Normal view

There are new articles available, click to refresh the page.
Yesterday — 25 April 2024Main stream

Coverage Guided Fuzzing – Extending Instrumentation to Hunt Down Bugs Faster!

We at IncludeSec sometimes have the need to develop fuzzing harnesses for our clients as part of our security assessment and pentesting work. Using fuzzing in an assessment methodology can uncover vulnerabilities in modern and complex software during security assessments by providing a faster way to submit highly structured inputs to the applications. This technique is usually applied when a more comprehensive effort beyond manual and traditional automated testing are requested by our clients to provide an additional analysis to uncover more esoteric vulnerabilities.

Introduction

Coverage-guided fuzzing is a useful capability in advanced fuzzers (AFL, libFuzzer, Fuzzilli, and others). This capability permits the fuzzer to acknowledge if an input can discover new edges or branches in the binary execution paths. An edge links two branches in a control flow graph (CFG). For instance, if a logical condition involves an if-else statement, there would be two edges, one for the if and the other for the else statement. It is a significant part of the fuzzing process, helping determine if the target program’s executable code is effectively covered by the fuzzer.

A guided fuzzing process usually utilizes a coverage-guided fuzzing (CGF) technique, employing very basic instrumentation to collect data needed to identify if a new edge or coverage block is hit during the execution of a fuzz test case. The instrumentation is code added during the compilation process, utilized for a number of reasons, including software debugging which is how we will use it in this post.

However, CGF instrumentation techniques can be extended, such as by adding new metrics, as demonstrated in this paper [1], where the authors consider not only the edge count but when there is a security impact too. Generally, extending instrumentation is useful to retrieve more information from the target programs.

In this post, we modify the Fuzzilli patch for the software JerryScript. JerryScript has a known and publicly available vulnerability/exploit, that we can use to show how extending Fuzzilli’s instrumentation could be helpful for more easily identifying vulnerabilities and providing more useful feedback to the fuzzer for further testing. Our aim is to demonstrate how we can modify the instrumentation and extract useful data for the fuzzing process.

[1] (Not All Coverage Measurements Are Equal: Fuzzing by Coverage Accounting for Input Prioritization – NDSS Symposium (ndss-symposium.org)

Fuzzing

Fuzzing is the process of submitting random inputs to trigger an unexpected behavior from the application. In recent approaches, the fuzzers consider various aspects of the target application for generating inputs, including the seeds – sources for generating the inputs. Since modern software has complex structures, we can not reach satisfactory results using simple inputs. In other words, by not affecting most of the target program it will be difficult to discover new vulnerabilities.

The diagram below shows an essential structure for a fuzzer with mutation strategy and code coverage capability.

  1. Seeds are selected;
  2. The mutation process takes the seeds to originate inputs for the execution;
  3. The execution happens;
  4. A vulnerability can occur or;
  5. The input hits a new edge in the target application; the fuzzer keeps mutating the same seed or; 
  6. The input does not hit new edges, and the fuzzer selects a new seed for mutation.

The code coverage is helpful to identify if the input can reach different parts of the target program by pointing to the fuzzer that a new edge or block was found during the execution.

CLANG

Clang [Clang]  is a compiler for the C, C++, Objective-C, and Objective-C++ programming languages. It is part of the LLVM project and offers advantages over traditional compilers like GCC (GNU Compiler Collection), including more expressive diagnostics, faster compilation times, and extensive analysis support. 

One significant tool within the Clang compiler is the sanitizer. Sanitizers are security libraries or tools that can detect bugs and vulnerabilities automatically by instrumenting the code. The compiler checks the compiled code for security implications when the sanitizer is enabled.

There are a few types of sanitizers in this context:

  • AddressSanitizer (ASAN): This tool detects memory errors, including vulnerabilities like buffer overflows, use-after-free, double-free, and memory leaks.
  • UndefinedBehaviorSanitizer (UBSAN): Identifies undefined behavior in C/C++ code such as integer overflow, division by zero, null pointer dereferences, and others.
  • MemorySanitizer (MSAN): Detected uninitialized memory reads in C/C++ programs that can lead to unpredictable behavior.
  • ThreadSanitizer (TSAN): Detects uninitialized data races and deadlocks in multithreads C/C++ applications.
  • LeakSanitizer (LSAN): This sanitizer is integrated with AddressSanitizer and helps detect memory leaks, ensuring that all allocated memory is being freed. 

The LLVM documentation (SanitizerCoverage — Clang 19.0.0git documentation (llvm.org)) provides a few examples of what to do with the tool. The shell snippet below shows the command line for the compilation using the ASAN option to trace the program counter.

$ clang -o targetprogram -g -fsanitize=address -fsanitize-coverage=trace-pc-guard targetprogram.c

From clang documentation:

LLVM has a simple code coverage instrumentation built in (SanitizerCoverage). It inserts calls to user-defined functions on function-, basic-block-, and edge- levels. Default implementations of those callbacks are provided and implement simple coverage reporting and visualization, however if you need just coverage visualization you may want to use SourceBasedCodeCoverage instead.”

For example, code coverage in Fuzzilli (googleprojectzero/fuzzilli: A JavaScript Engine Fuzzer (github.com)), Google’s state-of-the-art JavaScript engine fuzzer, utilizes simple instrumentation to respond to Fuzzilli’s process, as demonstrated in the code snippet below.

extern "C" void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {
    uint32_t index = *guard;
    __shmem->edges[index / 8] |= 1 << (index % 8);
    *guard = 0;
}

The function __sanitizer_cov_trace_pc_guard() will consistently execute when a new edge is found, so no condition is necessary to interpret the new edge discovery. Then, the function changes a bit in the shared bitmap __shmem->edges to 1 (bitwise OR), and then Fuzzilli analyzes the bitmap after execution.

Other tools, like LLVM-COV (llvm-cov – emit coverage information — LLVM 19.0.0git documentation), capture code coverage information statically, providing a human-readable document after execution; however, fuzzers need to be efficient, and reading documents in the disk would affect the performance.

Getting More Information

We can modify Fuzzilli’s instrumentation and observe other resources that __sanitizer_cov_trace_pc_guard() can bring to the code coverage. The code snippet below demonstrates the Fuzzilli instrumentation with a few tweaks.

extern "C" void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {
    uint32_t index = *guard;

    void *PC = __builtin_return_address(0);
    char PcDescr[1024];

    __sanitizer_symbolize_pc(PC, "%p %F %L", PcDescr, sizeof(PcDescr));
    printf("guard: %p %x PC %s\n", guard, *guard, PcDescr);

    __shmem->edges[index / 8] |= 1 << (index % 8);
    *guard = 0;
}

We already know that the function __sanitizer_cov_trace_pc_guard() is executed every time the instrumented program hits a new edge. In this case, we are utilizing the function __builtin_return_address() to collect the return addresses from every new edge hit in the target program. Now, the pointer PC has the return address information. We can utilize the __sanitizer_symbolize_pc() function to correlate the address to the symbols, providing more information about the source code file used during the execution.

Most fuzzers use only the edge information to guide the fuzzing process. However, as we will demonstrate in the next section, using the sanitizer interface can provide compelling information for security assessments.

Lab Exercise

In our laboratory, we will utilize another JavaScript engine. In this case, an old version of JerryScript JavaScript engine to create an environment.

  • Operating System (OS): Ubuntu 22.04
  • Target Program: JerryScript 
  • Vulnerability: CVE-2023-36109

Setting Up the Environment

You can build JerryScript using the following instructions.

First, clone the repository:

$ git clone https://github.com/jerryscript-project/jerryscript.git

Enter into the JerryScript folder and checkout the 8ba0d1b6ee5a065a42f3b306771ad8e3c0d819bc commit.

$ git checkout 8ba0d1b6ee5a065a42f3b306771ad8e3c0d819bc

Fuzzilli utilizes the head 8ba0d1b6ee5a065a42f3b306771ad8e3c0d819bc for the instrumentation, and we can take advantage of the configuration done for our lab. Apply the patch available in the Fuzziilli’s repository (fuzzilli/Targets/Jerryscript/Patches/jerryscript.patch at main · googleprojectzero/fuzzilli (github.com))

$ cd jerry-main
$ wget https://github.com/googleprojectzero/fuzzilli/raw/main/Targets/Jerryscript/Patches/jerryscript.patch
$ patch < jerryscript.patch
patching file CMakeLists.txt
patching file main-fuzzilli.c
patching file main-fuzzilli.h
patching file main-options.c
patching file main-options.h
patching file main-unix.c

The instrumented file is jerry-main/main-fuzzilli.c, provided by the Fuzzilli’s patch.  It comes with the necessary to work with simple code coverage capabilities. Still, we want more, so we can use the same lines we demonstrated in the previous section to update the function __sanitizer_cov_trace_pc_guard() before the compilation. Also, adding the following header to jerry-main/main-fuzzilli.c file:

#include <sanitizer/common_interface_defs.h>

The file header describes the __sanitizer_symbolize_pc() function, which will be needed in our implementation. We will modify the function in the jerry-main/main-fuzzilli.c file.

void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {
    uint32_t index = *guard;
    if(!index) return;
    index--;

    void *PC = __builtin_return_address(0);
    char PcDescr[1024];

    __sanitizer_symbolize_pc(PC, "%p %F %L", PcDescr, sizeof(PcDescr));
    printf("guard: %p %x PC %s\n", (void *)guard, *guard, PcDescr);
    __shmem->edges[index / 8] |= 1 << (index % 8);
    *guard = 0;
}

We now change the compilation configuration and disable the strip. The symbols are only needed to identify the possible vulnerable functions for our demonstration.

In the root folder CMakeLists.txt file

# Strip binary
if(ENABLE_STRIP AND NOT CMAKE_BUILD_TYPE STREQUAL "Debug")
  jerry_add_link_flags(-g)
endif()

It defaults with the -s option; change to -g to keep the symbols. Make sure that jerry-main/CMakeLists.txt contains the main-fuzzilli.c file, and then we are ready to compile. We can then build it using the Fuzzilli instructions.

$ python jerryscript/tools/build.py --compile-flag=-fsanitize-coverage=trace-pc-guard --profile=es2015-subset --lto=off --compile-flag=-D_POSIX_C_SOURCE=200809 --compile-flag=-Wno-strict-prototypes --stack-limit=15

If you have installed Clang, but the output line CMAKE_C_COMPILER_ID is displaying GNU or something else, you will have errors during the building.

$ python tools/build.py --compile-flag=-fsanitize-coverage=trace-pc-guard --profile=es2015-subset --lto=off --compile-flag=-D_POSIX_C_SOURCE=200809 --compile-flag=-Wno-strict-prototypes --stack-limit=15
-- CMAKE_BUILD_TYPE               MinSizeRel
-- CMAKE_C_COMPILER_ID            GNU
-- CMAKE_SYSTEM_NAME              Linux
-- CMAKE_SYSTEM_PROCESSOR         x86_64

You can simply change the CMakeLists.txt file, lines 28-42 to enforce Clang instead of GNU by modifying USING_GCC 1 to USING_CLANG 1, as shown below:

# Determining compiler
if(CMAKE_C_COMPILER_ID MATCHES "GNU")
  set(USING_CLANG 1)
endif()

if(CMAKE_C_COMPILER_ID MATCHES "Clang")
  set(USING_CLANG 1)
endif()

The instrumented binary will be the build/bin/jerry file.

Execution

Let’s start by disabling ASLR (Address Space Layout Randomization).

$ echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

After testing, we can re-enable the ASLR by setting the value to 2.

$ echo 2 | sudo tee /proc/sys/kernel/randomize_va_space

We want to track the address to the source code file, and disabling the ASLR will help us stay aware during the analysis and not affect our results. The ASLR will not impact our lab, but keeping the addresses fixed during the fuzzing process will be fundamental.

Now, we can execute JerryScript using the PoC file for the vulnerability CVE-2023-36109 (Limesss/CVE-2023-36109: a poc for cve-2023-36109 (github.com)), as an argument to trigger the vulnerability. As described in the vulnerability description, the vulnerable function is at ecma_stringbuilder_append_raw in jerry-core/ecma/base/ecma-helpers-string.c, highlighted in the command snippet below. 

$ ./build/bin/jerry ./poc.js
[...]
guard: 0x55e17d12ac88 7bb PC 0x55e17d07ac6b in ecma_string_get_ascii_size ecma-helpers-string.c
guard: 0x55e17d12ac84 7ba PC 0x55e17d07acfe in ecma_string_get_ascii_size ecma-helpers-string.c
guard: 0x55e17d12ac94 7be PC 0x55e17d07ad46 in ecma_string_get_size (/jerryscript/build/bin/jerry+0x44d46) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
guard: 0x55e17d12e87c 16b8 PC 0x55e17d09dfe1 in ecma_regexp_replace_helper (/jerryscript/build/bin/jerry+0x67fe1) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
guard: 0x55e17d12ae04 81a PC 0x55e17d07bb64 in ecma_stringbuilder_append_raw (/jerryscript/build/bin/jerry+0x45b64) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
guard: 0x55e17d12e890 16bd PC 0x55e17d09e053 in ecma_regexp_replace_helper (/jerryscript/build/bin/jerry+0x68053) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
guard: 0x55e17d12e8b8 16c7 PC 0x55e17d09e0f1 in ecma_regexp_replace_helper (/jerryscript/build/bin/jerry+0x680f1) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
guard: 0x55e17d133508 29db PC 0x55e17d0cc292 in ecma_builtin_replace_substitute (/jerryscript/build/bin/jerry+0x96292) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
guard: 0x55e17d133528 29e3 PC 0x55e17d0cc5bd in ecma_builtin_replace_substitute (/jerryscript/build/bin/jerry+0x965bd) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
guard: 0x55e17d12f078 18b7 PC 0x55e17d040a78 in jmem_heap_realloc_block (/jerryscript/build/bin/jerry+0xaa78) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
guard: 0x55e17d12f088 18bb PC 0x55e17d040ab4 in jmem_heap_realloc_block (/jerryscript/build/bin/jerry+0xaab4) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
guard: 0x55e17d12f08c 18bc PC 0x55e17d040c26 in jmem_heap_realloc_block (/jerryscript/build/bin/jerry+0xac26) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
guard: 0x55e17d12f094 18be PC 0x55e17d040ca3 in jmem_heap_realloc_block (/jerryscript/build/bin/jerry+0xaca3) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
UndefinedBehaviorSanitizer:DEADLYSIGNAL
==27636==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x55e27da7950c (pc 0x7fe341fa092b bp 0x000000000000 sp 0x7ffc77634f18 T27636)
==27636==The signal is caused by a READ memory access.
    #0 0x7fe341fa092b  string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:513
    #1 0x55e17d0cc3bb in ecma_builtin_replace_substitute (/jerryscript/build/bin/jerry+0x963bb) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
    #2 0x55e17d09e103 in ecma_regexp_replace_helper (/jerryscript/build/bin/jerry+0x68103) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
    #3 0x55e17d084a23 in ecma_builtin_dispatch_call (/jerryscript/build/bin/jerry+0x4ea23) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
    #4 0x55e17d090ddc in ecma_op_function_call_native ecma-function-object.c
    #5 0x55e17d0909c1 in ecma_op_function_call (/jerryscript/build/bin/jerry+0x5a9c1) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
    #6 0x55e17d0d4743 in ecma_builtin_string_prototype_object_replace_helper ecma-builtin-string-prototype.c
    #7 0x55e17d084a23 in ecma_builtin_dispatch_call (/jerryscript/build/bin/jerry+0x4ea23) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
    #8 0x55e17d090ddc in ecma_op_function_call_native ecma-function-object.c
    #9 0x55e17d0909c1 in ecma_op_function_call (/jerryscript/build/bin/jerry+0x5a9c1) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
    #10 0x55e17d0b929f in vm_execute (/jerryscript/build/bin/jerry+0x8329f) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
    #11 0x55e17d0b8d4a in vm_run (/jerryscript/build/bin/jerry+0x82d4a) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
    #12 0x55e17d0b8dd0 in vm_run_global (/jerryscript/build/bin/jerry+0x82dd0) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
    #13 0x55e17d06d4a5 in jerry_run (/jerryscript/build/bin/jerry+0x374a5) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
    #14 0x55e17d069e32 in main (/jerryscript/build/bin/jerry+0x33e32) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
    #15 0x7fe341e29d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #16 0x7fe341e29e3f in __libc_start_main csu/../csu/libc-start.c:392:3
    #17 0x55e17d0412d4 in _start (/jerryscript/build/bin/jerry+0xb2d4) (BuildId: 9588e1efabff4190fd492d05d3710c7810323407)
UndefinedBehaviorSanitizer can not provide additional info.
SUMMARY: UndefinedBehaviorSanitizer: SEGV string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:513 
==27636==ABORTING

Using this technique, we could identify the root cause of the vulnerability in the function ecma_stringbuilder_append_raw() address in the stack trace. 

However, if we rely only on the sanitizer to check the stack trace, we won’t be able to see the vulnerable function name in our output:

$ ./build/bin/jerry ./poc.js 
[COV] no shared memory bitmap available, skipping
[COV] edge counters initialized. Shared memory: (null) with 14587 edges
UndefinedBehaviorSanitizer:DEADLYSIGNAL
==54331==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x5622ae01350c (pc 0x7fc1925a092b bp 0x000000000000 sp 0x7ffed516b838 T54331)
==54331==The signal is caused by a READ memory access.
    #0 0x7fc1925a092b  string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:513
    #1 0x5621ad66636b in ecma_builtin_replace_substitute (/jerryscript/build/bin/jerry+0x9636b) (BuildId: 15a3c1cd9721e9f1b4e15fade2028ddca6dc542a)
    #2 0x5621ad6380b3 in ecma_regexp_replace_helper (/jerryscript/build/bin/jerry+0x680b3) (BuildId: 15a3c1cd9721e9f1b4e15fade2028ddca6dc542a)
    #3 0x5621ad61e9d3 in ecma_builtin_dispatch_call (/jerryscript/build/bin/jerry+0x4e9d3) (BuildId: 15a3c1cd9721e9f1b4e15fade2028ddca6dc542a)
    #4 0x5621ad62ad8c in ecma_op_function_call_native ecma-function-object.c
    #5 0x5621ad62a971 in ecma_op_function_call (/jerryscript/build/bin/jerry+0x5a971) (BuildId: 15a3c1cd9721e9f1b4e15fade2028ddca6dc542a)
    #6 0x5621ad66e6f3 in ecma_builtin_string_prototype_object_replace_helper ecma-builtin-string-prototype.c
    #7 0x5621ad61e9d3 in ecma_builtin_dispatch_call (/jerryscript/build/bin/jerry+0x4e9d3) (BuildId: 15a3c1cd9721e9f1b4e15fade2028ddca6dc542a)
    #8 0x5621ad62ad8c in ecma_op_function_call_native ecma-function-object.c
    #9 0x5621ad62a971 in ecma_op_function_call (/jerryscript/build/bin/jerry+0x5a971) (BuildId: 15a3c1cd9721e9f1b4e15fade2028ddca6dc542a)
    #10 0x5621ad65324f in vm_execute (/jerryscript/build/bin/jerry+0x8324f) (BuildId: 15a3c1cd9721e9f1b4e15fade2028ddca6dc542a)
    #11 0x5621ad652cfa in vm_run (/jerryscript/build/bin/jerry+0x82cfa) (BuildId: 15a3c1cd9721e9f1b4e15fade2028ddca6dc542a)
    #12 0x5621ad652d80 in vm_run_global (/jerryscript/build/bin/jerry+0x82d80) (BuildId: 15a3c1cd9721e9f1b4e15fade2028ddca6dc542a)
    #13 0x5621ad607455 in jerry_run (/jerryscript/build/bin/jerry+0x37455) (BuildId: 15a3c1cd9721e9f1b4e15fade2028ddca6dc542a)
    #14 0x5621ad603e32 in main (/jerryscript/build/bin/jerry+0x33e32) (BuildId: 15a3c1cd9721e9f1b4e15fade2028ddca6dc542a)
    #15 0x7fc192429d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #16 0x7fc192429e3f in __libc_start_main csu/../csu/libc-start.c:392:3
    #17 0x5621ad5db2d4 in _start (/jerryscript/build/bin/jerry+0xb2d4) (BuildId: 15a3c1cd9721e9f1b4e15fade2028ddca6dc542a)

UndefinedBehaviorSanitizer can not provide additional info.
SUMMARY: UndefinedBehaviorSanitizer: SEGV string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:513 
==54331==ABORTING

This behavior happens because the vulnerability occurs far from the last execution in the program. Usually, the primary action would be debugging to identify the address of the vulnerable function in memory. 

Additional Considerations

The vulnerable address or address space could be used as a guide during fuzzing. We can then compare the PC to the specific address space and instruct the fuzzer to focus on a path by mutating the same input in an attempt to cause other vulnerabilities in the same function or file.
For example, we can also feed data related to historical vulnerability identification, correlate dangerous files to their address space in a specific project and include them into the instrumentation, and give feedback to the fuzzer to achieve a more focused fuzzing campaign.

We do not necessarily need to use __sanitizer_symbolize_pc for the fuzzing process; this is done only to demonstrate the function and file utilized by each address. Our methodology would only require void *PC = __builtin_return_address(0). The PC will point to the current PC address in the execution, which is the only information needed for the comparison and guiding process.

As we demonstrated above, we can retrieve more information about the stack trace and identify vulnerable execution paths. So, let’s look at Fuzzilli’s basic algorithm, described in their NDSS paper.

In line 12, it is defined that if a new edge is found, the JavaScript code is converted back to its Intermediate Language (IL) (line 13), and the input is added to the corpus for further mutations in line 14.

What can we change to improve the algorithm? Since we have more information about historical vulnerability identification and stack traces, I think that’s a good exercise for the readers.

Conclusion

We demonstrated that we can track the real-time stack trace of a target program by extending Fuzzilli’s instrumentation. By having better visibility into the return address information and its associated source code files, it’s easier to supply the fuzzer with additional paths that can produce interesting results.

Ultimately, this instrumentation technique can be applied to any fuzzer that can take advantage of code coverage capabilities. We intend to use this modified instrumentation output technique in a part 2 blog post at a later date, showing how it can be used to direct the fuzzer to potentially interesting execution paths.

The post Coverage Guided Fuzzing – Extending Instrumentation to Hunt Down Bugs Faster! appeared first on Include Security Research Blog.

Sifting through the spines: identifying (potential) Cactus ransomware victims

25 April 2024 at 04:01

Authored by Willem Zeeman and Yun Zheng Hu

This blog is part of a series written by various Dutch cyber security firms that have collaborated on the Cactus ransomware group, which exploits Qlik Sense servers for initial access. To view all of them please check the central blog by Dutch special interest group Cyberveilig Nederland [1]

The effectiveness of the public-private partnership called Melissa [2] is increasingly evident. The Melissa partnership, which includes Fox-IT, has identified overlap in a specific ransomware tactic. Multiple partners, sharing information from incident response engagements for their clients, found that the Cactus ransomware group uses a particular method for initial access. Following that discovery, NCC Group’s Fox-IT developed a fingerprinting technique to identify which systems around the world are vulnerable to this method of initial access or, even more critically, are already compromised.

Qlik Sense vulnerabilities

Qlik Sense, a popular data visualisation and business intelligence tool, has recently become a focal point in cybersecurity discussions. This tool, designed to aid businesses in data analysis, has been identified as a key entry point for cyberattacks by the Cactus ransomware group.

The Cactus ransomware campaign

Since November 2023, the Cactus ransomware group has been actively targeting vulnerable Qlik Sense servers. These attacks are not just about exploiting software vulnerabilities; they also involve a psychological component where Cactus misleads its victims with fabricated stories about the breach. This likely is part of their strategy to obscure their actual method of entry, thus complicating mitigation and response efforts for the affected organizations.

For those looking for in-depth coverage of these exploits, the Arctic Wolf blog [3] provides detailed insights into the specific vulnerabilities being exploited, notably CVE-2023-41266, CVE-2023-41265 also known as ZeroQlik, and potentially CVE-2023-48365 also known as DoubleQlik.

Threat statistics and collaborative action

The scope of this threat is significant. In total, we identified 5205 Qlik Sense servers, 3143 servers seem to be vulnerable to the exploits used by the Cactus group. This is based on the initial scan on 17 April 2024. Closer to home in the Netherlands, we’ve identified 241 vulnerable systems, fortunately most don’t seem to have been compromised. However, 6 Dutch systems weren’t so lucky and have already fallen victim to the Cactus group. It’s crucial to understand that “already compromised” can mean that either the ransomware has been deployed and the initial access artifacts left behind were not removed, or the system remains compromised and is potentially poised for a future ransomware attack.

Since 17 April 2024, the DIVD (Dutch Institute for Vulnerability Disclosure) and the governmental bodies NCSC (Nationaal Cyber Security Centrum) and DTC (Digital Trust Center) have teamed up to globally inform (potential) victims of cyberattacks resembling those from the Cactus ransomware group. This collaborative effort has enabled them to reach out to affected organisations worldwide, sharing crucial information to help prevent further damage where possible.

Identifying vulnerable Qlik Sense servers

Expanding on Praetorian’s thorough vulnerability research on the ZeroQlik and DoubleQlik vulnerabilities [4,5], we found a method to identify the version of a Qlik Sense server by retrieving a file called product-info.json from the server. While we acknowledge the existence of Nuclei templates for the vulnerability checks, using the server version allows for a more reliable evaluation of potential vulnerability status, e.g. whether it’s patched or end of support.

This JSON file contains the release label and version numbers by which we can identify the exact version that this Qlik Sense server is running.

Figure 1: Qlik Sense product-info.json file containing version information

Keep in mind that although Qlik Sense servers are assigned version numbers, the vendor typically refers to advisories and updates by their release label, such as “February 2022 Patch 3”.

The following cURL command can be used to retrieve the product-info.json file from a Qlik server:

curl -H "Host: localhost" -vk 'https://<ip>/resources/autogenerated/product-info.json?.ttf'

Note that we specify ?.ttf at the end of the URL to let the Qlik proxy server think that we are requesting a .ttf file, as font files can be accessed unauthenticated. Also, we set the Host header to localhost or else the server will return 400 - Bad Request - Qlik Sense, with the message The http request header is incorrect.

Retrieving this file with the ?.ttf extension trick has been fixed in the patch that addresses CVE-2023-48365 and you will always get a 302 Authenticate at this location response:

> GET /resources/autogenerated/product-info.json?.ttf HTTP/1.1
> Host: localhost
> Accept: */*
>
< HTTP/1.1 302 Authenticate at this location
< Cache-Control: no-cache, no-store, must-revalidate
< Location: https://localhost/internal_forms_authentication/?targetId=2aa7575d-3234-4980-956c-2c6929c57b71
< Content-Length: 0
<

Nevertheless, this is still a good way to determine the state of a Qlik instance, because if it redirects using 302 Authenticate at this location it is likely that the server is not vulnerable to CVE-2023-48365.

An example response from a vulnerable server would return the JSON file:

> GET /resources/autogenerated/product-info.json?.ttf HTTP/1.1
> Host: localhost
> Accept: */*
>
< HTTP/1.1 200 OK
< Set-Cookie: X-Qlik-Session=893de431-1177-46aa-88c7-b95e28c5f103; Path=/; HttpOnly; SameSite=Lax; Secure
< Cache-Control: public, max-age=3600
< Transfer-Encoding: chunked
< Content-Type: application/json;charset=utf-8
< Expires: Tue, 16 Apr 2024 08:14:56 GMT
< Last-Modified: Fri, 04 Nov 2022 23:28:24 GMT
< Accept-Ranges: bytes
< ETag: 638032013040000000
< Server: Microsoft-HTTPAPI/2.0
< Date: Tue, 16 Apr 2024 07:14:55 GMT
< Age: 136
<
{"composition":{"contentHash":"89c9087978b3f026fb100267523b5204","senseId":"qliksenseserver:14.54.21","releaseLabel":"February 2022 Patch 12","originalClassName":"Composition","deprecatedProductVersion":"4.0.X","productName":"Qlik Sense","version":"14.54.21","copyrightYearRange":"1993-2022","deploymentType":"QlikSenseServer"},
<snipped>

We utilised Censys and Google BigQuery [6] to compile a list of potential Qlik Sense servers accessible on the internet and conducted a version scan against them. Subsequently, we extracted the Qlik release label from the JSON response to assess vulnerability to CVE-2023-48365.

Our vulnerability assessment for DoubleQlik / CVE-2023-48365 operated on the following criteria:

  1. The release label corresponds to vulnerability statuses outlined in the original ZeroQlik and DoubleQlik vendor advisories [7,8].
  2. The release label is designated as End of Support (EOS) by the vendor [9], such as “February 2019 Patch 5”.

We consider a server non-vulnerable if:

  1. The release label date is post-November 2023, as the advisory states that “November 2023” is not affected.
  2. The server responded with HTTP/1.1 302 Authenticate at this location.

Any other responses were disregarded as invalid Qlik server instances.

As of 17 April 2024, and as stated in the introduction of this blog, we have detected 5205 Qlik Servers on the Internet. Among them, 3143 servers are still at risk of DoubleQlik, indicating that 60% of all Qlik Servers online remain vulnerable.

Figure 2: Qlik Sense patch status for DoubleQlik CVE-2023-48365

The majority of vulnerable Qlik servers reside in the United States (396), trailed by Italy (280), Brazil (244), the Netherlands (241), and Germany (175).

Figure 3: Top 20 countries with servers vulnerable to DoubleQlik CVE-2023-48365

Identifying compromised Qlik Sense servers

Based on insights gathered from the Arctic Wolf blog and our own incident response engagements where the Cactus ransomware was observed, it’s evident that the Cactus ransomware group continues to redirect the output of executed commands to a True Type font file named qle.ttf, likely abbreviated for “qlik exploit”.

Below are a few examples of executed commands and their output redirection by the Cactus ransomware group:

whoami /all > ../Client/qmc/fonts/qle.ttf
quser > ../Client/qmc/fonts/qle.ttf

In addition to the qle.ttf file, we have also observed instances where qle.woff was used:

Figure 4: Directory listing with exploitation artefacts left by Cactus ransomware group

It’s important to note that these font files are not part of a default Qlik Sense server installation.

We discovered that files with a font file extension such as .ttf and .woff can be accessed without any authentication, regardless of whether the server is patched. This likely explains why the Cactus ransomware group opted to store command output in font files within the fonts directory, which in turn, also serves as a useful indicator of compromise.

Our scan for both font files, found a total of 122 servers with the indicator of compromise. The United States ranked highest in exploited servers with 49 online instances carrying the indicator of compromise, followed by Spain (13), Italy (11), the United Kingdom (8), Germany (7), and then Ireland and the Netherlands (6).

Figure 5: Top 20 countries with known compromised Qlik Sense servers

Out of the 122 compromised servers, 46 were not vulnerable anymore.

When the indicator of compromise artefact is present on a remote Qlik Sense server, it can imply various scenarios. Firstly, it may suggest that remote code execution was carried out on the server, followed by subsequent patching to address the vulnerability (if the server is not vulnerable anymore). Alternatively, its presence could signify a leftover artefact from a previous security incident or unauthorised access.

While the root cause for the presence of these files is hard to determine from the outside it still is a reliable indicator of compromise.

Responsible disclosure by the DIVD
We shared our fingerprints and scan data with the Dutch Institute of Vulnerability Disclosure (DIVD), who then proceeded to issue responsible disclosure notifications to the administrators of the Qlik Sense servers.

Call to action

Ensure the security of your Qlik Sense installations by checking your current version. If your software is still supported, apply the latest patches immediately. For systems that are at the end of support, consider upgrading or replacing them to maintain robust security.

Additionally, to enhance your defences, it’s recommended to avoid exposing these services to the entire internet. Implement IP whitelisting if public access is necessary, or better yet, make them accessible only through secure remote working solutions.

If you discover you’ve been running a vulnerable version, it’s crucial to contact your (external) security experts for a thorough check-up to confirm that no breaches have occurred. Taking these steps will help safeguard your data and infrastructure from potential threats.

References

  1. https://cyberveilignederland.nl/actueel/persbericht-samenwerkingsverband-melissa-vindt-diverse-nederlandse-slachtoffers-van-ransomwaregroepering-cactus ↩︎
  2. https://www.ncsc.nl/actueel/nieuws/2023/oktober/3/melissa-samenwerkingsverband-ransomwarebestrijding ↩︎
  3. https://arcticwolf.com/resources/blog/qlik-sense-exploited-in-cactus-ransomware-campaign/ ↩︎
  4. https://www.praetorian.com/blog/qlik-sense-technical-exploit/ ↩︎
  5. https://www.praetorian.com/blog/doubleqlik-bypassing-the-original-fix-for-cve-2023-41265/ ↩︎
  6. https://support.censys.io/hc/en-us/articles/360038759991-Google-BigQuery-Introduction ↩︎
  7. https://community.qlik.com/t5/Official-Support-Articles/Critical-Security-fixes-for-Qlik-Sense-Enterprise-for-Windows/ta-p/2110801 ↩︎
  8. https://community.qlik.com/t5/Official-Support-Articles/Critical-Security-fixes-for-Qlik-Sense-Enterprise-for-Windows/ta-p/2120325 ↩︎
  9. https://community.qlik.com/t5/Product-Lifecycle/Qlik-Sense-Enterprise-on-Windows-Product-Lifecycle/ta-p/1826335 ↩︎
Before yesterdayMain stream
❌
❌