Researchers from Sonatype last week reported on a supply chain attack via a malicious Python package ‘pymafka’ that was uploaded to the popular PyPI registry. The package attempted to infect users by means of typosquatting: hoping that victims looking for the legitimate ‘pykafka’ package might mistype the query and download the malware instead.
While typosquatting may seem like a rather hit-and-miss way to infect targets, it hasn’t stopped threat actors from trying their luck, and it’s the second such attack we’ve seen in recent weeks using this method. Last week, SentinelLabs reported on CrateDepression, a typosquatting attack against the Rust repository that targeted macOS and Linux users.
Both attacks also made use of red-teaming tools to drop a payload on macOS devices that ‘beacons’ out to an operator. In the case of ‘pymafka’, the attackers further made use of a very specific packing and obfuscation method to disguise the true nature of the Mach-O payload, so specific in fact that we’ve only seen that method used in the wild once before, as part of the OSX.Zuru campaign.
While the use of packing, obfuscation and beacons are all techniques common enough in the world of Windows attacks, they have rarely been seen used against macOS targets until now. In this post, we review how these TTPs were seen in pymafka and other attacks, and offer defenders indicators to help detect their use on macOS endpoints.
The Pymafka Typosquatting Attack
Since the details of this were well-covered here, we will only briefly review the first-stage of the attack for the purposes of context. The pymakfa package was so named in the hope that users would confuse it with pykafka, a Kafka client for Python that is widely used in enterprises. Kafka itself is described as “an open-source distributed event streaming platform used by thousands of companies”, including “80% of all Fortune 100 companies”, a description which gives a fairly clear indication of the attackers’ interests.
The pymafka package contains a Python script that surveils the host and determines its operating system.
The setup.py script runs different logic for different platforms, including macOS
If the device is running macOS, it reaches out to a C2 and downloads a Mach-O binary called ‘MacOs’, which is then written to the /var/tmp (aka/private/var/tmp) directory with the filename “zad”.
Threat hunters should note that /var/tmp is not the same as the standard /tmp directory (aka/private/tmp), nor is it the same as the Darwin User $TMPDIR directory, both of which are more typical destinations for malware payloads. This little-used location may not be scanned or monitored by some security tools.
It might also be worth noting that ‘MacOs’ is itself a typo. The only form of this word used by Apple is cased as either ‘MacOS’ (the name of a directory inside every application bundle which contains the program executable) or ‘macOS’ (the official name of the operating system, replacing ‘OS X’). There is no Apple binary on the system that takes this word as a name. However, ‘MacOs’ is only used as the name of the file as it is stored remotely and may be useful in case-sensitive hunts across URL data, but as we noted above, the executable is written to the local file system as “zad”.
Packed and Obfuscated Payload
The payload is packed with UPX, a common enough technique used to evade certain kinds of static scanning tools. Aside from pymafka, UPX was recently used in the Mac variant of oRat, in OSX.Zuru, and in a variant of DazzleSpy, but more interesting than the packing is the obfuscation found in the decompressed binary.
The obfuscation has strong overlaps with a payload from the OSX.Zuru campaign. In that campaign, Chinese-linked threat actors distributed a series of sophisticated trojanized apps, including iTerm, Navicat, SecureCRT and Microsoft Remote Desktop via sponsored links in the Baidu search engine. The selection of trojanized apps suggested the threat actor was targeting users of backend tools used for SSH and other remote connections and business database management.
The trojanized apps dropped a UPX-packed Mach-O at /private/tmp/GoogleUpdate that used the same obfuscation techniques we observe in the pymafka payload. In both cases, researchers suggested the payload functions as a Cobalt Strike beacon, reaching out to check-in with a remote operator for further tasking.
The unpacked binary from OSX.Zuru and the unpacked binary from pymafka are quite different in size, the former weighing in at 5.7Mb versus the latter’s 3.6Mb, yet analysis of the sections suggests they have been run through a common obfuscation mechanism. In particular, the __cstring and __const sections are not only the same size but have the exact same hash values in both binaries.
The highlighted data are common to both Zuru and pymafka payloads
The two executables also display very similar entropy across all Sections.
The entropy profile of OSX.Zuru payload (left) and pymafka payload (right)
At this point, we are not suggesting that the campaigns are linked; it is possible that different actors may be coalescing around a set of similar TTPs and using a common tool or technique for obfuscating Cobalt Strike payloads.
Abusing Red Teaming Tools For macOS Compromises
More widely, our report on last week’s CrateDepression supply chain attack described how threat actors used a Poseidon Mythic payload as the second-stage of their infection chain. Mythic, like Cobalt Strike, is a legitimate tool that was designed to simulate real-world attacks for use by red teams. Unlike Cobalt Strike, Mythic is open source software that can be used “as-is” or forked and adapted at will.
Both frameworks have become so adept at simulating real-world attacks that real-world attackers have adopted these frameworks as go-to tools. While this has been true for some time regarding Cobalt Strike and attacks on enterprises running Windows and Windows servers, this is a relatively new development in campaigns targeting macOS. But as the old movie quote has it, “if you build it, they will come”.
Detecting pymafka and Similar Attacks
For security teams, this means ensuring that you have good coverage against the common red-teaming tools and frameworks that are out there and which are easily available to attackers. Test that your security software can detect attacks using similar TTPs.
Threat hunters looking for this particular obfuscation technique might consider hunting for binaries with a __TEXT.__cstring section having the MD5 hash value of c5a055de400ba07ce806eabb456adf0a and binaries having similar entropy profiles as shown above.
The SentinelOne Singularity platform detects and prevents attacks such as pymafka and OSX.Zuru, both in packed and unpacked form.
Conclusion
At this point in time, we can say very little about the threat actors behind the pymafka campaign, other than that the choice of package to typosquat and the use of typosquatting itself suggest a heavy interest in compromising multiple enterprises regardless of their industry vertical. While it’s not entirely unknown for highly-targeted attacks to hide behind mass intrusion techniques to obscure the real target, the simpler explanation is that this is likely a campaign with common “crimeware objectives” – stealing data, selling access, dropping ransomware and so on.
What is interesting from our point of view is that what we may be seeing now is the beginning of a ‘mirroring’ of TTPs commonly used against other enterprise platforms coming to macOS devices and Mac users. For organizations that still think of Macs as inherently safer than their Windows counterparts, this should be pause for thought and cause for concern. Security teams should consider adjusting their risk assessments accordingly.
Welcome back to our series on macOS reversing. Last time out, we took a look at challenges around string decryption, following on from our earlier posts about beating malware anti-analysis techniques and rapid triage of Mac malware with radare2. In this fourth post in the series, we tackle several related challenges that every malware hunter faces: you have a sample, you know it’s malicious, but
How do you determine if it’s a variant of other known malware?
If it is unknown, how do you hunt for other samples like it?
How do you write robust detection rules that survive malware author’s refactoring and recompilation?
The answer to those challenges is part Art and part Science: a mixture of practice, intuition and occasionally luck(!) blended with a solid understanding of the tools at your disposal. In this post, we’ll get into the tools and techniques, offer you tips to guide your practice, and encourage you to gain experience (which, in turn, will help you make your own luck) through a series of related examples.
As always, you’re going to need a few things to follow along, with the second and third items in this list installed in the first.
An isolated VM – see instructions here for how to get set up
By now you might have wondered more than once if this post just had a really obvious typo: Zignatures, not signatures? No, you read that right the first time! Zignatures are r2’s own format for creating and matching function signatures. We can use them to see if a sample contains a function or functions that are similar to other functions we found in other malware. Similarly, Zignatures can help analysts identify commonly re-used library code, encryption algorithms and deobfuscation routines, saving us lots of reversing time down the road (for readers familiar with IDA Pro or Ghidra, think F.L.I.R.T or Function ID).
What’s particularly nice about Zignatures is that you can not only search for exact matches but also for matches with a certain similarity score. This allows us to find functions that have been modified from one instantiation to the other but which are otherwise the same.
Zignatures can help us to answer the question of whether an unknown sample is a variant of a known one. Once you are familiar with Zignatures, they can also help you write good detection rules, since they will allow you to see what is constant in a family of malware and what is variant. Combined with YARA rules, which we’ll take a look at later in this post, you can create effective hunting rules for malware repositories like VirusTotal to find variants or use them to help inform the detection logic in malware hunting software.
Create and Use A Zignature
Let’s jump into some malware and create our first Zignature. Here’s a recent sample of WizardUpdate (you might remember we looked at an older sample of WizardUpdate in our post on string decryption).
Loading the sample into r2, analyzing its functions, and displaying its hashes
We’ve loaded the sample into r2 and run some analysis on it. We’ve been conveniently dropped at the main() function, which looks like this.
WizardUpdate main() function
That main function contains some malware specific strings, so should make a nice target for a Zignature. To do so, we use the zaf command, supplying the parameters of the function name and the signature name. Our sample file happened to be called “WizardUpdateB1”, so we’ll call this signature “WizardUpdateB1_main”. In r2, the full command we need, then, is:
> zaf main WizardUpdate_main
We can look at the newly-created Zignature in JSON format with zj~{} (if you’re not sure why we’re using the tilde, review the earlier post on grepping in r2).
An r2 Zignature viewed in JSON format
To see that the Zignature works, try zb and note the output:
zb returns how close the match was to the Zignature and the function at the current address
The first entry in the row is the most important, as that gives us the overall (i.e., average) match (between 0.00000 and 1.00000). The next two show us the match for bytes and graph, respectively. In this case, it’s a perfect match to the function, which is of course what we would expect as this is the sample from which we created the rule.
You can also create Zignatures for every function in the binary in one go with zg.
Create function signatures for every function in a binary with one command
Beware of using zg on large files with thousands of functions though, as you might get a lot of errors or junk output. For small-ish binaries with up to a couple of hundred functions it’s probably fine, but for anything larger than that I typically go for a targeted approach.
So far, we have created and tested a Zignature, but it’s real value lies in when we use the Zignature on other samples.
Create A Reusable and Extensible Zignatures File
At the moment, your Zignatures aren’t much use because we haven’t learned yet how to save and load Zignatures between samples. We’ll do that now.
We can save our generated Zignatures with zos <filename>. Note that if you just provide the bare filename it’ll save in the current working directory. If you give an absolute path to an existing file, r2 will nicely merge the Zignatures you’re saving with any existing ones in that file.
Radare2 does have a default address from which it is supposed to autoload Zignatures if the autoload variable is set, namely ~/.local/share/radare2/zigns/ (in some documentation, it’s ~/.config/radare2/zigns/) However, I’ve never quite been able to get autoload to work from either address, but if you want to try it, create the above location and in your radare2 config file (~/.radare2rc) add the following line.
e zign.autoload = true
In my case, I load my zigs file manually, which is a simple command: zo <filename> to load, and zb to run the Zignatures contained in the file against the function at the current address.
Sample WizardUpdate_B2’s main function doesn’t match our Zignature
Sample WizardUpdate_B5’s main function is a perfect match for our Zignature
As you can see, the Sample above B5 is a perfect match to B1, whereas B2 is way off with the match only around 46.6%.
When you’ve built up a collection of Zignatures, they can be really useful for checking a new sample against known families. I encourage you to create Zignatures for all your samples as they will pay dividends down the line. Don’t forget to back them up too. I learned the hard way that not having a master copy of my Zigs outside of my VMs can cause a few tears!
Creating YARA Rules Within radare2
Zignatures will help you in your efforts to determine if some new malware belongs to a family you’ve come across before, but that’s only half the battle when we come across a new sample. We also want to hunt – and detect – files that are like it. For that, YARA is our friend, and r2 handily integrates the creation of YARA strings to make this easy.
In this next example, we can see that a different WizardUpdate sample doesn’t match our earlier Zignature.
The output from zb shows that the current function doesn’t match any of our previous function signatures
While we certainly want to add a function signature for this sample’s main() to our existing Zigs, we also want to hunt for this on external repos like VirusTotal and elsewhere where YARA can be used.
Our main friend here is the pcy command. Since we’ve already been dropped at main()’s address, we can just run the pcy command directly to create a YARA string for the function.
Generating a YARA string for the current function
However, this is far too specific to be useful. Fortunately, the pcy command can be tailored to give us however many bytes we wish at whatever address.
We know that WizardUpdate makes plenty of use of ioreg, so let’s start by searching for instances of that in the binary.
Searching for the string “ioreg” in a WizardUpdate sample
Lots of hits. Let’s take a closer look at the hex of the first one.
A URL embedded in the WizardUpdate sample
That URL address might be a good candidate to include in a YARA rule, let’s try it. To grab it as YARA code, we just seek to the address and state how many bytes we want.
Generating a YARA string of 48 bytes from a specific address
This works nicely and we can just copy and paste the code into VT’s search with the content modifier. Our first effort, though, only gives us 1 hit on VirusTotal, although at least it’s different from our initial sample (we’ll add that to our collection, thanks!).
Our string only found a single hit on VirusTotal
But note how we can iterate on this process, easily generating YARA strings that we can use both for inclusion and exclusion in our YARA rules.
This time we had better success with 46 hits for one string
This string gives us lots of hits, so let’s create a file and add the string.
pcy 32 >> WizardUpdate_B.yara
Outputting the YARA string to a file
From here on in, we can continue to append further strings that we might want to include or exclude in our final YARA rule. When we are finished, all we have to do is open our new .yara file and add the YARA meta data and conditional logic, or we can paste the contents of our file into VTs Livehunt template and test out our rule there.
Xrefs For the Win
At the beginning of this post I said that the answer to some of the challenges we would deal with today were “part Art and part Science”. We’ve done plenty of “the Science”, so I want to round out the post by talking a little about “the Art”. Let’s return to a topic we covered briefly earlier in this series – finding cross-references in r2 – and introduce a couple of handy tips that can make development of hunting rules a little easier.
When developing a hunting or detection rule for a malware family, we are trying to balance two opposing demands: we want our rule to be specific enough not to create false positives, but wide or general enough not to miss true positives. If we had perfect knowledge of all samples that ever had been or ever would be created for the family under consideration, that would be no problem at all, but that’s precisely the knowledge-gap that our rule is aiming to fill.
A common tip for writing YARA rules is to use something like a combination of strings, method names and imports to try to achieve this balance. That’s good advice, but sometimes malware is packed to have virtually none of these, or not enough to make them easily distinguishable. On top of that, malware authors can and do easily refactor such artifacts and that can make your rules date very quickly.
A supplementary approach that I often use is to focus on code logic that is less easy for author’s to change and more likely to be re-used.
Let’s take a look at this sample of Adload written in Go. It’s a variant of a much more prolific version, also written in Google’s Golang. Both versions contain calls to a legit project found on Github, but this variant is missing one of the distinctive strings that made its more widespread cousin fairly easy to hunt.
A version of Adload that calls out to a popular project on Github
However, notice the URL at 0x7226. That could be interesting, but if we hit on that domain name string alone in VirusTotal we only see 3 hits, so that’s way too tight for our rule.
Your rules won’t catch much if your strings are too specificLet’s grab some bytes immediately after the C2 string is loaded
We might do better if we try grabbing bytes of code right after that string has been loaded, for while the API string will certainly change, the code that consumes it perhaps might not. In this case, searching on 96 bytes from 0x7255 catches a more respectable 23 hits, but that still seems too low for a malware variant that has been circulating for many months.
Notice the dates – this malware has probably far more than just 23 samples
Let’s see if we can do better. One trick I find useful with r2 is to hunt down all the XREFs to a particular piece of code and then look at the calling functions for useful sequences of byte code to hunt on.
For example, you can use sf. to seek to the beginning of a function from a given address (assuming it’s part of a function, of course) and then use axg to get the path of execution to that function all the way from main(). You can use pds to give you a summary of the calls in any function along the way, which means combining axg and pds is a very good way to quickly move around a binary in r2 to find things of interest.
Using the axg command to trace execution path back to main
Now that we can see the call graph to the C2 string, we can start hunting for logic that is more likely to be re-used across samples. In this case, let’s hunt for bytes where sym.main.main calls the function that loads the C2 URL at 0x01247a41.
Finding reusable logic that should be more general than individual strings
Grabbing 48 bytes from that address and hunting for it on VT gives us a much more respectable 45 TP hits. We can also see from VT that these files all have a common size, 5.33MB, which we can use as a further pivot for hunting.
Our hunt is starting to give better results, but don’t stop here!
We’ve made a huge improvement on our initial hits of 3 and then 23, but we’re not really done yet. If we keep iterating on this process, looking for reusable code rather than just specific strings, imports or method names, we’re likely to do much better, and by now you should have a solid understanding of how to do that using r2 to help you in your quest. All you need now, just like any good piece of malware, is a bit of persistence!
Conclusion
In this post, we’ve taken a look at some of r2’s lesser known features that are extremely useful for hunting malware families, both in terms of associating new samples to known families and in searching for unknown relations to a sample or samples we already have. If you haven’t checked out the previous posts in this series, have a look at Part 1, Part 2 and Part 3. If you would like us to cover other topics on r2 and reverse engineering macOS malware, ping me or SentinelLabs on Twitter with your suggestions.
Last month, as we closed out 2021, we shared the most recent malware discoveries afflicting the Mac platform, covering spyware, targeted attacks on developers and activists, cryptocurrency theft and cryptomining. As worrisome as those are, the bulk of infections affecting Mac users in and out of enterprise settings revolve around adware.
Once little more than a minor nuisance, adware on all platforms has taken a darker turn in recent years, often emulating malware TTPs and regularly surpassing a lot of malware families in sophistication and rapid evolution. What’s driven these developments is simple: adware makes a lot of money. Adware also harvests a lot of data from infections which can be sold off to other actors.
Most importantly from a security team’s point of view, however, is that adware infections set up hidden, persistent executables, engage in device and environmental fingerprinting, use anti-removal, anti-analysis and detection avoidance techniques, and reach out to unknown URLs to deliver custom payloads, typically without the knowledge or informed consent of the user or, in the enterprise case, the device owner.
For all these reasons, knowing how to detect an adware infection is no less important than any other malware infection. In this post, we shine a light on the most prevalent adware families affecting the Mac platform over the last 3 months and describe the typical infection patterns for each.
Cataloguing and sharing what we know in this way has two benefits. It enables defenders to improve their immediate detection responses in the short-term, and it represents a cost to threat actors in the mid-term, who are forced to invest in retooling and rethinking their approach.
1. Adload System_Service
Adload has probably been around since 2016 and is the most common family we see in live infections today. We have discussed specific Adload campaigns a few times in the past, here and here and we advise readers to review those posts for earlier Adload indicators. We include in this entry only those that we have not detailed before or which we saw in the last quarter of 2021 and early 2022.
An increasingly common pattern we are seeing throughout late 2021 involves Adload variants written in either Go (aka Rload/Lador) or Kotlin. The Go variants currently drop a payload with the following file path pattern:
Note that the executable file name only contains numerals. Although the underscore prefix is present more often than not in instances we observed, there are cases of this pattern where the underscore is not present.
3. Adload Kotlin Variant
The Kotlin variant of Adload uses a different but still quite distinctive pattern:
A pattern seen across a number of different variants involves the Adload installer dropping a Mach-O executable in the /tmp/ directory with a filename prefixed with the letters “php” followed by 6 alphanumeric characters (a similar pattern is used by MaxOfferDeal/Genieo, which we discuss below)
There are other minor variants on this naming convention that will be readily recognizable once you are familiar with the above patterns. For more information on this pattern see here.
5. Bundlore, Shlayer, and ZShlayer
Bundlore has been around since at least 2014 and, after Adload, is the most prevalent family we see in live infections throughout 2021 and into the beginning of 2022.
Bundlore payloads are typically dropped by a Shlayer or ZShlayer DMG installer. Often the Shlayer or ZShlayer installer will have one of the following file patterns:
Note that in the case of the “Install” pattern, the “I” can appear both as upper and lowercase. We see the “Player” version more often than the “Install” one.
The first-stage Bundlore payload will be dropped in a random folder created in the /tmp/ directory with a corresponding name:
Pirrit is a macOS malware family that was first seen in 2016 and remained relatively active throughout 2017 but had all but disappeared until November 2021. Since then, Pirrit has seen a new burst of activity.
In common with Bundlore, Pirrit will typically drop via a user executed DMG, although the disk image name and application name tend to be as follows:
/Volumes/Install Flash Player/Install Flash Player
Pirrit’s first stage payload drops in the Darwin_User_Temp_Dir (rather than the system /tmp dir) and uses an 8 character random directory name with either tmp or Installer as a prefix.
A further component is written to a folder in the User’s Library folder or local domain Library folder (depending on available permissions) and contains an application of the same name:
Genieo is another long-standing, common macOS malware family that goes in and out of periods of activity. Late 2021 saw some new variants which we continue to track but we have seen little activity. The most prevalent one on our radar uses a persistent LaunchAgent with the following pattern for its program argument:
Example
~/Library/Application Support/.gettime/GetTime
Interestingly, the persistence file is copied from a /tmp/ file that uses a similar naming pattern to Adload, namely “php” followed by 6 characters. This may be coincidence or deliberate, and either way may have caused some vendors to identify one as the other.
The same regex we showed for Adload Mach-Os above, however, will also find these .plist files.
However, in the Adload case, these files are always Mach-Os, whereas in the MaxOfferDeal/Genieo case they are always property lists. We see no other indicators or similarities between the executable and known Adload variants.
8. MMInstall/MacUpdater
MMInstall has been around since at least early 2018 and typically installs a LaunchAgent with a program argument with variety of names like “MyShopCoupon”, “CouponSmart” and similar. Older forms typically had an executable with the name “mm-install-macos” but we haven’t seen those for some time.
Apple recently updated their XProtect malware signatures for a newer version of this adware threat that appears to have been active during the middle of 2021. The following domains are still currently active:
The only known installer pattern we have seen to date is as follows.
/Volumes/search/Search.app/Contents/MacOS/Search
Conclusion
Most adware arrives in the form of trojanized applications that users are persuaded to attempt to install. Free content, cracked apps, and “special deals” are typical vectors. The fact that some – although by no means all – adware installers make a show of obtaining user consent doesn’t ameliorate the situation: in the cases where that does happen, the consent mechanism is itself often misleading or aggressive.
Regardless of how it is installed, unless the user has permission from the device owner, then adware will almost certainly be unwanted on company-owned devices. Given the aggressive behavior of adware, it should be of no less concern than any other type of malware.
We hope the information in this post will aid security teams to identify and remove adware infections on Mac devices. We would also encourage analysts to become familiar with other useful behavioral indicators associated with a wide range of macOS threats including adware families that can be found here.
Last week, Google’s Threat Analysis Group published details around what appears to be APT activity targeting, among others, Mac users visiting Hong Kong websites supporting pro-democracy activism. Google’s report focused on the use of two vulnerabilities: a zero day and a N-day (a known vulnerability with an available patch).
By the time of Google’s publication both had, in fact, been patched for some months. What received less attention was the malware that the vulnerabilities were leveraged to drop: a backdoor that works just fine even on the latest patched systems of macOS Monterey.
Google labelled the backdoor “Macma”, and we will follow suit. Shortly after Google’s publication, a rapid triage of the backdoor was published by Objective-See (under the name “OSX.CDDS”). In this post, we take a deeper dive into macOS.Macma, reveal further IoCs to aid defenders and threat hunters, and speculate on some of macOS.Macma’s (hitherto-unmentioned) interesting artifacts.
How macOS.Macma Gains Persistence
Thanks to the work of Google’s TAG team, we were able to grab two versions of the backdoor used by the threat actors, which we will label UserAgent 2019 and UserAgent 2021. Both are interesting, but arguably the earlier 2019 version has greater longevity since the delivery mechanism appears to work just fine on macOS Monterey.
The 2019 version of macOS.Macma will run just fine on macOS Monterey
UserAgent 2019 is a Mach-O binary dropped by an application called “SafariFlashActivity.app”, itself contained in a .DMG file (the disk image sample found by Google has the name “install_flash_player_osx.dmg”). UserAgent 2021 is a standalone Mach-O binary and contains much the same functionality as the 2019 version along with some added AV capture capabilities. This version of macOS.Macma is installed by a separate Mach-O binary dropped when the threat actors leverage the vulnerabilities described in Google’s post.
Both versions install the same persistence agent, com.UserAgent.va.plist in the current user’s ~/Library/LaunchAgents folder.
Macma’s persistence agent, com.UserAgent.va.plist
The property list is worth pausing over as it contains some interesting features. First, aside from the path to the executable, we can see that the persistence agent passes two arguments to the malware before it is run: -runMode, and ifneeded.
The agent also switches the current working directory to a custom folder, in which later will be deposited data from the separate keylogger module, among other things.
We find it interesting that the developer chose to include the LimitLoadToSessionType key with the value “Aqua”. The “Aqua” value ensures the LaunchAgent only runs when there is a logged in GUI user (as opposed to running as a background task or running when a user logs in via SSH). This is likely necessary to ensure other functionality, such as requesting that the user gives access to the Microphone and Accessibility features.
Victims are prompted to allow macOS.Macma access to the Microphone
However, since launchd defaults to “Aqua” when no key is specified at all, this inclusion is rather redundant. We might speculate that the inclusion of the key here suggests the developer is familiar with developing other LaunchAgents in other contexts where other keys are indeed necessary.
Application Bundle Confusion Suggests A “Messy” Development Process
Since we are discussing property lists, there’s some interesting artifacts in the SafariFlashActivity.app’s Info.plist, and that in turn led us to notice a number of other oddities in the bundle executables.
One of the great things about finding malware built into a bundle with an Info.plist is it gives away some interesting details about when, and on what machine, the malware was built.
macOS.Macma was built on El Capitan
In this case, we see the malware was built on an El Capitan machine running build 15C43. That’s curious, because build 15C43 was never a public release build: it was a beta of El Capitan 11.2 available to developers and AppleSeed (Apple beta testers) briefly around October to November 2015. On December 8th, 2015, El Capitan 11.2 was released with build number 15C50, superseding the previous public release of 11.1, build 15B42 from October 21st.
At this juncture, let’s note that the malware was signed with an ad hoc signature, meaning it did not require an Apple Developer account or ID to satisfy code signing requirements.
Therein lies an anomaly: the bundle was signed without needing a developer account, but it seems that the macOS version used to create this version of macOS.Macma was indeed sourced from a developer account. Such an account could possibly belong to the author(s); possibly be stolen, or possibly acquired with a fake ID. However, the latter two scenarios seem inconsistent with the ad hoc signature. If the developer had a fake or stolen Apple ID, why not codesign it with that for added credibility?
While we’re speculating about the developer or developers’ identities, two other artifacts in the bundle are worthy of mention. The main executable in ../MacOS is called “SafariFlashActivity” and was apparently compiled on Sept 16th, 2019. In the ../Resources folder, we see what appears to be an earlier version of the executable, “SafariFlashActivity1”, built some nine days earlier on Sept 7th.
While these two executables share a large amount of code and functionality, there are also a number of differences between them. Perhaps the most intriguing are that they appear – by accident or by design – to have been created by two entirely different users.
User strings from two binaries in the same macOS.Macma bundle
The user account “lifei” (speculatively, Li Fei, a common-enough Chinese name) seems to have replaced the user account “lxk”. Of course, it could be the same person operating different user accounts, or two entirely different individuals building separately from a common project. Indeed, there are sufficiently large differences in the code in such a short space of time to make it plausible to suggest that two developers were working independently on the same project and that one was chosen over the other for the final executable embedded in the ../MacOs folder.
Note that in the “lifei” builds, we see both the use of “Mac_Ma” for the first time, and “preexcel” — used as the team identifier in the final code signature. Neither of these appear in the “lxk” build, where “SafariFlashActivity” appears to be the project name. This bifurcation even extends to an unusual inconsistency between the identifier used in the bundle and that used in the code signature, where one is xxxxx.SafariFlashActivity and the other is xxxxxx.preexcl-project.
Inconsistent identifiers used in the bundle and code signature of macOS.Macma
In any case, the string “lifei” is found in several of the other binaries in the 2019 version of macOS.Macma, whereas “lxk” is not seen again. In the 2021 version, both “lifei” and “lxk” and all other developer artifacts have disappeared entirely from both the installer and UserAgent binaries, suggesting that the development process had been deliberately cleaned up.
User lifei’s “Macma” seems to have won the ‘battle of the devs’
Finally, if we return to the various (admittedly, falsifiable) compilation dates found in the bundle, there is another curiosity: we noted that the malware appears to have been compiled on a 2015 developer build of macOS, yet the Info.plist has a copyright date of 2018, and the executables in this bundle were built well-over 3 years later in September 2019 according to the (entirely manipulatable) timestamps.
What can we conclude from all these tangled weeds? Nothing concrete, admittedly. But there do seem to be two plausible, if competing, narratives: perhaps the threat actor went to extraordinary, and likely unnecessary, lengths to muddle the artifacts in these binaries. Alternatively, the threat actor had a somewhat confused development process with more than one developer and changing requirements. No doubt the truth is far more complex, but given the nature of the artifacts above, we suspect the latter may well be at least part of the story.
For defenders, all this provides a plethora of collectible artifacts that may, perhaps, help us to identify this malware or track this threat actor in future incidents.
macOS.Macma – Links To Android and Linux Malware?
Things start to get even more interesting when we take a look at artifacts in the executable code itself. As we noted in the introduction, an early report on this malware dubbed it “OSX.CDDS”. We can see why. The code is littered with methods prefixed with CDDS.
Some of the CDDS methods found in the 2021 UserAgent executable
That code, according to Google TAG, is an implementation for a DDS – Data Distribution Service – framework. While our searches turned up blank trying to find a specific implementation of DDS that matched the functions used in macOS.Macma, we did find other malware that uses the same framework.
Android malware drops an ELF bin that contains the same CDDS framework Links to known Android malware droppers
These ELF bins and both versions of macOS.Macma’s UserAgent also share another commonality, the strings “Octstr2Dec” and “Dec2Octstr”.
Commonalities between macOS.Macma and a malicious ELF Shared object file
These latter strings, which appear to be conversions for strings containing octals and decimals, may simply be a matter of coincidence or of code reuse. The code similarities we found also have links back to installers for the notorious Shedun Android malware.
In their report, Google’s TAG pointed out that macOS.Macma was associated with an iOS exploit chain that they had not been able to entirely recover. Our analysis suggests that the actors behind macOS.Macma at least were reusing code from ELF/Android developers and possibly could have also been targeting Android phones with malware as well. Further analysis is needed to see how far these connections extend.
Macma’s Keylogger and AV Capture Functionality
While the earlier reports referred to above have already covered the basics of macOS.Macma functionality, we want to expand on previous reporting to reveal further IoCs.
As previously mentioned, macOS.Macma will drop a persistence agent at ~/Library/LaunchAgents/com.UserAgent.va.plist and an executable at ~/Library/Preferences/lib/UserAgent.
As we noted above, the LaunchAgent will ensure that before the job starts, the executable’s current working directory will be changed to the aforementioned “lib” folder. This folder is used as a repository for data culled by the keylogger, “kAgent”, which itself is dropped at ~/Library/Preferences/Tools/, along with the “at” and “arch” Mach-O binaries.
Binaries dropped by macOS.Macma
The kAgent keylogger creates text files of captured keystrokes from any text input field, including Spotlight, Finder, Safari, Mail, Messages and other apps that have text fields for passwords and so on. The text files are created with Unix timestamps for names and collected in directories called “data”.
The file 1636804188 contains data captured by the keylogger
We also note that this malware reaches out to a remote .php file to return the user’s IP address. The same URL has a long history of use.
http://cgi1.apnic.net/cgi-bin/my-ip.php
Both Android and macOS malware ping this URL
Finally, one further IoC we noted in the ../MacOS/SafariFlashActivity “lifei” binary that never appeared anywhere else, and we also did not see dropped on any of our test runs, was:
This is worth mentioning since the target folder, the User’s Library/Safari folder, is TCC protected since Mojave. For that reason, any attempt to install there would fall afoul of current TCC protections (bypasses notwithstanding). It looks, therefore, like a remnant of the earlier code development from El Capitan era, and indeed we do not see this string in later versions. However, it’s unique enough for defenders to watch out for: there’s never any legitimate reason for an executable at this path to exist on any version of macOS.
Conclusion
Catching APTs targeting macOS users is a rare event, and we are lucky in this instance to have a fairly transparent view of the malware being dropped. Regardless of the vector used to drop the malware, the payload itself is perfectly functional and capable of exfiltrating data and spying on macOS users. It’s just another reminder, if one were needed, that simply investing in a Mac does not guarantee you safe passage against bad actors. This may have been an APT-developed payload, but the code is simple enough for anyone interested in malfeasance to reproduce.
If you’ve been following this series so far, you’ll have a good idea how to use radare2 to quickly triage a Mach-O binary statically and how to move through it dynamically to beat anti-analysis attempts. But sometimes, no matter how much time you spend looking at disassembly or debugging, you’ll hit a roadblock trying to figure out your macOS malware sample’s most interesting behavior because much of the human-readable ‘strings’ have been rendered unintelligible by encryption and/or obfuscation.
That’s the bad news; the good news is that while encryption is most definitely hard, decryption is, at least in principle, somewhat easier. Whatever methods are used, at some point during execution the malware itself has to decrypt its code. This means that, although there are many different methods of encryption, most practical implementations are amenable to reverse engineering given the right conditions.
Sometimes, we can do our decryption statically, perhaps emulating the malware’s decryption method(s) by writing our own decryption logic(s). Other times, we may have to run the malware and extract the strings as they are decrypted in memory. We’ll take a practical look at using both of these techniques in today’s post through a series of short case studies of real macOS malware.
First, we’ll look at an example of AES 128 symmetric encryption used in the recent macOS.ZuRu malware and show you how to quickly decode it; then we’ll decrypt a Vigenère cipher used in the WizardUpdate/Silver Toucan malware; finally, we’ll see how to decode strings dynamically, in-memory while executing a sample of a notorious adware installer.
Although we cannot cover all the myriad possible encryption schemes or methods you might encounter in the wild, these case studies should give you a solid basis from which to tackle other encryption challenges. We’ll also point you to some further resources showcasing other macOS malware decryption strategies to help you expand your knowledge.
For our case studies, you can grab a copy of the malware samples we’ll be using from the following links:
Don’t forget to use an isolated VM for all this work: these are live malware samples and you do not want to infect your personal or work device!
Breaking AES Encryption in macOS.ZuRu
Let’s begin with a recent strain of new macOS malware dubbed ‘macOS.ZuRu’. This malware was distributed inside trojanized applications such as iTerm, MS Remote Desktop and others in September 2021. Inside the malware’s application bundle is a Frameworks folder containing the malicious libcrypto.2.dylib. The sample we’re going to look at has the following hash signatures:
Let’s load it into r2 in the usual way (if you haven’t read the earlier posts in this series, catch up here and here), and consider the simple sequence of reversing steps illustrated in the following images.
Getting started with our macOS.ZuRu sample
As shown in the image above, after loading the binary, we use ii to look at the imports, and see among them CCCrypt (note that I piped this to head for display purposes). We then do a case insensitive search on ‘crypt’ in the functions list with afll~+crypt.
If we add [0] to the end of that, it gives us just the first column of addresses. We can then do a for-each over those using backticks to pipe them into axt to grab the XREFS. The entire command is:
> axt @@=`afll~crypt[0]`
The result, as you can see in the lower section of the image above, shows us that the malware uses CCCrypt to call the AESDecrypt128 block cipher algorithm.
AES128 requires a 128-bit key, which is the equivalent of 16 bytes. Though there’s a number of ways that such a key could be encoded in malware, the first thing we should do is a simple check for any 16 byte strings in the binary.
To do that quickly, let’s pipe the binary’s strings through awk and filter on the len column for ‘16’: That’s the fourth column in r2’s iz output. We’ll also narrow down the output to just cstrings by grepping on ‘string’, so our command is:
> iz | awk ‘$4==16’ | grep string
We can see the output in the middle section of the following image.
Filtering the malware’s strings for possible AES 128 keys
We got lucky! There’s two occurrences of what is obviously not a plain text string. Of course, it could be anything, but if we check out the XREFS we can see that this string is provided as an argument to the AESDecrypt method, as illustrated in the lower section of the above image.
All that remains now is to find the strings that are being deciphered. If we get the function summary of AESDecrypt from the address shown in our last command, 0x348b, it reveals that the function is using base64 encoded strings.
> pds @ 0x348b
Grabbing a function summary in r2 with the pds command
A quick and dirty way to look for base64 encoded strings is to grep on the “=” sign. We’ll use r2’s own grep function, ~ and pipe the result of that through another filter for “str” to further refine the output.
> iz~=~str
A quick-and-dirty grep for possible base64 cipher strings
Our search returns three hits that look like good candidates, but the proof is in the pudding! What we have at this point is candidates for:
the encryption algorithm – AES128
the key – “quwi38ie87duy78u”
three ciphers – “oPp2nG8br7oIB+5wLoA6Bg==, …”
All we need to do now is to run our suspects through the appropriate decryption routine for that algorithm. There are online tools such as Cyber Chef that can do that for you, or you can find code for most popular algorithms for your favorite language from an online search. Here, we implemented our own rough-and-ready AES128 decryption algorithm in Go to test out our candidates:
A simple AES128 ECB decryption algorithm implemented in Go
We can pipe all the candidate ciphers to file from within r2 and then use a shell one-liner in a separate Terminal window to run each line through our Go decryption script with the candidate key.
Revealing the strings in clear text with our Go decrypter
And voila! With a few short commands in r2 and a bash one-liner, we’ve decrypted the strings in macOS.ZuRu and found a valuable IoC for detection and further investigation.
Decoding a Vigenère Cipher in WizardUpdate Malware
In our second case study, we’re going to take a look at the string encryption used in a recent sample of WizardUpdate malware. The sample we’ll look at has the following hash signatures:
We’ll follow the same procedure as last time, beginning with a case insensitive search of functions with “crypt” in the name, filtering the results of that down to addresses, and getting the XREFS for each of the addresses. This is what it looks like on our new sample:
Finding our way to the string encryption code from the function analysis
We can see that there are several calls from main to a decrypt function, and that function itself calls sym.decrypt_vigenere.
Vigenère is a well-known cipher algorithm which we will say a bit more about shortly, but for now, let’s see if we can find any strings that might be either keys or ciphers.
Since a lot of the action is happening in main, let’s do a quick pds summary on the main function.
Using pds to get a quick summary of a function
There are at least two strings of interest. Let’s take a better look by leveraging r2’s afns command, which lists all strings associated with the current function.
r2’s afns can help you isolate strings in a function
That gives us a few more interesting looking candidates. Given its length and form, my suspicion at this point is that the “LBZEWWERBC” string is likely the key.
We can isolate just the strings we want by successive filtering. First, we get just the rows we want:
> afns~:1..5
And then grab just the last column (ignoring the addresses):
> afns~:1..5[2]
Then using sed to remove the “str.” prefix and grep to remove the “{MAID}” string, we end up with:
Access to the shell in r2 makes it easy to isolate the strings of interest
As before, we can now pipe these out to a “ciphers” file.
Let’s next turn to the encryption algorithm. Vigenère has a fascinating history. Once thought to be unbreakable, it’s now considered highly insecure for cryptography. In fact, if you like puzzles, you can decrypt a Vigenère cipher with a manual table.
The Vigenère cipher was invented before computers and can be solved by hand
One of the Vigenère cipher’s weaknesses is that it’s possible to discern patterns in the ciphertext that can reveal the length of the key. That problem can be avoided by encrypting a base64 encoding of the plain text rather than the plain text itself.
Now, if we jump back into radare2, we’ll see that WizardUpdate does indeed decode the output of the Vigenère function with a base64 decoder.
WizardUpdate malware uses base64 encoding either side of encrypting/decrypting
There is one other thing we need to decipher a Vigenère cipher aside from the key and ciphertext. We also need the alphabet used in the table. Let’s use another r2 feature to see if it can help us find it. Radare2’s search function, /, has some crypto search functionality built in. Use /c? to view the help on this command.
Search for crypto materials with built-in r2 commands
The /ck search gives us a hit which looks like it could function as the Vigenère alphabet.
OK, it’s time to build our decoder. This time, I’m going to adapt a Python script from here, and then feed it our ciphers file just as before. The only differences are I’m going to hardcode the alphabet in the script and then run the output through base64. Let’s see how it looks.
Decoding the strings returns base64 as expected
So far so good. Let’s try running those through base64 -D (decode) and see if we get our plain text.
Our decoder returns gibberish after we try to decode the base64
Hmm. The script runs without error, but the final decoded base64 output is gibberish. That suggests that while our key and ciphers are correct, our alphabet might not be.
Returning to r2, let’s search more widely across the strings with iz~string
Finding cstrings in the TEXT section with r2’s ~ filter
The first hit actually looks similar to the one we tried, but with fewer characters and a different order, which will also affect the result in a Vigenère table. Let’s try again using this as the hardcoded alphabet.
Decoding the WizardUpdate’s encrypted strings back to plain text
Success! The first cipher turns out to be an encoding of the system_profiler command that returns the device’s serial number, while the second contains the attacker’s payload URL. The third downloads the payload and executes it on the victim’s device.
Reading Encrypted Strings In-Memory
Reverse engineering is a multi-faceted puzzle, and often the pieces drop into place in no particular order. When our triage of a malware sample suggests a known or readily identifiable encryption scheme has been used as we saw with macOS.ZuRu and WizardUpdate, decrypting those strings statically can be the first domino that makes the other pieces fall into place.
However, when faced with an incalcitrant sample on which the authors have clearly spent a great deal of time second-guessing possible reversing moves, a ‘cheaper’ option is to detonate the malware and observe the strings as they are decrypted in memory. Of course, to do that, you might need to defeat some anti-analysis and anti-debugging tricks first!
In our third case study, then, we’re going to take a look at a common adware installer. Adware is big business, employs lots of professional coders, and produces code that is every bit as crafty as any sophisticated malware you’re likely to come across. If you spend anytime dealing with infected Macs, coming across adware is inevitable, so knowing how to deal with it is essential.
Let’s dump this into r2 and see what a quick triage can tell us.
This sample is keeping its secrets
Well, not much! If we print the disassembly for the main function with pdf @main, we see a mass of obfuscated code.
Lots of obfuscated code in this adware installer
However, the only calls here are to system and remove, as we saw from the function list. Let’s quit and reopen in r2’s debugger mode (remember: you may need to chmod the sample and remove any code signature and extended attributes as explained here).
Let’s find the entrypoint with the ie command. We’ll set a breakpoint on that and then execute to that point.
Breaking on the entrypoint
Now that we’re at main, let’s break on the system call and take a look at the registers. To do that, first get the address of the system flag with
> f~system
Then set the breakpoint on the address returned with the db command. We can continue execution with dc.
Setting a breakpoint on the system call and continuing execution
Note that in the image above, our first attempt to continue execution results in a warning message and we actually hit our main breakpoint again. If this happens, repeating the dc command should get you past the warning. Now we can look at all the registers with drr.
Revealing the encoded strings in memory
At the rdi register, we can see the beginning of the decrypted string. Let’s see the rest of it.
The clear text is revealed in the rdi register
Ah, an encoded shell script, typical of Bundlore and Shlayer malware. One of my favorite things about r2 is how you can do a lot of otherwise complex things very easily thanks to the shell integration. Want to pretty-print that script? Just pipe the same command through sed from right within r2.
> ps 2048 @rdi | sed ‘s/;/\n/g’
We can easily format the output by piping it through the sed utility
More Examples of macOS String Decryption Techniques
WizardUpdate and macOS.ZuRu provided us with some real-world malware samples where we could use the same general technique: identify the encryption algorithm in the functions table, search for and isolate the key and ciphers in the strings, and then find or implement an appropriate decoding algorithm.
Some malware authors, however, will implement custom encryption and decryption schemes and you’ll have to look more closely at the code to see how the decryption routine works. Alternatively, where necessary, we can detonate the code, jump over any anti-analysis techniques and read the decrypted strings directly from memory.
If all this has piqued your interest in string encryption techniques used in macOS malware, then you might like to check out some or all of the following for further study.
EvilQuest, which we looked at in the previous post, is one example of malware that uses a custom encryption and decryption algorithm. SentinelLabs broke the encryption statically, and then created a tool based on the malware’s own decryption algorithm to decrypt any files locked by the malware. Fellow macOS researcher Scott Knight also published his Python decryption routine for EvilQuest, which is worth close study.
Adload is another malware that uses a custom encryption scheme, and for which researchers at Confiant also published decryption code.
Notorious adware dropper platforms Bundlore and Shlayer use a complex and varying set of shell obfuscation techniques which are simple enough to decode but interesting in their own right.
Likewise, XCodeSpy uses a simple but quite effective shell obfuscation trick to hide its strings from simple search tools and regex pattern matches.
Conclusion
In this post, we’ve looked at a variety of different encryption techniques used by macOS malware and how we can tackle these challenges both statically and dynamically. If you haven’t checked out the previous posts in this series, have a look Part 1 and Part 2. I hope you’ll join us for the next post in this series as we continue to look at common challenges facing macOS malware researchers.
In this second post in our series on intermediate to advanced macOS malware reversing, we start our journey into tackling common challenges when dealing with macOS malware samples. Last time out, we took a look at how to use radare2 for rapid triage, and we’ll continue using r2 as we move through these various challenges. Along the way, we’ll pick up tips on both how to beat obstacles put in place by malware authors and how to use r2 more productively.
Although we can achieve a lot from static analysis, sometimes it can be more efficient to execute the malware in a controlled environment and conduct dynamic analysis. Malware authors, however, may have other ideas and can set up various roadblocks to stop us doing exactly that. Consequently, one of the first challenges we often have to overcome is working around these attempts to prevent execution in our safe environment.
In this post, we’ll look at how to circumvent the malware author’s control flow to avoid executing unwanted parts of their code, learning along the way how to take advantage of some nice features of the r2 debugger! We’ll be looking at a sample of EvilQuest (password: infect3d), so fire up your VM and download it before reading on.
A note for the unwary: if you’re using Safari in your VM to download the file and you see “decompression failed”, go to Safari Preferences and turn off the ‘Open “safe” files after downloading’ option in the General tab and try the download again.
Getting Started With the radare2 Debugger
Our sample hit the headlines in July 2020, largely because at first glance it appeared to be a rare example of macOS ransomware. SentinelLabs quickly analyzed it and produced a decryptor to help any potential victims, but it turned out the malware was not very effective in the wild.
It may well have been a PoC, or a project still in early development stages, as the code and functionality have the look and feel of someone experimenting with how to achieve various attacker objectives. However, that’s all good news for us, as EvilQuest implements several anti-analysis features that will serve us as good practice.
The first thing you will want to do is remove any extended attributes and codesigning if the sample has a revoked signature. In this case, the sample isn’t signed at all, but if it were we could use:
% sudo codesign --remove-signature <path to bundle or file>
If we need the sample to be codesigned for execution, we can also sign it (remember your VM needs to have installed the Xcode command line tools via xcode-select --install) with:
% sudo codesign -fs - <path to bundle or file> --deep
We’ll remove the extended attributes to bypass Gatekeeper and Notarization checks with
% xattr -rc <path to bundle or file>
And we’ll attempt to attach to the radare2 debugger by adding the -d switch to our initialization command:
% r2 -AA -d patch
Unfortunately, our first attempt doesn’t go well. We already removed the extended attributes and codesigning isn’t the issue here, but the radare2 debugger fails to attach.
Failing to attach the debugger.
That ptrace: Cannot Attach: Invalid argument looks ominous, but actually the error message is misleading. The problem is that we need elevated privileges to debug, so a simple sudo should get us past our current obstacle.
The debugger needs elevated privileges
Yay, attach success! Let’s take a look around before we start diving further into the debugger.
A Faster Way of Finding XREFS and Interesting Code
Let’s run afll as we did when analyzing OSX.Calisto previously, but this time we’ll output the function list to file so that we can sort it and search it more conveniently without having to keep running the command or scrolling up in the Terminal window.
> afll > functions.txt
Looking through our text file, we can see there are a number of function names that could be related to some kind of anti-analysis.
Some of EvilQuest’s suspected anti-analysis functions
We can see that some of these only have a single cross-reference, and if we dig into these using the axt commmand, we see the cross-reference (XREF) for the is_virtual_mchn function happens to be main(), so that looks a good place to start.
Getting help on radare2’s axt command
> axt sym._is_debugging
main 0x10000be5f [CALL] sys._is_virtual_mchn
Many commands in r2 support tab expansion
Here’s a useful powertrick for those already comfortable with r2. You can run any command on a for-each loop using @@. For example, with
axt @@f:<search term>
we can get the XREFS to any function containing the search term in one go.
In this case I tell r2 to give me the XREFS for every function that contains “_is_”. Then I do the same with “get”. Try @@? to see more examples of what you can do with @@.
Using a for-each in radare2
Since we see that is_virtual_mchn is called in main, we should start by disassembling the entire main function to see what’s going on, but first I’m going to change the r2 color theme to something a bit more reader-friendly with the eco command (try eco and hit the tab key to see a list of available themes).
eco focus
pdf @ main
Visual Graph Mode and Renaming Functions with Radare2
As we scroll back up to the beginning of the function, we can see the disassembly provides pretty interesting reading. At the beginning of main, we can see some unnamed functions are called. We’re going to jump into Visual Graph mode and start renaming code as this will give us a good idea of the malware’s execution flow and indicate what we need to do to beat the anti-analysis.
Hit VV to enter Visual Graph mode. I will try to walk you through the commands, but if you get lost at any point, don’t feel bad. It happens to us all and is part of the r2 learning curve! You can just quit out and start again if needs be (part of the beauty of r2’s speed; you can also save your project: type uppercase P? to see project options).
I prefer to view the graph as a horizontal, left-to-right flow; you can toggle between horizontal and vertical by pressing the @ key.
Viewing the sample’s visual graph horizontally
Here’s a quick summary of some useful commands (there are many more as you’ll see if you play around):
hjkl(arrow keys) – move the graph around
-/+0 – reduce, enlarge, return to default size
‘ – toggle graph comments
tab/shift-tab – move to next/previous function
dr – rename function
q – back to visual mode
t/f – follow the true/false execution chain
u – go back
? – help/available options
Hit ‘ once or twice make sure graph comments are on.
Use the tab key to move to the first function after main() (the border will be highlighted), where we can see an unnamed function and a reference in square brackets that begins with the letter ‘o’ (for example, [ob], though it may be different in your sample). Type the letters (without the square brackets) to go to that function. Type p to rotate between different display modes till you see something similar to the next image.
As we can see, this function call is actually a call to the standard C library function strcmp(), so let’s rename it.
Type dr and at the prompt type in the name you want to use and hit ‘enter’. Unsurprisingly, I’m going to call it strcmp.
To return to the main graph, type u and you should see that all references to that previously unnamed function now show strcmp, making things much clearer.
If you scroll through the graph (hjkl, remember) you will see many other unnamed functions that, once you explore them in the same way, are just relocations of standard C library calls such as exit, time, sleep, printf, malloc, srandom and more. I suggest you repeat the above exercise and rename as many as you can. This will both make the malware’s behaviour easier to understand and build up some valuable muscle-memory for working in r2!
Beating Anti-Analysis Without Patching
There are two approaches you can take to interrupt a program’s designed logic. One is to identify functions you want to avoid and patch the binary statically. This is fairly easy to do in r2 and there’s quite a few tutorials on how to patch binaries already out there. We’re not going to look at patching today because our entire objective is to run the sample dynamically, so we might as well interact with the program dynamically as well. Patching is really only worth considering if you need to create a sample for repeated use that avoids some kind of unwanted behaviour.
We basically have two easy options in terms of affecting control flow dynamically. We can either execute the function but manipulate the returned value (like put 0 in rax instead of 1) or skip execution of the function altogether.
We’ll see just how easy it is to do each of these, but we should first think about the different consequences of each choice based on the malware we’re dealing with.
If we NOP a function or skip over it, we’re going to lose any behaviour or memory states invoked by that function. If the function doesn’t do anything that affects the state of our program later on, this can be a good choice.
By the same token, if we execute the function but manipulate the value it returns, we may be allowing execution of code buried in that function that might trip us up. For example, if our function contains jumps to subroutines that do further anti-analysis tests, then we might get blocked before the parent function even returns, so this strategy wouldn’t help us. Clearly then, we need to take a look around the code to figure out which is the best strategy in each particular case.
Let’s take a look inside the _is_virtual_mchn function to see what it would do and work out our strategy.
If you’re still in Visual Graph mode, hit q to get back to the r2 prompt. Regardless of where you are, you can disassemble a function with pdf and the @ symbol and provide a flag or address. Remember, you can also use tab expansion to get a list of possible symbols.
It seems this function subtracts the sleep interval from the second timestamp, then compares it against the first timestamp. Jumping back out to how this result is consumed in main, it seems that if the result is not ‘0’, the malware calls exit() with ‘-1’.
The is_virtual_mchn function causes the malware to exit unless it returns ‘0’
The function appears to be somewhat misnamed as we don’t see the kind of tests that we would normally expect for VM detection. In fact, it looks like an attempt to evade automated sandboxes that patch the sleep function, and we’re not likely to fall foul of it just by executing in our VM. However, we can also see that the next function, user_info, also exits if it doesn’t return the expected value, so let’s practice both the techniques discussed above so that we can learn how to use the debugger whichever one we need to use.
Manipulating Execution with the radare2 Debugger
If you are at the command prompt, type Vp to go into radare2 visual mode (yup, this is another mode, and not the last!).
The Visual Debugger in radare2
Ooh, this is nice! We get registers at the top, and source code underneath. The current line where we’re stopped in the debugger is highlighted. If you don’t see that, hit uppercase S once (i.e., shift-s), which steps over one source line, and – in case you lose your way – also brings you back to the debugger view.
Let’s step smartly through the source with repeated uppercase S commands (by the way, in visual mode, lowercase ‘s’ steps in, whereas uppercase ‘S’ steps over). After a dozen or so rapid step overs, you should find yourself inside this familiar code, which is main().
main() in Visual Debugger mode
Note the highlighted dword, which is holding the value of argc. It should be ‘2’, but we can see from the register above that rdi is only 1. The code will jump over the next function call, which if you hit the ‘1’ key on the keyboard you can inspect (hit u to come back) and see this is a string comparison. Let’s continue stepping over and let the jump happen, as it doesn’t appear to block us. We’ll stop just short of the is_virtual_mchn function.
Seek and break locations are two different things!
We know from our earlier discussion what’s going to happen here, so let’s see how to take each of our options.
The first thing to note is that although the highlighted address is where the debugger is, that’s not where you are if you enter an r2 command prompt, unless it’s a debugger command. To see what I mean, hit the colon key to enter the command line.
From there, print out one line of disassembly with this command:
> pd 1
Note that the line printed out is r2’s current seek position, shown at the top of the visual view. This is good. It means you can move around the program, seek to other functions and run other r2 commands without disturbing the debugger.
On the other hand, if you execute a debugger command on the command line it will operate on the source code where the debugger is currently parked, not on the current seek at the top of your view (unless they happen to be the same).
OK, let’s entirely skip execution of the _is_virtual_mchn function by entering the command line with : and then:
> dss 2
Hit ‘return’ twice. As you can see, the dss command skips the number of source lines specified by the integer you gave it, making it a very easy way to bypass unwanted code execution!
Alternatively, if we want to execute the function then manipulate the register, stop the debugger on the line where the register is compared, and enter the command line again. This time, we can use dr to both inspect and write values to our chosen register.
> dr eax // see eax’s current value
> dr eax = 0 // set eax to 0
> drr // view all the registers
> dro // see the previous values of the registers
Viewing and changing register values
And that, pretty much, is all you need to defeat anti-analysis code in terms of manipulating execution. Of course, the fun part is finding the code you need to manipulate, which is why we spent some time learning how to move around in radare2 in both visual graph mode and visual mode. Remember that in either mode you can get back to the regular command prompt by hitting q. As a bonus, you might play around with hitting p and tab when in the visual modes.
At this point, what I suggest you do is go back to the list of functions we identified at the beginning of the post and see what they do, and whether it’s best to skip them or modify their return values (or whether either option will do). You might want to look up the built-in help for listing and setting breakpoints (from a command prompt, try db?) to move quickly through the code. By the time you’ve done this a few times, you’ll be feeling pretty comfortable about tackling other samples in radare2’s debugger.
Conclusion
If you’re starting to see the potential power of r2, I strongly suggest you read the free online radare2 book, which will be well worth investing the time in. By now you should be starting to get the feel of r2 and exploring more on your own with the help of the ? and other resources. As we go into further challenges, we’ll be spending less time going over the r2 basics and digging more into the actual malware code.
In the next part of our series, we’re going to start looking at one of the major challenges in reversing macOS malware that you are bound to face on a regular basis: dealing with encrypted and obfuscated strings. I hope you’ll join us there and practice your r2 skills in the meantime!
In our previous foray into macOS malware reverse engineering, we guided those new to the field through the basics of static and dynamic analysis using nothing other than native tools such as strings, otool and lldb. In this new series of posts, we move into intermediate and more advanced techniques, introducing you to further tools and covering a wide range of real-world malware samples from commodity adware to trojans, backdoors, and spyware used by APT actors such as Lazarus and OceanLotus. We’ll walk through problems such as beating anti-analysis and sandbox checks, reversing encrypted strings, intercepting C2 comms and more.
We kick off with a walk-through on how to rapidly triage a new sample. Analysts are busy people, and the majority of malware samples you have to deal with are neither that interesting nor that complicated. We don’t want to get stuck in the weeds reversing lots of unnecessary code only to find out that the sample really wasn’t worth that much effort!
Ideally, we want to get a sample “triaged” in just a few minutes, where “triage” means that we understand the basics of the malware’s behavior and objectives, collecting just enough data to be able to effectively hunt for related samples and detect them in our environments. For those rarer samples that pique our interest and look like they need deeper analysis, we want our triage session to give an overall profile of the sample and indicate areas for further investigation.
Why Use radare2 (r2) for macOS Malware Analysis?
For rapid triage, my preferred tool is radare2 (aka r2). There are many introductory blogs on installing and using r2, and I’m not going to cover that material here. Such posts will serve you well in terms of learning your way around the basics of installing and using the tool if it’s completely new to you.
However, most such posts are aimed at CTF/crackme readers and typically showcase simple ELF or PE binaries. Very few are aimed at malware analysts, and even fewer still are aimed at macOS malware analysts, so they are not much use to us from a practical point of view. I’m going to assume that you’ve read at least one or two basic intro r2 posts before starting on the material below. For a rare example of r2 introductory material using Mach-O samples (albeit not malware), I recommend having a look at these two helpful posts: 1, 2.
Before we dive in, I do want to say a little bit about why r2 is a good choice for macOS malware analysis, as I expect at least some readers are likely already familiar with other tools such as IDA, Ghidra and perhaps even Hopper, and may be asking that question from the outset.
Radare2 is an extremely powerful and customizable reversing platform, and – at least the way I use it – a great deal of that power comes from the very feature that puts some people off: it’s a command line tool rather than a GUI tool.
Because of that, r2 is very fast, lightweight, and stable. You can install and run it very quickly in a new VM without having to worry about dependencies or licensing (the latter, because it’s free) and it’s much less likely (in my experience) to crash on you or corrupt a file or refuse to start. And as we’ll see in the tips below, you can triage a binary with it very quickly indeed!
Moreover, because it’s a command line tool, it integrates very easily with other command line tools that you are likely familiar with, including things like grep, awk, diff and so on. Other tools typically require you to develop separate scripts in python or Java to do various tailored tasks, but with r2 you can often accomplish the same just by piping output through familiar command line tools (we’ll be looking at some examples of doing that below).
Finally, because r2 is free, multi-platform and runs on pretty much anything at all that can run a terminal emulator, learning how to reverse with r2 is a transferable skill you can take advantage of anywhere.
Enough of the hard sell, let’s get down to triaging some malware! For this post, we’re going to look at a malware sample called OSX.Calisto. Be sure to set up an isolated VM, download the sample from here (password:infect3d) and install r2.
Then, let’s get started!
1. Fun with Functions, Calls, XREFS and More
Our sample, OSX.Calisto, is a backdoor that tries to exfiltrate the user’s keychain, username and clear text copy of the login password. The first tip about using r2 quickly is to load your sample with the -AA option, like so:
% r2 -AA calisto
Load and analyse macOS malware sample with radare2
This performs the same analysis as loading the file and then running aaa from within r2. It’s not only faster to do it in one step, it also cuts out the possibility of forgetting to run the analysis command after loading the binary.
Now that our Calisto sample is loaded and analysed, the first thing that we should do is list all the functions in verbose mode with afll. What is particularly useful about this command is that it gives a great overview of the malware. Not only can we see all the function calls, we can see which are imports, which are dead code, which are making the most system calls, which take the most (or least) arguments, how many variables each declares and more. From here, we are in a very good position to see both what the malware does and where it does it.
List all functions, displaying stats for calls, locals, args, and xrefs for each
Even from just the top of that list, we can see that this malware makes a lot of calls to NSUserName. Typically, though, we will want to sort that table. Although r2 has an internal function for sorting the function table (aflt), I have not found the output to be reliable.
Fortunately, there is another way, which will introduce us to a more general “power feature” of r2. This is to pipe the output of afll through awk and sort. Say, for example, we would like to sort only select columns (we don’t want all that noisy data!):
Here we pipe the output through awk, selecting only the columns we want and then pipe and sort on the third column (number of calls). We add the -n option to make the sort numerical. We can reverse the sort with -r.
Function table sorted by calls
Note that we never left r2 throughout this whole process, making the whole thing extremely convenient. If we wanted to do the same and output the results to file, just do that as you would normally on the command line with a > <path_to_file>.
2. Quickly Dive Into a Function’s Calls
Having found something of interest, we will naturally want to take a quick look at it to see if our hunch is right. We can do that rapidly in a couple of ways as the next few tips will show.
Normally, from that function table, it would make sense to look for functions that have a particular profile such as lots of calls, args, and/or xrefs, and then look at those particular functions in more detail.
Back in our Calisto example, we noted there was one function that had a lot of calls: sym.func.100005620, but we don’t necessarily want to spend time looking at that function if those calls aren’t doing anything interesting.
We can get a look at what calls a function makes very quickly just by typing in a variant of the afll command, aflm. You might want to just punch that in and see what it outputs.
aflm
Yeah, useful, but overwhelming! As we noted in the previous section, we can easily filter things with command line tools while still in r2, so we could pipe that output to grep. But how many lines should we grep after the pattern? For example, if you try
aflm | grep -A 100 5620:
You’ll shoot way over target, because although there may be more calls in that function, aflm only lists each unique call. A better way is to pipe through sed and tell sed to stop piping when it hits another colon (signalling another function listing).
aflm | sed -n ‘/5620:/,/:/p’
The above command says “search for the pattern “/5620:/”, keep going (“/,/”) until you find the next “/:/”. The final “/p” tells sed to print all that it found.
You’ll get an output like this:
Sorting output from radare2
Awesome! Now we can see all the calls that this huge function makes. From that alone we can infer that this function appears to grab the User name, does some string searching, possibly builds an array out of what it finds, and then uploads some data to a remote server! And we haven’t even done any disassembly yet!
3. Strings on Steroids
At this point, we might want to go back to the function table and repeat the above steps on a few different functions, but we also have another option. Having seen that NSUserName is called on multiple occasions, we might want to look more closely at how the malware is interacting with the user. As we explained in our previous guide on reversing macOS malware, extracting strings from a binary can give you a very good insight into what the malware is up to, so much so that some malware authors take great efforts to obfuscate and encrypt the binary’s strings (something we’ll be looking at in a later post). Fortunately, the author of Calisto wasn’t one of those. Let’s see how we can use r2 to help us with string analysis.
The main command for dumping strings is
izz
However, that dump isn’t pretty and doesn’t make for easy analysis. Fortunately, there’s a much nicer way to look at and filter strings in radare2. Let’s try this instead:
izz~...
The tilde is r2’s internal “grep” command, but more importantly the three periods pipe the string dump into a “HUD” (Heads Up Display) from where we can type filter characters. For example, after issuing the above command, type a single “/” to reveal all strings (like paths and URLs, for example) containing a forward slash. Backspace to clear that and try other filters in turn like “http” and “user”. As the images below show, we quickly hit pay dirt!
Filtering strings in radare2
The first image above looks like a lead on the malware’s C2 addresses, while the second shows us what looks very much like a path the malware is going to write data to. Both of these are ideal for our IoCs and for hunting, subject to further confirmation.
4. Fast Seek and Disassembly
What we’ve found after just a few short commands and a couple of minutes of triaging our binary is very promising. Let’s see if we can dig a little deeper. Our output from the HUD gives us the addresses of all those strings. Let’s take a look at the address for what looks like uploading exfiltrated data to a C2:
http://40.87.56.192/calisto/upload.php?username="
From the output, we can see that this string is referenced at 0x1000128d0. Let’s go to that address and see what we have. First, double-click the address to select it then copy it with Cmd-C. To escape the HUD, hit ‘return’ so that you are returned to the r2 prompt.
Next, we’ll invoke the ‘seek’ command, which is simply the letter s, and paste the address after it. Hit ‘return’. Type pd (print disassembly) and scroll up in your Terminal window to get to the start of the disassembly.
Seeking in radare2
The disassembly shows us where the string is called via the xref at the top. Let’s again select and Cmd-C that address and do another seek. After the seek, this time we’ll do pdf.
Disassembling a function in radare2
The difference is that pdf will disassemble an entire function, no matter how long it is. On the other hand, pd will disassemble a given number of instructions. Thus, it’s good to know both. You can’t use pdf from an address that isn’t a function, and sometimes you want to just disassemble a limited number of instructions: this is where pd comes in handy. However, when what you want is a complete function’s disassembly, pdf is your friend.
The pdf command gives you exactly what you’d expect from a disassembler, and if you’ve done any reversing before or even just read some r2 intros as suggested above, you’ll recognize this output (as pretty much all r2 intros start with pdf!). In any case, from here you can get a pretty good overview of what the function does, and r2 is nicer than some other disassemblers in that things like stack strings are shown by default.
You might also like to experiment with pdc. This is a “not very good” pseudocode output. One of r2’s weakpoints, it has to be said, is the ability to render disassembly in good pseudocode, but pdc can sometimes be helpful for focus.
Finally, before we move on to the next tip, I’m just going to give you a variation on something we mentioned above that I often like to do with pdf, which is to grep the calls out of it. This is particularly useful for really big functions. In other words, try
pdf~call
for a quick look at the calls in a given function. You can also get r2 to give you a summary of a function with pds.
5. Rabin2 | Master of Binary Info Extraction
When we discussed strings, I mentioned the izz command, which is a child of the iz command, which in turn is a child of r2’s i command. As you might have guessed, i stands for information, and the various incantations of i are all very useful while you’re in the middle of analysis (if you happen to forget what file you are analyzing, i~file is your friend!).
Some of the useful variants of the i command are as follows:
get file metadata [i]
look at what libraries it imports [ii]
look at what strings it contains [iz]
look at what classes/functions/methods it contains [icc]
find the entrypoint [ie]
However, for rapid triage, there is a much better way to get a bird’s eye view of everything there is to know about a file. When you installed r2, you also installed a bunch of other utilities that r2 makes use of but which you can call independently. Perhaps the most useful of these is rabin2. In a new Terminal window, try man rabin2 to see its options.
While we can take advantage of rabin2’s power via the i command in r2, we can get more juice out of it by opening a separate Terminal window and calling rabin2 directly on our malware sample. For our purposes, focused as we are in this post on rapid triage, the only rabin2 option we need to know is:
% rabin2 -g <path_to_binary>
Triaging macOS malware with rabin2
The -g option outputs everything there is to know about the file, including strings, symbols, sections, imports, and such things like whether the file is stripped, what language it was written in, and so on. It is essentially all of the options of r2’s i command rolled into one (if it’s possible to make r2 punch out all of that in one command, I’m not aware of how).
Strangely, one of the best outputs from rabin2 is when its -g option outputs almost nothing at all! That tells you that you are almost certainly dealing with packed malware, and that in itself is a great guide on where to go next in your investigation (we’ll be looking at packed files in a later post).
Meanwhile, it’s time to introduce our last rapid analysis pro trick, Visual Graph mode!
6. Visual Graph Mode
For those of you used to a GUI disassembler, if you’ve followed this far you may well be thinking… “ahuh…but how do I get a function call graph from a command line tool?” A graph is often a make or break deal when trying to triage malware rapidly, and a tool that doesn’t have one is probably not going to win many friends. Fortunately, r2 has you covered!
Returning to our r2 prompt, type VV to enter visual graph mode.
radar2 graph mode
Visual graph mode is super useful for being able to trace logic paths through a malware sample and to see which paths are worth further investigation. I will readily admit that learning your way around the navigation options takes some practice. However, it is an extremely useful tool and one which I frequently return to with samples that attempt to obstruct analysis.
The options for using Visual Graph mode are nicely laid out in this post here. Once you learn your way around, it’s relatively simple and powerful, but it’s also easy to get lost when you’re first starting out. Like Vi and Vim, inexperienced users can sometimes find themselves trapped in an endless world of error beeps with r2’s Visual Graph mode. However, as with all things in r2, whenever you find yourself “stuck”, hit q on the keyboard (repeatedly, if needs be). If you find yourself needing help, hit ?.
I highly recommend that you experiment with the Calisto sample to familiarize yourself with how it works. In the next post, we’ll be looking in more detail at how Visual Graph mode can help us when we tackle anti-analysis measures, so give yourself a heads up by playing around with it in the meantime.
Conclusion
In this post, we’ve looked at how to use radare2 to quickly triage macOS malware samples, seen how it can easily be integrated with other command line tools most malware analysts are already familiar with, and caught a glimpse of its visual graph mode.
There’s much more to learn about radare2 and macOS malware, and while we hope you’ve enjoyed the tips we’ve shared here, there’s many more ways to use this amazing tool to achieve your aims in reversing macOS malware. We hope you’ll join us in the next post in this series as we continue our exploration of intermediate and advanced macOS malware analysis techniques.
AdLoad is one of several widespread adware and bundleware loaders currently afflicting macOS.
In late 2019, SentinelLabs described how AdLoad was continuing to adapt and evade detection.
This year we have seen over 150 unique samples that are part of a new campaign that remain undetected by Apple’s on-device malware scanner.
Some of these samples have been known to have also been blessed by Apple’s notarization service.
We describe the infection pattern and detail the indicators of compromise for the first time.
Introduction
AdLoad is one of several widespread adware and bundleware loaders currently afflicting macOS. AdLoad is certainly no newcomer to the macOS malware party. In late 2019, SentinelLabs described how AdLoad was continuing to adapt and evade detection, and this year we have seen another iteration that continues to impact Mac users who rely solely on Apple’s built-in security control XProtect for malware detection.
In this post, we detail one of several new AdLoad campaigns we are currently tracking that remain undetected by Apple’s macOS malware scanner. We describe the infection pattern and indicators of compromise for the first time and hope this information will help others to detect and remove this threat.
AdLoad | Staying One Step Ahead of Apple
AdLoad has been around since at least 2017, and when we previously reported on it in 2019, Apple had some partial protection against its earlier variants. Alas, at that time the 2019 variant was undetected by XProtect.
As of today, however, XProtect arguably has around 11 different signatures for AdLoad (it is ‘arguable’ because Apple uses non-industry standard names for its signature rules). As best as we can track Apple’s rule names to common vendor names, the following XProtect rules appear to be all partially or wholly related to AdLoad variants:
Signatures for AdLoad variants in XProtect
The good news for those without additional security protection is that the previous variant we reported in 2019 is now detected by XProtect, via rule 22d71e9.
An earlier AdLoad variant reported by SentinelLabs is now detected by XProtect
The bad news is the variant used in this new campaign is undetected by any of those rules. Let’s see what’s changed.
AdLoad 2021 Campaign | ‘System’ and ‘Service’
Both the 2019 and 2021 variants of AdLoad used persistence and executable names that followed a consistent pattern. In 2019, that pattern included some combination of the words “Search” , “Result” and “Daemon”, as in the example shown above: “ElementarySignalSearchDaemon”. Many other examples can be found here.
The 2021 variant uses a different pattern that primarily relies on a file extension that is either .system or .service. Which file extension is used depends on the location of the dropped persistence file and executable as described below, but typically both .system and .service files will be found on the same infected device if the user gave privileges to the installer.
With or without privileges, AdLoad will install a persistence agent in the user’s Library LaunchAgents folder with patterns such as:
To date, we have found around 50 unique label patterns, with each one having both a .service and a .system version. Based on our previous understanding of AdLoad, we expect there to be many more.
When the user logs in, the AdLoad persistence agent will execute a binary hidden in the same user’s ~/Library/Application Support/ folder. That binary follows another deterministic pattern, whereby the child folder in Application Support is prepended with a period and a random string of digits. Within that directory is another directory called /Services/, which in turn contains a minimal application bundle having the same name as the LaunchAgent label. That barebones bundle contains an executable with the same name but without the com. prefix. For example:
Indicators of compromise in the User’s Library Application Support folder
A hidden tracker file called .logg and containing only a UUID string is also dropped in the Application Support folder. Despite the location, if the dropper has also been granted privileges, then the tracker file is owned by root rather than the user.
The hidden tracker file in the User’s Library Application Support folder
Further, assuming the user supplied admin privileges as requested by the installer, another persistence mechanism is written to the domain /Library/LaunchDaemons/ folder. This plist file uses the file extension .system, and the corresponding folder in the hidden Application Support folder is also named /System/ instead of /Services/.
Indicators of compromise in the Domain Library Application Support folder
The LaunchDaemon is dropped with one of a number of pre-determined labels that mirrors the label used in the LaunchAgent, such as:
The persistence plists themselves pass different arguments to the executables they launch. For the system daemon, the first argument is -t and the second is the plist label. For the user persistence agent, the arguments -s and 6600 are passed to the first and second parameters, respectively.
AdLoad 2021 macOS persistence pattern
Interestingly, the droppers for this campaign share the same pattern as Bundlore/Shlayer droppers. They use a fake Player.app mounted in a DMG. Many are signed with a valid signature; in some cases, they have even been known to be notarized.
Like much other adware, AdLoad makes use of a fake Player.app to install malware
Typically, we observe that developer certificates used to sign the droppers are revoked by Apple within a matter of days (sometimes hours) of samples being observed on VirusTotal, offering some belated and temporary protection against further infections by those particular signed samples by means of Gatekeeper and OCSP signature checks. Also typically, we see new samples signed with fresh certificates appearing within a matter of hours and days. Truly, it is a game of whack-a-mole.
The droppers we have seen take the form of a lightly obfuscated Zsh script that decompresses a number of times before finally executing the malware out of the /tmp directory (for a discussion of how to deobfucscate such scripts see here).
The dropper executes a shell script obfuscated several times over
The final payload is not codesigned and isn’t known to the current version of Apple’s XProtect, v2149.
The malware executes out of /tmp/ and is neither codesigned nor known to XProtectOnce infection is complete, the adware pops the following page in the user’s default browser
How New Is This Variant of AdLoad?
In our investigation, we found over 220 samples of this adware variant on VirusTotal, in both packed and unpacked form. At least 150 of these are unique. Interestingly, a lone sample of this variant was documented by analysts at Confiant, who described the malware’s string decryption routine in a post published on June 3rd, 2021. According to these researchers, the sample they observed had been notarized by Apple.
We note that across our corpus, all samples from November 2020 to August 2021 use the same or similar string decryption routine as that described by Confiant. Similarly, the earlier researchers’ sample, “MapperState.system” conforms to the AdLoad naming pattern that we observed and described above. Both these indicators definitively link our findings with theirs.
AdLoad binaries use a great deal of obfuscation, including custom string encryptionThree different samples, all using a similar string encryption routine
Our research showed that samples began to appear at least as early as November 2020, with regular further occurrences across the first half of 2021. However, there appears to have been a sharp uptick throughout July and in particular the early weeks of August 2021.
It certainly seems possible that the malware developers are taking advantage of the gap in XProtect, which itself has not been updated since a few week’s after Confiant’s research over two months ago. At the time of writing, XProtect was last updated to version 2149 around June 15th – 18th.
Version 2149 is the most recent version of Apple’s XProtect as of August 11th
None of the samples we found are known to XProtect since they do not match any of the scanner’s current set of AdLoad rules.
Running XProtect v2149 against 221 known samples shows no detections
However, there is reasonably good detection across a variety of different vendor engines used by VirusTotal for all the same samples that XProtect doesn’t detect.
All the samples are detected by various VT vendor engines
On our test machine, we set the policy of the SentinelOne Agent to “Detect only” in order to allow the malware to execute and observe its behaviour. In the Management console, the behavioral detection is mapped to the relevant MITRE indicators.
Behavioral Indicators from the SentinelOne agent
Since AdLoad is a common adware threat whose behavior of hijacking search engine results and injecting advertisements into web pages has been widely documented in the past, we ended our observation at this juncture.
Conclusion
As Apple itself has noted and we described elsewhere, malware on macOS is a problem that the device manufacturer is struggling to cope with. The fact that hundreds of unique samples of a well-known adware variant have been circulating for at least 10 months and yet still remain undetected by Apple’s built-in malware scanner demonstrates the necessity of adding further endpoint security controls to Mac devices.
As we indicated at the beginning of this post, this is only one campaign related to AdLoad that we are currently tracking. Further publications related to these campaigns are in progress.
Indicators of Compromise
YARA Hunting Rule
private rule Macho
{
meta:
description = "private rule to match Mach-O binaries"
condition:
uint32(0) == 0xfeedface or uint32(0) == 0xcefaedfe or uint32(0) == 0xfeedfacf or uint32(0) == 0xcffaedfe or uint32(0) == 0xcafebabe or uint32(0) == 0xbebafeca
}
rule adload_2021_system_service
{
meta:
description = "rule to catch Adload .system .service variant"
author = "Phil Stokes, SentinelLabs"
version = "1.0"
last_modified = "2021-08-10"
reference = "https://s1.ai/adload"
strings:
$a = { 48 8D 35 ?? ?? 00 00 48 8D 5D B8 BA B8 00 00 00 48 89 DF E8 ?? ?? FB FF 48 8B 43 08 48 2B 03 66 48 0F 6E C0 66 0F 62 05 ?? ?? 00 00 66 0F 5C 05 ?? ?? 00 00 0F 57 C9 66 0F 7C C0 48 8D 7D A0 0F 29 0F F2 0F 59 05 }
condition:
Macho and all of them
}
TCC is meant to protect user data from unauthorized access, but weaknesses in its design mean that protections are easily overridden inadvertently.
Automation, by design, allows Full Disk Access to be ‘backdoored’ while also lowering the authorization barrier.
Multiple partial and full TCC bypasses are known, with at least one actively exploited in the wild.
TCC does not prevent processes reading and writing to ‘protected’ locations, a loophole that can be used to hide malware.
Introduction
In recent years, protecting sensitive user data on-device has become of increasing importance, particularly now that our phones, tablets and computers are used for creating, storing and transmitting the most sensitive data about us: from selfies and family videos to passwords, banking details, health and medical data and pretty much everything else.
With macOS, Apple took a strong position on protecting user data early on, implementing controls as far back as 2012 in OSX Mountain Lion under a framework known as ‘Transparency, Consent and Control’, or TCC for short. With each iteration of macOS since then, the scope of what falls under TCC has increased to the point now that users can barely access their own data – or data-creating devices like the camera and microphone – without jumping through various hoops of giving ‘consent’ or ‘control’ to the relevant applications through which such access is mediated.
There have been plenty of complaints about what this means with regards to usability, but we do not intend to revisit those here. Our concern in this paper is to highlight a number of ways in which TCC fails when users and IT admins might reasonably expect it to succeed.
We hope that by bringing attention to these failures, users and admins might better understand how and when sensitive data can be exposed and take that into account in their working practices.
Crash Course: What’s TCC Again?
Apple’s latest platform security guide no longer mentions TCC by name, but instead refers to ‘protecting app access to user data’. The current version of the platform security guide states:
“Apple devices help prevent apps from accessing a user’s personal information without permission using various technologies…[in] System Preferences in macOS, users can see which apps they have permitted to access certain information as well as grant or revoke any future access.”
In common parlance, we’re talking about privacy protections that are primarily managed by the user in System Preferences’ Privacy tab of the Security & Privacy pane.
System Preferences.app provides the front-end for TCC
Mac devices controlled by an MDM solution may also set various privacy preferences via means of a Profile. Where in effect, these preferences will not be visible to users in the Privacy pane above. However, they can be enumerated via the TCC database. The command for doing so changes slightly with Big Sur and later.
macOS 11 (Big Sur) and later:
sudo sqlite3 /Library/Application Support/com.apple.TCC/TCC.db "SELECT client,auth_value FROM access WHERE service=='kTCCServiceSystemPolicyAllFiles'" | grep '2'$
macOS 10.15 (Catalina) and earlier:
sudo sqlite3 /Library/Application Support/com.apple.TCC/TCC.db "SELECT client,allowed FROM access WHERE service == 'kTCCServiceSystemPolicyAllFiles'" | grep '1'$
The command line also presents users and administrators with the /usr/bin/tccutil utility, although its claim to offer the ability “to manage the privacy database” is a little exaggerated since the only documented command is reset. The tool is useful if you need to blanket wipe TCC permissions for the system or a user, but little else.
The spartan man page from tccutil
Under the hood, all these permissions are managed by the TCC.framework at /System/Library/PrivateFrameworks/TCC.framework/Versions/A/Resources/tccd.
Strings in tccd binary reveal some of the services afforded TCC protection
Looked at in a rather narrow way with regard to how users work with their Macs in practice, one could argue that the privacy controls Apple has designed with this framework work as intended when users (and apps) behave as intended in that narrow sense. However, as we shall now see, problems arise when one or both go off script.
Full Disk Access – One Rule That Breaks Them All
To understand the problems in Apple’s implementation of TCC, it’s important to understand that TCC privileges exist at two levels: the user level and the system level. At the user level, individual users can allow certain permissions that are designed only to apply to their own account and not others. If Alice allows the Terminal access to her Desktop or Downloads folders, that’s no skin off Bob’s nose. When Bob logs in, Terminal won’t be able to access Bob’s Desktop or Downloads folders.
At least, that’s how it’s supposed to work, but if Alice is an admin user and gives Terminal Full Disk Access (FDA), then Alice can quite happily navigate to Bob’s Desktop and Downloads folders (and everyone else’s) regardless of what TCC settings Bob (or those other users) set. Note that Bob is not afforded any special protection if he is an admin user, too. Full Disk Access means what it says: it can be set by one user with admin rights and it grants access to all users’ data system-wide.
While this may seem like good news for system administrators, there are implications that may not be readily apparent, and these implications affect the administrator’s own data security.
When Alice grants FDA permission to the Terminal for herself, all users now have FDA permission via the Terminal as well. The upshot is that Alice isn’t only granting herself the privilege to access others’ data, she’s granting others the privilege to access her data, too.
Surprisingly, Alice’s (no doubt) unintended permissiveness also extends to unprivileged users. As reported in CVE-2020-9771, allowing the Terminal to have Full Disk Access renders all data readable without any further security challenges: the entire disk can be mounted and read even by non-admin users. Exactly how this works is nicely laid out in this blog post here, but in short any user can create and mount a local snapshot of the system and read all other users’ data.
Even Standard users can read Admin’s private data
The ‘trick’ to this lies in two command line utilities, both of which are available to all users: /usr/bin/tmutil and /sbin/mount. The first allows us to create a local snapshot of the entire system, and the second to mount that snapshot as an apfs read-only file system. From there, we can navigate all users data as captured on the mounted snapshot.
It’s important to understand that this is not a bug and will not be fixed (at least, ‘works as intended’ appears to be Apple’s position at the time of writing). The CVE mentioned above was the bug for being able to exploit this without Full Disk Access. Apple’s fix was to make it only possible when Full Disk Access has been granted. The tl;dr for Mac admins?
When you grant yourself Full Disk Access, you grant all users (even unprivileged users) the ability to read all other users’ data on the disk, including your own.
Backdooring Full Disk Access Through Automation
This situation isn’t restricted only to users: it extends to user processes, too. Any application granted Full Disk Access has access to all user data, by design. If that application is malware, or can be controlled by malware, then so does the malware. But application control is managed by another TCC preference, Automation.
And here lies another trap: there is one app on the Mac that always has Full Disk Access but never appears in the Full Disk Access pane in System Preferences: the Finder.
Any application that can control the Finder (listed in ‘Automation’ in the Privacy pane) also has Full Disk Access, although you will see neither the Finder nor the controlling app listed in the Full Disk Access pane.
Because of this complication, administrators must be aware that even if they never grant FDA permissions, or even if they lock down Full Disk Access (perhaps via MDM solution), simply allowing an application to control the Finder in the ‘Automation’ pane will bypass those restrictions.
Automating the Finder allows the controlling app Full Disk Access
In the image above, Terminal, and two legitimate third party automation apps, Script Debugger and FastScripts, all have Full Disk Access, although none are shown in the Full Disk Access privacy pane:
Apps that backdoor FDA through Automation are not shown in the FDA pane
As noted above, this is because the Finder has irrevocable FDA permissions, and these apps have been given automation control over the Finder. To see how this works, here’s a little demonstration.
~ osascript<<EOD
set a_user to do shell script "logname"
tell application "Finder"
set desc to path to home folder
set copyFile to duplicate (item "private.txt" of folder "Desktop" of folder a_user of item "Users" of disk of home) to folder desc with replacing
set t to paragraphs of (do shell script "cat " & POSIX path of (copyFile as alias)) as text
end tell
do shell script "rm " & POSIX path of (copyFile as alias)
t
EOD
Although the Terminal is not granted Full Disk Access, if it has been granted Automation privileges for any reason in the past, executing the script above in the Terminal will return the contents of whatever the file “private.txt” contains. As “private.txt” is located on the user’s Desktop, a location ostensibly protected by TCC, users might reasonably expect that the contents of this file would remain private if no applications had been explicitly granted FDA permissions. This is demonstrably not the case.
Backdooring FDA access through automating the Finder
The obvious mitigation here is not to allow apps the right to automate the Finder. However, let’s note two important points about that suggestion.
First, there are many legitimate reasons for granting automation of the Finder to the Terminal or other productivity apps: any mildly proficient user who is interested in increasing their productivity through automation may well have done so or wish to do so. Unfortunately, this is an “All-In” deal. If the user has a specific purpose for doing this, there’s no way to prevent other less legitimate uses of Terminal’s (or other programs’) use of this access.
Second, backdooring FDA access in this way results in a lowering of the authorization barrier. Granting FDA in the usual way requires an administrator password. However, one can grant consent for automation of the Finder (and thus backdoor FDA) without a password. A consent dialog with a simple click-through will suffice:
A simple ‘OK’ gives access to control the Finder, and by extension Full Disk Access.
While the warning text is explicit enough (if the user reads it), it is far from transparent that given the Finder’s irrevocable Full Disk Access rights, the power being invested in the controlling app goes far beyond the current user’s consent, or control.
As a bonus, this is not a per-time consent. If it has ever been granted at any point in the past, then that permission remains in force (and thus transparent, in the not-good sense, to the user) unless revoked in System Preferences ‘Automation’ pane or via the previously mentioned tccutil reset command.
The tl;dr: keep a close and regular eye on what is allowed to automate the Finder in your System Preferences Privacy pane.
The Sorry Tale of TCC Bypasses
Everything we’ve mentioned so far is actually by design, but there is a long history of TCC bypasses to bear in mind as well. When macOS Mojave first went on public release, SentinelOne was the first to note that TCC could be bypassed via SSH (this finding was later duplicated by others). The indications from multiple researchers are that there are plenty more bypasses out there.
The most recent TCC bypass came to light after it was discovered being exploited by XCSSET malware in August 2020. Although Apple patched this particular flaw some 9 months later in May 2021, it is still exploitable on systems that haven’t been updated to macOS 11.4 or the latest security update to 10.15.7.
On a vulnerable system, it’s trivially easy to reproduce.
Create a simple trojan application that needs TCC privileges. Here we’ll create an app that needs access to the current user’s Desktop to enumerate the files saved there.
One way you can find the current permitted list of apps is from the ‘Files and Folders’ category in the Privacy tab of System Preferences’ Security & Privacy pane (malware takes another route, as we’ll explain shortly).
Execute the trojan app:
% open /Applications/Some Privileged.app/ls.app
Security-minded readers will no doubt be wondering how an attacker achieves Step 2 without already having knowledge of TCC permissions – you can’t enumerate the list of privileged apps in the TCC.db from the Terminal unless Terminal already has Full Disk Access.
Assuming the target hasn’t already granted Terminal FDA privileges for some other legitimate reason (and who hasn’t these days?), an attacker, red teamer or malware could instead enumerate over the contents of the /Applications folder and take educated guesses based on what’s found there, e.g., Xcode, Camtasia, and Zoom are all applications that, if installed, are likely to be privileged.
Similarly, one could hardcode a list of apps known to have such permissions and search the target machine for them. This is precisely how XCSSET malware works: the malware is hardcoded with a list of apps that it expects to have screen capture permissions and injects its own app into the bundle of any of those found.
Decoded strings from XCSSET malware reveals a list of apps it exploits for TCC permissions
Unfortunately, the fix for this particular bug doesn’t effectively stop malware authors. If the bypass fails, it’s a simple matter to just impersonate the Finder and ask the user for control. As with the Automation request, this only requires the user to click-through their consent rather than provide a password.
Fake Finder App used by XCSSET malware to access protected areas
As we noted above, the (real) Finder already has Full Disk Access by default, so users seeing a request dialog asking to grant the Finder access to any folder should immediately raise suspicion that something is amiss.
TCC – Just One More Thing
That almost wraps up our tour of TCC gotchas, but there’s one more worth pointing out. A common misunderstanding with Apple’s User privacy controls is that it prevents access to certain locations (e.g., Desktop, Documents, Downloads, iCloud folders). However, that is not quite the case.
Administrators need to be aware that TCC doesn’t protect against files being written to TCC protected areas by unprivileged processes, and similarly nor does it stop files so written from being read by those processes.
A process can write to a TCC protected area, and read the files it writes
Why does this matter? It matters because if you have any kind of security or monitoring software installed that doesn’t have access to TCC-protected areas, there’s nothing to stop malware from hiding some or all of its components in these protected areas. TCC isn’t going to stop malware using those locations – a blind spot that not every Mac sys administrator is aware of – so don’t rely on TCC to provide some kind of built-in protected ‘safe-zone’. That’s not how it works, when it works at all.
Conclusion
We’ve seen how macOS users can easily and unknowingly expose data they think is protected by TCC simply by doing the things that macOS users, particularly admins, are often inclined to do. Ironically, most of these ‘inadvertent breaches’ are only possible because of TCC’s own lack of transparency. Why, for example, is the Finder not listed in the Full Disk Access pane? Why is it not clear that Automation of the Finder backdoors Full Disk Access? And why is password-authentication downgraded to a simple consent prompt for what is, effectively, the same privilege?
Other questions raised by this post concern whether consent should have finer grained controls so that prompts can be optionally repeated at certain intervals, and – perhaps most importantly – whether users should be able to protect their own data by being allowed to opt out of FDA granted by other users on the same device.
We know that malware abuses some of these loopholes, and that various TCC bugs exist that have yet to be patched. Our only conclusion at this point has to be that neither users nor admins should place too much faith in the ability of TCC as it is currently implemented to protect data from unauthorized access.
In our previous foray into macOS malware reverse engineering, we guided those new to the field through the basics of static and dynamic analysis using nothing other than native tools such as strings, otool and lldb. In this new series of posts, we move into intermediate and more advanced techniques, introducing you to further tools and covering a wide range of real-world malware samples from commodity adware to trojans, backdoors, and spyware used by APT actors such as Lazarus and OceanLotus. We’ll walk through problems such as beating anti-analysis and sandbox checks, reversing encrypted strings, intercepting C2 comms and more.
We kick off with a walk-through on how to rapidly triage a new sample. Analysts are busy people, and the majority of malware samples you have to deal with are neither that interesting nor that complicated. We don’t want to get stuck in the weeds reversing lots of unnecessary code only to find out that the sample really wasn’t worth that much effort!
Ideally, we want to get a sample “triaged” in just a few minutes, where “triage” means that we understand the basics of the malware’s behavior and objectives, collecting just enough data to be able to effectively hunt for related samples and detect them in our environments. For those rarer samples that pique our interest and look like they need deeper analysis, we want our triage session to give an overall profile of the sample and indicate areas for further investigation.
Why Use radare2 (r2) for macOS Malware Analysis?
For rapid triage, my preferred tool is radare2 (aka r2). There are many introductory blogs on installing and using r2, and I’m not going to cover that material here. Such posts will serve you well in terms of learning your way around the basics of installing and using the tool if it’s completely new to you.
However, most such posts are aimed at CTF/crackme readers and typically showcase simple ELF or PE binaries. Very few are aimed at malware analysts, and even fewer still are aimed at macOS malware analysts, so they are not much use to us from a practical point of view. I’m going to assume that you’ve read at least one or two basic intro r2 posts before starting on the material below. For a rare example of r2 introductory material using Mach-O samples (albeit not malware), I recommend having a look at these two helpful posts: 1, 2.
Before we dive in, I do want to say a little bit about why r2 is a good choice for macOS malware analysis, as I expect at least some readers are likely already familiar with other tools such as IDA, Ghidra and perhaps even Hopper, and may be asking that question from the outset.
Radare2 is an extremely powerful and customizable reversing platform, and – at least the way I use it – a great deal of that power comes from the very feature that puts some people off: it’s a command line tool rather than a GUI tool.
Because of that, r2 is very fast, lightweight, and stable. You can install and run it very quickly in a new VM without having to worry about dependencies or licensing (the latter, because it’s free) and it’s much less likely (in my experience) to crash on you or corrupt a file or refuse to start. And as we’ll see in the tips below, you can triage a binary with it very quickly indeed!
Moreover, because it’s a command line tool, it integrates very easily with other command line tools that you are likely familiar with, including things like grep, awk, diff and so on. Other tools typically require you to develop separate scripts in python or Java to do various tailored tasks, but with r2 you can often accomplish the same just by piping output through familiar command line tools (we’ll be looking at some examples of doing that below).
Finally, because r2 is free, multi-platform and runs on pretty much anything at all that can run a terminal emulator, learning how to reverse with r2 is a transferable skill you can take advantage of anywhere.
Enough of the hard sell, let’s get down to triaging some malware! For this post, we’re going to look at a malware sample called OSX.Calisto. Be sure to set up an isolated VM, download the sample from here (password:infect3d) and install r2.
Then, let’s get started!
1. Fun with Functions, Calls, XREFS and More
Our sample, OSX.Calisto, is a backdoor that tries to exfiltrate the user’s keychain, username and clear text copy of the login password. The first tip about using r2 quickly is to load your sample with the -AA option, like so:
% r2 -AA calisto
Load and analyse macOS malware sample with radare2
This performs the same analysis as loading the file and then running aaa from within r2. It’s not only faster to do it in one step, it also cuts out the possibility of forgetting to run the analysis command after loading the binary.
Now that our Calisto sample is loaded and analysed, the first thing that we should do is list all the functions in verbose mode with afll. What is particularly useful about this command is that it gives a great overview of the malware. Not only can we see all the function calls, we can see which are imports, which are dead code, which are making the most system calls, which take the most (or least) arguments, how many variables each declares and more. From here, we are in a very good position to see both what the malware does and where it does it.
List all functions, displaying stats for calls, locals, args, and xrefs for each
Even from just the top of that list, we can see that this malware makes a lot of calls to NSUserName. Typically, though, we will want to sort that table. Although r2 has an internal function for sorting the function table (aflt), I have not found the output to be reliable.
Fortunately, there is another way, which will introduce us to a more general “power feature” of r2. This is to pipe the output of afll through awk and sort. Say, for example, we would like to sort only select columns (we don’t want all that noisy data!):
Here we pipe the output through awk, selecting only the columns we want and then pipe and sort on the third column (number of calls). We add the -n option to make the sort numerical. We can reverse the sort with -r.
Function table sorted by calls
Note that we never left r2 throughout this whole process, making the whole thing extremely convenient. If we wanted to do the same and output the results to file, just do that as you would normally on the command line with a > <path_to_file>.
2. Quickly Dive Into a Function’s Calls
Having found something of interest, we will naturally want to take a quick look at it to see if our hunch is right. We can do that rapidly in a couple of ways as the next few tips will show.
Normally, from that function table, it would make sense to look for functions that have a particular profile such as lots of calls, args, and/or xrefs, and then look at those particular functions in more detail.
Back in our Calisto example, we noted there was one function that had a lot of calls: sym.func.100005620, but we don’t necessarily want to spend time looking at that function if those calls aren’t doing anything interesting.
We can get a look at what calls a function makes very quickly just by typing in a variant of the afll command, aflm. You might want to just punch that in and see what it outputs.
aflm
Yeah, useful, but overwhelming! As we noted in the previous section, we can easily filter things with command line tools while still in r2, so we could pipe that output to grep. But how many lines should we grep after the pattern? For example, if you try
aflm | grep -A 100 5620:
You’ll shoot way over target, because although there may be more calls in that function, aflm only lists each unique call. A better way is to pipe through sed and tell sed to stop piping when it hits another colon (signalling another function listing).
aflm | sed -n ‘/5620:/,/:/p’
The above command says “search for the pattern “/5620:/”, keep going (“/,/”) until you find the next “/:/”. The final “/p” tells sed to print all that it found.
You’ll get an output like this:
Sorting output from radare2
Awesome! Now we can see all the calls that this huge function makes. From that alone we can infer that this function appears to grab the User name, does some string searching, possibly builds an array out of what it finds, and then uploads some data to a remote server! And we haven’t even done any disassembly yet!
3. Strings on Steroids
At this point, we might want to go back to the function table and repeat the above steps on a few different functions, but we also have another option. Having seen that NSUserName is called on multiple occasions, we might want to look more closely at how the malware is interacting with the user. As we explained in our previous guide on reversing macOS malware, extracting strings from a binary can give you a very good insight into what the malware is up to, so much so that some malware authors take great efforts to obfuscate and encrypt the binary’s strings (something we’ll be looking at in a later post). Fortunately, the author of Calisto wasn’t one of those. Let’s see how we can use r2 to help us with string analysis.
The main command for dumping strings is
izz
However, that dump isn’t pretty and doesn’t make for easy analysis. Fortunately, there’s a much nicer way to look at and filter strings in radare2. Let’s try this instead:
izz~...
The tilde is r2’s internal “grep” command, but more importantly the three periods pipe the string dump into a “HUD” (Heads Up Display) from where we can type filter characters. For example, after issuing the above command, type a single “/” to reveal all strings (like paths and URLs, for example) containing a forward slash. Backspace to clear that and try other filters in turn like “http” and “user”. As the images below show, we quickly hit pay dirt!
Filtering strings in radare2
The first image above looks like a lead on the malware’s C2 addresses, while the second shows us what looks very much like a path the malware is going to write data to. Both of these are ideal for our IoCs and for hunting, subject to further confirmation.
4. Fast Seek and Disassembly
What we’ve found after just a few short commands and a couple of minutes of triaging our binary is very promising. Let’s see if we can dig a little deeper. Our output from the HUD gives us the addresses of all those strings. Let’s take a look at the address for what looks like uploading exfiltrated data to a C2:
http://40.87.56.192/calisto/upload.php?username="
From the output, we can see that this string is referenced at 0x1000128d0. Let’s go to that address and see what we have. First, double-click the address to select it then copy it with Cmd-C. To escape the HUD, hit ‘return’ so that you are returned to the r2 prompt.
Next, we’ll invoke the ‘seek’ command, which is simply the letter s, and paste the address after it. Hit ‘return’. Type pd (print disassembly) and scroll up in your Terminal window to get to the start of the disassembly.
Seeking in radare2
The disassembly shows us where the string is called via the xref at the top. Let’s again select and Cmd-C that address and do another seek. After the seek, this time we’ll do pdf.
Disassembling a function in radare2
The difference is that pdf will disassemble an entire function, no matter how long it is. On the other hand, pd will disassemble a given number of instructions. Thus, it’s good to know both. You can’t use pdf from an address that isn’t a function, and sometimes you want to just disassemble a limited number of instructions: this is where pd comes in handy. However, when what you want is a complete function’s disassembly, pdf is your friend.
The pdf command gives you exactly what you’d expect from a disassembler, and if you’ve done any reversing before or even just read some r2 intros as suggested above, you’ll recognize this output (as pretty much all r2 intros start with pdf!). In any case, from here you can get a pretty good overview of what the function does, and r2 is nicer than some other disassemblers in that things like stack strings are shown by default.
You might also like to experiment with pdc. This is a “not very good” pseudocode output. One of r2’s weakpoints, it has to be said, is the ability to render disassembly in good pseudocode, but pdc can sometimes be helpful for focus.
Finally, before we move on to the next tip, I’m just going to give you a variation on something we mentioned above that I often like to do with pdf, which is to grep the calls out of it. This is particularly useful for really big functions. In other words, try
pdf~call
for a quick look at the calls in a given function.
5. Rabin2 | Master of Binary Info Extraction
When we discussed strings, I mentioned the izz command, which is a child of the iz command, which in turn is a child of r2’s i command. As you might have guessed, i stands for information, and the various incantations of i are all very useful while you’re in the middle of analysis (if you happen to forget what file you are analyzing, i~file is your friend!).
Some of the useful variants of the i command are as follows:
get file metadata [i]
look at what libraries it imports [ii]
look at what strings it contains [iz]
look at what classes/functions/methods it contains [icc]
find the entrypoint [ie]
However, for rapid triage, there is a much better way to get a bird’s eye view of everything there is to know about a file. When you installed r2, you also installed a bunch of other utilities that r2 makes use of but which you can call independently. Perhaps the most useful of these is rabin2. In a new Terminal window, try man rabin2 to see its options.
While we can take advantage of rabin2’s power via the i command in r2, we can get more juice out of it by opening a separate Terminal window and calling rabin2 directly on our malware sample. For our purposes, focused as we are in this post on rapid triage, the only rabin2 option we need to know is:
% rabin2 -g <path_to_binary>
Triaging macOS malware with rabin2
The -g option outputs everything there is to know about the file, including strings, symbols, sections, imports, and such things like whether the file is stripped, what language it was written in, and so on. It is essentially all of the options of r2’s i command rolled into one (if it’s possible to make r2 punch out all of that in one command, I’m not aware of how).
Strangely, one of the best outputs from rabin2 is when its -g option outputs almost nothing at all! That tells you that you are almost certainly dealing with packed malware, and that in itself is a great guide on where to go next in your investigation (we’ll be looking at packed files in a later post).
Meanwhile, it’s time to introduce our last rapid analysis pro trick, Visual Graph mode!
6. Visual Graph Mode
For those of you used to a GUI disassembler, if you’ve followed this far you may well be thinking… “ahuh…but how do I get a function call graph from a command line tool?” A graph is often a make or break deal when trying to triage malware rapidly, and a tool that doesn’t have one is probably not going to win many friends. Fortunately, r2 has you covered!
Returning to our r2 prompt, type VV to enter visual graph mode.
radar2 graph mode
Visual graph mode is super useful for being able to trace logic paths through a malware sample and to see which paths are worth further investigation. I will readily admit that learning your way around the navigation options takes some practice. However, it is an extremely useful tool and one which I frequently return to with samples that attempt to obstruct analysis.
The options for using Visual Graph mode are nicely laid out in this post here. Once you learn your way around, it’s relatively simple and powerful, but it’s also easy to get lost when you’re first starting out. Like Vi and Vim, inexperienced users can sometimes find themselves trapped in an endless world of error beeps with r2’s Visual Graph mode. However, as with all things in r2, whenever you find yourself “stuck”, hit q on the keyboard (repeatedly, if needs be). If you find yourself needing help, hit ?.
I highly recommend that you experiment with the Calisto sample to familiarize yourself with how it works. In the next post, we’ll be looking in more detail at how Visual Graph mode can help us when we tackle anti-analysis measures, so give yourself a heads up by playing around with it in the meantime.
Conclusion
In this post, we’ve looked at how to use radare2 to quickly triage macOS malware samples, seen how it can easily be integrated with other command line tools most malware analysts are already familiar with, and caught a glimpse of its visual graph mode.
There’s much more to learn about radare2 and macOS malware, and while we hope you’ve enjoyed the tips we’ve shared here, there’s many more ways to use this amazing tool to achieve your aims in reversing macOS malware. We hope you’ll join us in the next post in this series as we continue our exploration of intermediate and advanced macOS malware analysis techniques.
AdLoad is one of several widespread adware and bundleware loaders currently afflicting macOS.
In late 2019, SentinelLabs described how AdLoad was continuing to adapt and evade detection.
This year we have seen over 150 unique samples that are part of a new campaign that remain undetected by Apple’s on-device malware scanner.
Some of these samples have been known to have also been blessed by Apple’s notarization service.
We describe the infection pattern and detail the indicators of compromise for the first time.
Introduction
AdLoad is one of several widespread adware and bundleware loaders currently afflicting macOS. AdLoad is certainly no newcomer to the macOS malware party. In late 2019, SentinelLabs described how AdLoad was continuing to adapt and evade detection, and this year we have seen another iteration that continues to impact Mac users who rely solely on Apple’s built-in security control XProtect for malware detection.
In this post, we detail one of several new AdLoad campaigns we are currently tracking that remain undetected by Apple’s macOS malware scanner. We describe the infection pattern and indicators of compromise for the first time and hope this information will help others to detect and remove this threat.
AdLoad | Staying One Step Ahead of Apple
AdLoad has been around since at least 2017, and when we previously reported on it in 2019, Apple had some partial protection against its earlier variants. Alas, at that time the 2019 variant was undetected by XProtect.
As of today, however, XProtect arguably has around 11 different signatures for AdLoad (it is ‘arguable’ because Apple uses non-industry standard names for its signature rules). As best as we can track Apple’s rule names to common vendor names, the following XProtect rules appear to be all partially or wholly related to AdLoad variants:
Signatures for AdLoad variants in XProtect
The good news for those without additional security protection is that the previous variant we reported in 2019 is now detected by XProtect, via rule 22d71e9.
An earlier AdLoad variant reported by SentinelLabs is now detected by XProtect
The bad news is the variant used in this new campaign is undetected by any of those rules. Let’s see what’s changed.
AdLoad 2021 Campaign | ‘System’ and ‘Service’
Both the 2019 and 2021 variants of AdLoad used persistence and executable names that followed a consistent pattern. In 2019, that pattern included some combination of the words “Search” , “Result” and “Daemon”, as in the example shown above: “ElementarySignalSearchDaemon”. Many other examples can be found here.
The 2021 variant uses a different pattern that primarily relies on a file extension that is either .system or .service. Which file extension is used depends on the location of the dropped persistence file and executable as described below, but typically both .system and .service files will be found on the same infected device if the user gave privileges to the installer.
With or without privileges, AdLoad will install a persistence agent in the user’s Library LaunchAgents folder with patterns such as:
To date, we have found around 50 unique label patterns, with each one having both a .service and a .system version. Based on our previous understanding of AdLoad, we expect there to be many more.
When the user logs in, the AdLoad persistence agent will execute a binary hidden in the same user’s ~/Library/Application Support/ folder. That binary follows another deterministic pattern, whereby the child folder in Application Support is prepended with a period and a random string of digits. Within that directory is another directory called /Services/, which in turn contains a minimal application bundle having the same name as the LaunchAgent label. That barebones bundle contains an executable with the same name but without the com. prefix. For example:
Indicators of compromise in the User’s Library Application Support folder
A hidden tracker file called .logg and containing only a UUID string is also dropped in the Application Support folder. Despite the location, if the dropper has also been granted privileges, then the tracker file is owned by root rather than the user.
The hidden tracker file in the User’s Library Application Support folder
Further, assuming the user supplied admin privileges as requested by the installer, another persistence mechanism is written to the domain /Library/LaunchDaemons/ folder. This plist file uses the file extension .system, and the corresponding folder in the hidden Application Support folder is also named /System/ instead of /Services/.
Indicators of compromise in the Domain Library Application Support folder
The LaunchDaemon is dropped with one of a number of pre-determined labels that mirrors the label used in the LaunchAgent, such as:
The persistence plists themselves pass different arguments to the executables they launch. For the system daemon, the first argument is -t and the second is the plist label. For the user persistence agent, the arguments -s and 6600 are passed to the first and second parameters, respectively.
AdLoad 2021 macOS persistence pattern
Interestingly, the droppers for this campaign share the same pattern as Bundlore/Shlayer droppers. They use a fake Player.app mounted in a DMG. Many are signed with a valid signature; in some cases, they have even been known to be notarized.
Like much other adware, AdLoad makes use of a fake Player.app to install malware
Typically, we observe that developer certificates used to sign the droppers are revoked by Apple within a matter of days (sometimes hours) of samples being observed on VirusTotal, offering some belated and temporary protection against further infections by those particular signed samples by means of Gatekeeper and OCSP signature checks. Also typically, we see new samples signed with fresh certificates appearing within a matter of hours and days. Truly, it is a game of whack-a-mole.
The droppers we have seen take the form of a lightly obfuscated Zsh script that decompresses a number of times before finally executing the malware out of the /tmp directory (for a discussion of how to deobfucscate such scripts see here).
The dropper executes a shell script obfuscated several times over
The final payload is not codesigned and isn’t known to the current version of Apple’s XProtect, v2149.
The malware executes out of /tmp/ and is neither codesigned nor known to XProtectOnce infection is complete, the adware pops the following page in the user’s default browser
How New Is This Variant of AdLoad?
In our investigation, we found over 220 samples of this adware variant on VirusTotal, in both packed and unpacked form. At least 150 of these are unique. Interestingly, a lone sample of this variant was documented by analysts at Confiant, who described the malware’s string decryption routine in a post published on June 3rd, 2021. According to these researchers, the sample they observed had been notarized by Apple.
We note that across our corpus, all samples from November 2020 to August 2021 use the same or similar string decryption routine as that described by Confiant. Similarly, the earlier researchers’ sample, “MapperState.system” conforms to the AdLoad naming pattern that we observed and described above. Both these indicators definitively link our findings with theirs.
AdLoad binaries use a great deal of obfuscation, including custom string encryptionThree different samples, all using a similar string encryption routine
Our research showed that samples began to appear at least as early as November 2020, with regular further occurrences across the first half of 2021. However, there appears to have been a sharp uptick throughout July and in particular the early weeks of August 2021.
It certainly seems possible that the malware developers are taking advantage of the gap in XProtect, which itself has not been updated since a few week’s after Confiant’s research over two months ago. At the time of writing, XProtect was last updated to version 2149 around June 15th – 18th.
Version 2149 is the most recent version of Apple’s XProtect as of August 11th
None of the samples we found are known to XProtect since they do not match any of the scanner’s current set of AdLoad rules.
Running XProtect v2149 against 221 known samples shows no detections
However, there is reasonably good detection across a variety of different vendor engines used by VirusTotal for all the same samples that XProtect doesn’t detect.
All the samples are detected by various VT vendor engines
On our test machine, we set the policy of the SentinelOne Agent to “Detect only” in order to allow the malware to execute and observe its behaviour. In the Management console, the behavioral detection is mapped to the relevant MITRE indicators.
Behavioral Indicators from the SentinelOne agent
Since AdLoad is a common adware threat whose behavior of hijacking search engine results and injecting advertisements into web pages has been widely documented in the past, we ended our observation at this juncture.
Conclusion
As Apple itself has noted and we described elsewhere, malware on macOS is a problem that the device manufacturer is struggling to cope with. The fact that hundreds of unique samples of a well-known adware variant have been circulating for at least 10 months and yet still remain undetected by Apple’s built-in malware scanner demonstrates the necessity of adding further endpoint security controls to Mac devices.
As we indicated at the beginning of this post, this is only one campaign related to AdLoad that we are currently tracking. Further publications related to these campaigns are in progress.
Indicators of Compromise
YARA Hunting Rule
private rule Macho
{
meta:
description = "private rule to match Mach-O binaries"
condition:
uint32(0) == 0xfeedface or uint32(0) == 0xcefaedfe or uint32(0) == 0xfeedfacf or uint32(0) == 0xcffaedfe or uint32(0) == 0xcafebabe or uint32(0) == 0xbebafeca
}
rule adload_2021_system_service
{
meta:
description = "rule to catch Adload .system .service variant"
author = "Phil Stokes, SentinelLabs"
version = "1.0"
last_modified = "2021-08-10"
reference = "https://s1.ai/adload"
strings:
$a = { 48 8D 35 ?? ?? 00 00 48 8D 5D B8 BA B8 00 00 00 48 89 DF E8 ?? ?? FB FF 48 8B 43 08 48 2B 03 66 48 0F 6E C0 66 0F 62 05 ?? ?? 00 00 66 0F 5C 05 ?? ?? 00 00 0F 57 C9 66 0F 7C C0 48 8D 7D A0 0F 29 0F F2 0F 59 05 }
condition:
Macho and all of them
}
TCC is meant to protect user data from unauthorized access, but weaknesses in its design mean that protections are easily overridden inadvertently.
Automation, by design, allows Full Disk Access to be ‘backdoored’ while also lowering the authorization barrier.
Multiple partial and full TCC bypasses are known, with at least one actively exploited in the wild.
TCC does not prevent processes reading and writing to ‘protected’ locations, a loophole that can be used to hide malware.
Introduction
In recent years, protecting sensitive user data on-device has become of increasing importance, particularly now that our phones, tablets and computers are used for creating, storing and transmitting the most sensitive data about us: from selfies and family videos to passwords, banking details, health and medical data and pretty much everything else.
With macOS, Apple took a strong position on protecting user data early on, implementing controls as far back as 2012 in OSX Mountain Lion under a framework known as ‘Transparency, Consent and Control’, or TCC for short. With each iteration of macOS since then, the scope of what falls under TCC has increased to the point now that users can barely access their own data – or data-creating devices like the camera and microphone – without jumping through various hoops of giving ‘consent’ or ‘control’ to the relevant applications through which such access is mediated.
There have been plenty of complaints about what this means with regards to usability, but we do not intend to revisit those here. Our concern in this paper is to highlight a number of ways in which TCC fails when users and IT admins might reasonably expect it to succeed.
We hope that by bringing attention to these failures, users and admins might better understand how and when sensitive data can be exposed and take that into account in their working practices.
Crash Course: What’s TCC Again?
Apple’s latest platform security guide no longer mentions TCC by name, but instead refers to ‘protecting app access to user data’. The current version of the platform security guide states:
“Apple devices help prevent apps from accessing a user’s personal information without permission using various technologies…[in] System Preferences in macOS, users can see which apps they have permitted to access certain information as well as grant or revoke any future access.”
In common parlance, we’re talking about privacy protections that are primarily managed by the user in System Preferences’ Privacy tab of the Security & Privacy pane.
System Preferences.app provides the front-end for TCC
Mac devices controlled by an MDM solution may also set various privacy preferences via means of a Profile. Where in effect, these preferences will not be visible to users in the Privacy pane above. However, they can be enumerated via the TCC database. The command for doing so changes slightly with Big Sur and later.
macOS 11 (Big Sur) and later:
sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db "SELECT client,auth_value FROM access WHERE service=='kTCCServiceSystemPolicyAllFiles'" | grep '2'$
macOS 10.15 (Catalina) and earlier:
sudo sqlite3 /Library/Application\ Support/com.apple.TCC/TCC.db "SELECT client,allowed FROM access WHERE service == 'kTCCServiceSystemPolicyAllFiles'" | grep '1'$
The command line also presents users and administrators with the /usr/bin/tccutil utility, although its claim to offer the ability “to manage the privacy database” is a little exaggerated since the only documented command is reset. The tool is useful if you need to blanket wipe TCC permissions for the system or a user, but little else.
The spartan man page from tccutil
Under the hood, all these permissions are managed by the TCC.framework at /System/Library/PrivateFrameworks/TCC.framework/Versions/A/Resources/tccd.
Strings in tccd binary reveal some of the services afforded TCC protection
Looked at in a rather narrow way with regard to how users work with their Macs in practice, one could argue that the privacy controls Apple has designed with this framework work as intended when users (and apps) behave as intended in that narrow sense. However, as we shall now see, problems arise when one or both go off script.
Full Disk Access – One Rule That Breaks Them All
To understand the problems in Apple’s implementation of TCC, it’s important to understand that TCC privileges exist at two levels: the user level and the system level. At the user level, individual users can allow certain permissions that are designed only to apply to their own account and not others. If Alice allows the Terminal access to her Desktop or Downloads folders, that’s no skin off Bob’s nose. When Bob logs in, Terminal won’t be able to access Bob’s Desktop or Downloads folders.
At least, that’s how it’s supposed to work, but if Alice is an admin user and gives Terminal Full Disk Access (FDA), then Alice can quite happily navigate to Bob’s Desktop and Downloads folders (and everyone else’s) regardless of what TCC settings Bob (or those other users) set. Note that Bob is not afforded any special protection if he is an admin user, too. Full Disk Access means what it says: it can be set by one user with admin rights and it grants access to all users’ data system-wide.
While this may seem like good news for system administrators, there are implications that may not be readily apparent, and these implications affect the administrator’s own data security.
When Alice grants FDA permission to the Terminal for herself, all users now have FDA permission via the Terminal as well. The upshot is that Alice isn’t only granting herself the privilege to access others’ data, she’s granting others the privilege to access her data, too.
Surprisingly, Alice’s (no doubt) unintended permissiveness also extends to unprivileged users. As reported in CVE-2020-9771, allowing the Terminal to have Full Disk Access renders all data readable without any further security challenges: the entire disk can be mounted and read even by non-admin users. Exactly how this works is nicely laid out in this blog post here, but in short any user can create and mount a local snapshot of the system and read all other users’ data.
Even Standard users can read Admin’s private data
The ‘trick’ to this lies in two command line utilities, both of which are available to all users: /usr/bin/tmutil and /sbin/mount. The first allows us to create a local snapshot of the entire system, and the second to mount that snapshot as an apfs read-only file system. From there, we can navigate all users data as captured on the mounted snapshot.
It’s important to understand that this is not a bug and will not be fixed (at least, ‘works as intended’ appears to be Apple’s position at the time of writing). The CVE mentioned above was the bug for being able to exploit this without Full Disk Access. Apple’s fix was to make it only possible when Full Disk Access has been granted. The tl;dr for Mac admins?
When you grant yourself Full Disk Access, you grant all users (even unprivileged users) the ability to read all other users’ data on the disk, including your own.
Backdooring Full Disk Access Through Automation
This situation isn’t restricted only to users: it extends to user processes, too. Any application granted Full Disk Access has access to all user data, by design. If that application is malware, or can be controlled by malware, then so does the malware. But application control is managed by another TCC preference, Automation.
And here lies another trap: there is one app on the Mac that always has Full Disk Access but never appears in the Full Disk Access pane in System Preferences: the Finder.
Any application that can control the Finder (listed in ‘Automation’ in the Privacy pane) also has Full Disk Access, although you will see neither the Finder nor the controlling app listed in the Full Disk Access pane.
Because of this complication, administrators must be aware that even if they never grant FDA permissions, or even if they lock down Full Disk Access (perhaps via MDM solution), simply allowing an application to control the Finder in the ‘Automation’ pane will bypass those restrictions.
Automating the Finder allows the controlling app Full Disk Access
In the image above, Terminal, and two legitimate third party automation apps, Script Debugger and FastScripts, all have Full Disk Access, although none are shown in the Full Disk Access privacy pane:
Apps that backdoor FDA through Automation are not shown in the FDA pane
As noted above, this is because the Finder has irrevocable FDA permissions, and these apps have been given automation control over the Finder. To see how this works, here’s a little demonstration.
~ osascript<<EOD
set a_user to do shell script "logname"
tell application "Finder"
set desc to path to home folder
set copyFile to duplicate (item "private.txt" of folder "Desktop" of folder a_user of item "Users" of disk of home) to folder desc with replacing
set t to paragraphs of (do shell script "cat " & POSIX path of (copyFile as alias)) as text
end tell
do shell script "rm " & POSIX path of (copyFile as alias)
t
EOD
Although the Terminal is not granted Full Disk Access, if it has been granted Automation privileges for any reason in the past, executing the script above in the Terminal will return the contents of whatever the file “private.txt” contains. As “private.txt” is located on the user’s Desktop, a location ostensibly protected by TCC, users might reasonably expect that the contents of this file would remain private if no applications had been explicitly granted FDA permissions. This is demonstrably not the case.
Backdooring FDA access through automating the Finder
The obvious mitigation here is not to allow apps the right to automate the Finder. However, let’s note two important points about that suggestion.
First, there are many legitimate reasons for granting automation of the Finder to the Terminal or other productivity apps: any mildly proficient user who is interested in increasing their productivity through automation may well have done so or wish to do so. Unfortunately, this is an “All-In” deal. If the user has a specific purpose for doing this, there’s no way to prevent other less legitimate uses of Terminal’s (or other programs’) use of this access.
Second, backdooring FDA access in this way results in a lowering of the authorization barrier. Granting FDA in the usual way requires an administrator password. However, one can grant consent for automation of the Finder (and thus backdoor FDA) without a password. A consent dialog with a simple click-through will suffice:
A simple ‘OK’ gives access to control the Finder, and by extension Full Disk Access.
While the warning text is explicit enough (if the user reads it), it is far from transparent that given the Finder’s irrevocable Full Disk Access rights, the power being invested in the controlling app goes far beyond the current user’s consent, or control.
As a bonus, this is not a per-time consent. If it has ever been granted at any point in the past, then that permission remains in force (and thus transparent, in the not-good sense, to the user) unless revoked in System Preferences ‘Automation’ pane or via the previously mentioned tccutil reset command.
The tl;dr: keep a close and regular eye on what is allowed to automate the Finder in your System Preferences Privacy pane.
The Sorry Tale of TCC Bypasses
Everything we’ve mentioned so far is actually by design, but there is a long history of TCC bypasses to bear in mind as well. When macOS Mojave first went on public release, SentinelOne was the first to note that TCC could be bypassed via SSH (this finding was later duplicated by others). The indications from multiple researchers are that there are plenty more bypasses out there.
The most recent TCC bypass came to light after it was discovered being exploited by XCSSET malware in August 2020. Although Apple patched this particular flaw some 9 months later in May 2021, it is still exploitable on systems that haven’t been updated to macOS 11.4 or the latest security update to 10.15.7.
On a vulnerable system, it’s trivially easy to reproduce.
Create a simple trojan application that needs TCC privileges. Here we’ll create an app that needs access to the current user’s Desktop to enumerate the files saved there.
One way you can find the current permitted list of apps is from the ‘Files and Folders’ category in the Privacy tab of System Preferences’ Security & Privacy pane (malware takes another route, as we’ll explain shortly).
Execute the trojan app:
% open /Applications/Some\ Privileged.app/ls.app
Security-minded readers will no doubt be wondering how an attacker achieves Step 2 without already having knowledge of TCC permissions – you can’t enumerate the list of privileged apps in the TCC.db from the Terminal unless Terminal already has Full Disk Access.
Assuming the target hasn’t already granted Terminal FDA privileges for some other legitimate reason (and who hasn’t these days?), an attacker, red teamer or malware could instead enumerate over the contents of the /Applications folder and take educated guesses based on what’s found there, e.g., Xcode, Camtasia, and Zoom are all applications that, if installed, are likely to be privileged.
Similarly, one could hardcode a list of apps known to have such permissions and search the target machine for them. This is precisely how XCSSET malware works: the malware is hardcoded with a list of apps that it expects to have screen capture permissions and injects its own app into the bundle of any of those found.
Decoded strings from XCSSET malware reveals a list of apps it exploits for TCC permissions
Unfortunately, the fix for this particular bug doesn’t effectively stop malware authors. If the bypass fails, it’s a simple matter to just impersonate the Finder and ask the user for control. As with the Automation request, this only requires the user to click-through their consent rather than provide a password.
Fake Finder App used by XCSSET malware to access protected areas
As we noted above, the (real) Finder already has Full Disk Access by default, so users seeing a request dialog asking to grant the Finder access to any folder should immediately raise suspicion that something is amiss.
TCC – Just One More Thing
That almost wraps up our tour of TCC gotchas, but there’s one more worth pointing out. A common misunderstanding with Apple’s User privacy controls is that it prevents access to certain locations (e.g., Desktop, Documents, Downloads, iCloud folders). However, that is not quite the case.
Administrators need to be aware that TCC doesn’t protect against files being written to TCC protected areas by unprivileged processes, and similarly nor does it stop files so written from being read by those processes.
A process can write to a TCC protected area, and read the files it writes
Why does this matter? It matters because if you have any kind of security or monitoring software installed that doesn’t have access to TCC-protected areas, there’s nothing to stop malware from hiding some or all of its components in these protected areas. TCC isn’t going to stop malware using those locations – a blind spot that not every Mac sys administrator is aware of – so don’t rely on TCC to provide some kind of built-in protected ‘safe-zone’. That’s not how it works, when it works at all.
Conclusion
We’ve seen how macOS users can easily and unknowingly expose data they think is protected by TCC simply by doing the things that macOS users, particularly admins, are often inclined to do. Ironically, most of these ‘inadvertent breaches’ are only possible because of TCC’s own lack of transparency. Why, for example, is the Finder not listed in the Full Disk Access pane? Why is it not clear that Automation of the Finder backdoors Full Disk Access? And why is password-authentication downgraded to a simple consent prompt for what is, effectively, the same privilege?
Other questions raised by this post concern whether consent should have finer grained controls so that prompts can be optionally repeated at certain intervals, and – perhaps most importantly – whether users should be able to protect their own data by being allowed to opt out of FDA granted by other users on the same device.
We know that malware abuses some of these loopholes, and that various TCC bugs exist that have yet to be patched. Our only conclusion at this point has to be that neither users nor admins should place too much faith in the ability of TCC as it is currently implemented to protect data from unauthorized access.