Before I publishing new articles on reversing and internals, people have requested to writing a text about Malwoverview, which is a simple tool that is used for threat hunting. Therefore, I’ve written a short and introductory article about how to use Malwoverview and Tines:
I returned to write the second article of Malware Analysis Series (MAS) last January/08 after receiving an outstanding support from a high-profile professional and company of the industry, but while the article is not ready (I working on page 43 and far from the end), I spent a couple of hours writing a simple and short article on malicious document analysis. I hope it helps someone.
While the first article of MAS (Malware Analysis Series) is not ready, I’m leaving here a very simple case of malicious document analysis for helping my Twitter followers and any professional interested in learning how to analyze this kind of artifact.
Before starting the analysis, I’m going to use the following environment and tools:
All three tools above are usually installed on REMnux by default. However, if you are using Ubuntu or any other Linux distribution, so you can install them through links and command above.
Like any common binary, we can analyze any maldoc using static or dynamic analysis, but as my preferred approach is always the former one, so let’s take it.
We’ll be analyzing the following sample: 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc
Downloading sample and gathering information
The first step is getting general information about this hash by using any well-known endpoint such as Virus Total, Hybrid Analysis, Triage, Malware Bazaar and so on. Therefore, let’s use Malwoverview to do it on the command line and collect information from Malware Bazaar that, fortunately, also brings information from excellent Triage:
Given the output above, we could try to make an assumption that the dropped executable comes from the own maldoc because Microsoft Office “loads VBA resource, possible macro or embedded object present“. Furthermore, the maldoc seems to elevate privilege (AdjustPrivilege( )), hook (intercept events) by installing a hook procedure into a hook chain (SetWindowsHookEx( )), maybe it makes code injection (WriteProcessMemory( )), so we it’s reasonable to assume these Triage signatures are associate to the an embedded executable. Therefore it’s time to download the malicious document from Triage (you can do it from https://tria.ge/dashboard website, if you wish):
From both previous outputs, important facts come up:
Some code is executed when the MS Word is executed.
A file seems to be written to the file system.
The maldoc seems to open a file (probably the same written above).
VBA macros are responsible for the entire activity.
The next step is to analyze the maldoc, which is a OLE document, we are going use oledump.py (from Didier Steven’s suite — @DidierStevens) to check the OLE’s internals and try to understand what’s happening:
According to the figure above we have:
three macros in 16, 17 and 18.
a big “content” in 11, which could be one of “VBA resources” mentioned Triage’s output.
Once again, we can decide to use dynamic analysis (a debugger) or static analysis to expose the real threat hidden inside this malicious document, but let’s proceed with static analysis because it will bring more details while addressing the problem.
In the next step we need to check the macros’ content by uncompressing their contents (-v option) using oledump.py:
remnux@remnux:~/articles$ oledump.py -s 16 -v 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc.docx | more
There’re few details that can be observed from output above:
Obviously the code is obfuscated.
The Split function, which returns a zero-based and one-dimensional array containing substrings, manipulates the content from UserForm1 (object 11) and, apparently, this content is divided in four parts (TextBox1, TextBox2, TextBox3 and TextBox4). In addition, the UserForm1 content seems to be separated by “!” character.
The UserForm2 is also being (TextBox1 and TextBox2) in a MoveFile operation.
The Winmgmt service, which is a WMI service operating inside the svchost process under LocalSystem account, is being used to execute an operation given by UserForm2.TextBox5.
The UserForm2.Text6 is used to create a reference to an object provided by ActiveX.
The UserForm2.Text7 is being used to save some content as a binary file.
Therefore we must investigate the content of object 15 (Macros/UserForm2/o):
Analyzing the image above (check SaveBinaryData() function) and previous figures, it’s reasonable to assume that an executable, which we don’t know yet, will be saved as “winword.com“ and later it will be renamed to “winword.exe“ within C:\Users\Public\Pictures\ directory. Finally, the binary will be executed by calling objProcess.create() function.
At this point, we should verify the content of object 11 (check “Macros/UserForm1/o“) because it likely contain our “hidden” executable. Thus, run the following command:
remnux@remnux:~/articles$ oledump.py 59ed41388826fed419cc3b18d28707491a4fa51309935c4fa016e53c6f2f94bc.docx -s 11 -d | more
As we expected and mentioned previously, these decimal numbers are separated by “!” character.
Additionally, there’s a catch: according to last figure, this object has 4 parts (UserForm1.Text1, UserForm1.Text2, UserForm1.Text3 and UserForm1.Text4), so we should dump it into a file (dump1), edit and “join” all parts.
To dump the “object 11” into a file (named dump1) execute the following command: :
Editing the file using “vi” command or any other editor.
Using “$” to go to the end of each line.
Removing occurrences of “Tahoma” word and any garbage (easily identified) from the text.
Join this line with the next one (“J” command on “vi“)
After editing the dump1 file, we have two replace all “!” characters by commas, and transform all decimal numbers into hex bytes. First, replace all “!” characters by comma using a simple “sed” command:
remnux@remnux:~/articles$ sed -e ‘s/!/,/g’ dump1 > dump3
remnux@remnux:~/articles$ cat dump3 | more
From this point we have to process and transform this file (dump3) to something useful end we have two clear options:
We can write a Python 3 code to statically decode the dump3 file into a possible executable.
I’m going to show you both methods, though I always prefer programming a small script. Please, pay attention to the fact that all decimal numbers are separated by comma, so it will demand an extra concern during the decoding operation.
To decode this file on CyberChef you have to:
Load it onto CyberChef’s input pane. There’s an button on top-right to do it.
Pick up “From Decimal” operation and configure the delimiter to “Comma”.
Afterwards, you’ll see an executable in the Output pane, which can be saved onto file system.
Saving the file from Output pane, save the file and check its type:
remnux@remnux:~/Downloads$ file download.dat
download.dat: PE32 executable (GUI) Intel 80386 Mono/.Net assembly, for MS Windows
It’s excellent! Let’s now write a simple Python code named python_convert.py to perform the same operation and get the same result:
final_file.bin: PE32 executable (GUI) Intel 80386 Mono/.Net assembly, for MS Windows
As we expected, it’s worked! Finally, let’s check the final binary on Virus Total and Triage to learn a bit further about the extracted binary (next figures):
It would be super easy to extract the same malware from the maldoc by using dynamic analysis. You’ll find out that a password is protecting the VBA Project, but this quite trivial to remove this kind of protection:
That’s it! I hope you have learned something new from this article and see you at next next one.
“Long is the way and hard, that out of hell leads up to light.”
(by John Milton from Paradise Lost — 1667)
My name is Alexandre Borges and I’m a security researcher focused on reverse engineering, exploit development and programming. Therefore, I’ll try to keep this blog updated and including write-up’s about these topics.
Honestly, I hope you can learn something from my posts.
Please, you should feel free to contact me and comment about any mistake and inaccuracy.