Knowledge Fragment: Unwrapping Fobber


About two weeks ago I came across an interesting sample using an interesting anti-analysis pattern. The anti-analysis technique can be best described as runtime-only code decryption. This means prior to execution of a function, the code is decrypted, then executed and finally encrypted again, but with a different key. Malwarebytes has already published an analysis on this family they called Fobber. However, in this blog post I wanted to share how to unwrap the sample’s encrypted functions for easier analysis. There is also another blog post detailing how to work around the string encryption.

The sample and code related to this blog post can be found on Github.

Fobber’s Function Encryption Scheme

First off, let’s have a look how Fobber looks in memory, visualized by IDA’s function analysis:

ida screenshot

IDA only recognizes a handful of functions. Among these is the actual code decryption routine, as well as some code handling translating relevant addresses of the position independent code into absolute offsets.

Next, a closer look at how the on-demand decryption/encryption of functions works:

ida screenshot

We can see that function sub_95112A starts with a call to what I renamed decryptFunctionCode:

ida screenshot

This function does not make use of the stack, thus it is surrounded by a simple pushad/popad prologue/epilogue. We can see that some references are made relative to the return address (initially put into esi by copying from [esp+20h]):

The actual cryptXorCode takes those values as parameters and then loops over the encrypted function body, xor’ing with the current key and then updating the key by rotating 3 bit and adding 0x53.

ida screenshot

After decryption, our function makes a lot more sense and we can see the default function prologue (push ebp; mov ebp, esp) among other things.

ida screenshot

Also note the parameters:

So far so good. Now let’s decrypt all of those functions automatically.

Decrypt All The Things

First, we want to find our decryption function. For all Fobber samples I looked at, the regex r"\x60\x8B.\x24\x20\x66" was delivering unique results for locating the decryption function.

Next, we want to find all calls to this decryption function. For this we can use the regex r"\xE8" to find all potential call rel_offset instructions. Then we just need to do some address math and check if the call destination (calculated as: image_base + call_origin + relative_call_offset + 5) is equal to the address of our decryption function. Should this be the case, we can extract the parameters as described above and decrypt the code.

We then only need to exchange the respective bytes in our binary with the decrypted bytes. In the following code I also set the decryption flag and fix the function ending with a retn ({ C3 }) instruction to ease IDA’s job of identifying functions afterwards. Otherwise, rinse/repeat until all functions are decrypted.

Code:

#!/usr/bin/env python
import re
import struct

def decrypt(buf, key):
    decrypted = ""
    for char in buf:
        decrypted += chr(ord(char) ^ key)
        # rotate 3 bits
        key = ((key >> 3) | (key << (8 - 3))) & 0xFF
        key = (key + 0x53) & 0xFF
    return decrypted

def replace_bytes(buf, offset, bytes):
    return buf[:offset] + bytes + buf[offset + len(bytes):]

def decrypt_all(binary, image_base):
    # locate decryption function
    decrypt_function_offset = re.search(r"\x60\x8B.\x24\x20\x66", binary).start()
    # locate all calls to decryption function
    regex_call = r"\xe8(?P<rel_call>.{4})"
    for match in re.finditer(regex_call, binary):
        call_origin = match.start()
        packed_call = binary[call_origin + 1:call_origin + 1 + 4]
        rel_call = struct.unpack("I", packed_call)[0]
        call_destination = (image_base + call_origin + rel_call + 5) & 0xFFFFFFFF
        if call_destination == image_base + decrypt_function_offset:
            # decrypt function and replace/fix
            decrypted_flag = ord(binary[call_origin - 0x2])
            if decrypted_flag == 0x0:
                key = ord(binary[call_origin - 0x3])
                size = struct.unpack("H", binary[call_origin - 0x5:call_origin - 0x3])[0] ^ 0x461F
                buf = binary[call_origin + 0x5:call_origin + 0x5 + size]
                decrypted_function = decrypt(buf, key)
                binary = replace_bytes(binary, call_origin + 0x5, decrypted_function)
                binary = replace_bytes(binary, call_origin + len(decrypted_function), "\xC3")
                binary = replace_bytes(binary, call_origin - 0x2, "\x01")
    return binary

[...]

IDA likes this this already better:

ida screenshot

However, we are not quite done yet, as IDA still barfs on a couple of functions.

Conclusion

After decrypting all functions, we can already start analyzing the sample effectively. But we are not quite done yet, and the second post looks closer at the inline usage of encrypted strings.

sample used:
md5: 49974f869f8f5d32620685bc1818c957
sha256: 93508580e84d3291f55a1f2cb15f27666238add9831fd20736a3c5e6a73a2cb4

link to original post on blogspot.