Knowledge Primer

Knowledge Primer

The below concepts and information will be useful to understand the rest of my blog, therefore they are very briefly covered. The idea is to provide a refresher to those familiar with these concepts or a starting point for anyone trying to dive deeper into the topics covered in this blog series. Being somewhat familiar with the below concepts would be highly advantageous to the reader.

PE Structure

A very high level overview of the Portable Executable (PE) file structure is represented below. PE

Image Source

A PE file contains some “headers” that will store data and metadata about the file itself, and “sections” that will store data required for the execution.

The most common ones are:

  • .TEXT - executable code
  • .DATA - initialised global and static variables used by the program. These variables are initialised with specific values at compile time.
  • .RDATA - read-only data, such as constant strings, string literals, and other immutable data.
  • .RSRC - other resources the binary may use e.g icons, images, etc

One of the most useful bits of information to us is the Relative Virtual Address (RVA) offsets it contains to specific data sections in the file, which allows for navigation to specific resources given a PE base address.

More about it in @xen0vas’s blog post and on Microsoft’s website.

Windows Operating System (OS)

The Windows OS is mainly divided into Usermode and Kernelmode. Kernelmode being hard to access and typically requiring an installed driver, whereas Usermode is what is accessible by users on the machine.

The following diagram shows the general flows of interaction between both modes. OS

Image Source

Debuggers

The tools that allow low-level programmers to inspect and debug a program. There are many debuggers one can use for Windows environments, I personally like the following ones.

  • x64dbg - Inspecting a program, process loaded modules, etc
  • Windbg - Inspecting Windows process structures such as Process Environment Block (PEB) etc

Stack

General

A high level overview of how a process might be represented. ProcessMemoryLayout

Image Source

Last-In First-Out (LIFO)

A common stack organisation concept that allows low-level programmers to understand how items are placed and retrieved from the stack.

Consider the following assembly code:

1
2
3
4
push 1  ; decimal 1 value
push 2  ; decimal 2 value
push 3  ; decimal 3 value
pop rax ; rax will now contain decimal value 3

StackLIFO

Image Source

x64 Assembly

Assembly Formats

There are two main assembly formats used, this blog will focus on MASM assembly.

  • NASM (Netwide Assembler)
  • MASM (Microsoft Macro Assembler)
Data Sizes

These are common data sizes found in assembly that represent chunks of data.

  • BYTE - 8 bits of data or 1 byte
  • WORD - 16 bits of data or 2 bytes
  • DWORD - 32 bits of data or 4 bytes
  • QWORD - 64 bits of data or 8 bytes
Registers

Registers available in x64 architecture.

Registers

Image Source

Common Instructions

This is by no means an exhaustive list, but some common instructions and their meanings can be found here.

  • xor r8, r8 - Clear the r8 register and set to NULL
  • mov r8, rdx - Load a pointer to RDX into R8
  • mov r8, [rdx] - Load the value of RDX into R8
  • mov r8d, dword ptr [rdx] - Load the lower 32 bit RDX value into R8
  • mov r8w, word ptr [rdx] - Load the lower 16 bit RDX value into R8
  • mov r8b, byte ptr [rdx] - Load the lower 8 bit RDX value into R8
  • push r8 - Push the contents of R8 onto the stack (can be used for saving register values)
  • pop r8 - Pop the last inputted stack item back into the R8 register (can be used for restoring a previously set register value)
  • shr r8, 1 - Bit shift right (effectively dividing the register contents by 2)
  • shl r8, 1 - Bit shift left (effectively multiplying the register contents by 2)
  • jmp <label> - Jump to a specific label in the assembly code function
  • cmp r8, r9 - Compare the contents off the R8 and R9 registers (comparison via subtraction. Updates the status flags with result of operation)
  • jnz <label> - If compare instruction cmp is not equal to zero (checks ZF flag) jump to specific label
  • call <label> - Call a another defined assembly function

You can find some more useful information here

String Length, Comparisons and Obfuscation in x64 MASM Assembly

String Length & Comparison with x64 MASM Assembly

This may not be the most efficient way to perform string comparisons in assembly, but understanding some of these concepts will help to understand how to search for process loaded modules and their specific functions.

As we will be looping through a list of structures that contain either a function name or module name string (which could be represented as a NULL separated character array), we can be quite efficient in our comparisons.

  1. Length: An obvious choice, if the module or function name character length does not match the length of the string we are searching for, we can immediately iterate to the next module or function string that needs to be compared. Sometimes we will be fortunate and Windows structures will contain the length of whatever string we want to work with, other times we will need to find the length of a given string ourselves.
  2. Byte Comparison: If the length of our search string matches the length of the current module or function name string being checked, we can perform a byte-by-byte character comparison check to make sure all characters in the current string match those of our search string.

Sample x64 MASM assembly to get a string length (reading characters from a location till we hit a NULL byte):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
; logic to find currentFunc str length
get_string_length PROC

    ; rcx has currentFunc ptr 

    ; Initialize rdi to use it as a counter
    xor rdi, rdi

    ; Loop through the string byte by byte
count_loop:
    cmp byte ptr [rcx + rdi], 0   ; Compare the byte with null terminator
    je count_done             ; If null terminator is found, end counting
    inc rdi                   ; Increment the counter
    jmp count_loop            ; Continue counting

count_done:
    ; Move the length to rax
    mov rax, rdi
    ret ; return value goes back to rax

get_string_length ENDP

Let’s run through the above code, our get_string_length function takes 1 argument - a string pointer (or rather it’s address). This is passed into the x64 MASM assembly function with the rcx register, which handles the first (and in this case, also the only) function parameter. We can clear a register of our choice with the xor instruction effectively setting the contents of another register such as rdi to 0, which we will use as a loop counter. We will then use a loop to compare each byte sequentially to the value NULL or rather 0x00. For every byte that is not NULL we can increment the counter, when we hit NULL we know that the length of the string is the value of the counter. When our byte comparison indicates that we have hit NULL, we can set the counter value rdi to rax (handles return values) and exit the get_string_length assembly function. The string length (rax) would then be returned to the calling function.

Sample x64 MASM assembly to perform byte-by-byte comparison of two strings (handles comparison of two strings of the same length that are NULL terminated):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
compare_strings PROC
    ; Inputs:
    ;   rcx: Pointer to the first string  ; currentFunc
    ;   rdx: Pointer to the second string ; FuncName (r9)
    ;   r8: Length of the strings
    ; Outputs:
    ;   rax: Zero if strings are equal, non-zero otherwise

    ; Compare byte by byte
    ; use r14 and r15 since free
    xor r14, r14
    xor r15, r15
    ; rcx is default loop counter
    mov rbx, rcx ; put rcx str into rsi
    mov rcx, r8 ; put length in rcx

compare_loop:
    mov r14b, BYTE PTR [rbx]  ; Load byte from the first string
    mov r15b, BYTE PTR [rdx]  ; Load byte from the second string ;rdx
    cmp r14b, r15b              ; Compare bytes
    jne strings_not_equal   ; If bytes are not equal, jump to strings_not_equal
    inc rbx                 ; Move to the next byte in the first string
    inc rdx                 ; Move to the next byte in the second string ; rdx
    loop compare_loop       ; Loop until all bytes are compared

    ; If loop completes without finding a difference, strings are equal
    mov rax, 0  ; Set return value to 0 (strings are equal)
    jmp end_compare_strings

; if not equal restore registers from stack and loop to next currentFunc?
strings_not_equal:
    ; If a difference is found, set return value to non-zero
    mov rax, 1  ; Set return value to 1 (strings are not equal)

end_compare_strings:
    ret

compare_strings ENDP

For the string comparison we take in pointers to both strings as rcx and rdx respectively, and the common length of both as r8. We will use two temporary registers r14 and r15 to perform our byte by byte comparison. We need to reshuffle our registers as rcx acts as the counter when using an assembly loop that calls the loop instruction, so we set the string length r8 into rcx which will automatically be decremented when the loop instruction is executed. Furthermore, we move the first string into rbx.

In the loop itself, we set each current byte (8 bits) of rbx and rbx into r14b and r15b respectively (r14b and r15b are both referring to the lower 8bits of each register which have 64 bits available). We then call the cmp instruction which will return a value to a flag, the one we will be interested in here is the Zero Flag (ZF). If ZF flag is set post comparison, the comparison would indicate the bytes are the same. If the ZF flag was not set, the following jne instruction would be triggered returning from the assembly function and error. Otherwise, the addresses pointed to by rbx and rdx increment by 1 pointing to the next char in both strings being compared. If the loop completes without an error, the strings are equal and we can return 0 for success from the assembly function.

Basic String Obfuscation in x64 MASM Assembly

Having clear text strings in a PE is suboptimal for malware developers, not to mention OPSEC costly. Cleartext content referencing ntdll.dll or NtDelayExecution is sure to raise alarms during any performed Static Analysis.

Using complex encryption such as AES has the benefit of being hard to decrypt, but significantly increases the PE entropy. There are other better suited encryption functions such as XOR that we could consider using or we could implement some form of function lookup via API Hashing. But for this code example, let’s do something a lot simpler, keeping the entropy low and achieving our goal.

Obfuscate:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
Obf PROC

	; In:
		; rcx - cleartext str pointer
		; rdx - length
	; Out:
		; rax - Obf string

	; iterate over string contents
	; for odd character index, increment value
	; for even character index, decrement value
	; this should change the byte values +1 or -1
	; e.g 0x42 ('A') may become either 0x41 (even index) or 0x43 (odd index)

	; save pointer to string 
	push rcx
    mov rbx, rcx ; move str to rbx
	xor r14, r14 ; clear out r14 to point to byte
	; r9 will be our counter
	xor r9, r9
	xor rcx, rcx
		
routine:
	; check if [rbx+r9] content is NULL 
	cmp byte ptr [rbx + r9], 0
	je done ; if null - ZF set - done 

	mov r14b, byte ptr [rbx + r9] ; current rbx byte into r14b (lower 8 bits)
	; perform odd or even check on rcx
	mov rcx, r9
	inc rcx
	
	test rcx, 1              ; Test the LSB with 1
	jz   even_number         ; Jump if the LSB is 0 (even)
	
	; Odd number code here
	inc r14b
	mov byte ptr [rbx + r9], r14b
    jmp end_check
	
even_number:
	; Even number code here
	dec r14b
	mov byte ptr [rbx + r9], r14b
	
end_check:

	; increment counter
	inc r9
	jmp routine ; go to next byte	

done:	
	pop rax ; get string pointer (hopefully now obuscated) 
	ret

Obf ENDP

We simply iterate over a string’s characters and increment or decrement their hex value based on whether the index at which the character is located at is an even or odd number. Increment the value for odd index characters, decrement the value for even index characters. We then return the modified string.

We can apply the same concept for de-obfuscation, but this time decrementing value on odd index characters and incrementing value on even index characters to reverse our previous obfuscation.

De-obfuscate (based on obfuscated content above):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
DeObf PROC

	; In:
		; rcx - obfuscated str pointer
		; rdx - length
	; Out:
		; rax - cleartext string

	; iterate over string contents
	; for odd character index, decrement value (inverse of Obfuscate)
	; for even character index, increment value (inverse of Obfuscate)
	; this should change the byte values +1 or -1
	; e.g 0x42 ('A') may become either 0x41 (even index) or 0x43 (odd index)

	; save pointer to string 
	push rcx
    mov rbx, rcx ; move str to rbx
	xor r14, r14 ; clear out r14 to point to byte
	; r9 will be our counter
	xor r9, r9
	xor rcx, rcx
		
routine:
	; check if [rbx+r9] content is NULL 
	cmp byte ptr [rbx + r9], 0
	je done ; if null - ZF set - done 

	mov r14b, byte ptr [rbx + r9] ; current rbx byte into r14b (lower 8 bits)
	; perform odd or even check on rcx
	mov rcx, r9
	inc rcx
	
	test rcx, 1              ; Test the LSB with 1
	jz   even_number         ; Jump if the LSB is 0 (even)
	
	; Odd number code here
	dec r14b
	mov byte ptr [rbx + r9], r14b
    jmp end_check
	
even_number:
	; Even number code here
	inc r14b
	mov byte ptr [rbx + r9], r14b
	
end_check:

	; increment counter
	inc r9
	jmp routine ; go to next byte	

done:	
	pop rax ; get string pointer (hopefully now cleartext) 
	ret


DeObf ENDP

If we wanted to use obfuscated strings in our C++ code we would obviously pre-obfuscate strings and place their resulting values in our code. Then simply call the de-obfuscation assembly function like so:

1
2
3
4
5
6
7
8
9
10
11
...
extern "C" int get_string_length(char* i);
extern "C" void* Obf(char* i, size_t l);
extern "C" void* DeObf(char* i, size_t l);
...
// Pre-obfuscated string with Obf() assembly function
char obfuscated[] = "Udts";
char* deobfuscated = {0};
deobfuscated = (char*)DeObf((char*)obfuscated, get_string_length((char*)obfuscated));
// printf("%s \n", deobfuscated); // De-obfuscated value: Test
...

Some strings and their obfuscated values if you wanted to use the above functions without modification:

1
2
Obfuscated: 'ntdll.dll' is 'osekm-ekm'
Obfuscated: 'NtDelayExecution' is 'OsEdm`zDyddtuhpm'

Dissassembling with Zydis

Zydis is a neat little dissassembly library that can be used by malware developers to dissassemble code. It’s particularly useful when working with Windows libraries.

Here is an example of it could be used:

Zydis helper functions (obviously not OPSEC safe mainly for maldev debug):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
// Function to determine the size of a function
size_t GetFunctionSize(void* functionAddress) {
    // Define a maximum number of bytes to disassemble
    const size_t maxBytes = 1024;

    // Create a buffer to store the function bytes
    std::vector<uint8_t> buffer(maxBytes);

    // Read the function bytes from memory
    SIZE_T bytesRead;
    if (!ReadProcessMemory(GetCurrentProcess(), functionAddress, buffer.data(), maxBytes, &bytesRead)) {
        printf("Failed to read function bytes. \n");
        return 0;
    }

    // Return the number of bytes read
    return bytesRead;
}

// Function to display the byte code and disassembled instructions
void DisplayByteCode(void* functionAddress, size_t size) {
    // updated code from GH
    // Initialize decoder context
    ZydisDecoder decoder;
    ZydisDecoderInit(&decoder, ZYDIS_MACHINE_MODE_LONG_64, ZYDIS_STACK_WIDTH_64);

    // Initialize formatter. Only required when you actually plan to do instruction
    // formatting ("disassembling"), like we do here
    ZydisFormatter formatter;
    ZydisFormatterInit(&formatter, ZYDIS_FORMATTER_STYLE_INTEL);

    // Loop over the instructions in our buffer.
    // The runtime-address (instruction pointer) is chosen arbitrary here in order to better
    // visualize relative addressing
    ZyanU64 runtime_address = 0x007FFFFFFF400000; // not used
    ZyanUSize offset = 0;
    const ZyanUSize length = 22; // test 28 bytes
    ZydisDecodedInstruction instruction;
    ZydisDecodedOperand operands[ZYDIS_MAX_OPERAND_COUNT]; // removing _VISIBLE fixed the error
    while (ZYAN_SUCCESS(ZydisDecoderDecodeFull(&decoder, reinterpret_cast<uint8_t*>(functionAddress) + offset, length - offset, &instruction, operands)))
    {
        // Print current instruction pointer.
        printf("%016" PRIX64 "  ", runtime_address);

        // Format & print the binary instruction structure to human-readable format
        char buffer[256];
        ZydisFormatterFormatInstruction(&formatter, &instruction, operands,
            instruction.operand_count_visible, buffer, sizeof(buffer), runtime_address, ZYAN_NULL);
        puts(buffer);

        offset += instruction.length;
        runtime_address += instruction.length;
    }
}

Calling Zydis in your main code block:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <vector>
#include <inttypes.h>
#include "Zydis.h"
...
 printf("Testing resolved function dissasembly with Zydis. \n");
 printf("--------------------------------------------------- \n");

 void* functionAddress = GetProcAddress(GetModuleHandle(L"ntdll.dll"), "NtCreateThreadEx");

 // Get the size of the function
 size_t functionSize = GetFunctionSize(functionAddress);
 if (functionSize == 0) {
     printf("Failed to get function size. \n");
     return 1;
 }

 // Display the byte code and disassembled instructions
 DisplayByteCode(functionAddress1, functionSize);
 ...

We can place an “EDR hook” in our development machine with SylantStrike, and use Zydis to output the hooked assembly of the NtProtectVirtualMemory ntdll.dll function:

1
2
3
4
5
6
7
8
9
10
ntdll.dll base address (Assembly DLL resolve via PEB): 0x00007FFCBF420000
NtProtectVirtualMemory function address (Assembly NtFunc resolve via PEB): 0x00007FFCBF4C0BC0
NtProtectVirtualMemory function EDR hook jmp address: 0x00000000FFF50411
Testing resolved function dissasembly with Zydis.
---------------------------------------------------
00007FFCBF4C0BC0   E9 11 04 F5 FF  jmp 0x00007FFCBF410FD6
00007FFCBF4C0BC5   25 08  add [rax], al
00007FFCBF4C0BC7   7F 01  add dh, dh
00007FFCBF4C0BC9   0F 05  add al, 0x25
00007FFCBF4C0BCB   2E C3  or [rbx], al