Memory (RAM)
Why is memory relevant in malware analysis?
This section will delve into memory, but let's get an overview of why and how it is relevant.
Volatile and ephemeral nature of data stored: As compared to a hard drive which holds persistent data, RAM holds the current state of a machine which may be irretrievably lost if switched off. Getting a snapshot of an infected machine's RAM through a memory dump tool can shed light on crucial artifacts such as processes and their modules (DLLs), network connections, user activity (e.g., clipboard contents), and keys and passwords.
Identify fileless and memory-resident malware: Fileless malware which doesn't write to disk in an executable form may not be picked up by AV and disk-based forensic tools (e.g., FTK Imager, Autopsy/Sleuth Kit), and they tend to Live-off-the-Land (attacks which utilize existing and innocuous tools such as psexec, WMI, netsh) to inject malicious code into legitimate processes. As such, RAM analysis may be the only way to detect such threats. Additionally, obfuscated malware is easier to analyze when it unpacks itself into memory, rather than digging through heaps of junk code (e.g., BabbleLoader).
Understand malware behavior: Analyzing the memory allows us to do the following: - enumerate running processes in the memory to identify malwares which spawn a legitimate process with tools such as Volatility. - identify injection of malicious code, modified function pointers, and hooked APIs. - identify active network connections and their IP addresses, ports, protocols to figure out their C2 infrastructure. - identify anti-forensic techniques which malware uses to detect sandboxes/debuggers.
TL;DR: RAM is sort of a live view of what is happening in a machine on a level granular enough that intentionally hidden processes are actions are revealed.
Cool, now let's go look at RAM.
Memory sections relevant to malware analysis:
Since we're covering malware analysis, the coverage of memory will be limited to parts which are relevant. Disclaimer done.
Any program that is loaded into the RAM only has access to the allocated section of memory. This memory has 4 sections:
Code: Contains the program's code which is the text section of a Portable Executable (PE) file. The CPU can execute data in this part of the program memory.
Data: Refers to the data section of a PE file. Contains data that should not change during execution such as Global variables.
Heap: Contains variables and data created and destroyed during program execution. Thus known as dynamic memory.
Stack: Contains local variables, arguments, and return address of the parent process which called the program. The stack uses a Last In First Out (LIFO) principle, meaning the last element pushed to the stack is the first to be taken out. Basically, think stack of plates and the first plate at the top is called the active frame.
An example would be:
1. The main
function is running
2. main
calls a function named foo()
3. Before it jumps to foo
, the CPU pushes the return address onto the stack
4. CPU jumps to foo
function
5. In foo
, local variable is created and pushed onto stack
6. foo
finishes execution
7. CPU retrieves the return address from top of stack
8. CPU jumps back to instruction at the return address in main
The stack is especially relevant as we can use debuggers and disassemblers to look for unsafe functions such as strcpy
, sprintf
, and gets
. In the case of a buffer overflow, we can check the memory dump to find the shellcode (classic BO structure: junk data, shellcode, hijacked return address) and figure out what the malware does.
The CPU uses the Stack Pointer (RSP) and Base Pointer (RBP) to keep track of the stack. Stack Pointer points to the top of the stack, and Base Pointer points to the bottom.
Fun fact: how does the OS know how much memory to allocate to a program?
Two factors are needed for the OS to determine this. The executable file header provides the initial size requirement, and the program itself can perform dynamic memory requests via system calls for more.
1. Executable File Header: After a program is compiled and linked, the linker will package the machine code into an executable file which includes a header. The header contains key information such as the initial code and data size, initial stack size, and initial memory heap. The OS gets the size requirements from the executable file's header for the initial launch. But this requirement can change as the program runs, which is where the second part comes in.
2. Dynamic memory requests:
The program's memory needs can change during runtime for various reasons such as processing a large dataset from user input and creating objects which weren't known during compile time. So the programmer would use library functions like malloc()
to make a system call to the OS for more memory. The kernel then makes a mapping in the page table and returns a pointer with the new virtual memory to the requesting program.
Last updated
Was this helpful?