0% found this document useful (0 votes)
6 views

Lesson 3 - Static & Dynamic Analysis

The document outlines the processes of static and dynamic analysis in reverse engineering, detailing steps such as assessment, disassembly, decompilation, and monitoring. It discusses techniques for understanding code structure and behavior without execution, as well as runtime behavior analysis, including memory analysis and instrumentation. Additionally, it addresses obfuscation methods that hinder reverse engineering efforts and the importance of instrumentation for collecting runtime data.

Uploaded by

Braincain007
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Lesson 3 - Static & Dynamic Analysis

The document outlines the processes of static and dynamic analysis in reverse engineering, detailing steps such as assessment, disassembly, decompilation, and monitoring. It discusses techniques for understanding code structure and behavior without execution, as well as runtime behavior analysis, including memory analysis and instrumentation. Additionally, it addresses obfuscation methods that hinder reverse engineering efforts and the importance of instrumentation for collecting runtime data.

Uploaded by

Braincain007
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Lesson 3:

Static & Dynamic Analysis


CSC448/548-CYEN404 – REVERSE ENGINEERING

DR. ANDREY TIMOFEYEV


OUTLINE
•Static analysis.
• Assessment and reconnaissance.
• Disassembly.
• Decompilation.
•Dynamic analysis.
• Monitoring.
• Tracing.
• Debugging
• Memory analysis.
•Obfuscation.
•Instrumentation.
STATIC ANALYSIS: PROCESS
•Static analysis.
• Process of understanding the structure, behavior, functionality of a given file without executing it.
•Static analysis steps:
• Initial assessment & reconnaissance.
• Disassembly & decompilation.
• Functions identification.
• Data analysis.
• Control flow analysis.
• String analysis.
• Patterns & signature matching.
STATIC ANALYSIS: ASSESSMENT & RECONNAISSANCE (1)
•Initial assessment & reconnaissance.
• Gather basic file information, identify file type, assess potential obfuscation.
•File identification.
• File type & format.
• Executable, library, firmware, script, archive, document.
• File architecture.
• x86, x64, ARM.
• Headers, signatures, metadata.
• Assessing obfuscation.
• Packing.
• Compressed executable (UPX, PECompact, AsPack).
• Protection.
• Protected executable (Themida, AsProtect, Enigma Protector).
• Encryption.
STATIC ANALYSIS: ASSESSMENT & RECONNAISSANCE (2)
•Executable file headers.
• Each OS has its own format for executable files.
• Portable Executable (PE) – Windows.
.acm, .ax, .cpl, .dll, .drv, .efi, .exe, .mui, .ocx, .scr, .sys, .tsp, .mun
• Executable and Linkable Format (ELF) – Unix.
none, .axf, .bin, .elf, .o, .out, .prx, .puff, .ko, .mod, and .so
• Mach Object (Mach-O) – MacOS.
none, .o, .dylib, .bundle
• Hierarchical file headers contain information about code location & entry point.
STATIC ANALYSIS: ASSESSMENT & RECONNAISSANCE (3)
•Executable file headers (cont).
• Components of PE file header:
• MZ header.
• DOS stub.
• PE header.
• Location of code & data in file.
• How file will be mapped into memory.
• Address of entry point (loaded to EIP register).
• Data directories.
• Addresses of important tables.
• Section table.
• Contains address of import table.
• Import table.
• Contains libraries and APIs.
STATIC ANALYSIS: ASSESSMENT & RECONNAISSANCE (4)
•Executable file headers (cont).
• Components of PE file header:
• MZ header.
• DOS stub.
• PE header.
• Location of code & data in file.
• How file will be mapped into memory.
• Address of entry point (loaded to EIP register).
• Data directories.
• Addresses of important tables.
• Section table.
• Contains address of import table.
• Import table.
• Contains libraries and APIs.
STATIC ANALYSIS: ASSESSMENT & RECONNAISSANCE (5)
•Executable file headers (cont).
• Components of ELF file header:
• ELF magic.
• ELF header.
• Essential information about file format & structure.
• Program header table (optional).
• Segments of file for execution & loading.
• Section header table.
• Individual sections of the file (code, data, symbols).
STATIC ANALYSIS: DISASSEMBLY & DECOMPILATION (1)
•Disassembly.
• Converting machine code into assembly language representation.
•Decompilation.
• Converting assembly code into high-level language representation.
•Language generations:
• First-generation.
• 0/1, hexadecimal.
• Machine code, byte code, binaries.
• Second-generation.
• Mapping between binary/hex and opcode mnemonics.
• Assembly.
• Third-generation.
• Keywords & constructs as program building blocks.
• C, Java, Python.
• Fourth-generation.
• High-level, more abstracted, more domain specific.
• SQL, SAS, MATLAB.
STATIC ANALYSIS: DISASSEMBLY & DECOMPILATION (2)
•Basic disassembly process:
• Identify region of code to disassemble.
• Distinguish between instructions/data and find entry point.
• Consult with executable file header (PE, ELF, Mach-O).
• Match binary string with opcode mnemonic.
• Determine instruction length, determine operands, understand prefixes.
• Format & output assembly instruction.
• Choose between output syntax types (x86, AT&T).
• Advance to next instruction & repeat until no more instructions.
STATIC ANALYSIS: DISASSEMBLY & DECOMPILATION (3)
•Disassembly algorithms:
• Linear sweep.
• Begins with first byte in code region, disassembling one instruction at a time.
• Advantage: complete coverage.
• Disadvantage: does not account for data comingled with code.
• Recursive descent.
• Based on control flow: instruction is disassembled based on being referenced by
another instruction.
• Instructions classified by the affect on EIP:
• Sequential instructions.
• Conditional & unconditional branching.
• Function calls & returns.
• Advantage: distinguishes code from data.
• Disadvantage: inability to follow indirect code paths.
• Jumps/calls with tables of pointers for look up addresses.
• Mitigated with heuristics.
STATIC ANALYSIS: DISASSEMBLY & DECOMPILATION (4)
•Decompilation challenges:
• Compilation process is lossy.
• No variable/function names, no data types/structures in machine language.
• Compilation is many-to-many operation.
• Compiling and decompiling file yields different results.
• Compilation is language & library dependent.
• Appropriate decompiler must be used.
• Must be based on high quality disassembly outputs.
• Errors and omission in disassembly propagate to decompilation.
STATIC ANALYSIS: FUNCTIONS IDENTIFICATION
•Functions identification.
• Locate function boundaries.
• Identify starting/ending points of functions within the code.
• Analyze function properties.
• Examine function names, parameters, local variables, interactions with other functions.
• Create call graph.
• Visualize function call relationships to understand code flow & structure.
STATIC ANALYSIS: DATA ANALYSIS
•Data analysis.
• Identify data types & structures.
• Recognize variables, data structures, and their relationships within the code.
• Track data flow.
• Follow how data is used & modified throughout the code.
• Understand purpose & potential vulnerabilities.
• Analyze cross-references.
• Find references to a particular variable or data structure.
• Grasp its impact on the code's behavior.
STATIC ANALYSIS: CONTROL FLOW ANALYSIS
•Control flow analysis.
• Trace code execution paths.
• Map the possible execution paths through the code.
• Conditional branches & loops.
• Identify key decision points.
• Understand how decisions are made within the code and their potential implications.
• Discover hidden code.
• Uncover code that might be reachable only under certain conditions or through indirect calls.
STATIC ANALYSIS: STRING ANALYSIS
•String analysis.
• Extract string literals.
• Isolate human-readable strings within the code.
• Provide clues about functionality or purpose.
• Search for keywords.
• Look for specific strings that might indicate functionality.
• File paths, URLs, API calls, sensitive data.
• Identify encoding & obfuscation.
• Recognize techniques used to conceal strings.
• Encryption, encoding.
STATIC ANALYSIS: PATTERNS & SIGNATURE MATCHING
•Patterns & signature matching.
• Search for known patterns.
• Search for known code patterns or signatures that indicate specific functionality or vulnerabilities.
• Identify libraries & frameworks.
• Recognize usage of common libraries or frameworks.
• Understand the code's context & potential attack vectors.
• Detect malware traits.
• Look for patterns known to be associated with malware or malicious code.
• Create a file hash.
• Hash digest of a file can be used to identify it in a malware database.
• Can help identifying different versions of the same malware.
DYNAMIC ANALYSIS: PROCESS
•Dynamic analysis.
• Process of understating the runtime behavior of the program.
• Helping understanding how code operates & interacts with environment.

•Dynamic analysis steps:


• Environment setup & monitoring.
• Debugging, tracing & instrumentation.
• Memory analysis.
• Additional considerations.
DYNAMIC ANALYSIS: ENVIRONMENT SETUP & MONITORING
•Environment setup.
• VMs, sandboxes, emulators.
• Dynamic analysis must be carried out in a safe / controlled environment.

•Monitoring.
• Runtime behavior is carefully monitored.
• Process monitors (system activity), network analyzers (network activity), debuggers, memory analysis tools.
• Data collection:
• System interactions (filesystem, registry, network).
• Memory usage.
• Function & API calls.
• Execution flow.
DYNAMIC ANALYSIS: DEBUGGING & TRACING
•Debugging & tracing.
• Debuggers allow monitoring execution by setting breakpoints & stepping through the code.
• Pausing execution at specific points of interest.
• Examining memory, registers, variables, stacks.
• Understand program's state at specific point.
• Tracing code execution path instruction by instruction.
• In addition, kernel-level tracing & monitoring of system activities.
•Dynamic instrumentation.
• Allows tracking function calls & memory access.
• Code injection & hooking into running processes.
DYNAMIC ANALYSIS: MEMORY ANALYSIS (1)
•Memory analysis.
• Capture & analyze memory dumps during execution.
• Understand memory usage, potential vulnerabilities, encrypted/obfuscated data.

•Virtual Address Space (VAS).


• Processes are running in VAS.
• Abstraction of physical memory.
• Each VAS is mapped to physical memory.
• Managed by OS kernel.
• Utilizes pages and page table.
• VAS is divided into user space & kernel space.
• User privileges (ring 3) & kernel privileges (ring 0).
DYNAMIC ANALYSIS: MEMORY ANALYSIS (2)
•Memory analysis.
• Capture & analyze memory dumps during execution.
• Understand memory usage, potential vulnerabilities, encrypted/obfuscated data.

•Virtual Address Space (VAS) (cont.)


• User space is initially allocated for:
•Stack.
•Heap.
•Program.
•Dynamic libraries.
•Thread Environment Block (TEB).
•Process Environment Block (PEB).
• Further allocations triggered by malloc or VirtualAlloc.
DYNAMIC ANALYSIS: ADDITIONAL STEPS
•Additional steps.
• Post-execution differences.
• Comparing differences between snapshots taken before / after running executable.
• Monitoring system changes.
• File monitoring.
• Created, modified, or deleted files and directories.
• Registry monitoring.
• Created, updated, or deleted registry keys, values, data.
• Analysis of runtime artifacts.
• Generated logs, crash reports, any artifacts created during runtime.
OBFUSCATION: INTRO
•Obfuscation.
• Deliberately making code difficult to understand, hindering reverse engineering analysis.
• Anti-reverse engineering.
•Obfuscation goals: •Obfuscation impacts:
• Evasion. • Increased damage.
• Bypassing signature-based detection. • Longer operation time = more harm.
• Analysis delay. • Slower response times.
• Deciphering takes time & expertise. • Longer to develop effective countermeasures.
• Anti-tampering. • Higher vulnerability.
• Harder to reverse-engineer & modify. • Easier to bypass security = easier to attack.
• Confidentiality.
• Harder to trace origins & link back.
OBFUSCATION: CATEGORIES
•Obfuscation categories:
• Anti-static analysis techniques.
• Disassembly desynchronization.
• Target addresses obfuscation.
• Control flow obfuscation.
• Opcode obfuscation.
• Anti-dynamic analysis techniques.
• Detecting virtualization.
• Anti-sandboxing.
• Detecting instrumentation.
• Anti-monitoring, anti-tracing.
• Detecting/preventing debugging.
OBFUSCATION: ANTI-STATIC TECHNIQUES (1)
•Disassembly desynchronization.
• Misaligning instructions.
• Inserting seemingly harmless data bytes within the code.
• Utilizing conditional jumps that always/never execute.
• Employing indirect jumps that target instructions unconventionally.

•Target addresses obfuscation.


• Dynamically computed target addresses (DCTA).
• Addresses are computed during execution.
• Using complex math formulas/algorithms.
• Combining values from variables, registers, memory.
• Employing operations that make prediction difficult.
OBFUSCATION: ANTI-STATIC TECHNIQUES (2)
•Control flow obfuscation.
• Flattening.
• No nested conditionals / loops -> single loop controlled by switch, selecting from massive number of blocks.
• Jump table obfuscation.
• Jump table references different parts of program. Addresses calculated based on data or code execution.
• Dead code injection.
• Code is injected with harmless snippets that never execute.
• Unstructured control flow.
• Using unconventional jumps/branches.
• Bit manipulation to encode target addresses within variables or instructions.
OBFUSCATION: ANTI-STATIC TECHNIQUES (3)
•Opcode obfuscation.
• Encode/encrypt actual instructions when executables is created.
• Instructions must be de-obfuscated before executed.
• Portion of the code is unencrypted – startup routine responsible for de-obfuscation.
• Original code fed to obfuscator utility.
• Obfuscates original code/data sections.
• Adds deobfuscation stub.
• Deobfuscates code/data before runtime.
• Transfers control to the original entry point.
• Modifies header to redirect entry point to deobfuscation stub.
OBFUSCATION: ANTI-DYNAMIC TECHNIQUES (1)
•Detecting virtualization.
• Virtualization-specific software.
• Detecting hypervisors.
• Virtualization-specific hardware.
• Detecting virtualized hardware.
• Process-specific behavioral changes.
• Instructions behave differently in native vs virtualized environment.

•Detecting instrumentation.
• Check for running monitoring/tracing tools.
• Shutdown if present.
OBFUSCATION: ANTI-DYNAMIC TECHNIQUES (2)
•Detecting/preventing debugging.
• Query OS for running debugger.
• Through API calls.
• Check memory/processor artifacts for debugger presence.
• Processor debug flag set to 1.
• Hinder debugging process.
• Introducing spurious breakpoints.
• Clearing hardware breakpoints.
• Intentionally generating exceptions.
INSTRUMENTATION: INTRO
•Instrumentation.
• Process of inserting of code/probes into program & collecting of valuable runtime data.
•Key instrumentation techniques:
• Dynamic binary instrumentation (DBI).
• Inserting custom codes snippets (instrumentation) into program during runtime for monitoring/modification.
• Code injection.
• Tracing.
• Capturing/recording events/data during execution, providing valuable insights into behavior & interactions.
• System calls, handles, library calls.
INSTRUMENTATION: DBI
•Dynamic binary instrumentation (DBI).
• Capabilities:
• Log arguments & return values.
• What data enters/exits function.
• Modify specific instructions.
• Temporarily (during runtime) change behavior.
• Trigger custom actions.
• Custom code executed whenever function is called.
• Process:
• Load target program.
• Identify instrumentation points.
• Inject probes.
• Execute instrumented program.
• Collect & analyze data.
INSTRUMENTATION: TRACING
•Tracing.
• Classified by the type of event/resource being traced:
• Function calls tracing.
• Identifies when program calls specific functions.
• Reveals execution flow & dependencies.
• System calls tracing.
• Identifies system calls made by program.
• Reveals interactions with the OS, file system, network.
• Library & API calls tracing.
• Identifies calls to shared libraries & API calls.
• Reveals how program utilizes external data structures, functions, services.

You might also like