CPP Dynamic Type Recovery
CPP Dynamic Type Recovery
Conclusion
Introduction
Genesis of this Research
HOOK
.idata:61EB19C8 extrn imp free:dword &freeHook
.idata:61EB19CC extrn imp malloc:dword &mallocHook
HOOK
The hooks save metadata upon malloc, and discard upon free.
Allocation Record
0xe105e18 0x50 0x154671 0xe874958 0x18 0xd356 0xeb0c038 0x138 0x154671 0xf0a3888 0x50 0x14f40c
Binary trees (AVL, red/black) are well-suited here. Hash tables are not.
Dynamic Structure Reconstruction
Existing DBI-Based Approaches
Locate Memory Management Functions
Hook Memory Management Functions
Run the Program, Instrumented
Instrument Memory References
Detect and Record Structure Accesses
Post-Process Recorded Data
Dynamic Structure Reconstruction
Step #3: Run Program under Instrumentation
DBI
Program Inputs
void DBIMemAccessCallback(
000000ADDR eaIns, ADDR eaMem,
000000SIZE size, BOOL bRead) {
{
00AllocRecord *ar = lookup(eaMem); J
00if(ar != NULL)
0000log(ar,eaIns,size,eaMem,bRead); J
}
DC07900 80 6F00B
(If two sites are known to allocate the same type, we can merge their data.)
Dynamic Structure Reconstruction
Rebuild C-Level Structures
struct X {
00// ...
00union { int c = x->a; char *d = x->b;
0000int a; . . . might compile to . . . . . . might compile to . . .
0000char *b; mov eax, [rsi+10h] mov rax, [rsi+10h]
00}
00// ...
Physical Address
Misc. Flags P=1
0x12345000
PTE #3
31 11 1 0
PTE #3
31 11 1 0
EXCEPTION:
I 61F33E5A mov eax, [ebx+4]
Page Fault
I 61F33E60 add edx, 4
Set the X86 trap flag (TF). This will allow one instruction
to execute, after which a single-step exception will be
raised. Continue execution of the monitored program.
Memory Tracing via Presence
Mechanism in Detail
EXCEPTION:
I 61F33E5A mov eax, [ebx+4]
Single Step
I 61F33E60 add edx, 4
Controller Thread
Poll buffers
Filter, log
Sleep
Architectural # Accesses
Revision per Minute
Single-Threaded 3M
Multi-Threaded 6.4M
Mini X64 Emulator V1 9M
Guard Pages 11M
SetThreadAffinityMask() 12.1M
Mini X64 Emulator V2 13.2M
1
Suggested by Yaron Dinkin
2
Suggested by Jason Geffner + RECON attendee whose name I forget (sorry)
Dynamic Structure Reconstruction
Existing DBI-Based Approaches
Limitations of DBI-Based Solutions
My Contributions to this Problem
Exploit X86 Demand-Based Paging
DLL Injection-Based Memory Tracking
Target Specific Allocation Sites
Exploit the Results within IDA/Hex-Rays
Target-Specific Example: unions
Dynamic Structure Reconstruction
How to Apply Page-Based Tracking
We’ve shown how to track memory, but not how to apply it. We
explore our two possibilities, and strategies for those cases:
Page Boundary
Pros: Cons:
I Easy to implement I Page faults for in-band
I Usually thread-safe metadata
I Naturally handles I Slower than some
different-sized allocations alternatives
I Tuned for performance
Target Specific Allocation Sites
Divert into Customized Slab Allocator
Slab Allocator
CHUNK CHUNK CHUNK CHUNK CHUNK CHUNK CHUNK CHUNK
...
#1 #2 #3 #4 #5 #6 #7 #8
#3 #5 #7
Free List
Pros: Cons:
I Fast allocation and range checks I Fixed-size
I No in-band metadata I Must be applied judiciously
Dynamic Structure Reconstruction
Summary: DBI vs. DLL Injection
union U {
00int x;
00char *y;
00void *z;
};
I The code must check the tag to know the union’s held type.
I Code using unions is littered with these checks.
Dynamic Structure Reconstruction
unions in Decompilation: Improper Selection
Same code as the previous, with three union fields set properly.
unionsare particularly tedious to apply manually – let’s automate.
Dynamic Structure Reconstruction
Upon Manually Discovering a union Somewhere . . .
class mop_t {
+0x00 00mopt_t t; J
+0x01 00char oprops;
+0x02 00short valnum;
+0x04 00int size;
+0x08 00union { ... }; J
+0x10 };
enum mop_t {
union {
00mop_z00 = 0,
00mop_r00 = 1,
10 00mreg_t r;
00mop_n00 = 2,
11 00mnumber_t *nnn;
00mop_str = 3,
12 00minsn_t *d;
00mop_d00 = 4,
13 00stkvar_ref_t *s;
00mop_S00 = 5,
14 00ea_t g;
00mop_v00 = 6,
15 00int b;
00mop_b00 = 7,
16 00mfuncinfo_t *f;
00mop_f00 = 8,
17 00lvar_ref_t *l;
00mop_l00 = 9,
18 00mop_addr_t *a;
00mop_a00 = 10,
19 00char *helper;
00mop_h00 = 11,
10 00char *cstr;
00mop_c00 = 12,
11 00mcases_t *c;
00mop_fn0 = 13,
12 00fnumber_t *fpc;
00mop_p00 = 14,
13 00mop_pair_t *pair;
00mop_sc0 = 15
14 00scif_t *scif;
};
};
Via DLL injection, for every call to malloc, record the pointer.
loc_567:
v4 = malloc(0x138);
sub_123(a1,v9,0);
sub_456(a4,v7);
sub_234(v1+24,"a");
sub_345(a3,a2,v17+16);
RUNTIME FUNCTION <rva sub 61EB80C0 J, rva algn 61EB80DB J, rva stru 620C0390>
void __fastcall
00sub_61EB8AD0(
0000void *rcx0, J
0000unsigned int a2,
0000void *a3) J
HOOK
.idata:61EB19C8 extrn imp free:dword &freeHook
.idata:61EB19CC extrn imp malloc:dword &mallocHook
HOOK
Allocation Record
commonLog:
; Save flags
; Save registers
00mov rcx, rsp 00000000; Point arg #0 to stack data
00call commonLogC
00mov [rsp+78h], rax J ; Return to re-entry thunk
; Restore registers
; Restore flags
00retn
Logged Data
Function RVA # Arg Alloc RVA Alloc size Offset into alloc
Dynamic Resolution of Argument Types
Summary: Logged Data
Logged Data
Function RVA # Arg Alloc RVA Alloc size Offset into alloc
0x15f520 1 0xbe648 0x50 0x20
0x153ce0 0 0xbe648 0x50 0x0
0x147520 0 0x143fd8 0x50 0x40
0x15f8d0 1 0x143fd8 0x50 0x40
0x11530 0 0x56c89 0x630 0x0
0x55a80 0 0x56149 0x120 0x0
0x57bd0 0 0x56149 0x120 0x18
Once known, the user manually sets the allocation site type.
(Or, in IDA: Edit->Operand Type->Set Operand Type)
For known allocation site types, the user can apply argument types.
Can select multiple allocation sites at once.
Dynamic Resolution of Argument Types
Applied Argument Types
class x { class y :
+0 int a; J
00int a; J 00public x {
+4 int b; J
00int b; J 00int c; J
+8 int c; J
}; 00int d; J
+12 int d; J
00 };
For derived objects, same size does not imply same type.
Challenges
Nested Structures
struct A {
00int a;
00struct B {
0000struct C { A
000000struct D {
00000000int d; int a int d int c int b
000000} D;
000000int c; D
0000} C; C
0000int b; B
00} B;
};
Type Expression
int a int d J int c int b int * *x
D * x->d
D C * x->D.d
C B * x->C.D.d
Conclusion
Conclusion
Any Questions?