Build Your Own IOS Kernel Debugger
Build Your Own IOS Kernel Debugger
image: http://www.instructables.com/id/Apple-iOS-SerialUSB-Cable-for-Kernel-Debugging/
now:
ARM64 iOS kernel KDP won't work
...
switch (class) {
... What effect does this actually have?
case ESR_EC_BKPT_REG_MATCH_EL1:
if (FSC_DEBUG_FAULT == ISS_SSDE_FSC(esr)) {
kprintf("Hardware Breakpoint Debug exception from kernel. Hanging here (by design).\n");
for (;;);
__unreachable_ok_push
DebuggerCall(EXC_BREAKPOINT, &context->ss);
break;
__unreachable_ok_pop
}
panic("Unsupported Class %u event code. state=%p class=%u esr=%u far=%p",
class, state, class, esr, (void *)far);
assert(0); /* Unreachable */
break;
ideas we'll look at the relevant manual pages
SMM Ring -2
EL3 secure monitor
Ring -1
EL2 hypervisor
Secure World
Ring 0
paravirtualized
EL1 kernel
kernel
Ring 1
Ring 2
EL0 userspace
less privileged Ring 3
exception levels in iOS KPP runs here on A7-A9
?
EL2 hypervisor
For an OS to do anything
interesting it must transition
between these levels EL1 kernel
exceptions are the only thing
which cause transitions
EL0 userspace
fundamental to understand them
A10, A11 A7, A8, A9
iPhone 7+ iPhone 5S...iPhone 6S
Exceptions
Cause transitions upwards EL3 secure monitor
(or sometimes to the same level)
Synchronous
syscall, memory abort,
trapped instruction,
EL2 hypervisor
breakpoint, watchpoint...
EL2 hypervisor
VBAR_EL1
EL1 kernel
for exceptions taken to EL1
Exception Level 1 EL0 userspace
The _ELx suffix is the lowest exception level with access to the register
(typically read/write, sometimes only read)
System Registers
read from a system register:
VBAR_EL1 + 0x400:
SPSel SP_EL1
userspace X1
Exception handling in ARM64 XNU - EL0 SVC
VBAR_EL1 + 0x400:
SP_EL3
userspace X1
Exception handling in ARM64 XNU - EL0 SVC
VBAR_EL1 + 0x400:
SP_EL3
DECLARE("ACT_CONTEXT",
offsetof(struct thread, machine.contextData));
cpu exception stack:
struct machine_thread {
arm_context_t *contextData; /* allocated user context */ userspace X0
machine_thread_create: userspace X1
/* If this isn't a kernel thread, we'll have userspace state. */
thread->machine.contextData = (arm_context_t *)zalloc(user_ss_zone);
arm_context_t
VBAR_EL1 + 0x400:
userspace X1
we've saved the userspace stack pointer to
the thread's userspace saved context area
Exception handling in ARM64 XNU - EL0 SVC
VBAR_EL1 + 0xxx?
VBAR_EL1 + 0xxx?
msr SP_EL0, x0
msr SPSel, #0 switch SP to alias SP_EL0 are now "running on SP0" for the purposes
of another exception happening now
Exception handling in ARM64 XNU - EL0 SVC
DECLARE("TH_KSTACKPTR",
offsetof(struct thread, machine.kstackptr));
machine_stack_attach:
thread->machine.kstackptr = stack + kernel_stack_size - sizeof(struct thread_kernel_state);
Exception handling in ARM64 XNU - EL0 SVC
stp x2, x3, [x0, SS64_X2] (for syscall case this is ACT_CONTEXT)
stp x4, x5, [x0, SS64_X4]
...
stp q0, q1, [x0, NS64_Q0] saves NEON registers
stp q2, q3, [x0, NS64_Q2]
...
fleh_dispatch locore.s
hand-written assembly
first level exception * spill register state
handler dispatcher * indirect jump to fleh
FLEH
first level exception locore.s
handler hand-written assembly
* load regs for a c function call
SLEH
sleh.c
second level exception
c code
handler
* handle the exception
last chunk of assembly code before we call into C
fleh_synchronous:
load the second and third arguments for
mrs x1, ESR_EL1
sleh_synchronous
mrs x2, FAR_EL1
void
mrs x2, FAR_EL1
sleh_synchronous(arm_context_t *context, uint32_t esr, vm_offset_t far)
{
esr_exception_class_t class = ESR_EC(esr);
arm_saved_state_t *state = &context->ss;
...
switch (class) {
case ESR_EC_SVC_64:
if (!is_saved_state64(state) ||
!PSR64_IS_USER(get_saved_state_cpsr(state)))
{ this is the syscall handler
panic("Invalid SVC_64 context");
}
handle_svc(state);
break;
sleh.c
void
sleh_synchronous(arm_context_t *context, uint32_t esr, vm_offset_t far)
{
esr_exception_class_t class = ESR_EC(esr);
arm_saved_state_t *state = &context->ss;
...
...
switch (class) {
...
case ESR_EC_BKPT_REG_MATCH_EL1:
if (FSC_DEBUG_FAULT == ISS_SSDE_FSC(esr)) {
kprintf("Hardware Breakpoint Debug exception from kernel. Hanging here (by design).\n");
for (;;);
__unreachable_ok_push
DebuggerCall(EXC_BREAKPOINT, &context->ss);
break;
__unreachable_ok_pop
}
panic("Unsupported Class %u event code. state=%p class=%u esr=%u far=%p",
class, state, class, esr, (void *)far);
assert(0); /* Unreachable */
break;
VBAR_EL1 differences for SYNC SP
EL1_SP0 to EL1
SP_EL0
this means a synchronous exception which
originated in the kernel, like a hardware breakpoint SPSel SP_EL1
exception!
cpu will switch SP to alias SP_EL1 for us, so we're SP_EL2
on the core's exception stack now
SP_EL3
first difference:
we could be here due to the kernel's stack pointer
being wrong (eg stack overflow, stack buffer overflow)
so we should probably try to detect that first:
make space on the per-core exception
stack for a full register dump, just in
sub sp, sp, ARM_CONTEXT_SIZE
case we will panic
stp x0, x1, [sp, SS64_X0]
do some checking to see if this could be
mrs x1, ESR_EL1 a problem with the thread's kernel stack
let's assume this is okay
VBAR_EL1 differences for SYNC SP
EL1_SP0 to EL1
SP_EL0
switch to SP_EL0
msr SPSel, #0
(which is the thread's kernel stack)
SPSel SP_EL1
sub sp, sp, ARM_CONTEXT_SIZE make space on the
thread's kernel stack SP_EL2
stp x0, x1, [sp, SS64_X0] for a full register dump
SP_EL3
add x0, sp, ARM_CONTEXT_SIZE
fill in the correct sp
str x0, [sp, SS64_SP] value
...
causes synchronous
struct arm_context
HWBP exception, from EL1
running on SP0 to EL1
hardware breakpoint is serviced on
the thread's kernel stack; registers
spilled there hw_bp handling frames
schedule if we'll be scheduled off, FIQ posts AST dumps current state
off handled right before FIQ ERET's to top of stack
how to get hardware breakpoints to fire
read the manual :)
D2.4 Enabling debug exceptions from the current Exception level and Security state
structure of hardware breakpoint registers
MDSCR_EL1 global enable bits; per core
DBGBVR<1..15>_EL1 DBGBCR<1..15>_EL1
addresses where we want hardware
breakpoints to fire
bcr |= ARM_DBG_CR_MODE_CONTROL_ANY;
set the flag to fire the bp in all ELs
causes synchronous
HWBP exception, from EL1
running on SP0 to EL1
struct arm_context
syscall frames
client also listens on a port for exception messages from the server
KDP_CONNECT 0 KDP_BREAKPOINT64_SET 22
KDP_REATTACH 18 KDP_BREAKPOINT64_REMOVE 23
KDP_VERSION 3 KDP_RESUMECPUS 12
KDP_HOSTINFO 2 KDP_READREGS 7
KDP_DISCONNECT 1 KDP_WRITEREGS 8
KDP_KERNELVERSION 24 KDP_KERNEL_CONTINUE 27
(KDP_READIOPORT)
KDP_READMEM64 20 KDP_KERNEL_SINGLE_STEP 28
(KDP_WRITEIOPORT)
KDP_WRITEMEM64 21
They pretty much all do what you'd expect
KDP packet structure
Header:
total length of packet,
31 including this header 16 15 8 7 6 0
r
e
total length sequence number p
l
command
y
session key
session key
reply port
session key
31 16 15 8 7 6 0
r
e
total length sequence number p
l
command
y
session key
error
31 16 15 8 7 6 0
r
e
total length sequence number p
l
command
y
session key
LLDB uses this to parse the loaded kernel, find loaded kexts etc
this is the stock MacOS lldb you get with xcode
example session uncompressed kernel cache from IPSW, extract with joker
$ lldb kernelcache.ip7_11_1_2.uncomp
(lldb) target create "kernelcache.ip7_11_1_2.uncomp"
Current executable set to 'kernelcache.ip7_11_1_2.uncomp' (arm64).
(lldb) kdp-remote 172.20.10.11 iPhone's IP
Version: Darwin Kernel Version 17.2.0: Fri Sep 29 18:14:50 PDT 2017;
root:xnu-4570.20.62~4/RELEASE_ARM64_T8010; UUID=5E450F40-E224-33F7-946B-A764D21DF3FC;
stext=0xfffffff021804000
Kernel UUID: 5E450F40-E224-33F7-946B-A764D21DF3FC kernel version string we built
Load Address: 0xfffffff021804000 lldb client computes this from stext
Kernel slid 0x1a800000 in memory.
Loaded kernel file
/Users/ianbeer/prog/ios/iPhone7_firmwares/11.1.2/kernelcache.ip7_11_1_2.uncomp
Loading 165 kext modules warning: Can't find binary/dSYM for com.apple.kec.corecrypto
(B3028F6D-3547-37E1-B166-DB8972637087)
for MacOS kernel debug we get
.warning: Can't find binary/dSYM for com.apple.kec.Libm
some of these; nothing for iOS
(51AFA03E-8041-3D11-BD40-A6D1AED1C667)
.warning: Can't find binary/dSYM for com.apple.kec.pthread
(422770EA-D9A0-3B84-B683-15A6910AB51E)
.warning: Can't find binary/dSYM for com.apple.iokit.IOSlowAdaptiveClockingFamily
(1D16EC28-554A-3C74-B14A-AA62B624EDF1)
...
. done.
Process 1 stopped
* thread #1, stop reason = signal SIGSTOP
frame #0: 0xfffffff0218cc474
kernelcache.ip7_11_1_2.uncomp`___lldb_unnamed_symbol49$$kernelcache.ip7_11_1_2.uncomp
kernelcache.ip7_11_1_2.uncomp`___lldb_unnamed_symbol49$$kernelcache.ip7_11_1_2.uncomp:
-> 0xfffffff0218cc474 <+0>: msr DAIFSet, #0x3
0xfffffff0218cc478 <+4>: mrs x3, TPIDR_EL1 this is a lie, we're not actually stopped here.
0xfffffff0218cc47c <+8>: mov sp, x21 The initial connection stopped state is faked.
Target 0: (kernelcache.ip7_11_1_2.uncomp) stopped. can't single step yet, but can set a breakpoint
(lldb) and continue
(lldb)
(lldb) bt
* thread #1, stop reason = breakpoint 1.1
* frame #0: 0xfffffff021900fbc
kernelcache.ip7_11_1_2.uncomp`___lldb_unnamed_symbol428$$kernelcache.ip7_11_1_2.uncomp
frame #1: 0xfffffff021984b00
kernelcache.ip7_11_1_2.uncomp`___lldb_unnamed_symbol1211$$kernelcache.ip7_11_1_2.uncomp
+ 688
frame #2: 0xfffffff0218e2e80
kernelcache.ip7_11_1_2.uncomp`___lldb_unnamed_symbol197$$kernelcache.ip7_11_1_2.uncomp
+ 2704
frame #3: 0xfffffff0218f2458
kernelcache.ip7_11_1_2.uncomp`___lldb_unnamed_symbol307$$kernelcache.ip7_11_1_2.uncomp
+ 972
frame #4: 0xfffffff0219deff8
kernelcache.ip7_11_1_2.uncomp`___lldb_unnamed_symbol1697$$kernelcache.ip7_11_1_2.uncomp
+ 4388
frame #5: 0xfffffff0218cc1e0
kernelcache.ip7_11_1_2.uncomp`___lldb_unnamed_symbol34$$kernelcache.ip7_11_1_2.uncomp +
40
(lldb) x/1xg $x0
we set the breakpoint at kalloc_canblock, the first
0xffffffe027b138e8: 0x000000000000003f
argument is a pointer to the size to allocate
KPP/KTRR: if you can single step a kernel thread with them there, you can
probably steal whatsapp/wechat/etc messages, log GPS etc.
release