Convoluted Boot
Convoluted Boot
Difficulty: Hard
Classification: Official
Synopsis
Convoluted Boot is a hard Reversing challenge. Players must reverse engineer a series of
backdoors in the Master Boot Record, kernel, libc and busybox to uncover a final password
checker.
Skills Required
Knowledge of OS fundamentals
Skills Learned
x86 fundamentals
Solution
To begin, we are given convolutedboot.bin . Running strings on it reveals the following
command:
qemu-system-i386 -boot n -device e1000,netdev=mynet0,mac=52:54:00:12:34:56 -netdev
user,id=mynet0,net=192.168.76.0/24,dhcpstart=192.168.76.9,tftp=./,bootfile=convolu
tedboot.bin
Running this gets us a system that uses PXEBoot (a form of network booting) to boot into a
tinycore 32 bit Linux shell. The kernel doesn't seem to have KASLR. Let's reverse the bootfile,
which is loaded as the MBR by the BIOS and starts off in 16 bit realmode.
Bootfile
For some fundamental knowledge, we can read this article, also written by the challenge author.
Other helpful links included the PXE spec, OS wiki on PXE, a reference for the BIOS data area, and
a reference for the BIOS interrupt table.
For starters, it helps to define a starting address in your reverse engineering tool of choice.
According to OS wiki, the MBR starts at 0x7c00. Throughout this writeup, you will see comments
next to disassembly left over from when I was solving this challenge. While most of them are
probably accurate, I wouldn't completely rely on them - just focus on what is discussed in this
writeup.
The function starts by retrieving PXE structures with int 0x1a and storing some information
about them (such as PXE API entry points at 0x7f63). The print string functions can be deduced
since they call 0x7e4d in a loop, which triggers BIOS interrupt 0x10 with argument 0xe (Write
character in TTY mode).
To figure out what the offsets in the structure and the behavior of int 0x1a is, refer to the PXE
spec.
Next, it modifies the metadata for the total available memory on the system in the BIOS Data Area
in preparation for injecting the backdoor. It then copies the bootkit (which is 0x684 bytes in length
starting at 0x7f81) towards the region of memory it reserved, configures networking with
PXENV_GET_CACHED_INFO, and then makes several calls with string commands to the PXE API
with bx set to 0xe5. I couldn't find that API function in the spec, but this turns out to be something
specific to iPXE and stands for PXENV_FILE_EXEC.
Lastly, it pushes two return addresses for subsequent ret far instructions. The first pushed is
0x7d2d, which simply executes boot with PXENV_FILE_EXEC. The second one pushed is where the
backdoor function was copied, and this is where the code flows to next.
The injected code begins by doing some runtime relocation of itself, and then hunts for some
magic bytes in memory.
Let's handle all the relocations for this part of the backdoor now. I used the following script (the
EAX value is derived from debugging, which I will discuss very soon):
data = b''
BASE = 0x7c00
I loaded this newly created binary into Binary Ninja, and after loading it at the desired addresses,
here is what this function looks like.
Once it finds the correct sequence after multiple comparisons, it modifies the code there before
doing another ret far back into the function mentioned earlier, which just boots the system.
I wasn't sure exactly what it was overwriting here, so I decided to debug this. While GDB real mode
support is really a disaster, I managed to get it working by using suggestions from this comment.
It seems like the backdoor is overwriting memory at 0x7ec4de8. This debugging session also
helped me calculate that the injected code region operated at address 0x9ac00. The injected code
replaced the following sequence: 8b 84 24 e0 00 00 00 81 c4 bc 00 00 00 5b 5e 5f 5d c3
with the following:
x/18bx 0x7ec4de8
0x7ec4de8: 0xbb 0x34 0xad 0x09 0x00 0xff 0xd3 0xeb
0x7ec4df0: 0xfe 0x5e 0xbe 0xe0 0xa1 0x70 0x00 0xb8
0x7ec4df8: 0x9e 0xad
By logging the traffic in wireshark, I was able to figure out the physical file offset of this
modification. Basically, it converted:
005a0de8 8b8424e0000000 mov eax, dword [esp+0xe0 {arg5}]
005a0def 81c4bc000000 add esp, 0xbc
005a0df5 5b pop ebx {__saved_ebx} {data_5baa00}
005a0df6 5e pop esi {__saved_esi}
005a0df7 5f pop edi {__saved_edi}
005a0df8 5d pop ebp {__saved_ebp}
005a0df9 c3 retn {__return_addr}
into:
As we can determine the kernel version from the shell (5.10.3), this backdoor basically targeted
the kernel extraction function. It forces the kernel to call into the injected backdoor memory
region before returning from successful decompression. It's executed at a point in the kernel
when it will have exited real mode and entered 32 bit protected mode.
This function simply accesses a function pointer table at 0xc070a1e0 (which is the syscall table),
stores 4 of the original pointers and then replaces them. The functions it targets are openat
(0x127), mmap2 (0xc0), close (0x6), and 0xde (unimplemented). The address seen above is
technically 0x70a1e0, but the kernel loads into address 0x100000 physically (and the virtual start
address without KASLR is 0xc0100000). It replaces the function pointers to pointers to this
backdoored region - this works because 0xc0000000 is essentially direct mapped to address 0x0 in
32 bit Linux.
From here with the patched backdoored bootloader, the other syscall backdoor functions can
easily be analyzed. Recall that syscall implementations in the kernel do not read arguments in the
registers, but rather read arguments in the style of asmlinkage.
even with no special __attribute__ to force the compiler to use the stack for parameters,
syscalls in recent kernel versions still take parameters from the stack indirectly through a pointer
to a pt_regs structure (holding all the user-space registers saved on the stack on syscall entry). This
is achieved through a moderately complex set of macros (SYSCALL_DEFINEx) that does everything
transparently.
Now, let's go through each of the backdoored syscalls. In general, they all act as a hook before
calling the real syscall.
In openat, it first checks if the file argument is "/lib/libc.so.6" - if so, it saves the resulting fd in
0x9adae.
The backdoored mmap starts off in a similar way, checking if the fd relates to the libc fd. It also
checks if the protection argument has the execute flag set, and if so, adds the writeable flag
(effectively making all libc.so.6 RX regions into RWX).
Next, it checks for a marker at the start of the page, before injecting in a backdoor, starting at
offset 0xc6952 and 0xa06 from the first executable page. Readelf tells us that there is only one
executable region in the ELF and it starts at 0x019000.
After dumping the backdoor bytes, I wrote the following libc patcher (it was using libc-2.32.so):
patch1_len = 0x57
patch2_len = 5
patch1 =
b'\x05\xfa\xb5\x14\x00`\xe8\x00\x00\x00\x00[\x81\xeb]\xf9\r\x00\x81\xc3\xacN\x1
6\x00\x8b\x1b\x8b\x1b\x8b\x1b\x81\xfbcat\x00u/j\x07h\xe08\x06\x00h\x00\xc0\x04\
x08\xe8\x00\x00\x00\x00[\x81\xeb\x89\xf9\r\x00\x81\xc3\x80\x80\x0b\x00\xff\xd3\
x83\xc4\x0c\xbb\xad\xb7\xad\r\xb8\xde\x00\x00\x00\xcd\x80a\xc3'
patch2 = b'\xe8\x47\x5f\x0c\x00'
Opening the patched libc and looking at the first patch at 0x019000 + 0xa06, we see that it
backdoors __libc_start_main to call key_decryptsession :
As expected, the other backdoor targeted key_decryptsession . It checks if the current running
program name is running under cat . If so, it calls mprotect(0x804c000, some size,
PROT_READ | PROT_WRITE | PROT_EXEC) and then triggers the unimplemented syscall backdoor
with argument 0xdadb7ad.
Since this system runs with a busybox shell, cat will belong under busybox - this becomes
important as the unimplemented syscall will be backdooring busybox. The start of the function
looks like this:
It first calculates a constant based on the syscall argument and stores this. It then checks if
address 0x80a4479 still holds 0x53565755 - if it does, it jumps to the following procedure:
It's important to note that the first backdoor is created via an xor scheme dependant on busybox's
code at the same location.
In all future calls of this unimplemented syscall backdoor, it checks a counter and patches two
dwords that are always at the same address. The counter address is always incremented
afterwards too.
elf = ELF('./busybox')
patch1_off = elf.vaddr_to_offset(0x80a4479)
patch2_off = elf.vaddr_to_offset(0x80ae28f)
patch1_len = 0x104
patch2_len = 5
# xor decrypt
xor =
b'\xd6\xafV\\\x05\x1e\xcc\x00\x00\x80\xb3\xef+a\xe9\x00\x00\xc7\x85x\x86i\x07\x
03W\x0b\x081I\xc2\x00c\x0f\x85m\x02\x00\x00\x80\x90\xc2\xee\x0f\x85q\x03\x00\x0
0\x80\x90`\xf8\x0f\x85y\xbe\x11\x0c\x88\x90\xef\xfd\xf0z\x14\xf8\x86\x0c\x88\xc
0.+\x03\x8dE\xaf\x89\xff\xae\xf1\xac>,\xc9Q`\xffW\x8f\xcc\x81\xc1\xe4\x13\xd8\x
cf\x8dD$JP\xd9\x8f\xec/[X\xe8\x88\x91a\xf3\x00ZNJuSB\x04S\x00=$\xf6\xb5j\xaf\x1
7T\xe2\xc6\x91J\xb1\xe0\x1e\x9dEm\x05\xb2\xd2\\\xb0[\xc0u\x08\x9a\xe8\x1e\xeaT1
:W\xc2~\x0c\x00\x00\x00\xf7\xd5\x8c\x01\x9f\xab\x19%\xecF\xe3\xbc\x00t\x96\xe9\
xf3\xdf\xef\xb2\x8a\xc69\x1aJ\xfcfm\xa9\xbf\xccH\xf3\xdf\x82\x10n\xd7j\xe7\xfa\
xd3\xed\x06\xf7\xd6`\x14G\x16\xd9\x0c\x8b\x14_\xfd\xa3a#\xae(\\\x07<\xfcff\x14\
n\xcb\x9by\xdbb\xa2\x87\xdbju\xf0\x85\xd0e\xe2\x08\x00\x00\xd0\xd0\xc3\xea\x08\
x00\x14#N\xdf'
original = [c for c in data[patch1_off : patch1_end]]
patch1 = []
assert(len(xor) == len(original))
for i in range(len(xor)):
patch1.append(chr(ord(xor[i]) ^ ord(original[i])))
patch1 = b''.join(patch1)
patch2 = b'\xe8\xe5\x61\xff\xff'
The patch at 0x80ae28f simply forces a call to 0x80a4479. 0x80a4479 begins by checking if the first
argument to cat is secret :
Then it ensures that the secret file exists, and starts reading 4 bytes from it, performs a
calculation, and then passes the result to the backdoored syscall. This occurs for 7 iterations -
notice how 3 of the addresses here correspond to the later iterations of patching from the
backdoored syscall (the 2 constant patches and the counter patch for the backdoor to know how
to patch)!
Recall how the backdoored syscall calculates a value from the integer - it actually returns this in
the end. Also notice how the 2 constant patches relate. The first constant patch affects the
multiplication constant, and the second constant affects the comparison constant, for which each
round of checking must pass. This is quite a crazy flag checker! If all 7 rounds succeed, the
backdoor outputs "Correct!", else it outpus "Wrong!"
from z3 import *
import struct
original_factor = 0xe296df0b
original_const = 0x544aa692
factors = [original_factor]
consts = [original_const]
This script yields the flag. We can confirm by echoing the flag into a file and cat ting it, to which
we receive the response Correct!