Workshop
Workshop
Pepijn de Vos
July 2017
Contents
1 Introduction 2
1.1 About me . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3 Resources 5
5 Assembly Basics 6
7 The Debugger 11
1
8 Hello world! 15
9 Conclusion 18
1 Introduction
In this workshop I will teach you how to get started reading and writing code for
the Nintendo Game Boy. First we will go over the hardware and the assembly
language, and from there start to inspect some code and make changes to it.
1.1 About me
I’m basically a software developer turned electrical engineer, doing a lot of crazy
projects along the way.
My journey with Game Boy hacking started when I connected the Game Link
port to an Arduino to spoof Pokémon trades. This ended up on Hackaday,
where someone suggested to trade across the internet. I quickly dismissed the
idea as impossible due to latency. Then I dug around in annotated Pokémon
dissassembly and did it anyway.
After poking at the Game Boy from the comfort of my Arduino, I decided to
get my hands dirty and implement Pokémon Go on the Game Boy. First using
Pokémon Red with an Arduino and GPS module. Then in Pokémon Crystal by
directly controlling an SPI sensor with the Game Boy.
Having gotten comfortable with Game Boy assembly, I wrote a small paint
program from scratch, to draw graphics for a game I have yet to make. Probably
I’ll keep on Yak-shaving like that.
Then when I needed to do a project for a digital hardware course at the Univer-
sity of Twente, we implemented the graphics chip of the Game Boy in VHDL,
driving a VGA monitor. I also wrote a simple presentation framework to run
our project presentation on our own emulator.
The above projects may sound extremely hard to you, and some of them were.
But the truth is that when I started on Bill’s Arduino, I knew nearly nothing
about game development or assembly programming.
2
Figure 1: Game Boy Paint
3
I simply started reading documentation and source code, making small changes,
learning as I went. So what I’ll attempt to do is to recreate a turbo-charged
version of my learning process. What we’ll roughly cover:
I assume you all know what a Game Boy is. Below are some factoids I copied
from Pandoc. Of course every single metric listed has funny particularities and
edge-cases, but we’ll get to those as needed.
4
3 Resources
The nice thing about the Game Boy platform is that there is just so much
documentation about everything. I will give you links and descriptions of some
particularly helpful ones.
If you have an hour to spare, definitely watch The Ultimate Game Boy Talk by
Michael Steil. He does an amazing job at describing the history of the Game
Boy, as well as going into great detail about the hardware.
The document that you will use most comes in several variations and names such
as Pandoc, gbspec, and Game Boy CPU Manual. It is the bible of Game Boy
technology. It contains everything ranging from detailed hardware descriptions,
to IO register maps, to assembly instructions.
Since all Game Boy games were originally written in assembly language, some
lovely people on the internet decided to disassemble and annotate Pokémon Red
for your reading pleasure. This provides a great way to become familiar with
the inner workings of the game and the platform it runs on, as well as an easy
avenue into hacking the game. It is also a good reference of working code and
collection of development tools. Some of these folks also gather on IRC in #pret
to answer my stupid questions.
When you want to write games from scratch, Pokered is not ideal because it has
so much going on. A better starting place is the GameBoy Assembly Language
Primer. It contains useful boilerplate code, hardware defines, utility routines,
and a sprite font.
First we will need to clone Pokered using git. Doing so will later allow us to
easily search the source code using git grep.
5
Please refer to their installation instructions on how to clone the project, install
RGBDS, and build the ROMs.
For running the ROMs, in theory any emulator would work, but BGB has the
best debugger, which we’ll be using extensively. Unfortunately it is written for
Windows, but it runs excellent on Wine. Please download and install BGB from
their website.
5 Assembly Basics
Open up the Game Boy CPU manual, and skim trough section 3, Game Boy
command overview. Pay special attention to 8-bit loads and jumps.
The ”ld” mnemonic is the most common in code, and copies bytes between
registers and memory locations. It is especially common to load data in the
Accumulator register, as a lot of instructions only work on that register. 16-bit
addresses will frequently be loaded in the HL register for similar reasons. (HL
has a special load-and-increment instruction, allowing compact pointer loops)
Run git grep ld in your Pokered folder to see how it is used.
Note especially the different addressing modes. You can do direct loads, where
you pass it a register or a literal number. You can also do indirect loads, where
you wrap a number or register in brackets to use it as the address to load
from/to. Some modes also support adding a number and a register, but not all
combinations are possible.
The way to do loops, if statements, and functions is by using jumps and labels.
A jump instruction takes an address in the code, and jumps to that position.
Some jumps check a condition before jumping. There is JP ”jump” and JR
”jump relative”. They work the same, but JR is slightly more compact and
faster.
6
with a colon (double colon for global labels).
Run git grep jr , open one of the files, and see if you can understand what
is going on. Pay attention to the way local dot labels are used for loops and
branches, and global double colons are used for function declarations.
Before I’ll let you figure out the rest by yourself using your CPU manual, I would
like to briefly point out three more instructions. CALL and RET ”return” are
jump instructions that are used for functions. CALL saves the address it jumped
from on the stack, and RET jumps back to the saved location. And finally, you’ll
frequently see xor a, which does an exclusive or on the ”a” register with itself,
essentially setting it to zero.
Now that you know a bit what Game Boy assembly generally looks like, let’s
try to understand some interesting things. If you’ve played Pokémon a lot, you
probably know there are all these glitches that you can use to get rare Pokémon
or do other weird things. Have you ever wondered how they actually work?
Time to find out.
One of the more famous glitches is of course the Old Man Glitch. Go ahead
and read up on the cause of this bug. It is a reuse of memory combined with
an oversight in the map data. Let’s see how it works in the code.
Step one is to find where the player name is stored. I just took a guess and
grepped for terms such as ”player” ad ”name”. After a while I’ve learned that
all WRAM variables are declared in a file called ”wram.asm”.
7
. nonstandardbattle
l d a , [ wBattleType ]
cp BATTLE TYPE SAFARI
l d a , BATTLE MENU TEMPLATE
j r nz , . m e n u s e l e c t e d
l d a , SAFARI BATTLE MENU TEMPLATE
. menuselected
l d [ wTextBoxID ] , a
c a l l DisplayTextBoxID
l d a , [ wBattleType ]
dec a
j p nz , . handleBattleMenuInput ; h a n d l e menu i n p u t i f i t ’ s not t h e o l d ma
; t h e f o l l o w i n g happens f o r t h e o l d man t u t o r i a l
l d hl , wPlayerName
l d de , wGrassRate
l d bc , NAME LENGTH
c a l l CopyData ; t e m p o r a r i l y s a v e t h e p l a y e r name i n unused space ,
; which i s supposed t o g e t o v e r w r i t t e n when e n t e r i n g a
; map with w i l d Pokemon . Due t o an o v e r s i g h t , t h e data
; may not g e t o v e r w r i t t e n ( c i n n a b a r ) and t h e infamous
; M i s s i n g n o . g l i t c h can show up .
l d hl , . oldManName
l d de , wPlayerName
l d bc , NAME LENGTH
c a l l CopyData
If the battle type was 1, the zero flag is set and the relative jump is ignored. The
code that follows loads ”wPlayerName” and ”wGrassRate” into registers and
calls ”CopyData”, which does exactly what it says on the tin. (go ahead and find
its implementation if you want. Hint: look for ”CopyData::” to find the decla-
ration) The comment helpfully mentiones that this enables the Old Man Glitch.
Then another call to ”CopyData” copies ”.oldManName” to ”wPlayerName”.
We can check back in ”wram.asm” to see that the variable is directly followed
by ”wGrassMons”. Grepping further to see what happens with ”wGrassMons”
in Cinnabar Island is left for you to explore.
8
Figure 3: First, what is your name?
Now that we understand how this glitch works, can we find other ways to catch
different Pokémon by surfing around on Cinnabar? While what I will describe
below is currently written on Bulbapedia, that is the case because I put it there
after I discovered this new glitch while working on TCPoke.
It is really not that hard, we just need to find places that write to ”wGrassMons”
or an address just before it. Since I was neck-deep in the link trading code,
”wLinkEnemyTrainerName” seemed very promising. It does not take very long
to find that indeed ”engine/cable club.asm” writes to it.
. findStartOfEnemyNameLoop
ld a , [ h l i ]
and a
j r z , . findStartOfEnemyNameLoop
cp SERIAL PREAMBLE BYTE
j r z , . findStartOfEnemyNameLoop
cp SERIAL NO DATA BYTE
j r z , . findStartOfEnemyNameLoop
dec h l
l d de , wLinkEnemyTrainerName
l d c , NAME LENGTH
. copyEnemyNameLoop
ld a , [ h l i ]
cp SERIAL NO DATA BYTE
j r z , . copyEnemyNameLoop
l d [ de ] , a
i n c de
dec c
9
j r nz , . copyEnemyNameLoop
What we see here is a loop in the code that parses the data that is received
over the link cable. After receiving the seeds for the random number gen-
erator, it chomps off a few preamble bytes, and then does a loop to copy
”NAME LENGTH” bytes to ”wLinkEnemyTrainerName”.
A few things to note here are the use of HLI, which is the special load-and-
increment instruction I mentioned earlier. While the DEstination register is
incremented manually. It also does a check for ”SERIAL NO DATA BYTE”,
in which case it just skips to the next byte. ”and a” is another of those common
patterns, that is essentially just setting the ”Z” flag if ”a” is zero.
I just browsed around the repository a bit, until I found a folder called ”data/wild-
Pokemon”. Inside it I found neatly organized files for every route with data that
exactly fits the ”wGrassMons” memory. All that is needed is to change all the
”PIDGEY” and ”RATATA” in ”route1.asm” into any Pokémon defined in ”con-
stants/pokemon constants.asm”.
Anyway, after changing a few routes like this, running make should produce
a new ROM with new Pokémon, as per your changes. Congratulations! You
wrote your first Game Boy hack! What are you going to change next?
10
6.4 Your own Adventure
Now that I’ve shown you how you can grep your way through the source code
and make small changes, it’s up to you what to do next. Bulbapedia has a
whole List of glitches in Generation I for you to explore and expand upon. Or
you can do whatever else you want. Make the shops sell Master Balls, change
the starter Pokémon, add a new item, change some sprites, change what people
say, make Magikarp learn self-destruct, anything is possible.
Using the tools that were developed for Pokered, you could in theory even
disassemble other games and hack those. Except it’d be a lot harder because
you would have to do it without all the names and annotations that people
added to Pokered. You’d just have to use the debugger to find what you’re
looking for and read the code. Which brings me to...
7 The Debugger
So far we have mostly looked at assembly code where other people have given
meaningful names to all the labels. Now lets see if we can make sense of another
popular game that has not been disassembled. I’ll be exploring Super Mario
Land, but if you’re feeling adventurous, pick any game you like.
After having loaded the game and doing some unavoidable ”play testing”, open
up the debugger via the right click menu under ”Other” as shown in figure 4.
Since you don’t know what anything is, the only thing you can do is start at
a known point, or look for known register locations. The start of the ROM is
probably not a good choice, as it just contains boring setup code.
For my TCPoke project, I looked a lot at all the code that touches FF01 and
FF02, the serial registers. If you want to figure out how the bleeps and bloops
work, look for code that touches the audio registers, if you want to see how
sprites are drawn, look at code that touches the tile data or DMA registers. If
you want to see how the input is handled, look at FF00, the joypad register. If
you want to see the main game loop, the VBlank interrupt is a good place to
start. For games that do interesting graphics effects, such as the perspective in
F1 Race, HBlank is also interesting to look at. (spoiler: it scrolls the window
every line to create curves in the road)
When you open the debugger, it puts you at the start of the VBlank interrupt
that happens between every frame. You can use F7 (trace) to step through the
code and see the registers update on the right side.
Note that the default syntax here is slightly different than the one in Pokered.
11
Figure 4: Open the debugger
12
Figure 5: VRAM viewer
Where BGB uses ”ldi a, (hl)”, RGBDS would use ”ldi a, [hl]” or ”ld a, [hli]”.
If you want you can change the syntax in the BGB settings from ”no$gmb” to
”rgbds”.
I thought it’d be funny to find where the game stores the lives you have and
cheat a bit. Rather than browsing the endless sea of assembly (which I did for
a good part of the day, believe me), lets open up the VRAM viewer (Figure 5),
which can also be found in the ”Other” right-click menu.
Play around a bit with the VRAM viewer, it gives you a good idea how the
game is drawn. Note for example how the BG map wraps around as you walk,
except the top bar. Look at the LCD status interrupt on LYC if you’re curious.
Now hover your mouse over the tile that holds your lives. It’ll tell you where
the tile data is stored and what the map address is (9807). If we’re unlucky this
data is written as a larger chunk, maybe starting at 9800, but it seems we’re
lucky.
Hit CRTL+F and search for 9807. It returns only one location that loads the
A register into the tile of interest. Hit F2 to set a breakpoint and F9 to run the
game. Every time you die the breakpoint gets triggered.
13
At the breakpoint, you’ll see that the A register contains the amount of lives
you have. If you head back over to the VRAM viewer tiles tab, you’ll see that
tiles 0-9 have tile numbers 0-9, making the math easier. (read Pandoc on the
two tile sets, and the way they are indexed. They overlap!)
Now we just need to backtrack where the value in A came from. A dozen
lines up there is an unconditional RET that must indicate the end of another
subroutine, and a few lines below is another unconditional RET that marks the
end of the current routine. If you look at the stack pointer in the lower right,
or trace (F7) past the RET, you can find this routine is called from the VBlank
interrupt. I put another interrupt at the beginning of the routine, but ended
up placing it after the early returns. (when this routine is called and the lives
do not need to be drawn, it seems to return early)
After tracing trough the code a few times it became evident that DA15 contains
the number of lives as binary coded decimal (4 bits per decimal digit). You can
see the use of DAA to fix up the BCD number after binary addition and the
use of SWAP to extract individual digits.
ldh a , [ $FF9F ]
and a
ret nz ; e a r l y r e t u r n i f FF9F != 0
ld a , [ $C0A3 ]
or a
ret z ; e a r l y r e t u r n i f C0A3 == 0
cp a , $FF
ld a , [ $DA15 ] ; l o a d DA15
jr z , . decr ; go t o . d e c r i f C0A3 == FF
cp a , $99 ; DA15 i s b i n a r y coded d e c i m a l !
jr z , . skip ; s k i p e v e r y t h i n g i f DA15 == 99
push af ; push DA15 on s t a c k
ld a , $08 ; update some unknown r e g i s t e r s
ld [ $DFE0 ] , a
ldh [ $FFD3 ] , a
pop af ; r e s t o r e DA15
add a , $01 ; add 1 t o DA15
. draw
daa ; decimal adjust
ld [ $DA15 ] , a ; s t o r e new v a l u e i n DA15
ld a , [ $DA15 ] ; useless ?
ld b,a ; copy DA15 t o b
and a , $0F ; o n l y g e t l a s t BCD d i g i t
ld [ $9807 ] , a ; w r i t e l a s t d i g i t t o t i l e map
ld a,b ; l o a d DA15 from b t o a
and a , $F0 ; g e t f i r s t BCD d i g i t
swap a ; swap upper and l o w e r b i t s
14
Figure 6: New cheat
ld [ $9806 ] , a ; w r i t e f i r s t d i g i t t o t i l e map
. skip
xor a
ld [ $C0A3 ] , a ; s e t C0A3 t o z e r o
ret ; return
. dead
ld a , $39 ; update some unkown r e g i s t e r s
ldh [ $FFB3 ] , a
ld [ $C0A4 ] , a
jr . skip ; s k i p t o t h e end
. decr
and a ; t e s t i f DA15 i s 0
jr z , . dead ; you ’ r e dead , e l s e
sub a , $01 ; decrement your l i v e s
jr . draw ; draw t h e updated v a l u e s
Knowing this, I opened the cheat window from the window menu in the debug
window. I created a new cheat that always read the value 99 from DA15, as
pictured in Figure 6. The result can be seen in Figure 7.
8 Hello world!
This is the part where you start your own game and do whatever you like, and
I’ll run around trying to fix everything. Pairing up is highly recommended.
15
Figure 7: 98 lives in Super Mario Land
16
Alternatively, if this is all way too much, I’ll take requests for improvised live-
coding, while you sit back and watch me make horrible mistakes.
To start making your own game, I recommend taking the GALP code as a start,
to fill in the basic details like the ROM header and setting up the screen and
tile data.
Unfortunately GALP does not compile with RGBDS out of the box. To get
you started, I took my paint program (that was based on GALP), and stripped
it back down to a ”Hello World!” game. It also includes a Makefile based on
Pokered, so if RGBDS is installed correctly, you should be able to just run make
to produce a main.gbc file.
The first thing you could do is obviously make it print something else. The
default tile data it loads also includes some other characters, arrows and boxes.
The Makefile is also capable of compiling png images to 2bpp files that can be
directly included in the game as follows.
MyImage :
INCBIN ” image . 2 bpp”
This can then be loaded somewhere in the tile data like so:
ld hl , MyImage ; t h e image a d d r e s s
ld de , TILE0 ; the f i r s t t i l e s e t
ld bc , 1 6 ∗ 4 ; length (16 bytes per t i l e )
call mem Copy ; Copy t i l e data t o memory
Next, you might want to add some interactivity. Maybe you could add a simple
animation or do something with the joypad.
If you look at the CPU Manual section on the joypad register FF00, you’ll see
it contains example code for reading the joypad. A RGBDS-compatible version
of that snippet is already in the code I gave you. You can have a look at how
my paint program does it. The actual tests for button presses are lines like
bit PADB START, b which set the zero flag depending on the button state,
usually followed by a conditional jump.
Using these two basic features, you could already make a sprite move around
when you press buttons.
From here it’s up to your creativity. You could make the Star Wars opening
17
scene, or a whole interactive story/text-based RPG. You could make a little
beatbox that just makes bleeps when you press buttons, and extend it with a
basic sequencer later, or you could make a little party game where the players
have to press increasingly complex button sequences such as ↑↑↓↓←→←→BA.
9 Conclusion
That’s all I have. I hope you had fun and learned something.
In the unlikely event that I had time to cover everything, we did a lot. We
started with a crash course 8080 assembly, followed by reading overwhelming
amounts of said assembly. Then we reached a major milestone where we changed
some code and ran our modified Pokemon game. Then we took off the training
wheels and made a new cheat code for Super Mario Land. And finally we went
completely wild with writing our own game.
18