Writeup
Writeup
The content associated with this post is available on GitHub and has been tested with
Ghidra 11.0.3: https://github.com/cy1337/ghidra_gdb_tracer
• GenPyGDB.py
o Ghidra Python script which generates a GDB Python script
• CommandServer
o An example program to analyze
• CommandClient.py
o A client for interacting with CommandServer
• Dockerfile
o Source for the DockerHub image
• gdb_script.py
o Example output generated by running GenPyGDB.py on CommandServer
The container suitable for running the target program is available on DockerHub.
And then from a separate terminal you can interact with the command server:
The goal of this project is to have a flexible process for identifying which functions are
called when processing a particular testcase. We will do this by adding logging breakpoints
at the start of every function. This is an ideal way to identify targets for fuzzing without a
substantial up front reversing effort. (More on this in my last post on using AFL QEMU.)
GDB Commands
Let’s start by looking at how the debugger script will function and then circle back to
discuss what will be needed in the Ghidra script.
• Determine the base address of the debug target (e.g. because of ASLR)
• Add breakpoints at each function
• Log the function name on breakpoint trigger
• Optionally disable the triggered breakpoint
• Resume execution
Without getting into system specific modifications, the easiest way to overcome ASLR is to
determine the base address of the process at runtime. One option for doing this is to get
calculate an offset between the static symbol in Ghidra’s listing and the dynamic address
at runtime.
We can calculate use this address, 0x134610 to calculate the runtime base address
relative to the addresses in Ghidra:
starti
set $base_addr = read - 0x134610
Adding this $base_addr to an address in Ghidra will give the address in the target process
where we want to set a breakpoint. For example, setting a breakpoint at FUN_00134869
looks like:
Next we can add a command to run at that breakpoint to print the function name and
continue:
command
print "FUN_00134869()"
continue
end
GDB Python
We could use Ghidra scripting to produce a GDB command script like this but there are
some more advanced situations where this becomes unfavorable. Using GDB’s Python API
can leave us in a better position for later adding more advanced logic to analyze data
structures at runtime. Using GDB’s Python API, the logic can be summarized with:
import gdb
class FunctionBreakpoint(gdb.Breakpoint):
def __init__(self, location, func_name):
location += base_addr
bp_loc = '*{}'.format(hex(location))
super(FunctionBreakpoint, self).__init__(bp_loc, gdb.BP_BREAKPOINT,
internal=False)
self.func_name = func_name
def stop(self):
gdb.write('Function: {}\n'.format(self.func_name))
return False
def set_base_address():
gdb.execute('starti')
gdb.execute('set $base_addr = read - 0x134610')
base_addr = int(gdb.parse_and_eval('$base_addr'))
gdb.write('Base address set to: {:#x}\n'.format(base_addr))
return base_addr
base_addr = set_base_address()
FunctionBreakpoint(0x134869, 'FUN_00134869')
Ghidra Python
Now that we have a skeleton for the target script, it’s time to start authoring a Ghidra script
to generate it. The Ghidra script will need to write out this template while filling in the
address for read and then writing a FunctionBreakpoint call for each function.
sym_table = currentProgram.getSymbolTable()
for symbol in sym_table.getPrimarySymbolIterator(True):
if symbol.getName() == "read":
read_addr = symbol.getAddress()
break
fm = currentProgram.getFunctionManager()
functions = fm.getFunctions(True)
for func in functions:
func_name = func.getName()
func_entry = func.getEntryPoint().getOffset()
Artificial Blocks
The above code snippet will iterate over all function symbols Ghidra is aware of.
Unfortunately this includes some sections which are artificially created by Ghidra to
support relocation:
And so to avoid adding breakpoints within these artificial blocks, we need to walk the
memory to identify these blocks so we can avoid adding breakpoints.
memory = currentProgram.getMemory()
blocks = memory.getBlocks()
artificial_blocks = []
for block in blocks:
# Check if the block has a specific name or comment
if block.getComment() and "artificial" in block.getComment().lower():
artificial_blocks.append(block)
elif block.getName() and "artificial" in block.getName().lower():
artificial_blocks.append(block)
Test Run
Putting all these pieces together produces the script you’ll find on GitHub. Now let’s take it
for a test drive with the provided CommandServer. First load the program into Ghidra and,
using Ghidra’s script manager, run the provided script from GitHub to produce a file,
gdb_script.py or similar which can be run through gdb as follows:
Program stopped.
0x00007ffff7fd0100 in _start () from /lib64/ld-linux-x86-64.so.2
Base address set to: 0x555555454000
Breakpoint 1 at 0x555555588000
Breakpoint 2 at 0x555555588020
Breakpoint 3 at 0x5555555883d0
Breakpoint 4 at 0x5555555883e0
Breakpoint 5 at 0x5555555883f0
…
Breakpoint 90 at 0x55555558ad90
Breakpoint 91 at 0x55555558ad98
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Function: entry
Function: FUN_00136d20
Function: _DT_INIT
Function: _INIT_0
Function: FUN_001347e0
Function: FUN_00136a55
Function: socket
Function: setsockopt
Function: htons
Function: bind
Function: listen
Function: printf
Server is listening on port 8080
Function: puts
Waiting for a new connection...
Function: accept
You can now interact with CommandServer by sending data to it using CommandClient.py
as follows:
While handling this request, GDB is also logging what additional (new) functions have been
called: