100% found this document useful (1 vote)
151 views51 pages

Recon 2019

Ghidra 9.1 introduces new features for system call decompilation and Sleigh processor model development, including the ability to override references to display system calls correctly in the decompiler and tools for editing and testing Sleigh files that define processor instruction semantics. The document provides overviews of these new capabilities and discusses opportunities for further improving system call handling and the Sleigh development environment.

Uploaded by

John Mack
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
151 views51 pages

Recon 2019

Ghidra 9.1 introduces new features for system call decompilation and Sleigh processor model development, including the ability to override references to display system calls correctly in the decompiler and tools for editing and testing Sleigh files that define processor instruction semantics. The document provides overviews of these new capabilities and discusses opportunities for further improving system call handling and the Sleigh development environment.

Uploaded by

John Mack
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Open Source Ghidra

The First Few Months

emteere
ghidracadabra

Recon MTL 2019


1/51
Outline

Ghidra Overview

New for 9.1: System Call Decompilation

New for 9.1: Sleigh Development Tools

Community Interaction

2/51
Ghidra Overview
• Full-featured SRE framework created by NSA Research.
• In development for ∼20 years.
• Primarily written in Java.
I Some C/C++.
I Can write scripts in Python.

• Designed for customizability and extensibility.


• Ghidra 9.0 publicly released March 2019.
• Source code released on Github April 2019.
• www.ghidra-sre.org
• https://github.com/NationalSecurityAgency/ghidra

3/51
Disassembler
4/51
Function Graph
5/51
Decompiler
6/51
P-code: Ghidra’s IR
Specified Using SLEIGH Language
7/51
Connected Tools

8/51
Scripting in Java and Python

9/51
Eclipse Integration

10/51
Multi-user Server with Version Control
11/51
support]$ ./analyzeHeadless ghidra://localhost/repo
-import /usr/bin/* -recursive
-postScript MyScript.py

Batch Processing with the Headless Analyzer

12/51
Version Tracking Tool

13/51
New Features for 9.1

14/51
Decompiling System Calls (syscalls)
• System calls are a way for a program to request a service
from the operating system.
• Services include process control, file management, device
management,...
• Typical implementation includes a native instruction and a
register, which we’ll call the system call register.
• When the instruction is executed, the value in the system call
register determines which function is called.

15/51
x64 Linux syscall

16/51
System Calls as User-defined Operations
• In this example, the syscall instruction implemented with a
pcodeop/CALLOTHER
• Such operators certainly have their uses, but not very
satisfying in this case.

17/51
Desired Behavior
• We’d like to see the correct function call in the decompiler:
I Correct name.
I Correct signature.

I Correct calling convention.

• We’d also like to get cross-references

18/51
• Need dataflow analysis to determine value in syscall register.

19/51
• Value in syscall register is not necessarily the syscall number
defined in system header file.

20/51
Additional Issues
• The system call register can be an OS decision — not
necessarily specified by ISA.
• System call numbers can change based on the OS
version/service pack.
• System calls might have their own calling convention.
• There can be more than one native instruction used to make
system calls (e.g., syscall and int 2e).
• Might not use a dedicated native system call instruction, e.g.,
system calls via CALL GS:[0x10].

21/51
Where to Put Them?
• In general, the code for system call targets is not in the
program’s address space.
• Where to put them in Ghidra?
• The OTHER space is used to store data from a binary that is
not loaded into memory.
I E.g., the .comment section of an ELF file.
• In 9.1, we’ve made the decompiler aware of the OTHER space.
• Recommendation for system calls:
I System call target should be in overlay(s) of the OTHER space.
I Use the system call number as the address in the overlay.

22/51
How to Get There?
• OK, great, we have a place for system call targets.
• How do you get there?
• New feature: Overriding References.
• Basically, this allows you to intercept certain Pcode ops on
their way to the decompiler and modify them.
I Change CALLOTHER ops to CALL ops and set destination.
I Change CALLIND to CALL ops and set destination.
I (plus a few others)
• See ResolveX86or64LinuxSystemCallsScript.java for an
example.

23/51
x64 Linux syscall with Overriding Reference

24/51
Functions in an Overlay of the OTHER Space

25/51
x64 Linux syscall Decompilation
Ghidra 9.1 (after running script)
26/51
Future Work
• We’d like an analyzer to be able to do this (mostly)
automatically.
• Ghidra has a notion of per-processor configuration
(.pspec files) and per-compiler configuration (.cspec files).
• System call data doesn’t quite fit this model.
• Ideally all the system call related configuration would be in
one place.
• Working on a notion of an OS/environment configuration.
• This will have other applications in Ghidra as well.

27/51
Sleigh Development Tools
• Sleigh
• SleighEditor
• Sleigh P-Code Tests
• Additional Techniques
• General Sleigh Development

28/51
Sleigh Processor Models

• Memory model
• Registers
• Display (printpiece)
• Decode patterns
• Semantics (Pcode)

Build it and the tools just work


Disassembly,Assembler(patch),Decompiler,Analysis...

29/51
Sleigh Processors

• Currently Included - evolving list


X86 16/32/64, ARM/AARCH64, PowerPC 32/64/VLE, MIPS 16/32/64/micro
68xxx, Java / DEX bytecode, PA-RISC, PIC 12/16/17/18/24, Sparc 32/64
CR16C, Z80, 6502, 8051, MSP430, AVR8, AVR32, and variants.

• Full Processor Contributions


Tricore, MCS-48

• Extensions, Improvements, and Bugs


ARM, PPC, 68xxx, AVR, PIC-16F, PPC, 6502, golang

• Seen in Development
SH-2, WebAssembly, Hexagon, Toshiba MeP-c4, Pic16F153xx, Arm4t-gba,
NVIDIA Falcon, PowerPC 750CL/CXe, WDC-65816, RISC-V, TI TMS9900

30/51
Sleigh Files

• LDEF
• PSPEC
• CSPEC
• SLASPEC
• SLA

• Java Files
• Manual Index
• Pattern Files

• Emulatornew
• Sleigh P-Code Testsnew

31/51
Sleigh Editor
• Syntax Coloring
• Hover
• Navigation
• Code Formatting
• Validation
• Quick Fixes
• Renaming
• Find References
• Content Assist
• Sleigh Compiler
Error Navigation

Xtext - DSL Framework for Eclipse


Eclipse IDE for Java and DSL Developers - 2019-03

32/51
Setting up Sleigh Editor - Xtext project
• Eclipse Help:Install New Software
I Add Archive: GhidraSleighEditor.zip

• Convert GhidraScript to Xtext project

I Allows for multi-file navigation


I Good for casual browsing
I Problem: all variables will be available (6502, PPC)
quick-fixes will be slower

• Best: Import as new Java Project - Ghidra/Processors/6502


• Large Sleigh projects can be slow - AARCH64 - 85K LOC
• Use separate Eclipse

33/51
Sleigh Editor

Quick Demo

After Edit - ReloadSleigh Script

Only works for some changes

No Structural changes - register, memory, pcodeop,. . .

34/51
Sleigh Editor - Future Features
• Better project integration
• Code-Mining - auto-comment
• Navigation from Ghidra to SleighEditor in Eclipse
• Templates of common idioms
• More Hovers
• Conversion of number to different formats
• Syntax coloring in the printpiece
• Refactoring:
Extract common patterns to sub-constructor
• Instruction Pattern Match Documentation

35/51
Sleigh P-Code Tests - Sleigh Testing Framework

• C code compiled for processor


• Small tests with known result
• General coverage of instructions emitted by C compilers
• Verifies core constructs - Addressing Modes, Registers
• Pcode Emulator to Execute and Verify
• Repeatable - regression testing
• Extendable - needs more cowbell
• Special case code - Assembly

36/51
Sleigh P-Code Tests - Tricore in Eclipse

State of All Processors

All Passing

37/51
Sleigh P-Code Tests - Example - Tricore

• Contribution - mumbel
• Surprisingly well written
• Call Context Save/Restore
• TRICORE O0 EmulatorTest
• EmulateInstructionStateModifier

38/51
Sleigh P-Code Tests - Debugging Sleigh

Debug One Failing test - lots of output


Directory - test-output / cache, logs, results

39/51
Sleigh P-Code Tests - Debugging tests

Match Unique to Read/Write


0x1f/0x1b should be 0x3f/0x1a, not extracting enough

40/51
InstructionInfo - Locating problems

41/51
External Disassembly Field

• binutils wrapper gdis


I Acts as a server

• Other Disassemblers
I dump/scrape
I code composer studio

• Verify, Debug, Mine

42/51
Script - CompareSleighExternal

43/51
Script - DebugSleighInstructionParse

44/51
Developing a Sleigh Module - What’s Good Enough?

• Disassembly
I Decode, Display, Flow instructions

• References
I Addressing modes

• Decompilation
I All Data Flow, pseudoOp In/Out, Logic, Math

• Emulation - EmulateInstructionStateModifier
• Theorem Proving - Detailed effects
• Partial languages - OK
I Use unimpl, BadInstruction(), pseudoOp

• Speed up the Process - Automate it


I Scraping disassembly / PDF
I Parse disassembly tables, XML descriptions

45/51
Developing a Sleigh Module - Now What?

• Tune for decompilation - calling convention


• Load format
I ELF, .opinion for magic machineID

• Tune for emulation - Sleigh P-Code Tests


• Analyzers
I Stock constant reference propagation can work well
I Write specialized register propagation - Page register

• Pattern Files - recognize common patterns or key functions


• Variants - Pointer checking, Control Flow Guard
I Decompiler Pcode UserOp injection
I Use context, Define, variants with Slaspec

• FID Files - Static library pattern matching

46/51
Contacting Us
• The Ghidra team is on Github.
• @NSAGov on Twitter announces new releases.
• The Ghidra team is not on Twitter, reddit, Slashdot,
VKontakte,. . .

47/51
Reporting Bugs
• Please report bugs!
• The perfect bug report includes:
1. Source code.
2. Relevant bytes from the binary.
3. XML Debug Function Decompilation from decompiler.
4. Stack trace if there is one.
• Often we need an entire function and surrounding instructions.
• Pictures work, but can limit triage.
• We reserve the right to ignore sketchy binaries :)

48/51
www.ghidra-sre.org Stats (June 25)
• 9.0.0: 302k downloads
• 9.0.1: 36k downloads
• 9.0.2: 100k downloads
• 9.0.4: 42k downloads
• Site views: 10.6M
• Video hits: 751k

Github Stats (June 25)


• 16145 stars
• 2019 forks
• 718 watching
• 608 issues, 272 open
• 111 pull requests, 35 open

49/51
References

• Xtext - itemis.com, https://www.eclipse.org/Xtext/


• mumbel - https://github.com/mumbel/ghidra/tree/tricore
• SleighEditor README.html, build README.txt

50/51
Questions?

51/51

You might also like