BHUSA2015 Unicorn

Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

Unicorn: Next Generation CPU Emulator Framework

www.unicorn-engine.org

NGUYEN Anh Quynh <aquynh -at- gmail.com>


DANG Hoang Vu <danghvu -at- gmail.com>

BlackHat USA, August 5th 2015

1 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Self-introduction

Nguyen Anh Quynh (aquynh -at- gmail.com)


I PhD in Computer Science, security researcher
I Operating System, Virtual Machine, Binary analysis, Forensic, etc
I Capstone disassembly framework (capstone-engine.org)

Dang Hoang Vu (danghvu -at- gmail.com)


I PhD candidate in Computer Science at UIUC, security hobyist
I Member of VNSecurity.NET, casual CTF player, exploit writer
I Capstone, Peda contributor

2 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Agenda

1 CPU Emulator
Background
Problems of existing CPU emulators

2 Unicorn engine: demands, ideas, design & implementation


Goals of Unicorn
Design & implementation
Write applications with Unicorn API

3 Live demo

4 Conclusions

3 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
CPU Emulator

Definition
Emulate physical CPU - using software only.
Focus on CPU operations only, but ignore machine devices.

Applications
Emulate the code without needing to have a real CPU.
I Cross-architecture emulator for console game.
Safely analyze malware code, detect virus signature.
Verify code semantics in reversing.

4 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Example
Emulate to understand code semantics.

5 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Internals of CPU emulator

Given input code in binary form


Decode binary into separate instructions
Emulate exactly what each instruction does
I Instruction-Set-Architecture manual referenced is needed
I Handle memory access & I/O upon requested
Update CPU context (regisers/memory/etc) after each step

6 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Example of emulating X86 32bit instructions

Ex: 50 → push eax


I load eax register
I copy eax value to stack bottom
I decrease esp by 4, and update esp

Ex: 01D1 → add eax, ebx


I load eax & ebx registers
I add values of eax & ebx, then copy result to eax
I update flags OF, SF, ZF, AF, CF, PF accordingly

7 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Challenges of building CPU emulator

Huge amount of works!


Good understanding of CPU architecture
Good understanding of instruction set
Instructions with various side-effect (sometimes undocumented, like
ex: Intel X86)
Tough to support all kind of code existed

8 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Good CPU emulator?

Multi-arch?
I X86, Arm, Arm64, Mips, PowerPC, Sparc, etc
Multi-platform?
I *nix, Windows, Android, iOS, etc
Updated?
I Keep up with latest CPU extensions
Independent?
I Support to build independent tools
Good performance?
I Just-In-Time (JIT) compiler technique vs Interpreter

9 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Existing CPU emulators

Features libemu PyEmu IDA-x86emu libCPU Dream


Multi-arch X X X X1 X
Updated X X X X X
Independent X2 X3 X4 X X
JIT X X X X X

Multi-arch: existing tools only support X86


Updated: existing tools do not supports X86_64

1
Possible by design, but nothing actually works
2
Focus only on detecting Windows shellcode
3
Python only
4
For IDA only
10 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Dream a good emulator

Multi-architectures
I Arm, Arm64, Mips, PowerPC, Sparc, X86 (+X86_64) + more
Multi-platform: *nix, Windows, Android, iOS, etc
Updated: latest extensions of all hardware architectures
Independent with multiple bindings
I Low-level framework to support all kind of OS and tools
I Core in pure C, and support multiple binding languages
Good performance with JIT compiler technique
I Dynamic compilation vs Interpreter
Allow instrumentation at various levels
I Single-step/isntruction/memory access

11 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Problems

No reasonable CPU emulator even in 2015!


Apparently nobody wants to fix the issues
No light at the end of the dark tunnel
Until Unicorn was born!

12 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Unicorn == Next Generation CPU Emulator

13 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Goals of Unicorn

Multi-architectures
I Arm, Arm64, Mips, PowerPC, Sparc, X86 (+X86_64) + more
Multi-platform: *nix, Windows, Android, iOS, etc
Updated: latest extensions of all hardware architectures
Core in pure C, and support multiple binding languages
Good performance with JIT compiler technique
Allow instrumentation at various levels
I Single-step/instruction/memory access

14 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Unicorn vs others

Features libemu PyEmu IDA-x86emu libCPU Unicorn


Multi-arch X X X X X
Updated X X X X X
Independent X X X X X
JIT X X X X X

Multi-arch: existing tools only support X86


Updated: existing tools do not supports X86_64

15 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Challenges to build Unicorn engine

Huge amount of works!


Too many hardware architectures
Too many instructions
Instructions with various side-effect (sometimes undocumented, like
Intel X86)
Hard to to support all kind of code existed
Limited resource
I Started as a personal for-fun in-spare-time project

16 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Unicorn design

17 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Ambitions & ideas

Have all features in months, not years!


Stand on the shoulders of the giants at the initial phase.
Open source project to get community involved & contributed.
Idea: Qemu!

18 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Introduction on Qemu

Qemu project
Open source project (GPL license) on system emulator:
http://www.qemu.org
Huge community & highly active
Multi-arch
I X86, Arm, Arm64, Mips, PowerPC, Sparc, etc (18 architectures)
Multi-platform
I Compile on *nix + cross-compile for Windows

19 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Qemu architecture

Courtesy of cmchao

20 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Why Qemu?

Support all kind of architectures and very updated


Already implemented in pure C, so easy to immplement Unicorn core
on top
Already supported JIT in CPU emulation

21 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Are we done?

22 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Challenges to build Unicorn (1)

Qemu codebase is a challenge


Not just emulate CPU, but also device models & ROM/BIOS to fully
emulate physical machines
Qemu codebase is huge and mixed like spaghetti :-(
Difficult to read, as contributed by many different people

Unicorn job
Keep only CPU emulation code & remove everything else (devices,
ROM/BIOS, migration, etc)
Keep supported subsystems like Qobject, Qom
Rewrites some components but keep CPU emulation code intact (so
easy to sync with Qemu in future)

23 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Challenges to build Unicorn (2)

Qemu is set of emulators


Set of emulators for individual architecture
I Independently built at compile time
I All archs code share a lot of internal data structures and global
variables
Unicorn wants a single emulator that supports all archs :-(

Unicorn job
Isolated common variables & structures
I Ensured thread-safe by design
Refactored to allow multiple instances of Unicorn at the same time
Modified the build system to support multiple archs on demand

24 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Challenges to build Unicorn (3)

Qemu has no instrumentation


Instrumentation for static compilation only
JIT optimizes for performance with lots of fast-path tricks, making
code instrumenting extremely hard :-(

Unicorn job
Build dynamic fine-grained instrumentation layer from scratch
Support various levels of instrumentation
I Single-step or on particular instruction (TCG level)
I Intrumentation of memory accesses (TLB level)
I Dynamically read and write register or memory during emulation.
I Handle exception, interrupt, syscall (arch-level) through user provided
callback.

25 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Challenges to build Unicorn (4)

Qemu is leaking memory


Objects is open (malloc) without closing (freeing) properly everywhere
Fine for a tool, but unacceptable for a framework

Unicorn job
Find and fix all the memory leak issues
Refactor various subsystems to keep track and cleanup dangling
pointers.

26 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Unicorn vs Qemu

Forked Qemu, but go far beyond it


Independent framework
Much more compact in size, lightweight in memory
Thread-safe with multiple architectures supported in a single binary
Provide interface for dynamic instrumentation
More resistant to exploitation (more secure)
I CPU emulation component is never exploited!
I Easy to test and fuzz as an API.

27 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Qemu vulnerabilities

28 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Write applications with Unicorn

29 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Introduce Unicorn API

Clean/simple/lightweight/intuitive architecture-neutral API.


The core provides API in C
I open & close Unicorn instance
I start & stop emulation (based on end-address, time or instructions
count)
I read & write memory
I read & write registers
I memory management: hook memory events, dynamically map memory
at runtime
F hook memory events for invalid memory access
F dynamically map memory at runtime (handle invalid/missing memory)
I instrument with user-defined callbacks for
instructions/single-step/memory event, etc
Python binding built around the core

30 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Sample code in C

31 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Sample code in Python

32 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Live demo

33 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Status & future works

Status
Support Arm, Arm64, Mips, M68K, PowerPC, Sparc, X86 (+X86_64)
Python binding available
Based on Qemu 2.3

Future works
Support all the rest architectures of Qemu
(alpha/s360x/microblaze/sh4/etc - totally 18)
Stripping more ultility code from Qemu e.g. improve the disassembler
(with potential integration with Capstone).
More bindings promised by community!
Synchronize with Qemu 2.4 (released soon)
I Future of Unicorn is guaranteed by Qemu active development!

34 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Conclusions

Unicorn is an innovative next generation CPU emulator


I Multi-arch + multi-platform
I Clean/simple/lightweight/intuitive architecture-neutral API
I Implemented in pure C language, with bindings for Python available.
I High performance with JIT compiler technique
I Support fine-grained instrumentation at various levels.
I Thread-safe by design.
I Open source GPL license.
I Future update guaranteed for all archs.
We are seriously committed to this project to make it the best CPU
emulator.

35 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Call for beta testers

Run beta test before official release


Willing to help? If you can code, contact us!
I Unicorn homepage: http://www.unicorn-engine.org
I Unicorn twitter: @unicorn_engine
I Unicorn mailing list:
http://www.freelists.org/list/unicorn-engine
First public version to be released after the beta phase - in GPL
license.

36 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Questions and answers
Unicorn: Next Generation CPU Emulator Framework

NGUYEN Anh Quynh <aquynh -at- gmail.com>

DANG Hoang Vu <danghvu -at- gmail.com>

37 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
References

Qemu: http://www.qemu.org
libemu: http://libemu.carnivore.it
PyEmu: http://code.google.com/p/pyemu
libcpu: https://github.com/libcpu/libcpu
IDA-x86emu: http://www.idabook.com/x86emu/index.html
Unicorn engine
I Homepage: http://www.unicorn-engine.org
I Mailing list: http://www.freelists.org/list/unicorn-engine
I Twitter: @unicorn_engine

38 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework
Acknowledgement

Nguyen Tan Cong for helped with the shellcode demo!


Other beta testers helped to improve our code!

39 / 39 NGUYEN Anh Quynh, DANG Hoang Vu Unicorn: Next Generation CPU Emulator Framework

You might also like