Boosting Instruction Set Simulator Performance with Parallel Block Optimisation and Replacement

Alexander, B., Donnellan, S., Jeffries, A., Olds, T. and Sizer, N.

    Time-to-market is a critical factor in the commercial success of new consumer devices. To minimise delays, system developers and third party software vendors must be able to test their applications before the hardware platform becomes available. Instruction Set Simulators (ISS's) underpin this early development by emulating new platforms on ordinary desktop machines. As target platforms become faster the performance demands on ISS's become greater. A key challenge is to leverage available simulator technology to produce, at low cost, incremental performance gains needed to keep up with these demands. In this work we use a very simple strategy: in-place-block-replacement to produce improvements in the performance of the popular QEMU functional simulator. The replacement blocks are generated at runtime using the LLVM JIT running on spare processor cores. This strategy provides a very lightweight way to incrementally build an alternate code generator within an existing ISS framework without incurring a substantial runtime cost. We show the approach is effective in reducing the runtimes of the QEMU user-space emulator on a number of SPECint 2006 benchmarks.
Cite as: Alexander, B., Donnellan, S., Jeffries, A., Olds, T. and Sizer, N. (2012). Boosting Instruction Set Simulator Performance with Parallel Block Optimisation and Replacement. In Proc. Australasian Computer Science Conference (ACSC 2012) Melbourne, Australia. CRPIT, 122. Reynolds, M. and Thomas, B, Eds., ACS. 11-20
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS