0% found this document useful (0 votes)
73 views27 pages

A Hardware Implementation of The Java Virtual Machine

The document discusses picoJava, a hardware implementation of the Java Virtual Machine designed for embedded systems. It aims to provide the best price/performance for running Java applications by directly executing bytecodes in hardware without an interpreter or JIT compiler. The core is optimized for Java instructions through techniques like hiding loads from local variables and pipeline enhancements to support features like method invocations.

Uploaded by

wert1a2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views27 pages

A Hardware Implementation of The Java Virtual Machine

The document discusses picoJava, a hardware implementation of the Java Virtual Machine designed for embedded systems. It aims to provide the best price/performance for running Java applications by directly executing bytecodes in hardware without an interpreter or JIT compiler. The core is optimized for Java instructions through techniques like hiding loads from local variables and pipeline enhancements to support features like method invocations.

Uploaded by

wert1a2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

picoJavaTM:

A Hardware Implementation
of the Java Virtual Machine
Marc Tremblay and Michael OConnor
Sun Microelectronics
Slide 1

The Java picoJava Synergy


Javas origins lie in improving the
consumer embedded market
picoJava is a low cost microprocessor
dedicated to executing Java-based
bytecodes
Best system price/performance

It is a processor core for:


Network computer
Internet chip for network appliances
Cellular phone & telco processors
Traditional embedded applications
Slide 2

Java in Embedded Devices


Products in the embedded market require:
Robust programs
Graceful recovery vs. crash

Increasingly complex programs with


multiple programmers
Object-oriented language and
development environment

Re-using code from one product


generation to the next
Portable code

Safe connectivity to applets


For networked devices (PDA, pagers, cell phones)
Slide 3

Important Factors to Consider in


the Embedded World
Low system cost
Processor, ROM, DRAM, etc.

Good performance
Time-to-market
Low power consumption

Slide 4

Various Ways of Implementing


the Java Virtual Machine
APIs

HotJava

Applets

Virtual Machine
Host Porting Interface
Adaptor

Adaptor

Browser

OS
Hardware
Architecture

OS
Hardware
Architecture

Adaptor

OS

JavaOS

Hardware
Architecture

picoJava

Slide 5

picoJava
Directly executes bytecodes
Excellent performance
Eliminates the need for an interpreter
or a JIT compiler
Small memory footprint

Simple core
Legacy blocks and circuits are not present

Hardware support for the runtime


Addresses overall system performance

Slide 6

Java Virtual Machine


What the virtual machine specifies:
Instruction set
Data types
Operand stack
Constant pool
Method area
Heap for runtime data
Format of the class file

Slide 7

Virtual Machine Instruction Set


Data types: byte, short, int, long float, double,
char, object, returnAddress
All opcodes have 8 bits, but are followed by a
variable number of operands (0, 1, 2, 3, )
Opcodes
200 assigned
25 quick variations
3 reserved

Slide 8

Java Virtual Machine Code Size


Java-based bytecodes are small
No register specifiers
Local variable accessed relative to a base
pointer (VARS)

This results in very compact code


Average JVM instruction is 1.8 bytes
RISC instructions typically require
4 bytes

Slide 9

Instruction Length
100%
80%

others

60%

3 bytes

40%

2 bytes
1 byte

20%
Hot Java

Pento.

Dhrys.

Ray

Compr.

Tomcat

Javac

0%

Slide 10

Java Virtual Machine Code Size


Java bytecodes are about 2X smaller than
the RISC code from the C++ compiler
A large application (2500+lines) coded in
both the C++ and Java languages

Slide 11

JVM Instruction Set RISCy


Some instructions are simple
bipush value
iadd
fadd
ifeq
iload offset

:push signed integer


:integer add
:single float add
:branch if equal to O
:load integer from
:local variable

Slide 12

JVM Instruction Set CISCy


Some instructions are complex
lookupswitch: traditional switch statement
byte
byte 11
opcode
opcode (171)
(171)

byte
byte 22

byte
byte 4
byte 33
0..3
0..3byte
bytepadding
padding
default
default offset
offset
numbers
numbers of
of pairs
pairs that
that follow
follow (N)
(N)
match
match 11
jump
jump offset
offset 11
match
match 22
jump
jump offset
offset 22
...
...
...
...
match
match N
N
jump
jump offset
offset N
N

Slide 13

Interpreter Loop
loop: 1: fetch bytecodes
2: indirect jump to
emulation code

Emulation Code
1: get operands
2: perform
operation
3: increment PC
4: go to loop

Slide 14

JVM: Stack-Based Architecture


Operands typically accessed from the
stack, put back on the stack
Example integer add:
Add top 2 entries in the stack and put the result on top
of the stack
Typical emulation on a RISC processor

1:
2:
3:
4:

load tos
load tos-1
add
store tos-1

Slide 15

How to Best Execute Bytecodes?


Leverage RISC techniques developed
over the past 15 years
Implement in hardware only those
instructions that make a difference
Trap for costly instructions that do not occur often
State machines for high frequency/medium
complexity instructions

Slide 16

Dynamic Instruction Distribution


100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

calls/ret
compute
st object
ld object

Hot Java

Pento

Dhryst.

Ray

Compr

Javac

stack ops

Slide 17

Composite Instruction Mix


Stack ops: dup, push, loads
and stores to local variables
compute: ALU, FP,
compute branches compute
28%
calls/ret: method
invocation virtual
and non-virtual
ld/st object: access
st object
to objects on the
5%
heap and array accesses

calls/ret
7%

stack ops
43%

ld object
17%

Slide 18

Loads from Local Variables


calls/ret
8%
stack ops
29%

compute
36%

st object
6%

ld object
21%

Loads from local


variables move data
within the chip
Target register is
often consume
immediately
Up to 60% of them
can be hidden
Resulting instruction
distribution looks
closer to a RISC
processor
Slide 19

Pipeline Design
RISC pipeline attributes
Stages based on fundamental paths (e.g. cache
access, ALU path, registers access)
No operation on cache/memory data
Hardwire all simple operations

Enhance classic pipeline


Support for method invocations
Support for hiding loads from local variables

Slide 20

Implementation of
Critical Instructions
Stack

getfield_quick offset
Fetch field from object
Executes as a load
[object + offset] on
picoJava

iadd
Fully pipelined
Executes in a single
cycle

objectref

value

...

...

value1

result

value2

...

...

Before

After
Slide 21

Typical Small Benchmarks


(Caffeinemarks, Pentonimo, etc.)
Few objects, few calls, few threads
95%

5%

Interpreter

Run Time

Speeding up the
Interpreter by 30X results in: 95
5

3.2
5
8.2

=> Speedup of ~12X


Slide 22

Representative Applications
Lots of Objects
Threaded Code
60 - 80%

40 - 20%

Interpreter
Synchronization
Garbage Collection
Object Creation
Speeding up the

Interpreter by 30X results in:

60
40

2
40
42

=> Speedup of ~2X


Slide 23

Percentage of Calls
12
10
8
6
4
2
0

Varies dramatically according to benchmark type


Slide 24

picoJava:
A System Performance Approach
Accelerates object-oriented programs
simple pipeline with enhancements for features specific
.to bytecodes
support for method invocation

Accelerates runtime
(gc.c, monitor.c, threadruntime.c, etc.)
Support for threads
Support for garbage collection

Simple but efficient, non-invasive, hardware


support

Slide 25

System Programming
Instructions added to support system
programming

available only under the hood


operating system functions
access to I/O devices
access to the internals of picoJava

Slide 26

picoJava - Summary
Best system price/performance for running
Java-powered applications in embedded markets
Embedded market very sensitive to system
cost and power consumption
Interpreter and/or JIT compiler eliminated
Excellent system performance
Efficient implementation through use of the
same methodology, process and circuit
techniques developed for RISC processors

Slide 27

You might also like