The document discusses stack machine design and optimizations for the Java Virtual Machine (JVM). It proposes a two-level stack cache with the top two stack elements stored in registers and the rest in a dual-port on-chip memory. This design simplifies the pipeline and avoids forwarding logic compared to a three-port memory implementation. Evaluation shows the two-level cache approach achieves higher frequency and smaller size than alternatives like a register file cache or single-level RAM cache.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
81 views
JOPStack
The document discusses stack machine design and optimizations for the Java Virtual Machine (JVM). It proposes a two-level stack cache with the top two stack elements stored in registers and the rest in a dual-port on-chip memory. This design simplifies the pipeline and avoids forwarding logic compared to a three-port memory implementation. Evaluation shows the two-level cache approach achieves higher frequency and smaller size than alternatives like a register file cache or single-level RAM cache.
The Java Virtual Machine JVM is a stack machine All instructions access the stack 40% access local variables Stack and local variables need caching
JOP Stack Architecture 3
An Efficient Stack Machine JVM stack is a logical stack Frame for return information Local variable area Operand stack We could use independent stacks Argument-passing regulates the layout
JOP Stack Architecture 4
Parameter passing int val = foo(1, 2); ... public int foo(int a, int b) { int c = 1; return a+b+c; }
The invocation sequence:
aload_0 // Push the object reference iconst_1 // and the parameter onto the iconst_2 // operand stack. invokevirtual #2 // Invoke method foo:(II)I. istore_1 // Store the result in val.
public int foo(int,int):
iconst_1 // The constant is stored in a method istore_3 // local variable (at position 3). iload_1 // Arguments are accessed as locals iload_2 // and pushed onto the operand stack. iadd // Operation on the operand stack. iload_3 // Push c onto the operand stack. iadd ireturn // Return value is on top of stack.
JOP Stack Architecture 5
Stack Layout
JOP Stack Architecture 6
Stack Content Operand stack A=B+C*D TOS and TOS-1 Stack JVM Local variable area push B iload_1 Former op stack push C iload_2 At a deeper position push D iload_3 Saved context * imul Between locals and + iadd operand stack pop A istore_0
JOP Stack Architecture 7
Stack access Stack operation Read TOS and TOS-1 Execute Write back TOS Variable load Read from deeper stack location Write into TOS Variable store Read TOS Write into deeper stack location JOP Stack Architecture 8 Three Port Stack Memory Single cycle execution Two read ports for TOS and TOS-1 or Local variable One write port for TOS or Local variable
JOP Stack Architecture 9
Register File Stack Cache
Register file as circular Instruction fetch
buffer - small Instruction decode Automatic spill/fill RF read and execute Five access ports RF write back picoJava, aJile
JOP Stack Architecture 10
On-chip Memory Stack Cache
Large cache Instruction fetch
Three-port memory Instruction decode Additional pipeline stage Memory read Komodo, FemtoJava Execute Memory write back JOP Stack Architecture 11 JVM Stack Access Revised ALU operation A is TOS A <- A op B B is TOS-1 B <- sm[p] sm is stack array p <- p -1 p points to TOS-2 Variable load (Push) v points to local area A <- sm[v+n] n is the local offset B <- A sm[p+1] <- B op is a two operand stack operation p <- p +1 Variable store (Pop) sm[v+n] <- A A <- B B <- sm[p] p <- p -1
JOP Stack Architecture 12
Do we need a 3-port memory? Stack operation: Dual read from TOS and TOS-1 Write to TOS Variable load/store: One read port One write port TOS and TOS-1 as register Deeper locations as on-chip memory JOP Stack Architecture 13 Two-Level Stack Cache
Dual read only from TOS and Instruction fetch
TOS-1 Instruction decode Two register (A/B) Execute, load or store Dual-port memory Simpler Pipeline No forwarding logic JOP Stack Architecture 14 Stack Caches Compared Design Cache fmax Size
(LC) (bit) (MHz) (word)
ALU - - 237 -
16 register 707 0 110 16
RAM 111 8192 153 128
Two-level 112 4096 213 130
JOP Stack Architecture 15
Summary The JVM is a stack machine Stack and local variables need caching Two-level cache Two top levels as register Rest as on-chip memory (two ports) Small design Short pipeline
JOP Stack Architecture 16
Further Information JOP Thesis: p 78-93 Martin Schoeberl, Design and Implementation of an Efficient Stack Machine, In Proceedings of the 12th IEEE Reconfigurable Architecture Workshop, RAW 2005, Denver, Colorado, USA, April 2005.
Download Full Modern Computer Architecture and Organization Learn x86 ARM and RISC V architectures and the design of smartphones PCs and cloud servers 2nd Edition Ledin PDF All Chapters