How The Java Virtual Machine
How The Java Virtual Machine
Registers
The registers of the Java Virtual Machine are similar to the registers in our computer. However, because the Virtual Machine is stack based, its registers are not used for passing or receiving arguments. In Java, registers hold the machine's state, and are updated after each line of byte code is executed, to maintain that state. The following four registers hold the state of the virtual machine: frame, the reference frame, and contains a pointer to the execution environment of the current method. optop, the operand top, and contains a pointer to the top of the operand stack, and is used to evaluate arithmetic expressions. pc, the program counter, and contains the address of the next byte code to be executed. vars, the variable register, and contains a pointer to local variables. All these registers are 32 bits wide, and are allocated immediately. This is possible because the compiler knows the size of the local variables and the operand stack, and because the interpreter knows the size of the execution environment.
The Stack
The Java Virtual Machine uses an operand stack to supply parameters to methods and operations, and to receive results back from them. All byte code instructions take operands from the stack, operate on them, and return results to the stack. Like registers in the Virtual Machine, the operand stack is 32 bits wide. The operand stack follows the last-in first-out (LIFO) methodology, and expects the operands on the stack to be in a specific order. For example, the isub byte code instruction expects two integers to be stored on the top of the stack, which means that the operands must have been pushed there by the previous set of instructions. isub pops the operands off the stack, subtracts them, and then pushes the results back onto the stack. In Java, integers are a primitive data type. Each primitive data type has unique instructions that tell it how to operate on operands of that type. For example, the lsub byte code is used to perform long integer subtraction, the fsub byte code is used to perform floating-point subtraction, and the dsub byte code is used to perform long integer subtraction. Because of this, it is illegal to push two integers onto the stack and then treat them as a single long integer. However, it is legal to push a 64-bit long integer onto the stack and have it occupy two 32-bit slots. Each method in our Java program has a stack frame associated with it. The stack frame holds the state of the method with three sets of data: the method's local variables, the method's execution environment, and the method's operand stack. Although the sizes of the local variable and the execution environment data sets are always fixed at the start of the method call, the size of the operand stack changes as the method's byte code instructions are
executed. Because the Java stack is 32 bits wide, 64-bit numbers are not guaranteed to be 64bit aligned.
Collapse
Collapse
System.out.println("Hello world!");
At compile time, the Java compiler converts the single-line print statement to the following byte code:
Collapse
0 getstatic #6 <Field java.lang.System.out Ljava/io/PrintStream;> 3 ldc #1 <String "Hello world!"> 5 invokevirtual #7 <Method java.io.PrintStream.println(Ljava/lang/String;)V> 8 return
The JDK provides a tool for examining byte code called the Java class file disassembler. We can run the disassembler by typing javap at the command line. Because the byte code instructions are in such a low-level format, our programs execute at nearly the speed of programs compiled to machine language. All instructions in machine language are represented by byte streams of 0s and 1s. In a low-level language, byte streams of 0s and 1s are replaced by suitable mnemonics, such as the byte code instruction isub. As with assembly language, the basic format of a byte code instruction is:
Collapse
<Operation>
Therefore, an instruction in the byte code instruction set consists of a 1-byte opcode specifying the operation to be performed, and zero or more operands that supply parameters or data that will be used by the operation.
<operands(s)>
Summary
The Java Virtual Machine exists only in the memory of our computer. Reproducing a machine within our computer's memory requires seven key objects: a set of registers, a stack, an execution environment, a garbage-collected heap, a constant pool, a method storage area, and a mechanism to tie it all together. This mechanism is the byte code instruction set. To examine byte code, we can use the Java class file disassembler, javap. By examining bytecode instructions in detail, we gain valuable insight into the inner workings of the Java Virtual Machine and Java itself. Each byte code instruction performs a specific function of extremely limited scope, such as pushing an object onto the stack or popping an object off the
stack. Combinations of these basic functions represent the complex high-level tasks defined as statements in the Java programming language. As amazing as it seems, sometimes dozens of byte code instructions are used to carry out the operation specified by a single Java statement. When we use these byte code instructions with the seven key objects of the Virtual Machine, Java gains its platform independence and becomes the most powerful and versatile programming language in the world.