Java Bytecode: What's Inside Class Files
Java Bytecode: What's Inside Class Files
MIT-Africa Internet
Technology Initiative
©2005
Typical Software Production
• Source code is ported to different
platform-specific sources.
• Each port is compiled with a platform-
specific compiler.
• Each compiler produces platform-
specific machine code (or binaries).
• Binaries execute on a single platform.
MIT-Africa Internet
Technology Initiative
©2005
Write Once, Run Anywhere
• Java source code is is compiled into
machine-independent bytecode class
files.
• javac command compiles .java source
code into .class bytecode.
• Bytecode is interpreted by machine-
specific Java Virtual Machines (JVMs).
• Bytecode consists of simple, step-by-
step instructions for the JVM.
MIT-Africa Internet
Technology Initiative
©2005
Java Virtual Machines
• JVM is a computer simulated in a
computer.
• JVMs are built into most web browsers.
• java command gives .class file to JVM.
• JVM interprets the bytecode into
instructions that a specific platform can
understand.
• Different JVMs for different platforms:
Windows, Mac OS X, Linux, cell
phones.
MIT-Africa Internet
Technology Initiative
©2005
Using javap
• javap is a program that can give you
information on .class files.
• If you get a class file without documentation,
javap can tell you what methods you can
call.
• javap can show you how javac compiles
your program into simple instructions.
• Good for high-performance optimizations or if
you don’t have access to an Application
Programming Interface (API).
MIT-Africa Internet
Technology Initiative
©2005
javap Output
• Given “Mystery.class”:
> javap Mystery
Compiled from "Mystery.java"
class Mystery extends java.lang.Object{
Mystery();
public static int unknown(int,int);
}
• One static method called “unknown” that
takes two integers and returns an integer.
MIT-Africa Internet
Technology Initiative
©2005
javap -c Output
• Using the “-c” flag will disassemble the
bytecode:
public static int unknown(int,int);
Code:
0: iload_0 Loads the first integer argument.
1: iload_1 Loads the second integer argument.
2: iadd Adds the two loaded integer values.
MIT-Africa Internet
Technology Initiative
©2005
public static java.lang.String
stringAdd(java.lang.String,java.lang.String,int);
Code:
0: new #2; //class StringBuffer
3: dup
4: invokespecial #3; //Method
java/lang/StringBuffer."<init>":()V
7: aload_0
8: invokevirtual #4; //Method
java/lang/StringBuffer.append:
(Ljava/lang/String;)Ljava/lang/StringBuffer;
11: aload_1
12: invokevirtual #4; //Method
java/lang/StringBuffer.append:
(Ljava/lang/String;)Ljava/lang/StringBuffer;
15: invokevirtual #5; //Method
java/lang/StringBuffer.toString:()Ljava/lang/String;
18: areturn
}
Example: String Appends
• All that bytecode is equivalent to:
StringBuffer temp;
temp = new StringBuffer();
temp.append(a);
temp.append(b);
return temp.toString();
• One line of code initializes an object
and makes three method calls.
MIT-Africa Internet
Technology Initiative
©2005
String Append Optimization
• What if we had:
• String a =“”; String b = “hello”;
for (int i=0; i<n; i++)
a += b;
• This code performs n initializations and
3n method calls.
• Using javap showed us there’s a class
called StringBuffer that has an append()
function.
MIT-Africa Internet
Technology Initiative
©2005
String Append Optimization
• A more efficient way:
StringBuffer aBuff =
new StringBuffer();
String b = “hello”;
for (int i=0; i<n; i++)
a.append(b);
String a = aBuff.toString();
• Only performs two initializations and
n+1 method calls -- three times faster.
MIT-Africa Internet
Technology Initiative
©2005
Questions
• What advantages are there in running
programs on simulated computers, like
JVMs?
• What disadvantages might there be?
• Could you compile Java directly into
native machine code like C or C++,
instead of using a JVM?
• Is it possible to generate Java source
code from class files, i.e. decompile?
MIT-Africa Internet
Technology Initiative
©2005