WebSphere Real Time: Predictable Performance in Realtime Java Applications
WebSphere Real Time: Predictable Performance in Realtime Java Applications
White paper
Creating predictable-performance
Java applications in real time.
March 2007
Creating predictable-performance Java applications in real time.
Page
Executive summary
Contents
This white paper provides a short primer on real-time applications and the
issues and concerns with using a standard Java™ Virtual Machine (JVM) to run
2 Executive summary
them. It then describes how IBM has addressed most of these problems in the
2 What is a real-time application?
IBM WebSphere® Real Time product. In particular, this white paper discusses
3 Can Java technology be used for
the innovations made in each of the core components of the new real-time JVM,
real-time applications?
including the Metronome garbage collector, the J9 JVM, IBM ahead-of-time
5 RTSJ: Addressing the challenges
(AOT) and just-in-time (JIT) compilers, the extensions to IBM’s core class
of real-time environments
libraries and the new class libraries provided as part of IBM Real-Time
7 WebSphere Real Time:
Specification for Java (RTSJ) support.
A robust tool for managing
real-time environmnents
What is a real-time application?
7 Real-time Linux
Real-time is a particularly broad term that is used to describe applications that
10 Real-time garbage collection:
have real-world timing requirements. For example, a sluggish user interface
Metronome
does not satisfy the generic real-time requirements of an average user. This
15 Real-time compilation
form of application is often described as a soft real-time application, because no
16 Real-time middleware
harm comes from the application being slow to respond, other than loss of sales
18 Practical applications
for a poor product. The same requirement might be more explicitly phrased as
19 Summary
“The application should not take more than a quarter of a second to respond to
19 For more information
a mouse click.” If the requirement is not met, it is a soft failure — the application
can continue and the user, although unhappy, can still use the application. In
contrast, applications where real-world timing requirements must be strictly
met are typically called hard real-time applications. An application controlling
the rudder of an airplane, for example, cannot be delayed for any reason
because the result would be catastrophic.
Creating predictable-performance Java applications in real time.
Page
This white paper describes the WebSphere Real Time product, which can
provide hard response-time guarantees for real-time Java applications
requiring responses of tens of microseconds and more.
Garbage collection
Another source of frustration for hard real-time programmers using Java is
garbage collection. Errors introduced by the need to explicitly manage memory
in languages such as C and C++ are some of the most difficult problems to
diagnose. Proving the absence of such errors when an application is deployed is
also a fundamental challenge. One of the major strengths of the Java program-
ming model is that the JVM, not the application, handles memory management,
which helps eliminate this burden for the application programmer.
Unfortunately, traditional garbage collectors can incur very large application
delays that are virtually impossible for the application programmer to predict.
Delays of several hundred milliseconds are not unusual. One way to solve this
problem is to prevent garbage collections by creating a set of objects that are
reused, helping to ensure that the Java heap memory is never exhausted. In
practice, this approach generally fails because it prevents programmers from
using many of the class libraries provided in the Java Development Kit (JDK)
and by other class vendors, which typically create many temporary objects.
Thread management
Standard Java does not provide any guarantees for thread scheduling or thread
priorities. An application that must respond to events in a well-defined time has
no way to ensure that another low-priority thread won’t get scheduled in front of
a high-priority thread. To compensate, a programmer would have to partition
an application into a set of applications that can then be run at different
priorities by the operating system. This approach would increase the overhead
of these events and make communication between the events far more challenging.
Creating predictable-performance Java applications in real time.
Page
Scheduling
Real-time systems need to control how threads will be scheduled and guarantee
that, given the same conditions, threads are scheduled in a predictable way.
Although the Java Class Library (JCL) includes the concept of thread priorities,
the JVM is not required to enforce priorities. In addition, non-real-time Java
implementations typically use a round-robin preemptive scheduling approach
with unpredictable scheduling order. With the RTSJ, true priorities and a
fixed-priority preemptive scheduler with priority inheritance support is
required for real-time threads. This scheduling approach helps in that the
highest-priority thread can always be the one running, and it will continue to
run until it releases the processor voluntarily or is preempted by a higher-priority
thread. Priority inheritance helps ensure that priority inversion is avoided
when a higher-priority thread needs a resource held by a lower-priority thread.
Threads
The RTSJ adds support for two new thread classes: RealtimeThreads and
NoHeapRealtimeThreads (NHRTs). These new thread classes provide support
for priorities, periodic behavior, deadlines with handlers than can be triggered
when the deadline is exceeded, and the use of memory areas other than the
heap. NHRTs cannot access the heap, and so, unlike other types of threads,
NHRTs do not need to be interrupted or preempted by garbage collection.
Real-time systems typically use NHRTs with high priorities for tasks with
the tightest latency requirements, RealtimeThreads for tasks with latency
requirements that can be accommodated by a garbage collector and regular
Java threads for everything else.
Creating predictable-performance Java applications in real time.
Page
Memory management
Although many real-time systems can tolerate the small delays resulting from
a deterministic garbage collector, there are cases where even these delays are
not acceptable. The RTSJ defines immortal- and scoped-memory areas to
supplement the standard Java heap. Objects allocated in the immortal-memory
area are accessible to all threads and are never collected, representing a limited
resource to use carefully. Scoped-memory areas can be created and destroyed
under programmer control. Each scoped-memory area is allocated with a
maximum size and can be used for object allocation. To help ensure the
integrity of references between objects, rules govern how objects in one memory
area (heap, scope or immortal) can refer to objects in another memory area.
More rules define when the objects in a scope are finalized and when the
memory area can be reused. Because of these complexities, the use of immortal
and scoped memory should be limited to components that cannot tolerate
garbage-collection pauses.
Synchronization
Synchronization must be carefully managed within a real-time system to
help prevent high-priority threads from waiting for lower-priority threads.
The RTSJ includes priority inheritance support to manage synchronization
when it occurs, and provides the ability for threads to communicate without
synchronization using wait-free read and write queues.
Asynchrony
Real-time systems often manage and respond to asynchronous events. The
RTSJ includes support for handling asynchronous events triggered by a number
of sources including timers, operating system signals, missed deadlines and
other application-defined events.
Creating predictable-performance Java applications in real time.
Page
Real-time Linux
The real-time Linux kernel is created from the mainline Linux kernel with
some patches applied to help reduce latency for real-time applications and
improve kernel performance. The patches address many of the real-time
programming issues such as timing, interrupt latency, task scheduling and
kernel preemption. Some of the major advances the real-time Linux kernel
has made in helping to reduce latency are discussed in this section.
Creating predictable-performance Java applications in real time.
Page
Interrupt handlers are another point of latency caused by the lack of preemption.
To minimize latency, real-time Linux allows a real-time process to preempt
interrupt handling by converting interrupt handlers into real-time kernel
threads. This capability enables them to be scheduled, preempted and
prioritized just like any other process. Thus, the only non-preemptible portion
of interrupt handling is the few instructions that run in interrupt context to
mark the interrupt handler thread as runnable.
Creating predictable-performance Java applications in real time.
Page
Priority inheritance
Priority inheritance is the real-time kernel response to priority inversions.
In the case where a low-priority process holds a lock that a high-priority
process is blocked on, it is possible to indefinitely delay both the low- and
high-priority processes with a processor-intensive, medium-priority process.
The real-time kernel doesn’t try to detect the priority inversions. Instead, it
avoids priority inversions by raising the priority of the process that owns the
lock to be the same as that of the highest-priority process that is being blocked
on that particular lock, until the process relinquishes the lock. In this manner,
blocked high-priority processes are delayed no longer than absolutely
necessary. The kernel uses priority-inheritance mutexes internally to avoid
priority inversions inside the kernel.
Creating predictable-performance Java applications in real time.
Page 10
The ratio of time spent in the application over a given window of time is known
as utilization. The units of measurement for utilization allow an application
developer to determine if the real-time task requirements can be achieved
given a particular utilization in a system. These tasks are typically measured
over the course of a window of time; pause times that require tighter timing
requirements are encouraged to use the RTSJ. The Metronome garbage
collector achieves this capability by providing low individual pause increments
in the garbage-collection cycle, as well as targeting a utilization rate over a
window of time.
Creating predictable-performance Java applications in real time.
Page 12
Root scanning
Work units within a garbage-collection quantum can consist of a number of
different operations. Generally, each is a known measurable quantity with
maximum path lengths whose cost can be evaluated to determine whether the
garbage collector should proceed or yield. However, there are some work units
to which the cost cannot be easily ascertained, and these cases should be
guarded against when writing application code. These problematic cases
relate to threads and their corresponding structures. Thread stacks can be
complicated and time consuming to scan, and sufficiently deep stacks can be
the source of outliers in garbage-collection pause times. Thread-local Java
Native Interface (JNI) references, along with the thread stacks, must be
scanned as a single atomic unit. If there are a sufficiently large enough number
of JNI local references on a thread, the pause times could exceed the targeted
value for a quantum.
Allocation
Allocation of objects in the Metronome garbage collector is performed using
segregated free lists to manage the available memory.4 The heap is divided into
a series of evenly sized pages that represent a size class from which objects can
be allocated. These heap pages are used to create individual units of work so
that the Metronome garbage collector can schedule operations on a page with
predictable time requirements to complete the operations. The page- and
size-class splitting is calculated so that in a worst-case scenario, no more than
one-eighth of the heap (12.5 percent) would be lost because of fragmentation
or unused ranges of memory due to objects smaller than the size class being
allocated. In practice, this number rarely exceeds two percent.5
Creating predictable-performance Java applications in real time.
Page 14
Arraylets
An area of concern in any collector is the handling of large objects, particularly
arrays. Although enough total free memory might be available to handle an
allocation, there might not be enough contiguous free memory within which
to lay the object out, in which case, the garbage collector performs a heap
compaction, which can be time consuming and not easily incrementalized. The
Metronome garbage collector uses an array-splitting technique called arraylets
to lay array objects out in memory. Arraylets are hierarchical representations of
arrays that enable array memory to be allocated individually (leaves) with a
central object representing the entire array (spine). By splitting the array up
into separately allocatable chunks, you can take advantage of the heap layout to
avoid the need for contiguous storage for large objects, and consequently avoid
having to start and complete garbage-collection cycles for the sole purpose of
freeing memory to satisfy the allocation.
Write barriers
The Metronome garbage collector is an incremental collector that achieves
a full collection by stopping the virtual machine at consistent intervals and
performing a small amount of work in each interval. The Metronome garbage
collector uses a variant of the Yuasa snapshot-at-the-beginning method,6 which
incurs a level of overhead associated with each object assignment into the heap;
as object references between one another are created and destroyed, the virtual
machine manages these changes for the garbage collector to reconcile. A
nonincremental garbage collector would not incur this management overhead.
Real-time compilation
Most JVMs employ a JIT compiler to generate native code for frequently used
methods as the application runs for several reasons: to eliminate the overhead
of bytecode interpretation, to take advantage of the strengths of the processor’s
native instruction set and to exploit dynamic application characteristics
observed when a particular program runs. Modern JIT compilers use the same
technology developed to compile static languages, such as C/C++ or Fortran,
as well as new technologies targeted at optimizing the performance of Java
programs. These new technologies are often speculative in nature because they
must account for the dynamic class-loading support required by the Java
language,7 and they often sacrifice worst-case performance to improve
average-case performance. This focus on average-case performance is one
reason why traditional Java JIT compilers cannot be used in a real-time
environment, where worst-case performance is a critical metric. In addition,
traditional JIT compilers run at the same time as the application,
randomly consuming resources to compile methods and shattering the
predictability required for real-time applications.
WebSphere Real Time provides two forms of native compilation suitable for
different classes of real-time applications. The first form is a JIT compiler that
has been adapted to avoid speculative optimizations and run at a priority level
below real-time tasks. The second form is an AOT compiler that generates Java
technology-conformant native code before the program runs.
Because the Java Language Specification [7] requires dynamic class resolution,
AOT-compiled code cannot include any assumptions about field offsets
within objects, or the targets of invocations. Therefore, AOT-compiled code is
generally slower than JIT-compiled code, because many compiler optimizations
rely on precise information about fields and methods. Nonetheless, AOT-
compiled code is almost always faster than bytecode interpretation, so in cases
where a JIT compiler is not viable, AOT compilation is frequently a desirable
alternative. Furthermore, because AOT compilation time is not a runtime
cost, more methods can be compiled with an AOT compiler than with a JIT
compiler, which can result in an AOT-compiled application running faster
than a JIT-compiled application.
Real-time middleware
A key component of any deterministic enterprise solution is the real-time
messaging middleware. This software is the backbone by which real-time,
critical service providers, service consumers and complex event-processing
applications can communicate. These middleware applications have well-
defined quality-of-service agreement policies that establish the worst-case
total time from start to completion of a message or event. Without these well-
defined quality-of-service policies, latency determinism and predictability are
unachievable for either a node-to-node or end-to-end real-time, critical
business process.
Creating predictable-performance Java applications in real time.
Page 17
As part of a total real-time enterprise solution, IBM has teamed with key
real-time middleware technology providers whose solutions and products
are based on open standards from the Object Management Group (OMG).
These providers include Real Time Innovations (RTI) with their Data
Distribution Service (DDS) technology-based real-time middleware, called
Naval Data Distribution Service (NDDS, Version 4.1), and PrismTech’s
OpenFusion RTOrb.
Figure 2 depicts how C++ and WebSphere Real Time technology-based Java
applications can communicate across different languages using the NDDS
real-time middleware environment. Here, both C++ and real-time Java
components are anchored to different domain containers that publish or subscribe
to services. Data can be sent to one or more subscribers by a publisher as long
as they have the same topic. The domain container also defines the quality-of-
service policy parameters and agreements between a set of nodes, as well as
communication between nodes (such as publish, subscribe or both).
Creating predictable-performance Java applications in real time.
Page 18
Domain container
Topic A: Converts C++ and real-time Java • Quality-of-service policy for latency
Topic B: Converts C++ and real-time Java determinism within a domain
Topic C: Converts C++ and real-time Java Straight Java between nodes
Real-time Java • Binding to individual applications
application • Topic definition and formats
• Interdomain quality-of-service
Topic Topic Topic and topic definition
A B C
Domain participant
Network stack
Practical applications
Many features of WebSphere Real Time are useful to programmers who need
to target a traditional operating system. Incremental garbage collection and
priority-based threads would clearly be useful in many applications, even
if hard real-time guarantees could not be met and only soft real-time
performance was available. For example, many would welcome providing
an application server that could provide predictable performance without
unpredictable garbage-collection delays. Similarly, enabling applications to
run high-priority Java health-monitor threads with reasonable scheduling
guarantees would make Java server development easier.
Summary
This white paper defines soft and hard real-time applications and predictable
performance, and presents the features of traditional JVMs that create
unpredictable delays while an application runs, including class loading,
compilation, garbage collection and thread management. It also discusses how
the WebSphere Real Time solution, in conjunction with a Linux distribution
containing real-time capabilities, and tools from the IBM alphaWorks Web site,
addresses each of these issues. Static precompilation of code helps ensure that
no compilation is required at run time.
ibm.com/software/webservers/realtime/
www.websphere.org
© Copyright IBM Corporation 2007
IBM Corporation
Software Group
Route 100
Somers, NY 10589
U.S.A.
WSW11305-USEN-00