Realtime Java Platform Programming 1st Peter C Dibble PDF Download
Realtime Java Platform Programming 1st Peter C Dibble PDF Download
Dibble download
https://ebookbell.com/product/realtime-java-platform-
programming-1st-peter-c-dibble-975086
https://ebookbell.com/product/realtime-java-programming-with-java-rts-
bruno-eric-j-bollella-22041672
https://ebookbell.com/product/realtime-systems-and-programming-
languages-ada-realtime-java-and-crealtime-posix-4th-edition-alan-
burns-4739614
https://ebookbell.com/product/distributed-embedded-and-realtime-java-
systems-1st-edition-m-teresa-higueratoledano-2522670
https://ebookbell.com/product/concurrent-and-realtime-programming-in-
java-1st-edition-andrew-wellings-62130300
Realtime Java Programming With Java Rts Greg Bollella Eric J Bruno
Greg Bollella
https://ebookbell.com/product/realtime-java-programming-with-java-rts-
greg-bollella-eric-j-bruno-greg-bollella-7261938
Realtime Iot Imaging With Deep Neural Networks Using Java On The
Raspberry Pi 4 1st Edition Nicolas Modrzyk
https://ebookbell.com/product/realtime-iot-imaging-with-deep-neural-
networks-using-java-on-the-raspberry-pi-4-1st-edition-nicolas-
modrzyk-50194994
Pro Java Clustering And Scalability Building Realtime Apps With Spring
Cassandra Redis Websocket And Rabbitmq 1st Ed Jorge Acetozi
https://ebookbell.com/product/pro-java-clustering-and-scalability-
building-realtime-apps-with-spring-cassandra-redis-websocket-and-
rabbitmq-1st-ed-jorge-acetozi-6731026
https://ebookbell.com/product/concurrent-realtime-and-distributed-
programming-in-java-threads-rtsj-and-rmi-badr-benmammar-6822974
https://ebookbell.com/product/realtime-systems-development-with-rtems-
and-multicore-processors-1st-edition-gedare-bloom-46072756
Real-Time Java™ Platform Programming
By Peter C. Dibble
Written for experienced Java platform developers, this practical guide provides a solid
grounding in real-time programming. Dibble, a member of the RTSJ expert group, starts
with an overview of real-time issues unique to the Java platform. He then explains how to
use each major feature of the RTSJ.
Team-Fly®
Table of Content
Table of Content...................................................................................................................i
Copyright .............................................................................................................................. v
Preface ................................................................................................................................vi
Introduction ........................................................................................................................vii
Chapter 1. Landscape........................................................................................................1
Java Technology and Real Time..................................................................................1
Definition of Real Time ..................................................................................................3
Java's Problem Domain.................................................................................................8
Real-Time Java's Problem Domain .............................................................................9
Summary........................................................................................................................10
Chapter 2. Architecture of the Java Virtual Machine ...................................................11
Write Once, Run Anywhere—Maybe.........................................................................11
JVM Components .........................................................................................................12
Interpreter Implementation ..........................................................................................23
Chapter 3. Hardware Architecture..................................................................................28
Worst-Case Execution of One Instruction.................................................................28
Management of Troublesome Hardware ..................................................................32
Effects on the JVM .......................................................................................................33
Chapter 4. Garbage Collection .......................................................................................35
Reference Counting .....................................................................................................35
Basic Garbage Collection............................................................................................35
Copying Collectors .......................................................................................................39
Incremental Collection .................................................................................................41
Generational Garbage Collection...............................................................................44
Real-Time Issues..........................................................................................................45
Chapter 5. Priority Scheduling ........................................................................................46
Scheduling Terms.........................................................................................................46
Execution Sequences ..................................................................................................46
Preemption ....................................................................................................................47
Fixed versus Dynamic Priority ....................................................................................49
Priority Inversion ...........................................................................................................49
Why 32 Priorities? ........................................................................................................52
Problems with Priority Scheduling .............................................................................53
Chapter 6. Scheduling with Deadlines...........................................................................55
Underlying Mechanism ................................................................................................55
Scope of the Scheduler ...............................................................................................56
Some Systems..............................................................................................................56
Timing Is Usually Probabilistic....................................................................................63
Chapter 7. Rate Monotonic Analysis..............................................................................65
Theorems.......................................................................................................................65
Restrictions....................................................................................................................71
Chapter 8. Introduction to the Real-Time Java Platform ............................................74
A Brief History of Real-Time Java..............................................................................74
Major Features of the Specification ...........................................................................76
Implementation .............................................................................................................80
RTSJ Hello World .........................................................................................................80
Chapter 9. Closures..........................................................................................................82
The Language Construct.............................................................................................82
Java Closures ...............................................................................................................82
Limitations of Closures ................................................................................................84
Chapter 10. High-Resolution Time.................................................................................87
Resolution......................................................................................................................87
ii
The "Clock" ....................................................................................................................87
HighResolutionTime Base Class................................................................................88
Absolute Time ...............................................................................................................89
Relative Time ................................................................................................................90
Rational Time ................................................................................................................90
Chapter 11. Async Events ...............................................................................................92
Binding a Happening to an Event ..............................................................................92
Basic Async Event Operation .....................................................................................93
Async Events without Happenings ............................................................................95
Implementation Discussion .......................................................................................100
Chapter 12. Real-Time Threads ...................................................................................102
Creation........................................................................................................................102
Scheduling ...................................................................................................................106
Periodic Threads without Handlers..........................................................................110
Periodic Threads with Handlers ...............................................................................115
Interactions with Normal Threads ............................................................................122
Changing the Scheduler ............................................................................................123
Chapter 13. Non-Heap Memory ...................................................................................131
The Advantage of Non-Heap Memory.....................................................................131
The Allocation Regimes.............................................................................................132
Rules.............................................................................................................................133
Mechanisms for Allocating Immortal Memory ........................................................134
Mechanisms for Allocating from Scoped Memory .................................................136
Using Nested Scoped Memory.................................................................................142
Using Shared Scoped Memory ................................................................................154
Fine Print......................................................................................................................165
Quick Examples..........................................................................................................166
Chapter 14. Non-Heap Access .....................................................................................169
Interaction with Scheduler.........................................................................................169
Rules.............................................................................................................................170
Samples .......................................................................................................................171
Final Remarks .............................................................................................................174
Chapter 15. More Async Events...................................................................................176
Async Events and the Scheduler .............................................................................176
The createReleaseParameters Method ..................................................................176
Bound Async Event Handlers...................................................................................177
Async Event Handlers and Non-Heap Memory .....................................................177
No-Heap Event Handlers vs. No-Heap Threads ...................................................177
Scheduling ...................................................................................................................178
Async Event Handlers and Threads ........................................................................179
Special Async Events ................................................................................................179
Chapter 16. Reusing Immortal Memory ......................................................................180
Using Fixed-Object Allocators ..................................................................................180
Recycling RT Threads ...............................................................................................181
Recycling Async Event Handlers .............................................................................186
Chapter 17. Asynchronous Transfer of Control..........................................................189
Thread Interrupt in Context .......................................................................................190
Asynchronous Interrupt Firing ..................................................................................191
Rules for Async Exception Propagation..................................................................197
Noninterruptible Code ................................................................................................206
Legacy Code ...............................................................................................................209
Use of ATC for Thread Termination ........................................................................209
Chapter 18. Physical Memory.......................................................................................211
Physical and Virtual Memory ....................................................................................212
iii
Physical Memory Manager........................................................................................212
Immortal Physical Memory........................................................................................215
Scoped Physical Memory..........................................................................................216
Chapter 19. Raw Memory Access................................................................................217
Security ........................................................................................................................218
Peek and Poke............................................................................................................218
Get/Set Methods.........................................................................................................219
Mapping .......................................................................................................................221
The RawMemoryFloatAccess Class........................................................................222
Chapter 20. Synchronization without Locking ............................................................224
Principles of Wait-Free Queues ...............................................................................226
The Wait-Free Write Queue......................................................................................227
The Wait-Free Read Queue......................................................................................229
The Wait-Free Double-Ended Queue......................................................................230
No-Wait Queues and Memory ..................................................................................231
Implementation Notes................................................................................................232
Chapter 21. Recommended Practices.........................................................................233
Powerful and Easy-to-Use Features of the RTSJ .................................................233
Very Powerful and Dangerous Features of the RTSJ...........................................234
Very Powerful and Finicky Features of the RTSJ..................................................235
Selection of Priorities .................................................................................................236
Index .................................................................................................................................239
Symbol ..........................................................................................................................239
A....................................................................................................................................239
B ....................................................................................................................................239
C ....................................................................................................................................240
D....................................................................................................................................240
E ....................................................................................................................................241
F.....................................................................................................................................241
G....................................................................................................................................241
H....................................................................................................................................242
I .....................................................................................................................................242
J .....................................................................................................................................242
K....................................................................................................................................242
L ....................................................................................................................................242
M ...................................................................................................................................242
N....................................................................................................................................243
O....................................................................................................................................243
P.....................................................................................................................................244
R ....................................................................................................................................244
S.....................................................................................................................................244
T ....................................................................................................................................246
V....................................................................................................................................246
W ...................................................................................................................................246
iv
Copyright
© 2002 Sun Microsystems, Inc.—
94303 U.S.A.
All rights reserved. This product and related documentation are protected by copyright and
distributed under licenses restricting its use, copying, distribution, and decompilation. No part of
this product or related documentation may be reproduced in any form by any means without prior
written authorization of Sun and its licensors, if any.
The products described may be protected by one or more U.S. patents, foreign patents, or pending
applications.
TRADEMARKS—HotJava, Java, Java Development Kit, Solaris, SPARC, SunOS, and Sunsoft
are trademarks of Sun Microsystems, Inc.
The publisher offers discounts on this book when ordered in bulk quantities. For more information,
contact Corporate Sales Department, Prentice Hall PTR , One Lake Street, Upper Saddle River,
NJ 07458. Phone: 800-382-3419; FAX: 201- 236-7141. Email: [email protected].
10 9 8 7 6 5 4 3 2 1
v
Preface
Real-time computing—computing with deadlines—is a field that involves every programmer, but
almost nobody gives it serious attention. The machine tool that controls a flying saw blade is a
real-time problem. So is a web site that guarantees to respond to queries within two seconds, or
the text editor that must respond to each keystroke within a tenth of a second to keep its user
comfortable. In the broadest sense, even the biweekly payroll run is a real-time system.
To the extent that typical programmers worry about timeliness, they think in terms of high-
performance algorithms, optimizing compilers, and fast processors. Those are important
considerations, but they ignore the question of consistent performance. There are a whole family
of things that can make timing undependable, and avoiding those problems is the province of real-
time programming.
Over the years I've been working on this book, the fastest Java Virtual Machines have improved to
the point where the performance of the Java platform can reliably match or exceed the
performance of C++, but that is material for a different book. A real-time programmer is certainly
interested in the time it takes to complete a computation, but the important question is whether it
will always complete on time. Are all the factors that could delay completion properly accounted
for?
The Real Time Specification for Java (RTSJ) focuses on the factors that matter to systems that
must meet deadlines, primarily time itself and things that could cause unexpected delays. This
book focuses on the same things.
This book has to be dedicated to the other members of the real time for Java Expert Group. Others
contributed—family, editors, employers, and friends—but the six other "experts" who saw the
effort through are special. Thank you Greg Bollella, Ben Brosgol, Steve Furr, James Gosling,
David Hardin, and Mark Turnbull. Without you the spec would not be there, this book would not
be here, and I would have missed what may have been the most exciting spec-writing exercise in
history. By my count, we spent more than a thousand hours together arguing, problem solving,
building concepts and tearing them down. Greg (our spec lead for most of the effort) was
demanding and a tireless example for us. We worked hard, but we had glorious fun. What could
be more fun than working with a group of supremely-qualified friends to complete a difficult
piece of work?
I found one aspect of the process particularly interesting. We were all sent as representatives of
companies. Nevertheless, in almost every case, we operated as if our companies had instructed us
to ignore all commercial motivations and build a good specification. Perhaps it is Sun's fault. They
sent us James Gosling. Not only is he highly qualified, and revered in the Java world, he is also a
scrupulously honest scientist. Did that example motivate other companies to send similar people?
Or perhaps we can trace it back to IBM, who set the ground rules that selected us all. However it
was done, it was good. All specifications should be created this way!
The main players on the reference implementation team also deserve special mention. Doug Locke,
Pratik Solanki, and Scott Robbins wrote lots of code and participated in Expert Group conference
calls in the last year of the specification's development. Not only did they bring the spec to life in
code, they also helped us redesign some facilities.
I started writing this book long before the specification was complete. The original goal was to
have a set of books ready near the time the specification became final. There would be a
specification, a reference implementation and a selection of "how to use it" books all at about the
same time. This required a big head start, but things did not work exactly as planned.
vi
The specification grew in a fairly organized way until the release of the preliminary specification.
Then the work got bumpy. The implementation of the reference implementation and my
experiences writing this book caused major upheavals in the most complex parts of the
specification. The Expert Group had been uncertain as to the implementability of our designs for
scoped memory management, physical memory, and asynchronous transfer of control. We agreed
to put fairly aggressive designs in the preliminary specification, and see how the reference
implementation team dealt with them. It turned out that we had to tighten the rules for
asynchronous transfer of control slightly, and increase the constraints on scoped memory a lot.
Then we discovered that interaction between scoped memory and threads made starting threads in
non-heap memory painful nearly to the point of uselessness. We fixed these problems, and I wrote
and rewrote chapters.
Every specification that is released before it has been used extensively needs an author who tries
to explain the specification, and write code that uses the specification. That person finds problems
that are invisible to implementers and the test suite. But the author had better love the
specification. A chapter about a feature that is broken will naturally get long and complicated as it
tries to show the power and rationale behind the design. Finally it reaches the point of, "this is not
complex and wonderful. It is broken!" Then the chapter becomes trash, and a new chapter appears
to explain the new design.
Happily, the Expert Group and the reference implementation team were with me. The final
specification went up on www.rtj.org in late 2001, the reference implementation appeared there in
early 2002, and this, the "how to use it" book should be on the shelves in March 2002. We did OK.
Introduction
You can treat this book as two closely-related books. Chapters 1 through 7 are background that
might help understand the RTSJ. The remainder of the book is about the RTSJ itself. If you
already understand real-time scheduling, or you don't care about scheduling and want to get
directly to the code, you can start at Chapter 8 and read from that point on. Other than possibly
skipping the first seven chapters, I do not recommend skipping around. Few of the chapters can
stand by themselves. After you've skimmed the book once, it can work as reference material, but I
suggest that you start by reading the book sequentially.
This book is intended to serve as part of a set comprising three elements: the RTSJ specification,
the reference implementation, and this book. You can find the specification and the reference
implementation through www.phptr.com/dibble or www.rtj.org. The preliminary RTSJ document
is part of the Addison-Wesley Java Series. It is available in hard copy through your favorite book
store. However, the preliminary RTSJ has been superseded by the final, version 1.0, version. At
this time, the final specification is only available as downloadable PDF and HTML.
The reference implementation is a complete and usable implementation of the RTSJ for Linux.
Almost every example in this book was tested on the reference implementation. I have used the
reference implementation on PCs running Red Hat Linux and TimeSys Linux, and it should work
with other versions of X86 Linux as well, but the reference implementation relies on the
underlying operating system for scheduling, so you will find that features like priority inversion
avoidance will depend on the version of Linux you use.
The source code for the reference implementation is available. Some of it is descended from the
Sun CVM. That is available under the Sun community source license. The parts of the reference
implementation that are not related to Sun code are covered under a less restrictive open source
license.
vii
Although the reference implementation is excellent for experimentation, it is not designed for
commercial use. It does not take the care with performance or memory use that you'd expect from
a commercial product.
You can find links to important web sites, corrections and extensions to this book, and probably
other useful things like source code at www.phptr.com/dibble.
viii
Chapter 1. Landscape
• Java Technology and Real Time
• Definition of Real Time
• Java's Problem Domain
• Real-Time Java's Problem Domain
• Summary
The Java platform seems an unusual choice for real-time programming: garbage collection freezes the
system whenever it likes, giving Java technology terrible timing behavior, and performance from one to 30
times slower than the same program written in C++, depending on the program and the details of the Java
platform. If the real-time community were not desperate for better tools, the Java platform would be
summarily rejected.
The benefits of Java technology are attractive enough that the standard Java platform has been used in a
few real-time systems, and its promise justified the effort to design and build real-time extensions—the
RTJava platform.
Real-time programming is like any other kind of programming, but arguably harder. Like an ordinary
program, a real-time program must produce correct results; it also has to produce the results at the correct
time. Traditionally, real-time programming has been practiced in antique[1] or arcane[2] languages. Real-
time Java gives real-time programmers access to a modern, mainstream language designed for productivity.
[1]
Assembly language itself may not be antique, but the practice of programming in assembly language is
being obsoleted by RISC processors and sophisticated optimizing compilers.
[2]
FORTH is a relatively popular arcane language. The defense department has a whole stable of other
arcane languages for real-time systems.
Java puts programmer productivity before everything else. The famous Java slogan, Write Once, Run
Anywhere, is just a specialization of programmer productivity—it is clearly inefficient for a programmer
to rewrite a program for each target platform.
Those who criticize the willingness of the designers of the Java platform to sacrifice performance for
productivity face two arguments.
1. A compiler should be able to optimize out much of the cost of Java's programmer-friendly features.
2. Moore's law has processors speeding up so fast that any reasonable constant-factor overhead
introduced by Java will quickly be covered by processor improvements.
1
If you feel that strongly about performance, use C or assembler, but a Java application would be designed,
written, and debugged before the C application is coded. Real-time Java is designed with the same theme.
Note
For this book, we stipulate that the Java platform is slow and has garbage collection delays. These
problems are not necessarily permanent. Tricks in the Java virtual machine (JVM) make execution
faster and garbage collection less intrusive, but those workarounds are not the focus of this book.
Real time does not necessarily mean "real fast." Computing speed is not the problem for a computer that
controls a needle that zips into a $200 glass test tube and stops abruptly a millimeter before it slams
through the bottom of the tube. If the system is slow, the solutions are well known: profile, improve the
algorithms, tweak the code, and upgrade the hardware. Inconsistent behavior is harder to address. It can
only be fixed by finding and removing the underlying cause, which, by definition, appears sporadically.
Boosting performance until the problem disappears often does more harm than good. The system may stop
failing under test but still fail after it is deployed.
The real-time programmer needs predictability. If the program stops the needle in time under test, it must
never smash the tube unless there is a hardware problem. The worst debugging problem is software that
works correctly almost every time. Removing a timing bug that won't appear reliably during testing is an
exercise in imagination. Convincing yourself that the repair worked is an exercise in faith—you cannot
make the defect appear in the defective software, and it still doesn't appear after you apply the fix. It must
be gone. Right?
Real-time programmers work in the usual engineering environment. The design has to optimize several
goals: correctness, low cost, fast time to market, compelling feature set. Real time is not the only concern.
Predictability helps to achieve a correct implementation quickly. That helps with correctness and time to
market, but cost is also an issue. Speed and low memory footprint amount to efficiency. They contribute to
cost reduction and expanded feature set. If some tool or technique makes the software faster, it can do the
following as a result:
2
• Let the system use a lower-performance (less expensive) processor
• Let some operations move from dedicated hardware to software
• Free some processor bandwidth for additional features
When the cost of a faster processor or special-purpose hardware is prohibitive, the real choice is to
optimize the software or quit. Prohibitive cost is a flexible term. In some fields, it is sensible to expend
years of engineering to reduce hardware costs by a few cents per unit. In other fields, it is normal to spend
hundreds of thousands of dollars on hardware to save months of engineering time for each software project.
A PersonalJava system with an e-mail client, address book, and other similar applications can
run with four megabytes of ROM and four megabytes of RAM. This is enough memory for
applications, class libraries, the JVM, the operating system, and supporting components.
EmbeddedJava can be configured much larger and smaller than PersonalJava, but its lower limit
would be hard to push below half a megabyte of ROM and half a megabyte of RAM.
The KJava virtual machine promises to run in less than 128 kilobytes.
Y
RTJava doesn't do anything for Java's performance. If anything, RTJava is a little slower than ordinary
Java, and ordinary (interpreted) Java is slower than C. (see The Case for Java as a Programming
FL
Language, by Arthur van Hoff in Internet Computing, January 1997 and An Empirical Comparison of
Seven Programming Languages, by Lutz Prechelt in IEEE Computer, October 2000.) For now, that
AM
shortcoming has to be accepted: Java is slower than the alternatives, and it requires a daunting amount of
memory to run a trivial program. (see sidebar, "How Big Is Java?").
Java is not likely to drive C out of the traditional embedded market soon. Embedded programmers are as
conservative as cats. This caution is well founded. Software defects in embedded systems can have
spectacular, physical effects manifested in fire and crushed metal, and updating embedded software is
usually expensive.
Embedded programmers feel uncomfortable shipping technology that hasn't been thoroughly tested in
many deployed systems. The barrier to adoption of new technology is subject to a minor tunneling effect.
A few adventurous groups will try Java in embedded real-time systems. They will talk about the results. If
real-time Java proves that it is a good tool for embedded real time, it could become a common embedded
programming tool in five to ten years.
[3]
The American Heritage Dictionary of the English Language, Third Edition, Houghton Mifflin Company,
1992.
3 ®
Team-Fly
Timeliness always matters. Few people would call a word processor, a compiler, or a payroll system a real-
time system, but perhaps they should. If a word processor takes more than a few tenths of a second to echo
a character, its user may become concerned about the competence of the programmers. If a compiler takes
much longer than expected to compile a program, its user will fear that it has seized up. Timeliness is
especially important for payroll. Late paychecks cause fear and anger in the people waiting for them.
The set of real-time problems is large and diverse, but a few standard tools and techniques work across
them all: tools that are optimized for predictability, schedulers that optimize timeliness (instead of
throughput), and analysis techniques that consider time constraints.
The space of real-time problems has at least three useful dimensions: the precision with which time is
measured, the importance of consistency, and the shape of the utility curve around the deadline.
Precision of Measurement
The precision of the units a real-time system design uses to measure time helps characterize the system
(see Figure 1-1). Does it express a second as one second, a thousand milliseconds, or a billion nanoseconds?
Submicrosecond
Some computer problems are expressed with times measured in units smaller than a microsecond (a
millionth of a second.) That level of precision can be attained by a general-purpose processor dedicated to
a repetitive task or by special-purpose hardware. Someday we may be able to handle such problems on a
general-purpose system, but not today. (See Chapter 3 for reasons why it is hard to predict the execution
time of a few instructions without tightly controlling the environment.)
Ten Microseconds
Software commonly handles specifications expressed in tens of microseconds, but coding to specifications
this precise requires care and deep knowledge of the underlying hardware. This level of precision coding
often appears in carefully written device driver code, and contrary to general principle, the goal is usually
to complete a computation in a precise and short interval.
Millisecond
General real-time software usually deals with time measured in milliseconds. Most well-known real-time
systems fall into this range. These systems can be programmed with normal tools, provided that the
programmer demonstrates healthy caution about timing artifacts of the hardware, the compiler, and the
system software.
4
Hundredths of a second
Distributed real-time programming sees time in hundredths of a second. The atomic unit of time is network
communication, and the network environment is dynamic. If you cannot tolerate timing that jitters badly at
the millisecond level, don't distribute that part of your application.
Tenths of a second
Programs that interact with people see time specified in tenths of a second at the command and response
level. This is the precision of the specification for response when a user strikes a key, clicks a mouse,
pushes a button, crosses a photodetector, touches a touch-screen, or gives a computer directions in some
other way. These systems are usually programmed without any consideration for real-time discipline.
Performance problems are fixed with faster hardware or by normal profiling and tuning methods. In a real-
time sense, these systems routinely "fail" under load.
Consistency
We're dealing with computers. Although there is randomness at the subatomic level, the electrical
engineers have eliminated most of the unpredictable behaviors before the computer ships. Any random
behavior that remains is a hardware bug. If we assume bug-free hardware, everything is predictable.
However, it may be so difficult to account for all the factors influencing execution time that on many
processors and software platform systems, execution time is effectively unpredictable.
In the real-time field, the term determinism means that timing is predictable at the precision required by the
problem without heroic effort. Determinism is a good thing. Part of the design process for a real-time
system involves drawing timelines with events and responses to those events. Without a deterministic
operating system and processor, the analyst cannot even predict whether an event will reach the event
handler before its deadline, much less whether the event handler will complete a computation before the
deadline.
Consistency is better than mere determinism. It is useful to know that an urgent event will reach your
program at sometime between 10 and 200 microseconds. It is better to know that the time interval will be
between 50 and 70 microseconds. A real-time system can be designed to operate in any deterministic
environment, but it has to assume that the system will always deliver the worst possible performance.
Designing to that assumption is wasteful since typical times are usually near the best case. A consistent
system reduces the difference between the expected performance and the worst possible performance,
ideally by improving the worse-case performance instead of only degrading the typical performance.
Consistency costs performance. A system that needs to bring its best-case (the fastest it can go) and worst-
case (the worst possible) performance as close together as possible cannot use hints or heuristics and
cannot rely on the "80/20" rule. A dynamically constructed binary tree can degenerate into a structure with
linear search time. A quicksort can take O(n2) time. Hints can be wrong, forcing the program to check the
hint, then execute the fallback strategy. The real-time approaches to these problems are as follows,
respectively: use a self-balancing binary tree, use a different sorting algorithm (mergesort is slower than
quicksort on average, but predictable), and do not use hints. The resulting software's typical performance is
at least 15 percent worse than the performance of a system that is designed to optimize typical performance,
but its worst-case performance may be orders of magnitude better than such a conventional design.
What happens when the real-time system is late? It misses its deadline … then what? The answer to that
question determines where the system falls on the continuum from hard real time through soft real time to
not real time.
5
Hard real-time systems cannot tolerate late results. Something unrecoverable happens: a person dies, a
wing falls off the airplane, a million barrels of hot petroleum squirt onto the tundra, or something else
unspeakable occurs.
Hard real-time systems are difficult to code but simple to specify. You have to establish that they will meet
all their deadlines, but you do not have to decide what to do when the systems miss a deadline. It is like the
question, "In the computer that sits in a bomb and tells it when to explode, what instruction do you put
after the one that explodes the bomb?"
If the result of the computation is worthless after the deadline, it makes no sense to waste computer time
on it. Since the system might have missed the deadline because it was overloaded, wasting time could
cause the system to miss other deadlines and convert the failure to meet a single deadline into a total
shutdown.
A missed deadline might require a response. Failing to shut off the valve that fills a water tank on time
could call for another valve to let some water out of the tank. Not responding to a database query on time
could put up a "please wait" message on the user's screen. A mistake in milling a part could call a human
supervisor or kick the part into a scrap bin.
Utility is the economists' term for how valuable something is. Economists like to draw graphs, for example,
the utility of a commodity as a function of how much you have of it. A graph (Figure 1-2) showing the
utility of completion around the deadline characterizes some classes of real-time systems.
6
The first graph in Figure 1-2 shows a utility function that would represent a non-real-time system. The
value of completion declines slowly after the deadline, but there is barely a difference between completing
on deadline and being seriously late. Think of mowing the lawn. Getting it done earlier is better, but there
isn't any point at which it suddenly becomes urgent.
The second graph shows a utility function for a soft real-time system. The value of completion has an
inflection point at the deadline. Late completion is worse than on-time completion, but the value stays
positive for a while after the deadline. Think of fixing lunch for your children on a relaxed summer day.
The children know the time they should be fed, and if you are late, they get fussy. The level of hungry
complaints gradually increases after the deadline, but nothing really bad will happen. On time is best, and
the value of the computation drops significantly when the deadline is missed.
The third graph represents a real-time problem with a serious time constraint. After the deadline the value
of completion quickly goes negative, but it does not become catastrophic. In this case, the slope of the
curve decreases because the system can take remedial action. Think of a child flushing a T-shirt down the
toilet. The deadline for action is the moment before the child flushes. You would much rather stop him
before he flushes. If you are late, you might be able to reach in and snag the shirt; that is unappealing, but
not disastrous. Still later there is a good chance you can recover the shirt with a plumber's snake, but we
are deep in negative utility. Still, nothing terrible has happened.
The last graph represents classic hard real time. A tiny interval after the deadline, the utility of completion
goes to negative infinity. There probably is a utility value for a late result, but it is so negative that the
creators of the system requirements don't want the engineer to consider the option of occasionally missing
7
the deadline. Sticking with the children analogy, this is grabbing the child before she runs into traffic. Late
is not an acceptable alternative.
Note
The utility function graphs in Figure 1-2 show various behaviors before the deadline. The utility of
early results is a whole different question. The functions show that early is the same as on time, early
is a little worse than on time, and early is much worse than on time. Since it is easy to delay the effect
of an early computation, worrying about early results is generally not interesting.
From the real-time perspective, Java's biggest problem isn't its performance, but its garbage collector.
Unless you use it carefully, Java will pause at unpredictable moments and collect garbage for milliseconds.
Better garbage collectors make long garbage collection pauses much less frequent or make the pauses
shorter, but even with the best technology, only Java code that pays careful attention to the garbage
collector can give predictable performance (see Figure 1-3).
The execution of programs on Java platforms is slow, and garbage collection makes it effectively
nondeterministic, but it is still useful in real-time systems. The simplest way to manage Java's problems is
to avoid them. Most real-time systems have large components with loose deadlines. A system that has to
service interrupts in 30 microseconds or lose them and must respond in 2 milliseconds or suffer serious
degradation probably uses a few thousand lines of code for that part of the system. Another 50,000 lines
might support a user interface, a logging system, system initialization, and error handling. Those
components need to communicate with the serious real-time components, but they are soft real time. They
can probably tolerate nondeterministic delays of a second or more.
Java technology needs at least a medium-performance processor and a few megabytes of memory to run
well.[4] This makes it particularly well suited to systems that have aggressive deadlines with long intervals
between them. The system designers will select a powerful processor to let it meet the deadlines, and in the
intervals between deadlines it will have lots of time to run Java programs.
8
[4]
Java Card and KJava can run satisfactorily on less powerful processors and in less memory than ordinary
Java.
It is inconvenient to use multiple languages to build a system. Using C for the demanding real-time
components and Java for the bulk of the components justifies the inconvenience with productivity.
1. Java code is unusually portable. This portability lets most of the development cycle take place on
the engineer's workstation.
2. Programmers seem to be more productive in the Java language than in other common languages.
3. The Java class libraries contain many prewritten classes.
4. The Java platform works well in heterogeneous distributed systems.
5. Java applications on the real-time system communicate easily with Java applications in other
systems, such as management, supervisory, and diagnostic systems.
Java probably cannot do all the work in a real-time system, but it can do the bulk of the work. See Chapter
8 for an overview.
RTJava is designed to stretch the platform slightly in the direction of real time without losing compatibility
with existing Java code. The design does not make Java faster—it probably makes it a little slower (see
"Consistency" on page 6)—but faster hardware or improved algorithms can compensate for Java's fixed
overhead. This is the standard Java tradeoff extended to real time: processor overhead for robust software
and faster development.
The class of embedded real-time systems that are stamped out like pennies are not likely early adopters of
RTJava. Per-unit costs on those systems are reduced mercilessly. The companies that build them would
rather spend years of extra engineering than upgrade the processor or pay the license fee for a JVM.
RTJava is more attractive for systems where the cost of the processor is a small part of the total cost. This
set includes commercial and industrial applications. The cost of adding Java to the system that controls a
scientific instrument, a manufacturing system, or an ATM could be lost in the noise. When the product
costs tens of thousands of dollars and ten thousand units would be a good year's production, the advantages
in flexibility and time to market of RTJava can justify the cost of a faster processor.
RTJava may prove most useful to programmers who write interactive applications. There is an informal,
but important, deadline when a person is waiting. Handling customer service, processing insurance claims,
validating credit card transactions, and similar systems account for millions of programmer years of TSO,
CICS, Complete, VMS, CMS, UNIX, ACP, Mac, and Windows interactive programs. Fast response is
important, but fast and consistent is better. An interactive system that responds in half a second most of the
time but sometimes takes five seconds is maddening. Non-real-time tools and methodologies can improve
all performance by some factor, but that misses the point. Improving the typical response to three-tenths of
a second and the occasional glitch to three seconds is nice, but the problem is still there. The accountable
administrator wants to be able to summon a programmer and say "A customer told me that it took more
than two seconds to validate a credit card this morning. Check the error logs. Tell me why it happened and
how to make sure it doesn't happen again." with the same confidence he'd tell the programmer to check
into a division by zero error.
9
Systems handling money may be even more constrained to consistently meet deadlines than are systems
handling people. Real-time software might make it reasonable for a stock brokerage to promise that all
trades will complete within 100 milliseconds.
Summary
All real-time systems need consistent performance. They differ in the precision they require and how
offended they are at a missed deadline. The Real Time Specification for Java does not require that a
conforming Java platform be unusually fast. It adds tools to Java to make it possible for a programmer to
get consistent performance.
Nearly all systems benefit from consistent performance. This generalization includes the usual real-time
systems. It also includes most commercial, industrial, recreational, and personal systems.
10
Chapter 2. Architecture of the Java Virtual Machine
• Write Once, Run Anywhere—Maybe
• JVM Components
• Interpreter Implementation
The Java virtual machine (JVM) is a software implementation of a computer architecture. Since Java
programs target the JVM, compiled Java programs should be portable. They execute the same instructions
and use support libraries with standard APIs and identical (or at least similar) behavior whether they
execute on an embedded system with an arcane processor or on a multiprocessor server.
Parts of the specification for the JVM are peculiarly loose. The specification seems to be a compromise
between strict specifications to support portable code, and loose specifications to make it easy to port the
JVM to diverse architectures. For instance:
• The Java Language Specification, JLS, does not insist that the JVM have multiple priorities. It
requires—
When there is competition for processing resources, threads with higher priority are generally
executed in preference to threads with lower priority. Such preference is not, however, a guarantee
that the highest priority thread will always be running, and thread priorities cannot be used to
reliably implement mutual exclusion.[1]
[1]
The Java Language Specification, James Gosling, Bill Joy and Guy Steele, Addison Wesley,
2000, page 415.
—which can be interpreted to mean that higher priorities are scheduled exactly like lower
priorities.
• Garbage collection is not required anywhere in the JLS. It is perfectly acceptable to create a JVM
with no garbage collection system provided that you don't add an explicit way to free memory.
• The specifications for Java's drawing primitives are similarly loose. They permit Motif-style or
Windows-style rectangle borders and fills, which give noticeably different results.
Even the real-time Java extensions are not magic. The JVM may let an MC68000 execute the same Java
programs as an Alpha, but it does not make them run at the same speed. Technically, a real-time program
should not depend on the performance of the platform once the platform is fast enough to meet all its
deadlines, but blind reliance on write-once-run-anywhere is a mistake. For real time it is better to think
WOCRAC (Write Once Carefully, Run Anywhere Conditionally).[2]
[2]
This phrase was coined by Paul Bowman at the 1999 Mendocino, California meeting of the Real Time
Java Expert Group.
11
JVM Components
A given JVM might be implemented as a monolithic program, but it is designed as a set of components.
The coarse-grained components are class loader, bytecode interpreter, security manager, garbage collector,
thread management, and graphics. Each of these components—except perhaps graphics—has a significant
influence on the real-time performance of the JVM.
Class Loading
The first time a Java program uses a class, the JVM finds the class and arranges to have it integrated with
the rest of the Java environment.
1. Find the class in a file with a name derived from the fully qualified name of the class by
converting dots to file delimiters (slash on UNIX, backslash on DOS-descended systems) and
adding the suffix class to it. The JVM can search extensively for the file. It looks in each directory
named in the CLASSPATH environment variable and possibly in every directory in the trees rooted
in a directory on the class path.
2. Read the class file into a buffer.
3. Digest the class file into JVM internal data structures that reflect the data defined by the class file,
the constants used by it, the classes in it, and the methods in it.
4. Run the verifier over the class. The verifier is a "theorem prover" algorithm that proves that the
bytecode in the class obeys various rules; for instance, the verifier will not permit the JVM to load
a class that includes code that uses uninitialized data.
Note
Things that the verifier can guarantee don't need to be checked at runtime. It is better to check
as much as possible once at load time than to repeat checks every time the code is used. The
verifier uses a relatively long time to make powerful general guarantees, but the bytecode
interpreter gains considerable efficiency when problems like uninitialized data cannot happen.
5. Before the first use of the class, the JVM must initialize static data for the classes in the file. It
does not actually have to initialize the static data when the class is loaded; it could initialize the
static data any time before the static data is accessed.
6. The JVM is allowed to load classes that can be used by the classes in this file. This can cause the
first reference to a class that is not built into the JVM to load all the classes that could possibly be
used by the transitive closure of that class.
7. The JVM may choose to compile some of the newly loaded methods.
How it's done and how the time consumed by the load operation (and the time it happens) can vary among
JVM implementations. Ordinary JVMs are optimized for throughput and can choose to defer potentially
expensive operations like initialization as long as possible—perhaps until a method that uses the field in
question is actually called. A real-time JVM cannot use that class of optimization because the programmer
would have to assume that each method call would incur initialization costs unless he could prove that the
initialization had already taken place; for example, it was the second call to the method in straight-line
code.
Bytecode Interpreter
12
The JVM uses one-byte operation codes called bytecodes. The maximum possible number of one-byte
codes is 256, and the JVM defines nearly the full set. The set of standard opcodes is given in Table 2-1.
13 ®
Team-Fly
90 dup_x1 91 dup_x2 92 dup2
93 dup2_x1 94 dup2_x2 95 swap
Arithmetic and logic
96 iadd 97 ladd 98 fadd
99 dadd 100 isub 101 lsub
102 fsub 103 dsub 104 imul
105 lmul 106 fmul 107 dmul
108 idiv 109 ldiv 110 fdiv
111 ddiv 112 irem 113 lrem
114 frem 115 drem 116 ineg
117 lneg 118 fneg 119 dneg
120 ishl 121 lshl 122 ishr
123 lshr 124 iushr 125 lushr
126 iand 127 land 128 ior
129 lor 130 ixor 131 lxor
132 iinc
Conversions
133 i2l 134 i2f 135 i2d
136 l2i 137 l2f 138 l2d
139 f2i 140 f2l 141 f2d
142 d2i 143 d2l 144 d2f
145 i2b 146 i2c 147 i2s
Simple flow control
148 lcmp 149 fcmpl 150 fcmpg
151 dcmpl 152 dcmpg 153 ifeq
154 ifne 155 iflt 156 ifge
157 ifgt 158 ifle 159 if_cmpeq
160 if_cmpne 161 if_cmplt 162 if_cmpge
163 if_cmpgt 164 if_cmple 165 if_acmpeq
166 if_acmpne 167 goto 168 jsr
169 ret 170 tableswitch 171 lookupswitch
172 ireturn 173 lreturn 174 freturn
175 dreturn 176 areturn 177 return
Operations on objects
178 getstatic 179 putstatic 180 putfield
181 getfield 182 invokevirtual 183 invokespecial
184 invokestatic 185 invokeinterface 186 no op assigned
187 new 188 newarray 189 anewarray
190 arraylength 191 athrow 192 checkcast
193 instanceof
Miscellaneous
194 monitorenter 195 monitorexit 196 wide
197 multianewarray 198 ifnull 199 ifnonnull
200 goto_w 201 jsr_w
14
Java specifies a fascinating mix of bytecodes. Most of them are nearly trivial. Many opcodes are expended
on load, store, and basic operations for each supported data type. The JVM instruction set also optimizes
the use of small constants with shorthand operations that push constants, and load and store from small
offsets in local storage.
Some JVM instructions are wildly complex: create an array of objects, invoke a method, and two different
single-instruction switch statements.
op = byteCode[pc]
switch(op){
case 0: /* noop */
pc += 1;
break
case 1: /* aconst_null */
...
}
There are many ways to accelerate this interpreter. First, brute force: almost all interpreters have been
rewritten in hand-coded assembly language. Sometimes the assembly language is generated by hand-
tuning of the output of a C compiler. Sometimes it is written from scratch. In either case, a careful, but
uninspired, assembly language implementation of the interpreter usually gets about a 30 percent
performance improvement over the C implementation of the interpreter.
There are probably hundreds of tricks that go beyond "uninspired" and yield a better than average
performance improvement. Three are presented below.
Keeping the state of each functional unit in mind while writing code is tedious, but that level of
craftmanship can generate big performance improvements.
This isn't generally a good investment for the bytecode interpreter. First, the optimizations are specific to a
particular version of the processor. An interpreter tuned for the PowerPC 604e might run worse than an
untuned interpreter on a PowerPC 603. Second, the available performance improvement depends on the
number of functional units, and the processors with the most functional units are often the ones with the
most ingenious systems for reordering their own instruction streams.[3] Third, most Java opcodes require
only a few machine instructions. There is not much room for reorganization.
[3]
Very Long Instruction Word (VLIW) processors are an exception to this rule. They have several functional
units and insist that the instruction stream schedule them all.
Cache Optimization. Processors run faster when their code and data are in the cache, but cache is a limited
resource. It is possible to write the bulk of the Java interpreter in less than 64 kilobytes, but a 64-kilobyte
instruction cache is huge (in 1999) and the interpreter should share the cache with other code.
The amount of acceleration available for cache optimization depends on the difference between cache
speed and memory speed. Cache could be more than 10 times faster than main memory on systems with
slow memory.
15
The trick to this optimization is to track which memory falls into the same cache line or way. These
addresses will contend for the same cache line or small group of cache lines. Code that has "hot" cache
lines will frequently wait while the processor loads the cache with data from a new address.
Cache optimization is particularly appropriate for real-time systems because cache faults are a major cause
of nondeterministic timing behavior. Systems that control cache faults reduce this nondeterminism—but
they cannot eliminate time variations caused by the cache unless they completely control the cache.
Register Optimization. A simple port of the Java bytecode interpreter from C to assembly language usually
gets much of its performance improvement by using registers for nonlocal context that C compilers cannot
easily track. An inspired port goes to the next step in register usage.
The Java virtual machine is a stack machine. That means that it doesn't have general-purpose registers, but
keeps all its work on a stack or in various types of named fields. The bytecode interpreter that is part of the
Sun Java platform distribution and simple assembly language interpreters keep the Java stack in memory,
but this approach is inefficient. Nearly every bytecode accesses the stack once or twice. If the top few
entries on the stack were usually in registers, the interpreter could run about twice as fast.[4] It would be
easy to treat a group of registers as part of the stack if processors let you index through registers the way
you can index through an array in memory, but that would be an unusual architectural feature.
[4]
Expected speedup figures like this are wildly approximate. They depend on how good the interpreter is
before the optimization and how well suited the hardware architecture is to the optimization. In this case,
approximately two means that 1.5 and 4.0 are equally likely.
One viable approach is to dedicate some registers to a window on the top of the stack and to code separate
implementations of each opcode for each feasible arrangement of the stack in that window. If four registers
are dedicated to the top of the stack, the top of the stack could be at any of those four, and the window
could contain from zero through four stack entries. That gives 20 possible arrangements and 20
implementations of each opcode.[5]
[5]
There is little point in letting the stack window go empty, so a 4-register window would probably implement
cases for a depth of two, three, and four.
The opcode dispatcher changes from a lookup of the handler function in a single-dimensional table with
the opcode as the index, to a lookup in a two-dimensional table with the opcode as one index and the
register arrangement as the other.
Security Manager
The SecurityManager class can prevent untrusted code from violating security constraints. The best-
known security manager is the sandbox, which wraps a security wall around applets. Every time an applet
uses a Java platform service that might impact the security or integrity of the underlying system, the
service asks the security manager for permission. Many services include code snippets like:
where checkXXX is replaced with the name of one of the methods in SecurityManager. The security
manager either returns normally or it throws SecurityException to signify that the operation violates
security constraints.[6]
[6]
To be painfully accurate, the checkTopLevelWindow method in the security manager returns a
boolean. All other checkXXX methods in the Java 2 security manager return void.
16
The current security manager is determined by the class loader that loaded (or more precisely, defined) the
executing class. Applets are the standard example. They are loaded by a special loader that loads applets
from a web site, so they always operate in the sandbox, which is the security manager associated with the
applet loader.
Every class loader can be associated with a security manager that protects the system from nefarious
operations attempted by classes it brings into the system.
Security Checking Methods. The security checking methods in the security manager are summarized in
Table 2-2.
Real-Time Issues. The security manager makes the performance of applications depend on the class loader
that brings them into the system. Every checked operation includes a path through the security manager,
and the security manager depends on the class loader that loaded the real-time code. The possibility that
the security manager might reject an operation is not a uniquely real-time issue, but the time it spends
deciding whether to permit an operation matters to real-time programmers.
Applications under development are normally loaded by the default class loader. That class loader
normally uses a trusting security manager that checks nothing and consequently uses little time. If the
system can be deployed under a less credulous security manager, it may execute checked functions more
slowly than it did under test.
It's difficult to predict timing for the checked operations even without the contribution of the security
manager. Only a few of the check functions are associated with operations for which timing might
normally be easily characterized:
17
• checkPropertiesAccess — Checks security on operations that get or set the system properties data
structure.
• checkPackageAccess — Checks whether the caller is allowed to access a package. Since the
default implementation of checkPackageAccess always throws SecurityException, the
performance of this method should not be an issue.
The other check methods implemented by the security manager are associated with I/O or other similarly
complex operations.
Garbage Collector
Java does not technically require a garbage collector, but it is painfully restricted without one. If there is no
garbage collector, the JVM cannot detect memory that is no longer in use, and the programmer has to
adopt a coding style that creates only objects that should exist "forever." For general use, a JVM requires a
garbage collector that detects objects that are no longer in use and returns them to the free pool.
It would be nice if the Java garbage collector would collect all forms of garbage—memory, open I/O paths,
other I/O resources, and classes that cannot be reached without a major effort—but most JVMs just collect
unused memory.
Garbage collection is not a new idea. Languages like LISP and SmallTalk have used garbage collection for
decades. There are even a few tools that add garbage collection to C and C++. If the language runtime can
identify pointers and chunks of allocated memory (in an object-oriented language all those chunks are
objects), it can go through each allocated chunk and see if any other chunk points to it. If a chunk of
memory can be reached by following pointers from something the runtime knows is in use (like a variable
on the stack), it is alive. If it cannot be reached, it is dead and can be freed. Stated like that, garbage
collection looks like it takes time proportional to the square of the amount of memory the program is using.
Advanced garbage collection algorithms use heuristics to run much faster than, O(n2) but worst-case
performance drops back here. The impact of garbage collection on real-time performance is enough to take
Java completely out of consideration for many real-time projects. Rejecting Java for real time because it
includes a garbage collector is not always justified.
Note
Big O notation is a common notation for algorithm analysis. Technically it means "at most some
constant times." Informally it works pretty well to think of it as "order of."
executes <stuff> exactly n x (n-1)times. Execution takes some amount of time that depends on the
quality of the compiler and the underlying architecture, but it is often enough for us to simply
characterize the execution time as O(n2).
• On request — The System.gc method suggests to the JVM that this would be a good time to
collect garbage. It does not require garbage collection, but, in fact, it generally causes collection to
start immediately. The exact operation of the method is JVM dependent.
18
• On demand — The only thing that demands garbage collection is the new function. If new needs
memory, it calls the garbage collector with a request to free at least enough memory to satisfy the
current allocation request.
• Background — The JVM has a low-priority thread that detects idleness. It spends most of the time
sleeping. Each time it runs, it sees whether other threads have run since it last looked. When the
JVM has been idle for a few periods, the idle-detection thread triggers garbage collection. The
assumption is that if nothing but the idle detector has run for a while, nothing important will want
to run for a while. This approach is also called asynchronous garbage collection.
On-demand garbage collection is always a problem because it runs when the JVM is nearly out of memory.
If the JVM could postpone demand garbage collection, it would be operating with a partially crippled
ability to allocate objects. As a rule, the JVM stops everything until demand garbage collection frees some
memory. Incremental garbage collectors may free some memory fairly soon (see Chapter 4), but in the
worst case they fall back to a roughly O(n2) algorithm. Garbage collection time depends on the exact
algorithm, the number of objects in the system, and the speed of the processor, so there is no specific time
penalty, but systems that otherwise run Java at a reasonable speed can take a large fraction of a second to
complete garbage collection.
Background and request garbage collection run when the system can survive if they don't complete. Under
these circumstances, any garbage collection can be preempted whenever preemption will leave the JVM's
data structures consistent. The delay before a garbage collector can be preempted (see "Mark and sweep is
not preemptable" on page 50) can be as much as an order of magnitude less than the delay before it
completes.
On-request garbage collection is the programmer's weapon against on-demand collection. A programmer
presumably knows when the system will be idle for at least a few milliseconds and can request garbage
collection at that time. If the program requests garbage collection frequently enough, garbage will not
accumulate and the JVM will never be forced to demand garbage collection.
Background garbage collection is the JVM's attempt to automate on-request collection. The JVM doesn't
have any way to know when the system has time to pause for garbage collection, so it guesses. A real-time
programmer would worry that background garbage collection would run at exactly the wrong time, but
fortunately it can be disabled when the JVM is started. When background garbage collection is disabled,
the JVM is left with on-demand and on-request garbage collection. The program must request collection
often enough to prevent on-demand collection; otherwise, it has to assume that every new will include a
full garbage collection.
Finalizers. Garbage collection must run finalizers on objects that have them before it disposes of them.
This is a serious problem. Finalizers are designed as a tool to clean up resources associated with an object.
They can close an I/O path, change a GUI element, release a lock, or any other housekeeping chore that
needs to be executed when an object is no longer in use. If they are used this way, finalizers should execute
quickly and should not reanimate the object.
The JVM enforces only one rule for good behavior by finalizers. A finalizer can execute for a long time
and even add a reference to its object from some other object. Finalizers are run at the end of garbage
collection, and their effect on performance is an important real-time issue; for example, see Example 2-1.
• First, any finalizer will cause the garbage collector to reevaluate the liveness of the object and
every object it references. That could cost O(n2) time since the finalized object could reanimate
itself, which would reanimate all the objects referenced by it, etc., traversing a reference graph
that could have as many as n2 edges.
• If there are slow finalizers, they could run during any garbage collection; the system design must
allow for that extra time on every garbage collection.
• The Java platform protects itself from objects that reanimate themselves in their finalizer by
storing a reference to themselves. Reanimation is permitted, but only once. The next time the
system garbage collects the object, it does not run the finalizer.
19
Example 2-1 A finalizer with bad real-time behavior
The best strategy is to avoid finalizers if possible and absolutely avoid finalizers that could execute at any
length.
Defragmentation. The garbage collector operates on live memory while it frees unreachable memory. The
memory that remains allocated after the garbage collector finishes might be defragmented.
If memory is obliviously allocated by new and freed by the garbage collector, it will soon leave free
memory fragmented into numerous small extents. There may be tens of megabytes of free memory, but if
it has been fragmented such that the largest contiguous free chunk is 24 kilobytes, then no object bigger
than 24 kilobytes can be allocated no matter how aggressively the garbage collector works.
The drawing on the left shows an extent of memory before and after defragmentation. Before
defragmentation, the largest possible allocation was about a quarter the size of the total amount of free
memory. The defragmentation process packs all allocated memory together. After defragmentation, all the
free memory could be used by a single object.
20
The trick to defragmentation is how to update every reference to the objects that are moved. If the system
just moves an object, every pointer (reference in Java) to that object will point to the old address. It can be
time consuming to track down every reference to an object, but during garbage collection the JVM has
already done just that. It can move objects such that all free memory is contiguous for a cost O(number of
references to all objects + size of memory).
Thread Management
A thread encapsulates concurrency. On a multiprocessor system, each thread can execute on a separate
processor. If there is only one processor or if the JVM doesn't support multiple processors, all the threads
will execute on one processor. Threads will act nearly the same whether they are actually executing
concurrently on multiple processors or sharing one processor. The software that supports threads puts the
same interface around concurrency whether it is real or virtual.
Threads are conceptually simple. The data for a thread is little more than an execution context; that is, a set
of CPU registers, a processor stack, and a Java stack. Java APIs start threads, stop them, change their
priority, and interrupt them. The dispatcher switches control from one thread to another (when there aren't
enough physical processors to run all of them), and the scheduler decides when to run each thread. The
locking mechanism hidden under the synchronized keyword gives programs more precise control over
concurrency than does adjusting priorities. A vast body of wisdom has grown up concerning proper use
and implementation of threads, but the principle is simple.
The mechanism that supports threads can be included in the operating system (relatively sophisticated
system software) or kernel (relatively simple system software), or it can be implemented in a library that is
linked directly to the JVM.
Kernel Threads. Threads or processes are a basic service of the system software. Since simple kernels are
often linked directly to application software and never provide protection domains like processes, the
distinction between kernel and library threads is only meaningful for operating systems. Nevertheless,
threads implemented in the operating system are usually called kernel threads.
If the operating system provides threads that meet the JVM's requirements, it makes good sense to use
them. The major advantages of this approach are these:
Library Threads. If the system software doesn't provide thread support or if the support doesn't meet the
JVM's requirements, the JVM can provide the support itself or use an existing library to support threads.
Note
The Sun JVM comes with a threads package called Greenthreads. This package was originally
implemented because although Sun's UNIX supported threads, it they did not meet the JVM's
requirements. The Solaris JVM no longer uses Greenthreads, but it is still part of the JVM source
distribution from Sun.
21
• The threads do not necessarily need to call the operating system for every thread operation. This is
particularly significant for context switching. Kernel threads need to enter and leave the OS for
every context switch. Library threads can context-switch with a function like longjmp.
• Java priorities are contained within a single process. This helps prevent rogue Java applications
from interfering with other activities in the system.
• The garbage collector has to freeze all the threads in the JVM before it can operate on memory.
Ordinary threading implementations may not include a way to selectively freeze threads, but it is
severely antisocial to freeze threads that are not involved with the JVM. An operating system
keeps code in one process from interfering with other processes; this mechanism makes it
impossible for library threads to freeze unrelated threads.[7]
[7]
Kernel threads don't automatically lock the entire system for each garbage collection, but it is
possible to lock up the system during garbage collection if the JVM is carelessly implemented.
Input/Output
Java programs have two paths to I/O. The usual path runs though the JVM to libraries that bind it to the
operating system's I/O services. The performance of these services is similar to that of the same services
from C. Most of the elapsed time for I/O passes while the system is in control. It makes little difference
that the JVM adds another layer of wrapping around the basic service.
The standard JVM is imperfectly adapted to asynchronous I/O. The base Java specification supports
asynchronous user interface I/O through AWT, and asynchronous network I/O through the networking
packages. These sources of asynchronous I/O are specifically handled in the JVM. The mechanisms for
handling this asynchrony are tightly bound to the glue code that uses it. It is not accessible from Java
applications.
If a program wants to support asynchronous I/O from a new device, it has a choice of finding a way to
push it through AWT or appointing a service thread to wait on the device.
Some Java implementations, notably JavaOS, use Java code for most OS services. They even write device
drivers in the Java programming language.
Note
Device drivers cannot be written in pure standard Java programming language. The registers that
control and monitor I/O devices are mapped into either regular memory or a special I/O address space.
In either case, they are accessed with pointers to primitive data types. References to objects supported
by the Java bytecode instruction set do not suffice. Interrupts are also commonly used for I/O, and the
Java platform has no direct access to interrupts or interrupt masking.
Java device drivers either include a layer compiled to native code or they use an "extended" JVM
instruction set that adds limited support for pointers.
Graphics
The abstract windowing toolkit (AWT) is the Java platform's graphics API. Swing and Truffles extend
AWT toward greater functionality and a touch-screen interface, respectively. Nobody would characterize
AWT as fast, but like many real-time systems, it is an event-driven system. It has pushed improved event
handling facilities into the Java programming language and libraries.
Polling. Programs with user interfaces need to know when something like a mouse click or a key press
happens in the real world. These events don't happen when an application asks for them; they happen when
the user chooses to move something. The program could keep checking (or polling) for each event that
22
interests it, but doing so would be tedious, and since users are slow compared to computers, it would waste
time.
Imagine yourself sitting at a traffic light. If it is a long light, you might consider reading a newspaper or
catching up on e-mail while you wait for it, but if you get involved in another activity, you need to
remember to check (or poll) the light every few seconds. If you check it too often, you don't get much
reading done. If you leave too long an interval between polls, you may not respond to a green light
promptly and the cars behind you may (um …) notify you shortly after the light changes.
Software has the same dilemma. If it polls frequently, it does not accomplish much else; if it polls
infrequently, it is unresponsive.
Events. Wouldn't it be nice if the traffic light made a polite noise when it was about to turn green? You
could then give most of your attention to e-mail and only switch attention to the light when it is about to
change color. That is what events do for software.[8]
[8]
The full treatment of real-world events requires the real-time Java enhancements to events and
asynchronous interrupts discussed in Chapters 11 and 17.
The Java event system allows a program to register an object, called a listener, with the event system, as
shown in Example 2-2.
This tiny AWT application puts up a little window with a button in it, then it loops forever, sleeping. When
the user asks the window to close, AWT calls the windowClosing method in the listener object
Y
registered for that frame. In this case, the windowClosing method just shuts down the whole application.
FL
The important thing is that AWT reaches in and calls windowClosing while the test object is busy
sleeping. The application doesn't need to poll the frame while it waits to see a "closing" flag asserted.
AM
Events are tied to AWT. The event system classes are part of the AWT package, but the tools are in the
Java programming language. Other specialized event systems can be implemented outside the AWT, but it
is sometimes easier to feed all real-world input through AWT (forcing hardware like switches, valves, and
TE
Interpreter Implementation
The core of the JVM is a loop that interprets the Java bytecodes. There are between 200 and 256 bytecodes,
depending on how the JVM has enhanced the standard set with optimizing extensions. Most of the
operations are nearly trivial: pushing literal values, performing simple arithmetic and logic on stacked data,
and flow control. Other operations are complex; it takes only one opcode to allocate an array of objects,
execute a switch statement, or invoke a method.
Standard Interpreter
A completely untuned JVM uses a bytecode interpreter written in C. Such implementations are nicely
portable, but such JVMs established that Java programs run up to 40 times slower than do C++ programs.
These JVMs are no longer taken seriously as anything but a hack to get a quick JVM port.
import java.awt.*;
import java.awt.event.*; // This brings in WindowAdapter
23 ®
Team-Fly
public class test {
public static void main(String [] args){
Frame fr = new Frame("test frame");
/*
Create an anonymous class that implements
the WindowListener interface.
Use Frame's addWindowListener to register it.
AWT will now call methods in WindowAdaptor whenever
something interesting happens to the frame.
*/
fr.addWindowListener(new WindowListener() {
// Exit when the window starts closing
public void windowClosing (WindowEvent e){
System.exit(0);
}
// Ignore all other events.
public void windowOpened (WindowEvent e){}
public void windowClosed (WindowEvent e){}
public void windowActivated (WindowEvent e){}
public void windowDeactivated(WindowEvent e){}
public void windowIconified (WindowEvent e){}
public void windowDeiconified(WindowEvent e){}
});
Optimized Interpreter
Recoding the bytecode interpreter into assembly language is fruitful. Humans make better decisions about
global register conventions than do compilers. A simple job of recoding keeps a global instruction pointer
and stack pointer. More careful work will keep at least the top entry in the stack in a register, and perhaps
the top four or more. (See "Bytecode Interpreter" on page 16.) The most highly optimized interpreter is
still slower than code that is compiled to native machine instructions, but the gap might be closed to less
than a factor of four.
From the real-time point of view, optimized interpreters are nearly entirely a good thing. Optimizations
that don't keep more than one stack position in a register give a performance improvement without making
performance hard to predict. Interpreters that keep the top few stack entries in registers are still predictable,
but the execution time of a single instruction depends on whether it is able to execute entirely from
registers, and that depends on the instructions that are executed before it. This complexity makes
performance prediction difficult, but it is still under control, and the speedup is worth the added difficulty
in predicting the performance of small groups of instructions.
JIT
24
A JIT is a just-in-time compiler. Classes are loaded into the JVM in bytecode form, but the JIT compiles
the bytecodes into native code some time after they are loaded. A JIT might compile all the methods in the
class as part of the loading process, or it might not compile a method until it has established that the
method is "hot" by calling it a few times.
A JIT generates native code. In theory, a JIT can execute benchmarks at least as fast as code that is
compiled with a conventional compiler, but a JIT poses serious problems for real-time programs. Under a
JIT:
Under a JIT, the longest execution time for a method is probably the time to compile the method, plus the
time to execute the compiled code. A fast compiler will lessen the damage, but it will probably generate
poorly optimized code.
1. Compile early and get the speed of compiled code immediately, or compile late and lessen the risk
of compiling a method that will seldom be used.
2. Optimize compiled code carefully to maximize performance on code that calls compiled methods
frequently, and maximize the pause while each method is compiled.
If the JIT uses a simple rule—for example, compile each method the first time it is executed—the program
can control the point at which it subjects itself to the JIT overhead. The system has to be designed to
compile all methods before they are needed for real-time operation. Such a design requires care, and the
code that calls each time-critical method before it is needed surely needs explicit documentation to keep
some future programmer from removing the superfluous calls.
Note
A JIT is often a less expensive way to increase performance than a faster processor. A simple JIT
generally gives applications roughly five times better performance than does an interpreter. Systems
for which a few dollars in production cost are worth a struggle may find it less expensive to add a JIT
license and RAM to hold compiled code than to upgrade the CPU.
All common JITs share a problem. They compile methods to native code and put the compiled code in a
buffer, but they are not good at removing compiled code from the buffer. With a static environment, all the
live methods are soon compiled and the system runs in a steady state. If the system runs different
applications over a span of days, a compost of old methods will build up in the native code buffer until the
system runs out of memory and chokes.
Most JITs are written for desktop systems. That class of hardware has plenty of memory for a big code
buffer, and the JVM seldom runs for more than a few hours at a time. On such systems, overflowing the
code buffer is unlikely and JIT builders concentrate on performance.
Many real-time systems are short on memory, but they are still expected to run smoothly until the
hardware wears out or power fails. Fortunately, most real-time systems also run the same set of methods
for their whole lifetime. Such systems do better with a JIT that never loses the output of the JIT. A JIT that
flushed old code out of the compiled code buffer would make it hard to control the time at which methods
were compiled.
Snippets
25
Early versions of Sun's Hotspot JVM used a snippet compiler. Snippet compilation is well suited to
dynamic code like the collections of classes that make up a Java application. A snippet compiler is hard to
drive to worst-case behavior, but that worst-case behavior is abysmal.
A snippet compiler compiles blocks of Java bytecode much smaller than a method. It uses insanely
aggressive assumptions when converting the bytecodes to native code. For instance, it feels able to inline
nonfinal methods. Those methods could be wrong the next time the snippet is used, but chances are they
will be right. The snippet includes code which verifies that the snippet is still valid. If it is not, the snippet
is discarded and execution reverts to the original bytecode. The same type of trick works for any code that
has variable behavior; for example, locking and conditional branches. It also lets the compiler freeze the
addresses of objects into code. When the garbage collector relocates those objects, it discards snippets that
might refer to them. The JVM regenerates the snippets with new addresses when it needs them.
With a carefully contrived benchmark, snippets can give arbitrarily good speedup. Build a nest of accessor
methods as deep as you like; the snippet compiler will convert n method invocations and one field
reference into one field reference. Since the speedup is proportional to the number of nested accessor
methods, this trick can generate any speedup you want to claim, but the speedup is not totally specious.
Real Java code sometimes includes deep nests of trivial methods. A conventional compiler cannot inline
ordinary methods, so a snippet compiler can easily outperform optimized C++ on a benchmark that
features virtual method invocation.
The basic assumption of a snippet compiler is that the cost of compiling tiny bits of code is small and the
likelihood of needing to recompile any particular snippet is low enough that the ongoing cost of
maintaining the snippets is far less than the speedup they offer. It is a demonstrably good assumption. The
Sun Hotspot JVM gives good performance without obvious JIT performance glitches.
Real-time performance analysis asks how bad it can get: every snippet could become invalid immediately.
If the JVM makes snippets aggressively, the worst-case performance is a little worse than a JVM that
works by generating native code for every block of bytecode, executing it, and discarding it. That situation
would probably result in performance less than a quarter as good as an ordinary interpreter. Using that
rough guess and assuming that a JVM with a snippet compiler runs about 20 times faster than an
"ordinary" interpreter, the worst-case performance of a JIT is 80 times worse than its typical performance.
Many real-time applications can tolerate the worst-case behavior of a normal JIT because they can control
it. They don't need to design for worst-case behavior because they can force the compilation to take place
when it suits them. The same trick is possible for a snippet compiler, but the rules for generating and
discarding snippets are comparatively subtle and depend on aspects of the Java environment that
programmers like to ignore. Worse, the rules are undocumented and likely to change at each release of the
JVM.
The real-time parts of an application could be written so no reasonable JVM would need to discard
snippets. It takes more effort, however, than controlling a conventional JIT, and the performance
improvement for moving from a conventional JIT to a snippet compiler is not as great as the improvement
for moving from an interpreter to a JIT.
Adding a JVM to a system does not suddenly eliminate the system's ability to run native code, and Java
programs have several facilities for interacting with processes outside the JVM. These separate processes
might be legacy applications, highly tuned code to handle some tight timing situation, or code that uses
low-level hardware facilities that cannot easily be reached from Java classes.
The important observation here is that the JVM is just a process. Operating systems can run many
processes and manage interactions among them. This is old technology. Although handling real-time
constraints in a mix of high-level languages and assembly language is sometimes difficult, real-time
programmers can do it. The Java language and JVM add new tools to this collection.
26
A native process can be coded in the Java language. Several Java compilers[9]can create binary images that
execute without the services of a JVM. A programmer can use these compilers to capitalize on many of the
strengths of the Java language without accepting the overhead of an interpreter or even a JIT.
[9]
A few examples of Java compilers that can generate stand-alone programs are the Java compiler from the
Free Software Foundation, the cooperative Java effort between Edison Design Group and Dinkum Software,
and Symantic Visual Cafe.
Native Methods
A native method is native code that is bound into the Java environment. It is invoked from Java code and
can call back into the JVM. If it avoids references to Java objects, a native method can perform like any
other native code except that it is subject to the Java runtime (including garbage collection.)
Compilers that compile Java classes directly into native code have been developed. If these binary objects
are to be used by ordinary Java applications running in a JVM, the obvious path is JNI, the Java Native
Interface.
The native method interface, JNI, is designed to be used by programmers. It does not require the
programmer to make heroic efforts to bring parameters from the Java stack to the programmer's
environment or to access objects that were created by the JVM. The usability features of the JNI slow it
down. The cost of moving between the JVM and a native method depends on the hardware architecture
and the implementation of the JVM, but it is typically equivalent to dozens of lines of C. Furthermore,
every time a native method wants to reference an object controlled by the JVM, it must first tell the JVM
to tie down the object. Some garbage collectors might free objects that are only referenced by a native
method and not tied down, and any garbage collector might relocate such objects.
This all amounts to a significant performance penalty for native methods. They will give good
performance for a length function that does not need much access to objects maintained by the JVM. They
may perform worse than interpreted bytecode for short methods that use objects maintained by the JVM.
The JIT interface can also call native code, but it assumes the native code was generated by a JIT attached
to the JVM. The JVM has no qualms about asking a JIT to generate code that carries the load of operating
with JVM internal conventions. This interface is relatively undocumented, unforgiving, and tedious to use,
but it is relatively efficient.
27
Chapter 3. Hardware Architecture
• Worst-Case Execution of One Instruction
• Management of Troublesome Hardware
• Effects on the JVM
The software engineer for a real-time system should not ask how fast a processor can go, but how much it
can be slowed down by unfortunate circumstances. From this point of view, it is unfortunate that modern
processor architecture optimizes for throughput. It makes excellent sense for most systems to trade a rare
factor of 100 slowdown for a performance doubling everywhere else.
This design philosophy is a major part of the reason you are probably using a desktop computer several
hundred times faster than a departmental minicomputer built ten years ago, but it leaves real-time
programs with a choice of obsolete processors or processors with scary real-time characteristics.
Many hardware architects' tricks are aimed at making the best of slow memory. Memory can be made very
fast, but lightning-fast memory is expensive. Slower memory is orders of magnitude less expensive than
the stuff that nearly keeps up with modern processors. The clever trick is to keep a copy of the most
recently used memory in high-speed memory cache, and have the processor look there before it goes to
slow memory. It turns out that just a few kilobytes of cache are enough to satisfy most memory loads and
stores from the cache. So how long does a load instruction take? If it is loading from cache, the instruction
might take one cycle.[1] If the load is not satisfied from the cache, we have what is called a cache miss, and
the processor will have to go to slow memory. It might even have to write some data from the cache to
make space for the new data. It could take a hundred cycles or more before the load instruction completes.
If the processor went directly to the memory, all memory access would take about a tenth as long as a
cache miss. A real-time programmer would like the opportunity to choose a factor of ten slowdown
everywhere over a factor of 100 slowdown at hard-to-predict intervals.
[1]
Talking about a one-cycle instruction is an oversimplification. It may take three or five or even seven
cycles to work its way through the processor pipeline, but it is called a single-cycle instruction because it
doesn't need more than one a cycle at any stage of the pipeline.
Note
Seymour Cray solved the cache problem the other way. He didn't put cache into Cray computers. He
made all the memory run at cache speeds.
Demand paging is another memory optimization that everyone but real-time programmers love. Demand
paging makes disk space behave like stunningly slow memory. All the computer's RAM acts as a cache for
the disk-based memory. Provided that the software running on the computer acts like typical software, the
system's throughput will degrade slowly as it needs more and more memory until it hits a point where it
starts to thrash and suddenly slows to a crawl. It is a fine thing to be able to effectively buy RAM for the
price of hard-disk space, but a real-time programmer sees that memory access has now slowed from one
cycle for a cache access to about 10 milliseconds for a disk access.
Memory access is a particularly rich vein of performance variation, but the CPU itself can cause trouble.
Branch prediction causes the processor to execute a branch instruction faster if it goes the same way it has
been going in the recent past. This gives the instruction a significant variation in execution time, depending
on its past history.
28
Consider
ld r0,r7,12
—an assembly language instruction that means load register r0 from the memory 12 bytes off the address
in register r7. To give us a starting point, let's say that the processor is rated at 100 MIPS. We would
expect this instruction to take about a hundredth of a microsecond, or 10 nanoseconds.
Most of the time the instruction will take about 10 nanoseconds, but if everything conspires to hurt the
performance of this single instruction, it could take as much as 100 milliseconds to get to the next
instruction. That represents a performance difference of about a factor of ten million between the most
likely execution time for the instruction and the longest time it could take.
Worst-Case Scenario
If the processor is lucky, the instruction will be in the instruction cache. Reading the instruction from the
instruction cache takes just a cycle or two.
If the instruction is not in the cache, the processor must read it from memory.
Reading from memory takes much longer than reading from the cache. In the best case, it can take just a
few cycles. In the worst case, the processor finds that the instruction falls in a page that is not in its address
translation cache (also called a translation lookaside buffer, or TLB).
To discover how it should treat the address of the instruction, the processor has to find a page table entry in
the page table. A CISC processor would find and read a page table entry invisibly—except that it would
read memory several times as it found the data in a search tree. Many RISC processors generate an
exception when a page table entry is not cached and let the TLB fault exception handler find the page table
entry and make a place for it in the address translation cache.
At least the exception handler will not generate a translation fault. That would cause recursion that would
quickly crash the system. The OS ensures that the code and data for handling address translation faults will
not cause translation faults.
A typical address translation fault handler might be around 30 instructions long. Each of those instructions
has some execution time. All of them have to read the instruction from cache or memory. Some of them
also read data, and some of them write data.
The page table might indicate that the page is not in RAM but rather in secondary storage, probably a disk
file.
The OS has to read the page from disk into RAM, and it may have to write a page to the disk to free a
place to read the page into. Disks are getting faster, but a disk read takes on the order of 10 milliseconds.
The instruction execution and memory access times up to this point have been measured in nanoseconds.
29
The procedure follows the same path as reading the instruction except that when the processor loads the
instruction cache, it can simply replace other instructions in the cache. Since the data cache holds data that
may have been changed since it was loaded, the processor may have to store data from the cache to make
space for the new data.
Both the store of the cache line and the load of the cache line can generate address translation cache faults
and even demand paging.
Exceptions happen.
We don't have to worry about software exceptions. They indicate things like division by zero. You can
predict when that will happen, and we'll make sure it doesn't. (Furthermore, division by zero would be
remarkable for a load instruction.)
Hardware interrupts will occur from time to time. An interrupt could occur directly before or after this
instruction. Another interrupt could take place before the first one is fully serviced, then another and
another…. Control might never return to this instruction stream.
It is safe to assume that things will not get that bad. An interrupt load so heavy that the system spends all
its time servicing interrupts is either a sign of a serious defect or an unusual system design. The intervals
between interrupts are usually distributed in a bell-curve-like fashion. The likelihood of getting even one
interrupt between two instructions is low. The chance of getting two is much lower. Unless one of the
interrupt sources can generate interrupts as fast as you can service all the system's interrupt sources, the
worst case is that every device in the system that can generate an interrupt will choose this moment to raise
its interrupt.
On a 100 MIPS RISC machine, interrupt service tends to take about 10 microseconds if the cache behaves
terribly for the interrupt code. If we have a system with ten sources of interrupts, all the interrupts together
will use 100 microseconds.
The specified access time for a memory chip is the best you can do. If the memory is DRAM, it needs to
be refreshed from time to time. Many systems use direct memory access (DMA), which uses memory
bandwidth and gets in the way of the processor. If you let it, DMA can use all the memory bandwidth.
Then, the processor will not be able to access memory until the DMA completes. A big DMA across a bus
could take 10 milliseconds
If our processor is rated at 100 MIPS, we would expect the typical instruction to take 10 nanoseconds.
When we've considered all the major factors that can slow the instruction down, the time could be worse
than 30 milliseconds.
30
Demand paging for dirty data cache write (read page) 10000000
Demand paging for data cache write (write dirty page) 20000000
Demand paging for data cache write (read page) 10000000
Demand paging for data cache read (write page) 20000000
Demand paging for data cache read (read page) 10000000
Interrupts 100000
One big DMA 10000000
Total 100101660
Practical Measures
You can prevent the worst case from getting this bad. To start with, real-time systems usually try to keep
time-critical code in page-locked memory. That means that the operating system will always keep those
pages in real memory, not off in the page file on disk. That cuts 90 million nanoseconds off the worst-case
time.
Getting DMA under control reduces worst-case degradation by another two orders of magnitude. Systems
designed for real time usually have tunable DMA. The DMA can be throttled back to a percentage of the
memory bandwidth and even made to stop entirely while the system services interrupts. We can still slow
the completion of a single instruction from 10 nanoseconds to 203,310 nanoseconds, a factor of twenty
thousand, even after throttling DMA down to 50 percent of bus bandwidth and page locking the instruction
and data memory.
This kind of worst-case scenario is an exercise in balancing risk. Every possible type of system overhead
has to land on a single instruction to degrade its performance as shown in Table 3-1. Bad luck on that scale
is extremely unlikely for a single instruction and approaches impossibility for two instructions in a row. At
some point, you have to say that it is more likely for the processor to spontaneously disassemble into a
pinch of sand than to degrade a substantial piece of code by more than 50 percent or so. The careful
programmer worries about it; most others just assume that everything will continue to work fine.
31
Massive performance perturbation is like Brownian motion. It is visible and sometimes significant in the
microscopic frame of reference. At a large scale, it is so unlikely that it is universally ignored.
The rule-of-thumb figures in Table 3-2 seem to work well for normally careful code on modern 32-bit
processors with fast memory. Oddly, the performance variability on a processor doesn't seem to depend
strongly on the performance of the processor. Note that you should never assume better than 10-
microsecond precision unless you understand and control the entire state of the machine. Performance
variability (measured in microseconds) worsens with execution time for intervals longer than 100
microseconds, but the effect diminishes rapidly enough that ordinary engineering care should be enough to
accommodate likely variation.
Table 3-2. Timing jitter with no interrupts, translation faults, or demand paging
Typical Time Worst-case degradation
Up to 10 µs 10 µs
Up to 100 µs factor of 2
Above 100 µs 200 µs
The huge performance degradation caused by demand paging is the first to go. Real-time programmers
avoid demand paging. Real-time programs run in either pinned or page locked memory, which the
operating system leaves in real memory. Often the memory used by a real-time program is page locked by
default because the program is running under an operating system that does not support demand paging.
Managing DMA
Direct memory access is too useful for real-time systems to ignore, but it can be controlled. Many DMA
controllers can be throttled such that they use no more than a specified fraction of memory bandwidth, or
they can be configured to get off the memory bus entirely when the processor is servicing interrupts.
Depending on system requirements, these mechanisms can convert DMA from a crippling problem to a
minor inconvenience.
Some systems have multiple memory buses. One bus is used for DMA and the other is reserved for high-
performance memory access. Sometimes entire pools of memory are isolated to prevent unexpected
interference with access to the memory.
In any case, DMA hurts a program's performance only when the program attempts to access memory, not
when the program is run from the cache.
Managing Cache
Cache is usually hidden from application programmers. At most, they are given instructions that flush or
invalidate caches. These instructions are required if the program deals with DMA or self-modifying code.
32
Processors are beginning to support caches whose contents can be controlled by software. The simplest
such mechanism just lets part of the cache be configured as cache-speed memory. The address and size of
the high-speed memory can be set with system configuration registers. Of course, the cache that has been
configured as memory no longer behaves as cache, thus slowing down everything except the code that uses
the cache-speed memory. This is a bad plan for most systems, but real-time programmers should know
what code needs to have consistent high performance.
More sophisticated caches allow chunks of the cache to be dedicated to specified processes or regions of
memory. All the cache continues to operate as a cache, but some of the cache is taken out of the general
pool and dedicated to code or data that needs predictable performance. The dedicated cache may still fault,
so its performance is not consistent, but access to the dedicated part of the cache is controlled. It can be
analyzed and each cache fault can be predicted. Code that uses a dedicated cache partition may take cache
faults; although its performance is not consistently optimal, it knows when the faults will occur, so its
performance is predictable.
The operating system can take some measures to make the cache more predictable. For instance, on
processors that support it, the cache can be considered part of a process state and preserved across context
switches.
The address translation cache (ATC, also called a translation lookaside buffer or TLB) is a cache. It would
make no sense to convert an ATC to memory, but dedicating parts of the ATC to software components is
common practice, and preloading the ATC at process startup and context switch time is not common, but it
Y
is done; e.g., Microware's OS-9 operating system preloads the ATC on most processors that support it.
FL
Managing Interrupts
AM
Some systems just don't use interrupts. That removes the timing uncertainty caused by interrupts, but it
surrenders a powerful tool.
TE
To some extent interrupts can be predicted—not exactly when they will occur, but about how often they
should be expected. This predictability lets the designer for a real-time system calculate how likely a block
of code is to experience a given number of interrupts.
In desperation, a program can mask interrupts. Interrupts can only be masked by system privileged code,
and it increases interrupt response time (which is a critical performance figure), but interrupt masking
absolutely prevents interrupt overhead while the mask is in place.
Even if a Java program could control the way hardware details affect their performance, Java religion
(Write Once, Run Anywhere, or WORA) dictates that they shouldn't try. There is a contradiction in even
trying to use hardware-aware techniques to control the performance of a Java program without introducing
hardware dependencies.
33 ®
Team-Fly
Java programs are software. If the programmer is willing to sacrifice enough, Java software can surely
achieve time precision like that of similar C programs, but it would require heroic efforts, and it is the
wrong approach.
1. It is hard to document hardware dependencies in Java programs. How do you express cache line
alignment of data when Java won't even expose the size of data structures?
2. The programmer is left using low-level inspection tools or "tweak and benchmark" techniques for
hardware-dependent optimizations. These techniques work, but imagine the documentation
required to explain a field inserted in a class to adjust the alignment of subsequent fields.
3. Just thinking about it is enough to make a committed Java programmer feel sweaty and a little
faint.
There are several ways to escape from Java into native code. These mechanisms are included specifically
to enable programmers to connect code written in other languages to Java programs. It is relatively easy to
perform machine-dependent optimization on C or assembly language code, and native methods and
processes are inherently machine dependent.
Further discouraging news: native methods look like they let machine-dependent optimizations work, but
like all timing work on advanced processors, micro-scale optimizations that work reliably are harder than
they seem. Native code supports optimization for throughput, but performance cannot be predictable on the
microsecond scale unless the code can control the cache and the other factors mentioned in "Worst-Case
Execution of One Instruction" on page 39.
The RTSJ does not try to micro-optimize the performance of the JVM. It targets problems that cause
programs to miss deadlines by tens or hundreds of milliseconds such as:
• Garbage collection
• Priority inversion
• Uncontrolled asynchronous events
34
Chapter 4. Garbage Collection
• Reference Counting
• Basic Garbage Collection
• Copying Collectors
• Incremental Collection
• Generational Garbage Collection
• Real-Time Issues
The Java specification does not require garbage collection; it just provides no other mechanism to return to
the free pool the memory that is no longer in use. Like the base Java specification, the RTSJ does not
require a garbage collection. It does require the implementation to include the class GarbageCollector
which formalizes communication with the garbage collection mechanism, but no garbage collection works
fine with GarbageCollector.
The two obvious choices for managing unused memory in a JVM are to dictate that no application may
allocate objects that it does not intend to keep around forever, or to find unused objects and recover their
memory.
The process of identifying unused objects and recovering their memory is called garbage collection.
The JVM will recover garbage when it thinks the system is idle, when the program requests garbage
collection, and when there is not enough free memory to meet the memory request for a new object
allocation.
Reference Counting
If each object contains a counter that tracks the number of references to that object, the system can free an
object as soon as the reference counter goes to zero. This forces every operation that creates or deletes a
reference to an object to maintain the reference count, but it amortizes the cost of garbage collection over
all those operations and it frees memory at the earliest possible moment. Garbage collection by reference
counting is simple and reliable, except for one problem.
Reference counting cannot easily detect cycles. If A contains a reference to B, and B contains a reference
to A, they both have a reference count of one. The garbage collector will not be able to free them even
though there is no way to reach A or B. If reference counting were inexpensive it would be a nice primary
garbage collection algorithm backed by another system that could run occasionally to free cycles that
reference counting left behind. Unfortunately, garbage collection is expensive in both time and space. It
adds complexity to every operation that can store a reference, and it adds a reference count field to every
object.[1]
[1]
Since all of memory could be full of objects that reference the a single object, the reference count must be
able to accommodate reference counts comparable to the size of memory.
35
• You select a root set, the set of objects and scalar references that you know the program can reach.
This contains static fields, local variables, parameters, and other data such as JVM internal
pointers.
• The root set and every object that can be reached on a path from the root set is live. All the other
objects are dead and can be freed.
There is a strong connection between LISP and Java. The primary inventor of Java, James
Gosling, is also the person responsible for Gosling emacs, an implementation of emacs that is
interpreted by a built-in LISP interpreter.
All the classes derived from the abstract Reference base class contain a special reference to
another object.
The reference in a PhantomReference will not prevent the garbage collector from deciding that
an object is unreachable. If an object has a PhantomReference, the garbage collector will place
that PhantomReference object on a special queue when it cannot be reached except through
phantom references. This allows the application to respond to the object's release in more
elaborate ways than a finalize method could support.
The reference in a WeakReference object will not prevent the referenced object from being
garbage collected, but the garbage collector sees the reference and makes it null when the
referenced object is ready to finalize. It may also place the WeakReference object in a queue.
This is good for data structures like the WeakHash structure. It makes it easy to find objects, but
the application does not need to remove objects from a weak hash. When objects they cannot be
reached except through the hash table, they are automatically removed.
The simplest algorithm for garbage collection is mark and sweep. It is a straightforward conversion of the
preceding outline into an algorithm.
Since the algorithm does not free any memory until control reaches the last for loop, it makes no progress
if it is preempted.
The net effect of all this is that mark and sweep must grab complete control of the JVM for an amount of
time that depends on the number of objects in the system, the number of links between objects, the
36
performance of the processor, and the quality of the garbage collector's code. This shuts down execution of
everything but garbage collection for an interval somewhere between a tenth of a second and several
seconds.
These long pauses are terrible for real-time computation, and there is no known way to escape garbage
collection's time cost without paying a high price somewhere else. There are garbage collection algorithms
that greatly decrease the cost of garbage collection except for pathological cases (see the discussion on
page 55), but the pathological cases happen and they take as long as conventional garbage collection. If the
application is willing to commit to a memory budget (e.g., "I will allocate no more than 10 kilobytes per
second"), garbage collection can be made part of each memory allocation request. The problems with the
budgeting strategy are that it falls apart if a thread exceeds its budget and that it degrades the performance
of memory allocation substantially.
That is it. It takes time proportional to the number of live object references in the system plus the number
of objects (live and dead) in the system.
Mark and sweep is simple, but it is hostile to real time. You can implement mark and sweep to be
preemptable at many points, but it cannot be preempted and resumed. (See Demonstration 4-1 and Figure
4-1.)
37
Demonstration 4–1 Mark and sweep is not preemptable
Imagine what could happen if mark and sweep were preempted and resumed:
Initial conditions:
Object D is alive, but the garbage collector has not reached it yet.
Object A contains no reference to B, and the garbage collector has already reached A and
cleared its dead flag.
Thread t clears x
It sees no reference to B.
(Thread t moved the only reference to B from object D to an object that the garbage collector
had already processed.)
38
The garbage collector sees that B is still flagged as dead and frees it.
This is an error.
A JVM that supports the Real Time Java specification may support advanced garbage collection, but for
real-time performance it allows code to avoid garbage collection altogether. (See Chapter 13.)
Defragmentation
After allocation and garbage collection proceed for a while, memory will tend to be a stable state with
blocks of allocated memory scattered between blocks of free memory. This is called fragmentation because
the free memory is in fragments instead of one contiguous extent.
1. The memory allocator has to search for an extent of free memory big enough to accommodate the
allocation request. Popular allocation algorithms either use the first chunk they find (that
algorithm is called first fit) or look for the free extent that most nearly fits the allocation (best fit.)
The worst-case performance of both first fit and best fit is 0(n), where n is the number of
fragments. If there was just one extent of free memory, allocation time would be 0(1) .
2. Allocation requests can only be satisfied if the allocator can find an extent of free memory big
enough to satisfy the request. You cannot satisfy a request for 200 bytes with two 100-byte extents.
Moreover, once memory is fragmented, the standard fix is drastic: shut everything down, free all memory,
and restart the best you can. Just moving allocated objects around so free memory is contiguous would
require the system to locate every reference to each object it moves, and update them. With a normal C-
based system, this can only be done with the classic computer programmer's trick of "adding one more
level of indirection."[2] This technique is best known for its heavy use in the Mac OS and is common for
disk file systems.
[2]
I don't know where the saying originated, but it is commonplace among programmers (at least the
operating systems type) that almost every algorithm problem can be solved with one more level of indirection.
If programs are not allowed to hold pointers to memory, but only pointers to pointers to memory,
defragmentation can move memory objects and only update the single system pointer to the object. That
sounds wonderfully simple, but an actual implementation requires more infrastructure. The program must
be able to hold pointers to objects at least briefly unless it has an instruction set that lets it handle double
indirection as easily as single indirection.
Garbage collection has a convenient interaction with defragmentation. A garbage collector must be able to
identify pointers, and by the end of most garbage collection algorithms, all the pointers to live objects have
been followed. The garbage collector has already accomplished the work that makes defragmentation
difficult. It would be sad to waste all this good information by not defragmenting memory as part of the
garbage collection process.
Defragmentation is not a natural part of mark and sweep collection, and it is not generally attached to it
since it would make a protracted process even longer. Other garbage collection algorithms defragment as
part of the process.
Copying Collectors
39
Copying collectors find garbage like mark and sweep collectors do but take a fundamentally different
approach to returning unreferenced memory to the free pool. Mark and sweep garbage collection works by
identifying live data, then freeing everything else. Copying collectors copy all live objects out of a region
of memory, then free the region. An outline of the algorithm is shown in Algorithm 4–2.
As each live object is identified, it is copied into the new region. The old version of the object is given a
forwarding address. As the garbage collector traverses the graph of live objects, it gives objects with
forwarding addresses special treatment:
1. It updates the reference in the current object to point to the new copy of the target object.
2. The forwarding address marks the object as the beginning of a cycle, so the traversal moves to
another branch.
When the traversal completes, every live node has been copied to the new region and every reference has
been updated to point directly to the copied instance of its target.
The region that objects have been copied into contains only live objects, and the live objects are all
allocated from contiguous memory. There is no fragmentation.
The region that objects have been copied from contains no live objects. It is ready to be the target for a
future iteration of collection.
The last advantage doesn't apply unconditionally to the Java environment. Java objects may have finalizers
that are expected to run before objects are freed.
A region can still be returned to free memory in a single block, but any finalizers in the region have to be
found and executed first.
A copying collector needs a free region as big as the region being collected. The system can never use
more than half of its total memory. For large-scale real-time systems this might be an ineligible cost, but
for many systems it is enough to rule out a simple copying collector.
void flip() {
void * scan, free;
swap FromSpace and ToSpace pointers
scan = ToSpace;
free = ToSpace;
// Copy all the objects in the root set into the new space
for each RtPtr in root set
40
RtPtr = copy(RtPtr);
Incremental Collection
Long pauses while garbage collection completes are painful. Even systems that are not normally
considered real time appear broken if they stop responding for noticeable intervals.
Any garbage collector can be made slightly preemptable. The collector notices a preemption request,
proceeds backward or forward to a point where the memory system is consistent and there is no resource
leakage, and allows itself to be preempted. If garbage collection makes some progress when it is
preempted, it is an incremental collector. Unfortunately, when it is preempted, the reference graph that the
garbage collector works from becomes obsolete and invalid.
If a garbage collector were able to concentrate on finding garbage directly instead of finding live objects
(everything that is not alive is dead) it would be easy to make it incremental. A dead object cannot come
back to life,[3] so each time the garbage collector finds a piece of garbage, it can immediately free it and
make progress.
[3]
To make a dead object Y come back to life, routine X would have to place a reference to it in the root set
or in some live object. But, routine X cannot place a reference to Y anywhere unless it has that reference. If
routine X is holding a reference to Y, then Y is not dead.
41
referenced = true
break
}
}
if (referenced == false)
free object x}
This algorithm can be preempted with no delay, though it has to restart at the beginning. Unfortunately, it
permits garbage to survive through many iterations of the entire algorithm, and cycles are never reclaimed.
Imagine a linked list of 1000 nodes that has just been made inaccessible. The nodes in this list happen to be
ordered such that the collector sees the tail of the list first, then iterates back to the head.
The trivial incremental collector will reclaim one node of the list each time it executes.
Algorithm 4–3 fails because it has to restart each time it is preempted. A high priority thread that wakes up
every millisecond or so would never let this algorithm get beyond the beginning of its list of objects.
The insight behind Algorithm 4–4 is that a garbage collector can do all its computation on a snapshot of
memory. Any object that was garbage when the snapshot was made will still be garbage when the garbage
collector detects it. Each time the garbage collector detects a dead object, it tells the system to free the
corresponding object in active memory. It doesn't matter whether it frees the object in the snapshot. When
the garbage collection algorithm terminates, the entire buffer used for the memory snapshot is cleared and
used for the next cycle.
Except for the long lock while it makes a private image of collectable memory, this algorithm is even more
preemptable than Algorithm 4–3. After the copy operation, it can be preempted and resumed at any point.
Copying all memory is a 0(n) time operation, where n is the size of memory, but it can still be quite fast.
First, this is a straightforward copy. It should run as fast as memory can handle the loads and stores.
Second, on a machine with an MMU, the memory can be copied with MMU lazy-copy techniques. The
MMU can be used to map the pages to two addresses and mark them read-only. It need only actually copy
pages that are written after it "makes a copy."
Algorithm 4–4 is in most ways an attractive incremental collector. Its only serious problems are its heavy
memory requirement for its working buffer and the way it shuts down writes to memory while it makes its
working copy.
There are many incremental garbage collection algorithms, most of them developed for multiprocessor
LISP machines.[4]
42
Exploring the Variety of Random
Documents with Different Content
of his children. It is a shame for him, while she sits sewing by his
side, never to raise her drooping self-respect, by addressing an
intelligent word to her about the book he is reading, or the subject
upon which he is thinking, as he sits looking into the fire. I marvel
and wonder at the God-like patience of these upper housekeepers,
or I should, had I not seen them dropping tears over the faces of
their sleeping children, to cool their hearts.
I want to hear no nonsense about the mental "equality or inequality
of the sexes." I am sick of it; that is a question men always start
when women ask for justice, to dodge a fair answer. They may be
equal or unequal—that's not what I am talking about. Napoleon the
Third gives his dear French people diversions, fête days, and folly of
all kinds, if they will only let him manage the politics. Our domestic
Napoleons, too many of them, give flattery, bonnets and bracelets to
women, and everything else but—justice; that question is one for
them to decide, and many a gravestone records how it is done.
An intelligent man sometimes satisfies his conscience by saying of
his wife, Oh, she's a good little woman, but there is one chamber in
my soul through whose window she is not tall enough to peep. Get
her but a footstool to stand on, Mr. Selfishness, and see how quick
she will leap over that window sill! In short, show but the disposition
to help her, and some manly, loving interest in her progress, instead
of striding on alone, as you do, in your seven league mental boots,
without a thought of her, and take my word for it, if you are thus
just to her, and if she loves you, which last, by the way, all wives
would do, if husbands were truly just, and you will find that though
she has but average intellect, you will soon be astonished at the
progress of your pupil.
I am not unaware that there are men whom the tailor makes, and
women who are manufactured by the dress-maker, and that they
often marry each other. Let such fulfill their august destiny—to
dress. I know that there are women much more intelligent than their
husbands; let such show their intelligence by appearing not to know
it. Still, it remains as I have said, that there exist the wives and
mothers whose cause I now plead, fulfilling each day, not hopelessly
—God forbid! but sometimes with a sad sinking of heart, the duties
which no true wife or mother will neglect, even under circumstances
rendered so disheartening by the husband and father, of whose
praise, perhaps, the world is full. Let the latter see to it, that while
the momentous question, "What shall I get for dinner?" may never,
though the heavens should fall, evade her daily and earnest
consideration, that he would sometimes, by his intelligent
conversation, when there is no company, recognize the existence of
the soul of this married housekeeper.
GRANDMOTHER'S CHAT ABOUT
CHILDREN AND CHILDHOOD.
"Baby bye,
Here's a fly—
Let us watch him,
You and I;"
Walking behind a father and his prattling child—a fairy little girl—the
other day, I heard a bit of human nature. "I mean to have a tea-
party," lisped the little thing; "a tea-party, papa." "Do you?" said the
father; "Well, whom shall you invite?" "I shan't ask anybody who
don't have tea at their houses," replied the little woman. "There's
worldly wisdom," thought we, "in pantalettes. So young and so
calculating!" We smiled—who could help it?—at the little mite; but
we sighed, also. We would rather have heard those infantile lips say:
"I shall ask everybody who don't have tea at their houses,"—not as
a mocking-bird or parrot would say it, as a lesson taught, but
because it was the out-gushing of a warm little unspoiled heart. That
child but echoed, probably, what she had listened to unobserved,
from mamma's lips, on the eve of some party or dinner. The child
who sits playing with its doll, be it remembered, oh mothers, is not
always deaf, dumb, and blind to what is passing around, though it
may seem so. The seed dropped carelessly then, may take root, and
develop into a tree, under whose withering influence your every
earthly hope shall perish.
Sometimes one thinks what a pity children should ever grow up. The
other day, passing through an entry of one of our public buildings, I
saw two little boys, of the ages of six and eight, with their arms
about each other's neck, exchanging kiss after kiss. It was such a
pretty sight, in that noisy den of business, that one could but stop to
look. The younger of the children, noticing this, looked up with such
a heaven of love in his face, and said, in explanation, "he is my
brother!" Pity they should ever grow up, thought we, as we passed
along. Pity that the world, with its clashing interests of business,
love, and politics, should ever come between them. Pity that they
should ever coldly exchange finger-tips, or, more wretched still, not
even exchange glances. Pity that one should sorrow, and grieve, and
hunger, and thirst, and yearn for sympathy, while the other should
sleep, and eat, and drink, unmindful of his fate. Pity that one with
meek-folded hands should pass into the land of silence, and no tear
of repentance and affection fall upon his marble face from the eyes
of his "brother." Such things have been. That is why we thought, pity
they should ever grow up!—"Heaven lies so near us in our infancy."
WOMEN AND THEIR DISCONTENTS.
Women can relieve their minds, now-a-days, in one way that was
formerly denied them: they can write! a woman who wrote, used to
be considered a sort of monster—At this day it is difficult to find one
who does not write, or has not written, or who has not, at least, a
strong desire to do so. Gridirons and darning-needles are getting
monotonous. A part of their time the women of to-day are content
to devote to their consideration when necessary; but you will rarely
find one—at least among women who think—who does not silently
rebel against allowing them a monopoly.
What? you inquire, would you encourage, in the present
overcrowded state of the literary market, any more women
scribblers? Stop a bit. It does not follow that she should wish or seek
to give to the world what she has written. I look around and see
innumerable women, to whose barren, loveless life this would be
improvement and solace, and I say to them, write! Write, if it will
make that life brighter, or happier, or less monotonous. Write! it will
be a safe outlet for thoughts and feelings, that maybe the nearest
friend you have, has never dreamed had place in your heart and
brain. You should have read the letters I have received; you should
have talked with the women I have talked with; in short, you should
have walked this earth with your eyes open, instead of shut, as far
as its women are concerned, to indorse this advice. Nor do I qualify
what I have said on account of social position, or age, or even
education. It is not safe for the women of 1868 to shut down so
much that cries out for sympathy and expression, because life is
such a maelstrom of business or folly, or both, that those to whom
they have bound themselves, body and soul, recognize only the
needs of the former. Let them write if they will. One of these days,
when that diary is found, when the hand that penned it shall be
dust, with what amazement and remorse will many a husband, or
father, exclaim, I never knew my wife, or my child, till this moment;
all these years she has sat by my hearth, and slumbered by my side,
and I have been a stranger to her. And you sit there, and you read
sentence after sentence, and recall the day, the month, the week,
when she moved calmly, and you thought happily, or, at least,
contentedly, about the house, all the while her heart was aching,
when a kind word from you, or even a touch of your hand upon her
head, as you passed out to business, or pleasure, would have
cheered her, oh so much! When had you sat down by her side after
the day's work for both was over, and talked with her just a few
moments of something besides the price of groceries, and the
number of shoes Tommy had kicked out, all of which, proper and
necessary in their place, need not of necessity form the stable of
conversation between a married pair; had you done this; had you
recognized that she had a soul as well as yourself, how much
sunshine you might have thrown over her colorless life!
"Perhaps, sir," you reply; "but I have left my wife far behind in the
region of thought. It would only distress her to do this!" How do you
know that? And if it were so, are you content to leave her—the
mother of your children—so far behind? Ought you to do it? Should
you not, by raising the self-respect you have well nigh crushed by
your indifference and neglect, extend a manly hand to her help? I
think so. The pink cheeks which first won you may have faded, but
remember that it was in your service, when you quietly accept the
fact that "you have left your wife far behind you in mental
improvement." Oh! it is pitiable this growing apart of man and wife,
for lack of a little generous consideration and magnanimity! It is
pitiable to see a husband without a thought that he might and
should occasionally, have given his wife a lift out of the petty,
harrowing details of her woman's life, turn from her, in company, to
address his conversation to some woman who, happier than she,
has had time and opportunity for mental culture. You do not see, sir
—you will not see—you do not desire to see, how her cheek flushes,
and her eye moistens, and her heart sinks like lead as you thus
wound her self-respect. You think her "cross and ill-natured," if
when, the next morning, you converse with her on the price of
butter, she answers you listlessly and with a total want of interest in
the treadmill-subject.
I say to such women: Write! Rescue a part of each week at least for
reading, and putting down on paper, for your own private benefit,
your thoughts and feelings. Not for the world's eye, unless you
choose, but to lift yourselves out the dead-level of your lives; to
keep off inanition; to lessen the number who are yearly added to our
lunatic asylums from the ranks of misappreciated, unhappy
womanhood, narrowed by lives made up of details. Fight it! oppose
it, for your own sakes and your children's! Do not be mentally
annihilated by it. It is all very well to sneer at this and raise the old
cry of "a woman's sphere being home"—which, by the way, you hear
oftenest from men whose home is only a place to feed and sleep in.
You might as well say that a man's sphere is his shop or his
counting-room. How many of them, think you, would be contented,
year in and year out, to eat, drink, and sleep as well as to transact
business there, and never desire or take, at all costs, some let-up
from its monotonous grind? How many would like to forego the walk
to and from the place of business? forego the opportunities for
conversation, which chance thus throws in their way, with other men
bent on the same or other errands? Have, literally, no variety in their
lives? Oh, if you could be a woman but one year and try it! A woman
—but not necessarily a butterfly—not necessarily a machine, which,
once wound up by the marriage ceremony, is expected to click on
with undeviating monotony till Death stops the hands.
Now, let me tell you how it was in good old-fashioned New England
towns; when people enjoyed life five times as well as now. Then
husbands, wives, and children had not each a separate circle of
acquaintances, and their chief aim was not to regulate matters, with
a view to be in each other's society as little as possible. That fatal
death-blow to the purity, happiness, and love of home.
Then you went at dark to tea. I am speaking of the old-fashioned
New England parties. You and your husband, and your eldest boy or
girl; the latter being instructed not to pull over the cake to get the
best piece, or otherwise to misbehave themselves. There were
assembled the principal members of the church, and, above all, its
pastor and spouse, and deacons ditto. The married women had on
their best caps and collars, and the regulation black-silk-company-
dress, which, in my opinion, has never been improved upon by
profane modern fingers. The young girls wore a merino of bright
hue, if it were winter, with a little frill of lace about the shoulders; or
a white cambric dress if the mildness of the weather admitted. The
men always in black, laity or clergy, with flesh-colored gloves, of
Nature's own making, warranted to fit.
All assembled, the buzz of talk was soon agreeably interrupted by
the entrance of a servant bearing a heavily-laden tray of cups and
saucers, filled with tea and coffee, cream and sugar. This tray was
rested on a table; and the host, rising, requested Rev. Mr. —— to
ask a blessing. He did it, and the youngsters, eying the cake, wished
it had been shorter. So did the girl in charge of the tray. "Blessing" at
last over, the tea and coffee were distributed. The matrons charging
their initiatory fledglings "not to spill over," often wisely pouring a
spoonful of coffee or tea, from the cup into the saucer, to prevent
the former from any china-gymnastics unfavorable to the best gown
or carpet. The men turned their toes in till they met; spread their
red silk handkerchiefs over their bony knees, and on that risky,
improvised, graceful lap, placed the hot cup of tea, with an awful
sense of responsibility, which interfered with the half-finished
account of the last "revival." Then came a tray of thinly-sliced bread
and butter, delicate and tempting; rich cake, guiltless of hartshorn or
soda, with delicate sandwiches, and tiny tarts.
This ceremony gone through, the young people crawled from the
maternal wing, and laughed and talked in corners, as freely and
hilariously as if they were not "children of damnation," destined to
eternal torment if they did not indorse the creed of their forefathers.
Their elders, with satisfied stomachs, and cheerful voices and faces,
seemed to have merged the awful "hell," too, for the time being;
and nobody would have supposed them capable of bringing children
into the world, to be scared through it with a claw-footed devil
constantly at their backs.
As the evening went on, the buzz and noise increased. The
youngsters giggled and pushed about, keeping jealous watch the
while, for the nine o'clock tray of goodies, which was to delight their
eyes and feast their palates. This tray contained the biggest oranges
and apples, the freshest cluster-raisins, and almonds, hickory nuts,
three-cornered nuts, filberts and grapes. After this came a tray of
preserved quinces, or plums, or peaches, with little pitchers of real
cream. Then, to wind up, little cunning glasses filled with lemonade,
made of lemons.
Now the youngsters had plenty to do. So absorbed were they,
cracking nuts and jokes, that when the minister, seizing the back of
a chair in the middle of the room, said, "Let us pray," the difficulty of
cutting a laugh off short in the middle, and disposing of their plates,
presented itself in such an hysterical manner, that a pinch of the ear,
or a shake of the shoulders, had to be resorted to, to bring things to
a spiritual focus. After prayers came speedy cloakings, shawlings,
and kind farewells and greetings; and by ten, or shortly after, the
hour at which modern parties begin, visitors and visited were all
tucked comfortably between the sheets.
Now. Nobody can give a party that does not involve the expenditure
of hundreds of dollars. Dinner, or evening party, it is all the same.
The hostess muddles her brain about "devilled fowl," "frozen
puddings," "meringue" things, of every shape—floral pyramids, for
which she has my forgiveness, for fashion never had a more
pardonable sin than this. She must have dozens of hired silver, and
chairs, and hired waiters, and the mantua-maker must be driven
wild for dress trimmings, and the interior of the house must be
thrown off of the family track for days, before and after. And the
good man of it must have a dozen kinds of wines, and as many
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
ebookbell.com