0% found this document useful (0 votes)
34 views81 pages

JVM Performance Engineering Inside Openjdk and The Hotspot Java Virtual Machine Monica Beckwith PDF Download

The document discusses 'JVM Performance Engineering: Inside OpenJDK and the HotSpot Java Virtual Machine' by Monica Beckwith, which explores Java's evolution, performance optimization techniques, and advanced memory management. It is aimed at Java developers, performance engineers, and educators, providing insights into JVM internals and performance tuning. The book covers various topics including Java's type system evolution, modular programming, and the unified logging system, emphasizing the importance of performance engineering in software development.

Uploaded by

vendystuval
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views81 pages

JVM Performance Engineering Inside Openjdk and The Hotspot Java Virtual Machine Monica Beckwith PDF Download

The document discusses 'JVM Performance Engineering: Inside OpenJDK and the HotSpot Java Virtual Machine' by Monica Beckwith, which explores Java's evolution, performance optimization techniques, and advanced memory management. It is aimed at Java developers, performance engineers, and educators, providing insights into JVM internals and performance tuning. The book covers various topics including Java's type system evolution, modular programming, and the unified logging system, emphasizing the importance of performance engineering in software development.

Uploaded by

vendystuval
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

Jvm Performance Engineering Inside Openjdk And

The Hotspot Java Virtual Machine Monica Beckwith


download

https://ebookbell.com/product/jvm-performance-engineering-inside-
openjdk-and-the-hotspot-java-virtual-machine-monica-
beckwith-58331716

Explore and download more ebooks at ebookbell.com


Here are some recommended products that we believe you will be
interested in. You can click the link to download.

Jvm Performance Engineering Inside Openjdk And The Hotspot Java


Virtual Machine 1st Edition Monica Beckwith

https://ebookbell.com/product/jvm-performance-engineering-inside-
openjdk-and-the-hotspot-java-virtual-machine-1st-edition-monica-
beckwith-56486380

Clojure High Performance Jvm Programming 1st Edition Eduardo Daz

https://ebookbell.com/product/clojure-high-performance-jvm-
programming-1st-edition-eduardo-daz-38196700

Clojure High Performance Jvm Programming Eduardo Diaz Shantanu Kumar


Akhil Wali

https://ebookbell.com/product/clojure-high-performance-jvm-
programming-eduardo-diaz-shantanu-kumar-akhil-wali-232074192

Clojure High Performance Jvm Programming Diaz Eduardo Kumar

https://ebookbell.com/product/clojure-high-performance-jvm-
programming-diaz-eduardo-kumar-153706882
Optimizing Cloud Native Java Practical Techniques For Improving Jvm
Application Performance 2nd Edition First Early Release 2nd First
Early Release Ben Evans

https://ebookbell.com/product/optimizing-cloud-native-java-practical-
techniques-for-improving-jvm-application-performance-2nd-edition-
first-early-release-2nd-first-early-release-ben-evans-53521098

Optimizing Cloud Native Java Practical Techniques For Improving Jvm


Application Performance 2nd Edition Benjamin J Evans

https://ebookbell.com/product/optimizing-cloud-native-java-practical-
techniques-for-improving-jvm-application-performance-2nd-edition-
benjamin-j-evans-63479180

Mastering The Java Virtual Machine An Indepth Guide To Jvm Internals


And Performance Optimization Santana

https://ebookbell.com/product/mastering-the-java-virtual-machine-an-
indepth-guide-to-jvm-internals-and-performance-optimization-
santana-55885106

Mastering The Java Virtual Machine An Indepth Guide To Jvm Internals


And Performance Optimization 1st Edition Otvio Santana

https://ebookbell.com/product/mastering-the-java-virtual-machine-an-
indepth-guide-to-jvm-internals-and-performance-optimization-1st-
edition-otvio-santana-55869838

Java Secrets High Performance And Scalability Unlock The Full


Potential Of Java With Expert Techniques For Building Scalable
Highperformance Applications Using Advanced Jvm Internals Part 1 Alex
Harrison
https://ebookbell.com/product/java-secrets-high-performance-and-
scalability-unlock-the-full-potential-of-java-with-expert-techniques-
for-building-scalable-highperformance-applications-using-advanced-jvm-
internals-part-1-alex-harrison-220578734
JVM Performance Engineering: Inside
OpenJDK and the HotSpot Java
Virtual Machine
Monica Beckwith

A NOTE FOR EARLY RELEASE READERS


With Early Release eBooks, you get books in their earliest form—the
author’s raw and unedited content as they write—so you can take advantage
of these technologies long before the official release of these titles.
Please note that the GitHub repo will be made active closer to publication.
If you have comments about how we might improve the content and/or
examples in this book, or if you notice missing material within this title,
please reach out to Pearson at [email protected]
Contents

Preface
Acknowledgments
About the Author

Chapter 1: The Performance Evolution of Java: The Language and the


Virtual Machine
Chapter 2: Performance Implications of Java’s Type System Evolution
Chapter 3: From Monolithic to Modular Java: A Retrospective and
Ongoing Evolution
Chapter 4: The Unified Java Virtual Machine Logging Interface
Chapter 5: End-to-End Java Performance Optimization: Engineering
Techniques and Micro-benchmarking with JMH
Chapter 6: Advanced Memory Management and Garbage Collection in
OpenJDK
Chapter 7: Runtime Performance Optimizations: A Focus on Strings
and Locks
Chapter 8: Accelerating Time to Steady State with OpenJDK HotSpot
VM
Chapter 9: Harnessing Exotic Hardware: The Future of JVM
Performance Engineering
Table of Contents

Preface
Intended Audience
How to Use This Book
Acknowledgments
About the Author

Chapter 1: The Performance Evolution of Java: The Language and the


Virtual Machine
A New Ecosystem Is Born
A Few Pages from History
Understanding Java HotSpot VM and Its Compilation Strategies
HotSpot Garbage Collector: Memory Management Unit
The Evolution of the Java Programming Language and Its
Ecosystem: A Closer Look
Embracing Evolution for Enhanced Performance
Chapter 2: Performance Implications of Java’s Type System Evolution
Java’s Primitive Types and Literals Prior to Java SE 5.0
Java’s Reference Types Prior to Java SE 5.0
Java’s Type System Evolution from Java SE 5.0 until Java SE 8
Java’s Type System Evolution: Java 9 and Java 10
Java’s Type System Evolution: Java 11 to Java 17
Beyond Java 17: Project Valhalla
Conclusion
Chapter 3: From Monolithic to Modular Java: A Retrospective and
Ongoing Evolution
Introduction
Understanding the Java Platform Module System
From Monolithic to Modular: The Evolution of the JDK
Continuing the Evolution: Modular JDK in JDK 11 and Beyond
Implementing Modular Services with JDK 17
JAR Hell Versioning Problem and Jigsaw Layers
Open Services Gateway Initiative
Introduction to Jdeps, Jlink, Jdeprscan, and Jmod
Conclusion
Chapter 4: The Unified Java Virtual Machine Logging Interface
The Need for Unified Logging
Unification and Infrastructure
Tags in the Unified Logging System
Diving into Levels, Outputs, and Decorators
Practical Examples of Using the Unified Logging System
Optimizing and Managing the Unified Logging System
Asynchronous Logging and the Unified Logging System
Understanding the Enhancements in JDK 11 and JDK 17
Conclusion
Chapter 5: End-to-End Java Performance Optimization: Engineering
Techniques and Micro-benchmarking with JMH
Introduction
Performance Engineering: A Central Pillar of Software
Engineering
Metrics for Measuring Java Performance
The Role of Hardware in Performance
Performance Engineering Methodology: A Dynamic and
Detailed Approach
The Importance of Performance Benchmarking
Conclusion
Chapter 6: Advanced Memory Management and Garbage Collection in
OpenJDK
Introduction
Overview of Garbage Collection in Java
Thread-Local Allocation Buffers and Promotion-Local
Allocation Buffers
Optimizing Memory Access with NUMA-Aware Garbage
Collection
Exploring Garbage Collection Improvements
Future Trends in Garbage Collection
Practical Tips for Evaluating GC Performance
Evaluating Garbage Collection Performance in Various
Workloads
Live Data Set Pressure
Chapter 7: Runtime Performance Optimizations: A Focus on Strings
and Locks
Introduction
String Optimizations
Enhanced Multithreading Performance: Java Thread
Synchronization
Transitioning from the Thread-per-Task Model to More Scalable
Models
Conclusion
Chapter 8: Accelerating Time to Steady State with OpenJDK HotSpot
VM
Introduction
JVM Start-up and Warm-up Optimization Techniques
Decoding Time to Steady State in Java Applications
Managing State at Start-up and Ramp-up
GraalVM: Revolutionizing Java’s Time to Steady State
Emerging Technologies: CRIU and Project CRaC for
Checkpoint/Restore Functionality
Start-up and Ramp-up Optimization in Serverless and Other
Environments
Boosting Warm-up Performance with OpenJDK HotSpot VM
Conclusion
Chapter 9: Harnessing Exotic Hardware: The Future of JVM
Performance Engineering
Introduction to Exotic Hardware and the JVM
Exotic Hardware in the Cloud
The Role of Language Design and Toolchains
Case Studies
Envisioning the Future of JVM and Project Panama
Concluding Thoughts: The Future of JVM Performance
Engineering
Preface

For over 20 years, I have been immersed in the JVM and its associated
runtime, constantly in awe of its transformative evolution. This detailed and
insightful journey has provided me with invaluable knowledge and
perspectives that I am excited to share in this book.
As a performance engineer and a Java Champion, I have had the honor of
sharing my knowledge at various forums. Time and again, I’ve been
approached with questions about Java and JVM performance, the nuances
of distributed and cloud performance, and the advanced techniques that
elevate the JVM to a marvel.
In this book, I have endeavored to distill my expertise into a cohesive
narrative that sheds light on Java’s history, its innovative type system, and
its performance prowess. This book reflects my passion for Java and its
runtime. As you navigate these pages, you’ll uncover problem statements,
solutions, and the unique nuances of Java. The JVM, with its robust
runtime, stands as the bedrock of today’s advanced software architectures,
powering some of the most state-of-the-art applications and fortifying
developers with the tools needed to build resilient distributed systems. From
the granularity of microservices to the vast expanse of cloud-native
architectures, Java’s reliability and efficiency have cemented its position as
the go-to language for distributed computing.
The future of JVM performance engineering beckons, and it’s brighter than
ever. As we stand at this juncture, there’s a call to action. The next chapter
of JVM’s evolution awaits, and it’s up to us, the community, to pen this
narrative. Let’s come together, innovate, and shape the trajectory of JVM
for generations to come.
Intended Audience
This book is primarily written for Java developers and software engineers
who are keen to enhance their understanding of JVM internals and
performance tuning. It will also greatly benefit system architects and
designers, providing them with insights into JVM’s impact on system
performance. Performance engineers and JVM tuners will find advanced
techniques for optimizing JVM performance. Additionally, computer
science and engineering students and educators will gain a comprehensive
understanding of JVM’s complexities and advanced features.
With the hope of furthering education in performance engineering,
particularly with a focus on the JVM, this text also aligns with advanced
courses on programming languages, algorithms, systems, computer
architectures, and software engineering. I am passionate about fostering a
deeper understanding of these concepts and excited about contributing to
coursework that integrates the principles of JVM performance engineering
and prepares the next generation of engineers with the knowledge and skills
to excel in this critical area of technology.
Focusing on the intricacies and strengths of the language and runtime, this
book offers a thorough dissection of Java’s capabilities in concurrency, its
strengths in multithreading, and the sophisticated memory management
mechanisms that drive peak performance across varied environments.
In Chapter 1, we trace Java’s timeline, from its inception in the mid-1990s
Java’s groundbreaking runtime environment, complete with the Java VM,
expansive class libraries, and a formidable set of tools, has set the stage
with creative advancements and flexibility.
We spotlight Java’s achievements, from the transformative garbage
collector to the streamlined Java bytecode. The Java HotSpot VM, with its
advanced JIT compilation and avant-garde optimization techniques,
exemplifies Java’s commitment to performance. Its intricate compilation
methodologies, harmonious synergy between the “client” compiler (C1) and
“server” compiler (C2), and dynamic optimization capabilities ensure Java
applications remain agile and efficient.
The brilliance of Java extends to memory management with the HotSpot
Garbage Collector. Embracing generational garbage collection and the weak
generational hypothesis, it efficiently employs parallel and concurrent GC
threads, ensuring peak memory optimization and application
responsiveness.
From Java 1.1’s foundational features to the trailblazing innovations of Java
17, Java’s trajectory has been one of progress and continuous enhancement.
Java’s legacy emerges as one of perpetual innovation and excellence
In Chapter 2, we delve into the heart of Java: its type system. This system,
integral to any programming language, has seen a remarkable evolution in
Java, with innovations that have continually refined its structure. We begin
by exploring Java’s foundational elements—primitive and reference types,
interfaces, classes, and arrays—that anchored Java programming prior to
Java SE 5.0.
The narrative continues with the transformative enhancements from Java
SE 5.0 to Java SE 8, where enumerations and annotations emerged,
amplifying Java’s adaptability. Subsequent versions, Java 9 to Java 10,
brought forth the Variable Handle Typed Reference, further enriching the
language. And as we transition to the latest iterations, Java 11 to Java 17,
we spotlight the advent of Switch Expressions, Sealed Classes, and the
eagerly awaited Records.
We then venture into the realms of Project Valhalla, examining the
performance nuances of the existing type system and the potential of future
value classes. This chapter offers insights into Project Valhalla’s ongoing
endeavors, from refined generics to the conceptualization of classes for
basic primitives.
Java’s type system is more than just a set of types—it’s a reflection of
Java’s commitment to versatility, efficiency, and innovation. The goal of
this chapter is to illuminate the type system’s past, present, and promising
future, fostering a profound understanding of its intricacies.
Chapter 3 extensively covers the Java Platform Module System (JPMS),
showcasing its breakthrough impact on modular programming. As we step
into the modular era, Java, with JPMS, has taken a giant leap into this
future. For those new to this domain, we start by unraveling the essence of
modules, complemented by hands-on examples that guide you through
module creation, compilation, and execution.
Java’s transition from a monolithic JDK to a modular one demonstrates its
dedication to evolving needs and creative progress. A standout section of
this chapter is the practical implementation of modular services using JDK
17. We navigate the intricacies of module interactions, from service
providers to consumers, enriched by working examples. Key concepts like
encapsulation of implementation details and the challenges of Jar Hell
versioning are addressed, with the introduction of Jigsaw layers offering
solutions in the modular landscape. A hands-on segment further clarifies
these concepts, providing readers with tangible insights.
For a broader perspective, we draw comparisons with OSGi, spotlighting
the parallels and distinctions, to give readers a comprehensive
understanding of Java’s modular systems. Essential tools such as Jdeps,
Jlink, Jdeprscan, and Jmod are introduced, each integral to the modular
ecosystem. Through in-depth explanations and examples, we aim to
empower readers to effectively utilize these tools. As we wrap up, we
contemplate the performance nuances of JPMS and look ahead, speculating
on the future trajectories of Java’s modular evolution.
Logs are the unsung heroes of software development, providing invaluable
insights and aiding debugging. Chapter 4 highlights Java’s Unified
Logging System, guiding you through its proficiencies and best practices.
We commence by acknowledging the need for unified logging, highlighting
the challenges of disparate logging systems and the advantages of a unified
approach. The chapter then highlights the unification and infrastructure,
shedding light on the pivotal performance metrics for monitoring and
optimization.
We explore the vast array of log tags available, diving into their specific
roles and importance. Ensuring logs are both comprehensive and insightful,
we tackle the challenge of discerning any missing information. The
intricacies of log levels, outputs, and decorators are meticulously examined,
providing readers with a lucid understanding of how to classify, format, and
direct their logs. Practical examples further illuminate the workings of the
unified logging system, empowering readers to implement their newfound
knowledge in tangible scenarios.
Benchmarking and performance evaluation stand as pillars of any logging
system. This chapter equips readers with the tools and methodologies to
gauge and refine their logging endeavors effectively. We also touch upon
the optimization and management of the unified logging system, ensuring
its sustained efficiency. With continuous advancements, notably in JDK 11
and JDK 17, we ensure readers remain abreast of the latest in Java logging.
Concluding this chapter, we emphasize the importance of logs as a
diagnostic tool, shedding light on their role in proactive system monitoring
and reactive problem-solving. Chapter 4 highlights the power of effective
logging in Java, underscoring its significance in building and maintaining
robust applications.
Chapter 5 focuses on the essence of performance engineering within the
Java ecosystem, emphasizing that performance transcends mere speed—it’s
about crafting an unparalleled user experience. Our voyage commences
with a formative exploration of performance engineering’s pivotal role
within the broader software development realm. By unraveling the
multifaceted layers of software engineering, we accentuate performance’s
stature as a paramount quality attribute.
With precision, we delineate the metrics pivotal to gauging Java’s
performance, encompassing aspects from footprint to the nuances of
availability, ensuring readers grasp the full spectrum of performance
dynamics. Stepping in further we explore the intricacies of response time
and its symbiotic relationship with availability. This inspection provides
insights into the mechanics of application timelines, intricately weaving the
narrative of response time, throughput, and the inevitable pauses that
punctuate them.
Yet, the performance narrative is only complete by acknowledging the
profound influence of hardware. This chapter decodes the symbiotic
relationship between hardware and software, emphasizing the harmonious
symphony that arises from the confluence of languages, processors, and
memory models. From the subtleties of memory models and their bearing
on thread dynamics to the Java Memory Model’s foundational principles,
we journey through the maze of concurrent hardware, shedding light on the
order mechanisms pivotal to concurrent computing. Transitioning to the
realm of methodology, we introduce readers to the dynamic world of
performance engineering methodology. This section offers a panoramic
view, from the intricacies of experimental design to formulating a
comprehensive statement of work, championing a top-down approach that
guarantees a holistic perspective on the performance engineering process.
Benchmarking, the cornerstone of performance engineering, receives its
due spotlight. We underscore its indispensable role, guiding the reader
through the labyrinth of the benchmarking regime. This encompasses
everything from its inception in planning to the culmination in analysis. The
chapter provides a view into the art and science of JVM memory
management benchmarking, serving as a compass for those passionate
about performance optimization.
Finally, the Java Micro-Benchmark Suite (JMH) emerges as the pièce de
résistance. From its foundational setup to the intricacies of its myriad
features, the journey encompasses the genesis of writing benchmarks, to
their execution, enriched with insights into benchmarking modes, profilers,
and JMH’s pivotal annotations. This chapter should inspire a fervor for
relentless optimization and arms readers with the arsenal required to unlock
Java’s unparalleled performance potential.
Memory management is the silent guardian of Java applications, often
operating behind the scenes but crucial to their success. Chapter 6 offers a
leap into the world of garbage collection, unraveling the techniques and
innovations that ensure Java applications run efficiently and effectively. Our
journey begins with an overview of the garbage collection in Java, setting
the stage for the intricate details that follow. We then venture into Thread-
Local Allocation Buffers (TLABs) and Promotion Local Allocation Buffers
(PLABs), elucidating their pivotal roles in memory management. As we
progress, the chapter sheds light on optimizing memory access,
emphasizing the significance of the NUMA-Aware garbage collection and
its impact on performance.
The highlight of this chapter lies in its exploration of advanced garbage
collection techniques. We review the G1 Garbage Collector (G1 GC),
unraveling its revolutionary approach to heap management. From grasping
the advantages of a regionalized heap to optimizing G1 GC parameters for
peak performance, this section promises a holistic cognizance of one of
Java’s most advanced garbage collectors. But the exploration doesn’t end
there. The Z Garbage Collector (ZGC) stands as a pinnacle of technological
advancement, offering unparalleled scalability and low latency for
managing multi-terabyte heaps. We look into the origins of ZGC, its
adaptive optimization techniques, and the advancements that make it a
game-changer in real-time applications.
This chapter also offers insights into the emerging trends in garbage
collection, setting the stage for what lies ahead. Practicality remains at the
forefront, with a dedicated section offering invaluable tips for evaluating
GC performance. From sympathizing with various workloads, such as
Online Analytical Processing (OLAP) to Online Transaction Processing
(OLTP) and Hybrid Transactional/Analytical Processing (HTAP), to
synthesizing live data set pressure and data lifespan patterns, the chapter
equips readers with the tools and knowledge to optimize memory
management effectively. This chapter is an accessible guide to advanced
garbage collection techniques d that Java professionals dneed to navigate
the topography of memory management.
the ability to efficiently manage concurrent tasks and optimize string
operations stands as a testament to the language’s evolution and
adaptability. Chapter 7 covers the intricacies of Java’s concurrency
mechanisms and string optimizations, offering readers a comprehensive
exploration of advanced techniques and best practices. We commence our
journey with an extensive review of the string optimizations. From
mastering the nuances of literal and interned string optimization in the
HotSpot VM to the innovative string deduplication optimization introduced
in Java 8, the chapter sheds light on techniques to reduce string footprint.
We take a further look into the “Indy-fication” of string concatenation and
the introduction of compact strings, ensuring a holistic conceptualization of
string operations in Java.
Next, the chapter focuses on enhanced multithreading performance,
highlighting Java’s thread synchronization mechanisms. We study the role
of monitor locks, the various lock types in OpenJDK’s HotSpot VM, and
the dynamics of lock contention. The evolution of Java’s locking
mechanisms is meticulously detailed, offering insights into the
improvements in contended locks and monitor operations. To tap into our
learnings from Chapter 5, with the help of practical testing and performance
analysis, we visualize contended lock optimization, harnessing the power of
JMH and Async-Profiler.
As we navigate the world of concurrency, the transition from the thread-per-
task model to the scalable thread-per-request model is highlighted. The
examination of Java’s Executor Service, ThreadPools, ForkJoinPool
framework, and CompletableFuture ensures a robust comprehension of
Java’s concurrency mechanisms.
Our journey in this chapter concludes with a glimpse into the future of
concurrency in Java as we reimagine concurrency with virtual threads.
From understanding virtual threads and their carriers to discussing
parallelism and integration with existing APIs, the chapter is a practical
guide to advanced concurrency mechanisms and string optimizations in
Java.
In Chapter 8 the journey from startup to steady-state performance is
explored in depth.. This chapter ventures far into the modulation of JVM
start-up and warm-up, covering techniques and best practices that ensure
peak performance. We begin by distinguishing between the often-confused
concepts of warm-up and ramp-up, setting the stage for fully understanding
JVM’s start-up dynamics. The chapter emphasizes the importance of JVM
start-up and warm-up performance, dissecting the phases of JVM startup
and the journey to an application’s steady state. As we navigate the
application’s lifecycle, the significance of managing the state during start-
up and ramp-up becomes evident, highlighting the benefits of efficient state
management.
The study of Class Data Sharing offers insights into the anatomy of shared
archive files, memory mapping, and the benefits of multi-instance setups.
Moving on to Ahead-Of-Time (AOT) compilation, the contrast between
AOT and JIT compilation is meticulously highlighted, with GraalVM
heralding a paradigm shift in Java’s performance landscape and with
HotSpot VM’s up-and-coming Project Leyden and its forecasted ability to
manage states via CDS and AOT. The chapter also addresses the unique
challenges and opportunities of serverless computing and containerized
environments. The emphasis on ensuring swift startups and efficient scaling
in these environments underscores the evolving nature of Java performance
optimization.
Our journey then transitions to boosting warm-up performance with
OpenJDK HotSpot VM. The chapter offers a holistic view of warm-up
optimizations, from compiler enhancements to segmented code cache and
Project Leyden enhancements in the near future. The evolution from
PermGen to Metaspace is also highlighted to showcase start-up, warm-up,
and steady-state implications.
The chapter culminates with a survey of various OpenJDK projects, such as
CRIU, and CraC, revolutionizing Java’s time-to-steady state by introducing
groundbreaking checkpoint/restore functionality..
Our final chapter ( Chapter 9) focuses on the intersection of exotic
hardware and the Java Virtual Machine (JVM). This chapter offers readers a
considered exploration of the world of exotic hardware, its integration with
the JVM, and its galvanizing impact on performance engineering. We start
with an introduction to exotic hardware and its growing prominence in
cloud environments.
The pivotal role of language design and toolchains quickly becomes
evident, setting the stage for case studies showcasing the real-world
applications and challenges of integrating exotic hardware with the JVM.
From the light-weight Java gaming library (LWJGL), a baseline example
that offers insights into the intricacies of working with the JVM, to Aparapi,
which bridges the gap between Java and OpenCL, each case study is
carefully detailed, demonstrating the challenges, limitations, and successes
of each integration. The chapter then shifts to Project Sumatra, a significant
effort in JVM performance optimization, followed by TornadoVM, a
specialized JVM tailored for hardware accelerators.
Through these case studies, the symbiotic potential of integrating exotic
hardware with the JVM becomes increasingly evident, leading up to an
overview of Project Panama, a new horizon in JVM performance
engineering. At the heart of Project Panama lies the Vector API, a symbol of
innovation designed for vector computations. But it’s not just about
computations—it’s about ensuring they are efficiently vectorized and
tailored for hardware that thrives on vector operations. This API is an
example of Java’s commitment to evolving, ensuring that developers have
the tools to express parallel computations optimized for diverse hardware
architectures. But Panama isn’t just about vectors. The Foreign Function
and Memory API emerges as a pivotal tool, a bridge that allows Java to
converse seamlessly with native libraries. This is Java’s answer to the age-
old challenge of interoperability, ensuring Java applications can interface
effortlessly with native code, breaking language barriers.
Yet, every innovation comes with its set of challenges. Integrating exotic
hardware with the JVM is no walk in the park. From managing intricate
memory access patterns to deciphering hardware-specific behaviors, the
path to optimization is laden with complexities. But these challenges drive
innovation, pushing the boundaries of what’s possible. Looking to the
future, we envision Project Panama as the gold standard for JVM
interoperability. The horizon looks promising, with Panama poised to
redefine performance and efficiency for Java applications.
This isn’t just about the present or the imminent future. The world of JVM
performance engineering is on the cusp of a revolution. Innovations are
knocking at our door, waiting to be embraced—with Tornado VM’s Hybrid
APIs, and with HAT toolkit and Project Babylon on the horizon.

How to Use This Book


1. Sequential Reading for Comprehensive Understanding: This book is
designed to be read from beginning to end, as each chapter builds
upon the knowledge of the previous ones. This approach is especially
recommended for readers new to JVM performance engineering.
2. Modular Approach for Specific Topics: Experienced readers may
prefer to jump directly to chapters that address their specific interests
or challenges. The table of contents and index can guide you to
relevant sections.
3. Practical Examples and Code: Throughout the book, practical
examples and code snippets are provided to illustrate key concepts. To
get the most out of these examples, readers are encouraged to type out
and run the code themselves.
4. Visual Aids for Enhanced Understanding: In addition to written
explanations, this book employs a variety of visual aids to deepen
your understanding.
a. Case Studies: Real-world scenarios that demonstrate the application
of JVM performance techniques.
b. Screenshots: Visual outputs depicting profiling results as well as
various GC plots, which are essential for understanding the GC
process and phases.
c. Use-Case Diagrams: Visual representations that map out the
system’s functional requirements, showing how different entities
interact with each other.
d. Block Diagrams: Illustrations that outline the architecture of a
particular JVM or system component, highlighting performance
features.
e. Class Diagrams: Detailed object-oriented designs of various code
examples, showing relationships and hierarchies.
f. Process Flowcharts: Step-by-step diagrams that walk you through
various performance optimization processes and components.
g. Timelines: Visual representations of the different phases or state
changes in an activity and the sequence of actions that are taken.
5. Utilizing the Companion GitHub Repository: A significant portion of
the book’s value lies in its practical application. To facilitate this, I
have created JVM Performance Engineering GitHub Repository
(https://github.com/mo-beck/JVM-Performance-Engineering ). Here,
you will find
a. Complete Code Listings: All the code snippets and scripts
mentioned in the book are available in full. This allows you to see
the code in its entirety and experiment with it.
b. Additional Resources and Updates: The field of JVM Performance
Engineering is ever evolving. The repository will be periodically
updated with new scripts, resources, and information to keep you
abreast of the latest developments.
c. Interactive Learning: Engage with the material by cloning the
repository, running the GC scripts against your GC log files, and
modifying them to see how outcomes better suit your GC learning
and understanding journey.
6. Engage with the Community: I encourage readers to engage with the
wider community. Use the GitHub repository to contribute your ideas,
ask questions, and share your insights. This collaborative approach
enriches the learning experience for everyone involved.
7. Feedback and Suggestions: Your feedback is invaluable. If you have
suggestions, corrections, or insights, I warmly invite you to share
them. You can provide feedback via the GitHub repository, via email
([email protected]), or via social media platforms
(https://www.linkedin.com/in/monicabeckwith/ or
https://twitter.com/JVMPerfEngineer).
__________________________________

In Java’s vast realm, my tale takes wing,


A narrative so vivid, of wonders I sing.
Distributed systems, both near and afar,
With JVM shining - the brightest star!

Its rise through the ages, a saga profound,


With each chronicle, inquiries resound.
“Where lies the wisdom, the legends so grand?”
They ask with a fervor, eager to understand.

This book is a beacon for all who pursue,


A tapestry of insights, both aged and new.
In chapters that flow, like streams to the seas,
I share my heart’s journey, my tech odyssey.

—Monica Beckwith
Acknowledgments

This content is currently in development.


About the Author

This content is currently in development.


Chapter 1. The Performance
Evolution of Java: The Language
and the Virtual Machine

More than three decades ago, the programming languages landscape was
largely defined by C and its object-oriented extension, C++. In this period,
the world of computing was undergoing a significant shift from large,
cumbersome mainframes to smaller, more efficient minicomputers. C, with
its suitability for Unix systems, and C++, with its innovative introduction of
classes for object-oriented design, were at the forefront of this technological
evolution.
However, as the industry started to shift toward more specialized and cost-
effective systems, such as microcontrollers and microcomputers, a new set
of challenges emerged. Applications were ballooning in terms of lines of
code, and the need to “port” software to various platforms became an
increasingly pressing concern. This often necessitated rewriting or heavily
modifying the application for each specific target, a labor-intensive and
error-prone process. Developers also faced the complexities of managing
numerous static library dependencies and the demand for lightweight
software on embedded systems—areas where C++ fell short.
It was against this backdrop that Java emerged in the mid-1990s. Its
creators aimed to fill this niche by offering a “write once, run anywhere”
solution. But Java was more than just a programming language. It
introduced its own runtime environment, complete with a virtual machine
(Java Virtual Machine [JVM]), class libraries, and a comprehensive set of
tools. This all-encompassing ecosystem, known as the Java Development
Kit (JDK), was designed to tackle the challenges of the era and set the stage
for the future of programming. Today, more than a quarter of a century later,
Java’s influence in the world of programming languages remains strong, a
testament to its adaptability and the robustness of its design.
The performance of applications emerged as a critical factor during this
time, especially with the rise of large-scale, data-intensive applications. The
evolution of Java’s type system has played a pivotal role in addressing these
performance challenges. Thanks to the introduction of generics, autoboxing
and unboxing, and enhancements to the concurrency utilities, Java
applications have seen significant improvements in both performance and
scalability. Moreover, the changes in the type system had had far-reaching
implications for the performance of the JVM itself. In particular, the JVM
has had to adapt and optimize its execution strategies to efficiently handle
these new language features. As you read this book, bear in mind the
historical context and the driving forces that led to Java’s inception. The
evolution of Java and its virtual machine have profoundly influenced the
way developers write and optimize software for various platforms.
In this chapter, we will thoroughly examine the history of Java and JVM,
highlighting the technological advancements and key milestones that have
significantly shaped its development. From its early days as a solution for
platform independence, through the introduction of new language features,
to the ongoing improvements to the JVM, Java has evolved into a powerful
and versatile tool in the arsenal of modern software development.

A New Ecosystem Is Born


In the 1990s, the internet was emerging, and web pages became more
interactive with the introduction of Java applets. Java applets were small
applications that ran within web browsers, providing a “real-time”
experience for end users.
Applets were not only platform independent but also “secure,” in the sense
that the user needed to trust the applet writer. When discussing security in
the context of the JVM, it’s essential to understand that direct access to
memory should be forbidden. As a result, Java introduced its own memory
management system, called the garbage collector (GC).

Note
In this book, the acronym GC is used to refer to both garbage collection,
the process of automatic memory management, and garbage collector, the
module within the JVM that performs this process. The specific meaning
will be clear based on the context in which GC is used.

Additionally, an abstraction layer, known as Java bytecode, was added to


any executable. Java applets quickly gained popularity because their
bytecode, residing on the web server, would be transferred and executed as
its own process during web page rendering. Although the Java bytecode is
platform independent, it is interpreted and compiled into native code
specific to the underlying platform.

A Few Pages from History


The JDK included tools such as a Java compiler that translated Java code
into Java bytecode. Java bytecode is the executable handled by the Java
Runtime Environment (JRE). Thus, for different environments, only the
runtime needed to be updated. As long as a JVM for a specific environment
existed, the bytecode could be executed. The JVM and the GC served as the
execution engines. For Java versions 1.0 and 1.1, the bytecode was
interpreted to the native machine code, and there was no dynamic
compilation.
Soon after the release of Java versions 1.0 and 1.1, it became apparent that
Java needed to be more performant. Consequently, a just-in-time (JIT)
compiler was introduced in Java 1.2. When combined with the JVM, it
provided dynamic compilation based on hot methods and loop-back branch
counts. This new VM was called the Java HotSpot VM.
Understanding Java HotSpot VM and Its
Compilation Strategies
The Java HotSpot VM plays a critical role in executing Java programs
efficiently. It includes JIT compilation, tiered compilation, and adaptive
optimization to improve the performance of Java applications.

The Evolution of the HotSpot Execution Engine


The HotSpot VM performs mixed-mode execution, which means that the
VM starts in interpreted mode, with the bytecode being converted into
native code based on a description table. The table has a template of native
code corresponding to each bytecode instruction known as the
TemplateTable; it is just a simple lookup table. The execution code is stored
in a code cache (known as CodeCache). CodeCache stores native code and
is also a useful cache for storing JIT-ted code.

Note
HotSpot VM also provides an interpreter that doesn’t need a template,
called the C++ interpreter. Some OpenJDK ports1 choose this route to
simplify porting of the VM to non-x86 platforms.
1
https://wiki.openjdk.org/pages/viewpage.action?pageId=13729802

Performance-Critical Methods and Their Optimization


Performance engineering is a critical aspect of software development, and a
key part of this process involves identifying and optimizing performance-
critical methods. These methods are frequently executed or contain
performance-sensitive code, and they stand to gain the most from JIT
compilation. Optimizing performance-critical methods is not just about
choosing appropriate data structures and algorithms; it also involves
identifying and optimizing the methods based on their frequency of
invocation, size and complexity, and available system resources.
Consider the following BookProgress class as an example:
import java.util.*;

public class BookProgress {


private String title;
private Map<String, Integer> chapterPages;
private Map<String, Integer> chapterPagesWritten;

public BookProgress(String title) {


this.title = title;
this.chapterPages = new HashMap<>();
this.chapterPagesWritten = new HashMap<>();
}

public void addChapter(String chapter, int totalPages) {


this.chapterPages.put(chapter, totalPages);
this.chapterPagesWritten.put(chapter, 0);
}

public void updateProgress(String chapter, int pagesWritten) {


this.chapterPagesWritten.put(chapter, pagesWritten);
}

public double getProgress(String chapter) {


return ((double) chapterPagesWritten.get(chapter) / chapte
}

public double getTotalProgress() {


int totalWritten = chapterPagesWritten.values().stream().ma
int total = chapterPages.values().stream().mapToInt(Intege
return ((double) totalWritten / total) * 100;
}
}

public class Main {


public static void main(String[] args) {
BookProgress book = new BookProgress("JVM Performance Engin
String[] chapters = {"Performance Evolution", "Performance
"Unified Logging System", "End-to-End Performance Optimization", "A
Performance Optimization", "Accelerating Startup", "Harnessing Exot
for (String chapter : chapters) {
book.addChapter(chapter, 100);
}
for (int i = 0; i < 500000; i++) {
for (String chapter : chapters) {
int currentPagesWritten = book.chapterPagesWritten
if (currentPagesWritten < 100) {
book.updateProgress(chapter, currentPagesWritte
double progress = book.getProgress(chapter);
System.out.println("Progress for chapter " + ch
}
}
}
System.out.println("Total book progress: " + book.getTotal
}
}

In this code, we’ve defined a BookProgress class to track the progress of


writing a book, which is divided into chapters. Each chapter has a total
number of pages and a current count of pages written. The class provides
methods to add chapters, update progress, and calculate the progress of each
chapter and the overall book.
The Main class creates a BookProgress object for a book titled “JVM
Performance Engineering.” It adds nine chapters, each with 100 pages, and
simulates writing the book by updating the progress of each chapter in a
round-robin fashion, writing two pages at a time. After each update, it
calculates and prints the progress of the current chapter and, once all pages
are written, the overall progress of the book.
The getProgress(String chapter) method is a performance-critical
method, which is called up to 500,000 times for each chapter. This frequent
invocation makes it a prime candidate for optimization by the HotSpot VM,
illustrating how certain methods in a program may require more attention
for performance optimization due to their high frequency of use.
Interpreter and JIT Compilation
The HotSpot VM provides an interpreter that converts bytecode into native
code based on the TemplateTable. Interpretation is the first step in adaptive
optimization offered by this VM and is considered the slowest form of
bytecode execution. To make the execution faster, the HotSpot VM utilizes
adaptive JIT compilation. The JIT-optimized code replaces the template
code for methods that are identified as performance critical.
As mentioned in the previous section, the HotSpot VM monitors executed
code for performance-critical methods based on two key metrics—method
entry counts and loop-back branch counts. The VM assigns call counters to
individual methods in the Java application. When the entry count exceeds a
preestablished value, the method or its callee is chosen for asynchronous
JIT compilation. Similarly, there is a counter for each loop in the code.
Once the HotSpot VM determines that the loop-back branches (also known
as loop-back edges) have crossed their threshold, the JIT optimizes that
particular loop. This optimization is called on-stack replacement (OSR).
With OSR, only the loop for which the loop-back branch counter
overflowed will be compiled and replaced asynchronously on the execution
stack.

Print Compilation
A very handy command-line option that can help us better understand
adaptive optimization in the HotSpot VM is –XX:+PrintCompilation. This
option also returns information on different optimized compilation levels,
which are provided by an adaptive optimization called tiered compilation
(discussed in the next subsection).
The output of the –XX:+PrintCompilation option is a log of the HotSpot
VM’s compilation tasks. Each line of the log represents a single
compilation task and includes several pieces of information:
The timestamp in milliseconds since the JVM started and this
compilation task was logged.
The unique identifier for this compilation task.
Flags indicating certain properties of the method being compiled, such
as whether it’s an OSR method (%), whether it’s synchronized (s),
whether it has an exception handler (!), whether it’s blocking (b), or
whether it’s native (n).
The tiered compilation level, indicating the level of optimization
applied to this method.
The fully qualified name of the method being compiled.
For OSR methods, the bytecode index where the compilation started.
This is usually the start of a loop.
The size of the method in the bytecode, in bytes.
Here are a few examples of the output of the –XX:+PrintCompilation
option:
567 693 % ! 3 org.h2.command.dml.Insert::insertRows @ 76 (51
656 797 n 0 java.lang.Object::clone (native)
779 835 s 4 java.lang.StringBuffer::append (13 bytes)

These logs provide valuable insights into the behavior of the HotSpot VM’s
adaptive optimization, helping us understand how our Java applications are
optimized at runtime.

Tiered Compilation
Tiered compilation, which was introduced in Java 7, provides multiple
levels of optimized compilations, ranging from T0 to T4:
1. T0: Interpreted code, devoid of compilation. This is where the code
starts and then moves on to the T1, T2, or T3 level.
2. T1–T3: Client-compiled mode. T1 is the first step where the method
invocation counters and loop-back branch counters are used. At T2,
the client compiler includes profiling information, referred to as
profile-guided optimization; it may be familiar to readers who are
conversant in static compiler optimizations. At the T3 compilation
level, completely profiled code can be generated.
3. T4: The highest level of optimization provided by the HotSpot VM’s
“server.”
Prior to tiered compilation, the server compiler would employ the
interpreter to collect such profiling information. With the introduction of
tiered compilation, the code reaches client compilation levels faster, and
now the profiling information is generated by client-compiled methods
themselves, providing better start-up times.

Note
Tiered compilation has been enabled by default since Java 8.

Client and Server Compilers


The HotSpot VM provides two flavors of compilers: the fast “client”
compiler (also known as the C1 compiler) and the “server” compiler (also
known as the C2 compiler).
1. Client compiler (C1): Aims for fast start-up times in a client setup.
The JIT invocation thresholds are lower for a client compiler than for
a server compiler. This compiler is designed to compile code quickly,
providing a fast start-up time, but the code it generates is less
optimized.
2. Server compiler (C2): Offers many more adaptive optimizations and
better thresholds geared toward higher performance. The counters that
determine when a method/loop needs to be compiled are still the
same, but the invocation thresholds are different (much lower) for a
client compiler than for a server compiler. The server compiler takes
longer to compile methods but produces highly optimized code that is
beneficial for long-running applications. Some of the optimizations
performed by the C2 compiler include inlining (replacing method
invocations with the method’s body), loop unrolling (increasing the
loop body size to decrease the overhead of loop checks and to
potentially apply other optimizations such as loop vectorization), dead
code elimination (removing code that does not affect the program
results), and range-check elimination (removing checks for index out-
of-bounds errors if it can be assured that the array index never crosses
its bounds). These optimizations help to improve the execution speed
of the code and reduce the overhead of certain operations.2
2
“What the JIT!? Anatomy of the OpenJDK HotSpot VM.” infoq.com.

Segmented Code Cache


As we delve deeper into the intricacies of the HotSpot VM, it’s important to
revisit the concept of the code cache. Recall that the code cache is a storage
area for native code generated by the JIT compiler or the interpreter. With
the introduction of tiered compilation, the code cache also becomes a
repository for profiling information gathered at different levels of tiered
compilation. Interestingly, even the TemplateTable, which the interpreter
uses to look up the native code sequence for each bytecode, is stored in the
code cache.
The size of the code cache is fixed at start-up, but can be modified on the
command line by passing the desired maximum value to -
XX:ReservedCodeCacheSize. Prior to Java 7, the default value for this size
was 48 MB. Once the code cache was filled up, all compilation would
cease. This posed a significant problem when tiered compilation was
enabled, as the code cache would contain not only JIT-compiled code
(represented as nmethod in the HotSpot VM) but also profiled code. The
nmethod refers to the internal representation of a Java method that has been
compiled into machine code by the JIT compiler. In contrast, the profiled
code is the code that has been analyzed and optimized based on its runtime
behavior. The code cache needs to manage both of these types of code,
leading to increased complexity and potential performance issues.
To address these problems, the default value for ReservedCodeCacheSize
was increased to 240 MB in JDK 7 update 40. Furthermore, when the code
cache occupancy crosses a preset CodeCacheMinimumFreeSpace threshold, the
JIT compilation halts and the JVM runs a sweeper. The nmethod sweeper
reclaims space by evacuating older compilations. However, sweeping the
entire code cache data structure can be time-consuming, especially when
the code cache is large and nearly full.
Java 9 introduced a significant change to the code cache: It was segmented
into different regions based on the type of code. This not only reduced the
sweeping time, but also minimized fragmentation of the long-lived code by
shorter-lived code. Co-locating code of the same type also reduced
hardware-level instruction cache misses.
The current implementation of the segmented code cache includes the
following regions:
Non-method code heap region: This region is reserved for VM
internal data structures that are not related to Java methods. For
example, the TemplateTable, which is a VM internal data structure,
resides here. This region doesn’t contain compiled Java methods.
Non-profiled nmethod code heap: This region contains Java methods
that have been compiled by the JIT compiler without profiling
information. These methods are fully optimized and are expected to be
long-lived, meaning they won’t be recompiled frequently and may
need to be reclaimed only infrequently by the sweeper.
Profiled nmethod code heap: This region contains Java methods that
have been compiled with profiling information. These methods are not
as optimized as those in the non-profiled region. They are considered
transient because they can be recompiled into more optimized versions
and moved to the non-profiled region as more profiling information
becomes available. They can also be reclaimed by the sweeper as often
as needed.
Each of these regions has a fixed size that can be set by their respective
command-line options:

Going forward, the hope is that the segmented code caches can
accommodate additional code regions for heterogeneous code such as
ahead-of-time (AOT)–compiled code and code for hardware accelerators.3
There’s also the expectation that the fixed sizing thresholds can be upgraded
to utilize adaptive resizing, thereby avoiding wastage of memory.
3
JEP 197: Segmented Code Cache. https://openjdk.org/jeps/197.

Adaptive Optimization and Deoptimization


Adaptive optimization allows the HotSpot VM runtime to optimize the
interpreted code into compiled code or insert an optimized loop on the stack
(so we could have something like an “interpreted to compiled, and back to
interpreted” code execution sequence). There is another major advantage of
adaptive optimization, however—in deoptimization of code. That means the
compiled code could go back to being interpreted, or a higher-optimized
code sequence could be rolled back into a less-optimized sequence.
Dynamic deoptimization helps Java reclaim code that may no longer be
relevant. A few example use cases are when checking interdependencies
during dynamic class loading, when dealing with polymorphic call sites,
and when reclaiming less-optimized code. Deoptimization will first make
the code “not entrant” and eventually reclaim it after marking it as
“zombie” code.

Deoptimization Scenarios
Deoptimization can occur in several scenarios when working with Java
applications. In this section, we’ll explore two of these scenarios.

Class Loading and Unloading


Consider an application containing two classes, Car and DriverLicense. The
Car class requires a DriverLicense to enable drive mode. The JIT compiler
optimizes the interaction between these two classes. However, if a new
version of the DriverLicense class is loaded due to changes in driving
regulations, the previously compiled code may no longer be valid. This
necessitates deoptimization to revert to the interpreted mode or a less-
optimized state. This allows the application to employ the new version of
the DriverLicense class.
Here’s an example code snippet:
class Car {
private DriverLicense driverLicense;

public Car(DriverLicense driverLicense) {


this.driverLicense = driverLicense;
}

public void enableDriveMode() {


if (driverLicense.isAdult()) {
System.out.println("Drive mode enabled!");
} else if (driverLicense.isTeenDriver()) {
if (driverLicense.isLearner()) {
System.out.println("You cannot drive without a lice
} else {
System.out.println("Drive mode enabled!");
}
} else {
System.out.println("You don't have a valid driver's li
}
}
}

class DriverLicense {
private boolean isTeenDriver;
private boolean isAdult;
private boolean isLearner;

public DriverLicense(boolean isTeenDriver, boolean isAdult, boo


this.isTeenDriver = isTeenDriver;
this.isAdult = isAdult;
this.isLearner = isLearner;
}

public boolean isTeenDriver() {


return isTeenDriver;
}

public boolean isAdult() {


return isAdult;
}

public boolean isLearner() {


return isLearner;
}
}

public class Main {


public static void main(String[] args) {
DriverLicense driverLicense = new DriverLicense(false, true
Car myCar = new Car(driverLicense);
myCar.enableDriveMode();
}
}

In this example, the Car class requires a DriverLicense to enable drive


mode. The driver’s license can be for an adult, a teen driver with a learner’s
permit, or a teen driver with a full license. The enableDriveMode() method
checks the driver’s license using the isAdult(), isTeenDriver(), and
isLearner() methods, and prints the appropriate message to the console.

If a new version of the DriverLicense class is loaded, the previously


optimized code may no longer be valid, triggering deoptimization. This
allows the application to use the new version of the DriverLicense class
without any issues.

Polymorphic Call Sites


Deoptimization can also occur when working with polymorphic call sites,
where the actual method to be invoked is determined at runtime. Let’s look
at an example using the DriverLicense class:
abstract class DriverLicense {
public abstract void drive();
}

class AdultLicense extends DriverLicense {


public void drive() {
System.out.println("Thanks for driving responsibly as an ad
}
}

class TeenPermit extends DriverLicense {


public void drive() {
System.out.println("Thanks for learning to drive responsib
}
}

class SeniorLicense extends DriverLicense {


public void drive() {
System.out.println("Thanks for being a valued senior citize
}
}

public class Main {


public static void main(String[] args) {
DriverLicense license = new AdultLicense();
license.drive(); // monomorphic call site

// Changing the call site to bimorphic


if (Math.random() < 0.5) {
license = new AdultLicense();
} else {
license = new TeenPermit();
}
license.drive(); // bimorphic call site

// Changing the call site to megamorphic


for (int i = 0; i < 100; i++) {
if (Math.random() < 0.33) {
license = new AdultLicense();
} else if (Math.random() < 0.66) {
license = new TeenPermit();
} else {
license = new SeniorLicense();
}
license.drive(); // megamorphic call site
}
}
}

In this example, the abstract DriverLicense class has three subclasses:


AdultLicense, TeenPermit, and SeniorLicense. The drive() method is
overridden in each subclass with different implementations.
First, when we assign an AdultLicense object to a DriverLicense variable
and call drive(), the HotSpot VM optimizes the call site to a monomorphic
call site and caches the target method address in an inline cache (a structure
to track the call site’s type profile).
Next, we change the call site to a bimorphic call site by randomly assigning
an AdultLicense or TeenPermit object to the DriverLicense variable and
calling drive(). Because there are two possible types, the VM can no
longer use the monomorphic dispatch mechanism, so it switches to the
bimorphic dispatch mechanism. This change does not require
deoptimization—and still provides a performance boost by reducing the
number of virtual method dispatches needed at the call site.
Finally, we change the call site to a megamorphic call site by randomly
assigning an AdultLicense, TeenPermit, or SeniorLicense object to the
DriverLicense variable and calling drive() 100 times. As there are now
three possible types, the VM cannot use the bimorphic dispatch mechanism
and must switch to the megamorphic dispatch mechanism. This change also
does not require deoptimization.
However, if we were to introduce a new subclass InternationalLicense
and change the call site to include it, the VM would need to deoptimize the
call site and switch to a megamorphic or polymorphic call site to handle the
new type. This change is necessary because the VM’s type profiling
information for the call site would be outdated, and the previously
optimized code would no longer be valid.
Here’s the code snippet for the new subclass and the updated call site:
class InternationalLicense extends DriverLicense {
public void drive() {
System.out.println(“Thanks for driving responsibly as an in
}
}

// Updated call site


for (int i = 0; i < 100; i++) {
if (Math.random() < 0.25) {
license = new AdultLicense();
} else if (Math.random() < 0.5) {
license = new TeenPermit();
} else if (Math.random() < 0.75) {
license = new SeniorLicense();
} else {
license = new InternationalLicense();
}
license.drive(); // megamorphic call site with a new type
}

HotSpot Garbage Collector: Memory


Management Unit
A crucial component of the HotSpot execution engine is its memory
management unit, commonly known as the garbage collector (GC). HotSpot
provides multiple garbage collection algorithms that cater to a trifecta of
performance aspects: application responsiveness, throughput, and overall
footprint. Responsiveness refers to the time taken to receive a response from
the system after sending a stimulus. Throughput measures the number of
operations that can be performed per second on a given system. Footprint
can be defined in two ways: as optimizing the amount of data or objects that
can fit into the available space and as removing redundant information to
save space.
Generational Garbage Collection, Stop-the-
World, and Concurrent Algorithms
OpenJDK offers a variety of generational GCs that utilize different
strategies to manage memory, with the common goal of improving
application performance. These collectors are designed based on the
principle that “most objects die young,” meaning that most newly allocated
objects on the Java heap are short-lived. By taking advantage of this
observation, generational GCs aim to optimize memory management and
significantly reduce the negative impact of garbage collection on the
performance of the application.
Heap collection in GC terms involves identifying live objects, reclaiming
space occupied by garbage objects, and, in some cases, compacting the
heap to reduce fragmentation. Fragmentation can occur in two ways: (1)
internal fragmentation, where allocated memory blocks are larger than
necessary, leaving wasted space within the blocks; and (2) external
fragmentation, where memory is allocated and deallocated in such a way
that free memory is divided into noncontiguous blocks. External
fragmentation can lead to inefficient memory use and potential allocation
failures. Compaction is a technique used by some GCs to combat external
fragmentation; it involves moving objects in memory to consolidate free
memory into a single contiguous block. However, compaction can be a
costly operation in terms of CPU usage and can cause lengthy pause times
if it’s done as a stop-the-world operation.
The OpenJDK GCs employ several different GC algorithms:
Stop-the-world (STW) algorithms: STW algorithms pause
application threads for the entire duration of the garbage collection
work. Serial, Parallel, (Mostly) Concurrent Mark and Sweep (CMS),
and Garbage First (G1) GCs use STW algorithms in specific phases of
their collection cycles. The STW approach can result in longer pause
times when the heap fills up and runs out of allocation space,
especially in nongenerational heaps, which treat the heap as a single
continuous space without segregating it into generations.
Concurrent algorithms: These algorithms aim to minimize pause
times by performing most of their work concurrently with the
application threads. CMS is an example of a collector using concurrent
algorithms. However, because CMS does not perform compaction,
fragmentation can become an issue over time. This can lead to longer
pause times or even cause a fallback to a full GC using the Serial Old
collector, which does include compaction.
Incremental compacting algorithms: The G1 GC introduced
incremental compaction to deal with the fragmentation issue found in
CMS. G1 divides the heap into smaller regions and performs garbage
collection on a subset of regions during a collection cycle. This
approach helps maintain more predictable pause times while also
handling compaction.
Thread-local handshakes: Newer GCs like Shenandoah and ZGC
leverage thread-local handshakes to minimize STW pauses. By
employing this mechanism, they can perform certain GC operations on
a per-thread basis, allowing application threads to continue running
while the GC works. This approach helps to reduce the overall impact
of garbage collection on application performance.
Ultra-low-pause-time collectors: The Shenandoah and ZGC aim to
have ultra-low pause times by performing concurrent marking,
relocation, and compaction. Both minimize the STW pauses to a small
fraction of the overall garbage collection work, offering consistent low
latency for applications. While these GCs are not generational in the
traditional sense, they do divide the heap into regions and collect
different regions at different times. This approach builds upon the
principles of incremental and “garbage first” collection. As of this
writing, efforts are ongoing to further develop these newer collectors
into generational ones, but they are included in this section due to their
innovative strategies that enhance the principles of generational
garbage collection.
Each collector has its advantages and trade-offs, allowing developers to
choose the one that best suits their application requirements.
Young Collections and Weak Generational
Hypothesis
In the realm of a generational heap, the majority of allocations take place in
the “eden” space of the “young” generation. An allocating thread may
encounter an allocation failure when this eden space is near its capacity,
indicating that the GC must step in and reclaim space.
During the first “young” collection, the eden space undergoes a scavenging
process in which live objects are identified and subsequently moved into the
“to” survivor space. The survivor space serves as a transitional area where
surviving objects are copied, aged, and moved back and forth between the
“from” and “to” spaces until they cross a tenuring threshold. Once an object
crosses this threshold, it is promoted to the “old” generation. The
underlying objective here is to promote only those objects that have proven

😊
their longevity, thereby creating a “teenage wasteland,” as Charlie Hunt4
would explain.
4
Charlie Hunt is my mentor, the author of Java Performance
(https://ptgmedia.pearsoncmg.com/images/9780137142521/samplepages/01
37142528.pdf), and my co-author for Java Performance Companion
(www.pearson.com/en-us/subject-catalog/p/java-performance-
companion/P200000009127/9780133796827).
The generational garbage collection is based on two main characteristics
related to the weak-generational hypothesis:
1. Most objects die young: This means that we promote only long-lived
objects. If the generational GC is efficient, we don’t promote
transients, nor do we promote medium-lived objects. This usually
results in smaller long-lived data sets, keeping premature promotions,
fragmentation, evacuation failures, and similar degenerative issues at
bay.
2. Maintenance of generations: The generational algorithm has proven
to be a great help to OpenJDK GCs, but it comes with a cost. Because
the young-generation collector works separately and more often than
the old-generation collector, it ends up moving live data. Therefore,
generational GCs incur maintenance/bookkeeping overhead to ensure
that they mark all reachable objects—a feat achieved through the use
of “write barriers” that track cross-generational references.

Figure 1.1 Key Concepts for Generational Garbage Collectors


Figure 1.1 depicts the three key concepts of generational GCs, providing a
visual reinforcement of the information discussed here. The word cloud
consists of the following phrases:
Objects die young: Highlighting the idea that most objects are short-
lived and only long-lived objects are promoted.
Small long-lived data sets: Emphasizing the efficiency of the
generational GC in not promoting transients or medium-lived objects,
resulting in smaller long-lived data sets.
Maintenance barriers: Highlighting the overhead and bookkeeping
required by generational GCs to mark all reachable objects, achieved
through the use of write barriers.
Most HotSpot GCs employ the renowned “scavenge” algorithm for young
collections. The Serial GC in HotSpot VM employs a single garbage
collection thread dedicated to efficiently reclaiming memory within the
young-generation space. In contrast, generational collectors such as the
Parallel GC (throughput collector), G1 GC, and CMS GC leverage multiple
GC worker threads.
Old-Generation Collection and Reclamation
Triggers
Old-generation reclamation algorithms in HotSpot VM’s generational GCs
are optimized for throughput, responsiveness, or a combination of both. The
Serial GC employs a single-threaded mark–sweep–compacting (MSC) GC.
The Parallel GC uses a similar MSC GC with multiple threads. The CMS
GC performs mostly concurrent marking, dividing the process into STW or
concurrent phases. After marking, CMS reclaims old-generation space by
performing in-place deallocation without compaction. If fragmentation
occurs, CMS falls back to the serial MSC.
G1 GC, introduced in Java 7 update 4 and refined over time, is the first
incremental collector. Specifically, it incrementally reclaims and compacts
the old-generation space, as opposed to performing the single monolithic
reclamation and compaction that is part of MSC. G1 GC divides the heap
into smaller regions and performs garbage collection on a subset of regions
during a collection cycle, which helps maintain more predictable pause
times while also handling compaction.
After multiple young-generation collections, the old generation starts filling
up, and garbage collection kicks in to reclaim space in the old generation.
To do so, a full heap marking cycle must be triggered by either (1) a
promotion failure, (2) a promotion of a regular-sized object that makes the
old generation or the total heap cross the marking threshold, or (3) a large
object allocation (also known as humongous allocation in the G1 GC) that
causes the heap occupancy to cross a predetermined threshold.
Shenandoah GC and ZGC—introduced in JDK 12 and JDK 11, respectively
—are ultra-low-pause-time collectors that aim to minimize STW pauses. In
JDK 17, they are single-generational collectors. Apart from utilizing thread-
local handshakes, these collectors know how to optimize for low-pause
scenarios either by employing the application threads to help out or by
asking the application threads to back off. This GC technique is known as
graceful degradation.
Parallel GC Threads, Concurrent GC Threads,
and Their Configuration
In the HotSpot VM, the total number of GC worker threads (also known as
parallel GC threads) is calculated as a fraction of the total number of
processing cores available to the Java process at start-up. Users can adjust
the parallel GC thread count by assigning it directly on the command line
using the -XX:ParallelGCThreads=<n> flag.
This configuration flag enables developers to define the number of parallel
GC threads for GC algorithms that use parallel collection phases. It is
particularly useful for tuning generational GCs, such as the Parallel GC and
G1 GC. Recent additions like Shenandoah and ZGC, also use multiple GC
worker threads and perform garbage collection concurrently with the
application threads to minimize pause times. They benefit from load
balancing, work sharing, and work stealing, which enhance performance
and efficiency by parallelizing the garbage collection process. This
parallelization is particularly beneficial for applications running on multi-
core processors, as it allows the GC to make better use of the available
hardware resources.
In a similar vein, the -XX:ConcGCThreads=<n> configuration flag allows
developers to specify the number of concurrent GC threads for specific GC
algorithms that use concurrent collection phases. This flag is particularly
useful for tuning GCs like G1, which performs concurrent work during
marking, and Shenandoah and ZGC, which aim to minimize STW pauses
by executing concurrent marking, relocation, and compaction.
By default, the number of parallel GC threads is automatically calculated
based on the available CPU cores. Concurrent GC threads usually default to
one-fourth of the parallel GC threads. However, developers may want to
adjust the number of parallel or concurrent GC threads to better align with
their application’s performance requirements and available hardware
resources.
Increasing the number of parallel GC threads can help improve overall GC
throughput, as more threads work simultaneously on the parallel phases of
this process. This increase may result in shorter GC pause times and
potentially higher application throughput, but developers should be cautious
not to over-commit processing resources.
By comparison, increasing the number of concurrent GC threads can
enhance overall GC performance and expedite the GC cycle, as more
threads work simultaneously on the concurrent phases of this process.
However, this increase may come at the cost of higher CPU utilization and
competition with application threads for CPU resources.
Conversely, reducing the number of parallel or concurrent GC threads may
lower CPU utilization but could result in longer GC pause times, potentially
affecting application performance and responsiveness. In some cases, if the
concurrent collector is unable to keep up with the rate at which the
application allocates objects (a situation referred to as the GC “losing the
race”), it may lead to a graceful degradation—that is, the GC falls back to a
less optimal but more reliable mode of operation, such as a STW collection
mode, or might employ strategies like throttling the application’s allocation
rate to prevent it from overloading the collector.
Figure 1.2 shows the key concepts as a word cloud related to GC work:

Figure 1.2 Key Concepts for Garbage Collection Work


Task queues: Highlighting the mechanisms used by GCs to manage
and distribute work among the GC threads.
Concurrent work: Emphasizing the operations performed by the GC
simultaneously with the application threads, aiming to minimize pause
times.
Graceful degradation: Referring to the GC’s ability to switch to a less
optimal but more reliable mode of operation when it can’t keep up
with the application’s object allocation rate.
Pauses: Highlighting the STW events during which application threads
are halted to allow the GC to perform certain tasks.
Task stealing: Emphasizing the strategy employed by some GCs in
which idle GC threads “steal” tasks from the work queues of busier
threads to ensure efficient load balancing.
Lots of threads: Highlighting the use of multiple threads by GCs to
parallelize the garbage collection process and improve throughput.
It is crucial to find the right balance between the number of parallel and
concurrent GC threads and application performance. Developers should
conduct performance testing and monitoring to identify the optimal
configuration for their specific use case. When tuning these threads,
consider factors such as the number of available CPU cores, the nature of
the application workload, and the desired balance between garbage
collection throughput and application responsiveness.

The Evolution of the Java Programming


Language and Its Ecosystem: A Closer Look
The Java language has evolved steadily since the early Java 1.0 days. To
appreciate the advancements in the JVM, and particularly the HotSpot VM,
it’s crucial to understand the evolution of the Java programming language
and its ecosystem. Gaining insight into how language features, libraries,
frameworks, and tools have shaped and influenced the JVM’s performance
optimizations and garbage collection strategies will help us grasp the
broader context.
Java 1.1 to Java 1.4.2 (Java SE 1.4.2)
Java 1.1, originally known as JDK 1.1, introduced JavaBeans, which
allowed multiple objects to be encapsulated in a bean. This version also
brought Java Database Connectivity (JDBC), Remote Method Invocation
(RMI), and inner classes. These features set the stage for more complex
applications, which in turn demanded improved JVM performance and
garbage collection strategies.
From Java 1.2 to Java 5.0, the releases were renamed to include the version
name, resulting in names like J2SE (Platform, Standard Edition). The
renaming helped differentiate between Java 2 Micro Edition (J2ME) and
Java 2 Enterprise Edition (J2EE).5 J2SE 1.2 introduced two significant
improvements to Java: the Collections Framework and the JIT compiler.
The Collections Framework provided “a unified architecture for
representing and manipulating collections”6, which became essential for
managing large-scale data structures and optimizing memory management
in the JVM.
5
www.oracle.com/java/technologies/javase/javanaming.xhtml
6
Collections Framework Overview.
https://docs.oracle.com/javase/8/docs/technotes/guides/collections/overview
.xhtml.
Java 1.3 (J2SE 1.3) added new APIs to the Collections Framework,
introduced Math classes, and made the HotSpot VM the default Java VM. A
directory services API was included for Java RMI to look up any directory
or name service. These enhancements further influenced JVM efficiency by
enabling more memory-efficient data management and interaction patterns.
The introduction of the New Input/Output (NIO) API in Java 1.4 (J2SE
1.4), based on Java Specification Request (JSR) #51,7 significantly
improved I/O operation efficiency. This enhancement resulted in reduced
waiting times for I/O tasks and an overall boost in JVM performance. J2SE
1.4 also introduced the Logging API, which allowed for generating text or
XML-formatted log messages that could be directed to a file or console.
7
https://jcp.org/en/jsr/detail?id=51
J2SE 1.4.1 was soon superseded by J2SE 1.4.2, which included numerous
performance enhancements in HotSpot’s client and server compilers.
Security enhancements were added as well, and Java users were introduced
to Java updates via the Java Plug-in Control Panel Update tab. With the
continuous improvements in the Java language and its ecosystem, JVM
performance strategies evolved to accommodate increasingly more complex
and resource-demanding applications.

Java 5 (J2SE 5.0)


The Java language made its first significant leap toward language
refinement with the release of Java 5.0. This version introduced several key
features, including generics, autoboxing/unboxing, annotations, and an
enhanced for loop.

Language Features
Generics introduced two major changes: (1) a change in syntax and (2)
modifications to the core API. Generics allow you to reuse your code for
different data types, meaning you can write just a single class—there is no
need to rewrite it for different inputs.
To compile the generics-enriched Java 5.0 code, you would need to use the
Java compiler javac, which was packaged with the Java 5.0 JDK. (Any
version prior to Java 5.0 did not have the core API changes.) The new Java
compiler would produce errors if any type safety violations were detected at
compile time. Hence, generics introduced type safety into Java. Also,
generics eliminated the need for explicit casting, as casting became implicit.
Here’s an example of how to create a generic class named
FreshmenAdmissions in Java 5.0:

class FreshmenAdmissions<K, V> {


//…
}

In this example, K and V are placeholders for the actual types of objects. The
class FreshmenAdmissions is a generic type. If we declare an instance of this
generic type without specifying the actual types for K and V, then it is
Random documents with unrelated
content Scribd suggests to you:
And after this he was deprived of his bishopric, having a certain
pension assigned unto him for to live on in an abbey, and soon after
he died.
A SEA FIGHT (June 1, 1458).

Source.—Paston Letters, vol. i., No. 317.


John Jerningham to Margaret Paston.
... Right worshipful cousin, if it please you for to hear of such tidings
as we have here, the embassy of Burgundy shall come to Calais the
Saturday after Corpus Christi day, as men say five hundred horse of
them. Moreover, on Trinity Sunday in the morning, came tidings unto
my Lord of Warwick that there were twenty-eight sails of Spaniards
on the sea, and whereof there was sixteen great ships of forecastle;
and then my Lord went and manned five ships of forecastle, and
three carvels, and four pinnaces, and on the Monday, on the
morning after Trinity Sunday, we met together afore Calais at four at
the clock in the morning, and fought that gathering till ten at the
clock; and there we took six of their ships, and they slew of our men
about four score, and hurt two hundred of us right sore; and there
were slain on their part about twelve score; and hurt five hundred of
them.
And it happed me, at the first aboarding of us, we took a ship of 300
ton, and I was left therein and twenty-three men with me; and they
fought so sore that our men were fain to leave them, and then come
they and boarded the ship that I was in, and there I was taken, and
was prisoner with them six hours, and was delivered again for their
men that were taken before. And as men say, there was not so great
a battle upon the sea this forty winter. And forsooth, we were well
and truly beat; and my Lord hath sent for more ships, and like to
fight together again in haste.
THE EVILS IN THE CHURCH (Written
before 1458).

Source.—Gascoigne's Loci e Libro Veritatum, edited


by Rogers. (Oxford: 1881.)
Unworthy promotions [pp. 13, 14].
It is notorious now in the realm of England that boys, youths and
men dwelling in the courts of the worldly are placed in churches, in
high offices and in prelacies, others being set aside who have long
been occupied in study and preaching and in the guiding of the
people without thought of worldly lucre.... Among others unworthily
promoted, one foolish youth, eighteen years of age, was promoted
to twelve prebends and a great archdeaconry of the value of a
hundred pounds, and to one great rectory, and a certain layman
received the rents of all the said benefices, and spent upon the
youth just as much as he, the layman, pleased, and never rendered
an account, and that youth was the son of a simple knight, and, like
an idiot, was drunk almost every day.
Non-residence [pp. 3, 149].
Some never or seldom reside in their cures, and he to whom a
church is appropriated and who is non-resident, comes once a year
to his cure, or sends to the church at the end of the autumn, and
having filled his purse with money and sold his tithes, departs again
far away from his cure to the court where he occupies himself in
money-making and pleasures.... O Lord God! incline the heart of the
Pope, Thy vicar, to remedy the evils which arise through the
appropriation of churches, and by the non-residence of good curates
in the same. For now in England a time draweth nigh when men will
say, "Formerly there were rectors in England, and now there are
ruined churches in which cultured men cannot decently live...."
Church dues oppressive [p. 13].
For Rome, like a singular and principal wild beast, hath laid waste
the vineyards of the church, reserving to herself the elections of
bishops, that none may confer an episcopal church on anyone unless
they first pay the annates or first-fruits and rent of the vacant
church. Also she hath destroyed the vineyard of God's church in
many places, by annulling the elections of all the bishops in England.
Also she destroys the church by promoting wicked men according as
the King and the Pope agree.
The abuse of the Sacraments [pp. 197].
It is now known that many infants die without baptism because the
parish churches have no fonts, and divers abbeys have licence and
custom that everyone of certain parishes should baptise in their
monasteries, and yet they cannot come conveniently by night, or at
other times to the font there.
Proud Prelates [pp. 22, 23].
Bishops were wont, as is manifest in the Life of St. Cuthbert, to talk
humbly and familiarly with their inferiors and every day to give
everyone of their flock an audience if he sought to speak with his
bishop. Recently a poor man came to the servant of a certain
archbishop, the son of a lord, and said "I marvel that the archbishop
does not give audience in his own person to his flock as his
predecessor was wont to do." The servant replied "My lord the
present archbishop was not bred in the same way as his
predecessor" (meaning by this that his lord the archbishop, who was
so strange and distant to his flock, was the son of a lord, and his
predecessor was the son of a poor man); the poor man answered
the said servant, "Truly the present archbishop and his predecessor
were bred in different fashions, but it is manifest that the
predecessor was the better man and more useful to his flock and to
their souls and to the whole diocese."
THE EVILS OF MISGOVERNMENT
(1459).

Source.—An English Chronicle, edited by Davies, pp.


79, 80. (Camden Society, 1846.)
In this same time the realm of England was out of all good
governance, as it had been many days before, for the King was
simple and led by covetous counsel, and owed more than he was
worth. His debts increased daily, but payment there was none; all
the possessions and lordships that pertained to the Crown the King
had given away, some to lords and some to other simple persons, so
that he had almost nought to live on. And such impositions as were
put to the people, as taxes, tallages and quinzimes (fifteenths), all
that came from them were spent in vain, for he held no household
nor maintained no wars. For these misgovernances, and for many
other, the hearts of the people were turned away from them that
had the land in governance, and their blessing was turned into
cursing. The queen with such as were of her affinity ruled the realm
as they liked, gathering riches innumerable. The officers of the
realm, and especially the earl of Wiltshire, treasurer of England, for
to enrich himself, peeled the poor people and disinherited rightful
heirs and did many wrongs. The queen was defamed and slandered,
that he that was called Prince was not her son.... Wherefore she,
dreading that he should not succeed his father in the crown of
England, allied unto her all the knights and squires of Cheshire, for
to have their benevolence, and held open household among them ...
trusting through them to make her son King.
YORK'S POPULARITY (1460).

Source.—An English Chronicle, edited by Davies, p.


93. (Camden Society, 1846.)

Ballad set upon the Gates of Canterbury.


Send home most gracious Lord Jesu most benign,
Send home thy true blood unto his proper vein,
Richard duke of York, Job thy servant insign,
Whom Satan not ceaseth to set at care and
disdain,
But by Thee preserved he may not be slain;
Set him ut sedeat in principibus, as he did before,
And so to our new song, Lord, thine ears incline,
Gloria, laus et honor Tibi sit Rex Christe
Redemptor!
Edward Earl of March, whose fame the earth shall
spread,
Richard Earl of Salisbury named prudence,
With that noble knight and flower of manhood,
Richard Earl of Warwick, shield of our defence,
Also little Falconberg, a knight of great reverence;
Jesu them restore to their honour as they had
before,
And ever shall we sing to thine High Excellence,
Gloria, laus et honor Tibi sit Rex Christe
Redemptor!
The dead man greeteth you well,
That is just true as steel,
With very good intent.
Also the Realm of England,
Soon to loose from Sorrow's bond
By right indifferent judgement.
THE BATTLE OF NORTHAMPTON
(July 10, 1460).

Source.—An English Chronicle, edited by Davies, pp.


96-98. (Camden Society, 1846.)
The King at Northampton lay at Friars, and had ordained there a
strong and mighty field in the meadows, armed and arrayed with
guns, having the river at his back. The earls [March and Warwick]
with the number of sixty thousand, as it was said, came to
Northampton and sent certain bishops to the King beseeching him
that, in eschewing of effusion of Christian blood, he would admit and
suffer the earls for to come into his presence to declare themselves
as they were. The duke of Buckingham that stood beside the King,
said unto them, "Ye come not as bishops for to treat for peace, but
as men of arms;" because they brought with them a notable
company of men of arms. They answered and said, "We come thus
for surety of our persons, for they that be about the King be not our
friends."
"Forsooth!" said the duke, "the Earl of Warwick shall not come to the
King's presence, and if he come he shall die." The messengers
returned again and told this to the earls....

Then on the Thursday the xth day of July, the year of our Lord 1460,
at two hours after noon, the said earls of March and Warwick let cry
through the field, that no man should lay hands upon the King nor
on the common people, but only on the lords, knights, and squires:
then the trumpets blew up, and both hosts encountered and fought
together half an hour,... The duke of Buckingham, the earl of
Shrewsbury, the lord Beaumont, the lord Egremont were slain by the
Kentishmen besides the King's tent, and many other knights and
squires. The ordinance of the King's guns availed not, for that day
was so great rain that the guns lay deep in water, and so were
quenched and might not be shot. When the field was done, and the
earls through mercy and help had the victory, they came to the King
in his tent, and said in this wise: "Most noble Prince, displease you
not, though it hath pleased God of his Grace to grant us the victory
of our mortal enemies, the which by their venomous malice have
untruly steered and moved your highness to exile us out of your
land. We come not to that intent for to inquiet nor grieve your said
highness, but for to please your most noble person, desiring most
tenderly the high welfare and prosperity thereof, and of all your
realm, and for to be your true liegemen while our lives shall endure."
The King of their words was greatly recomforted, and anon was led
into Northampton with procession, where he rested him three days,
and then came to London, the xvj day of the month abovesaid, and
lodged in the bishop's palace. For the which victory London gave to
Almighty God great laud and thanking.
THE WANDERINGS OF QUEEN
MARGARET (1460).

Source.—Gregory's "Chronicle" in the Collections of


a London Citizen, pp. 208, 209. (Camden Society.)
And that same night the King [Henry VI.] removed unto London,
against his will, to the bishop's palace of London, and the Duke of
York come unto him that same night by torch-light and took upon
him as King and said in many places that "this is ours by very right."
And then the Queen, hearing this, voided unto Wales, but she was
met beside the Castle of Malpas, and a servant of her own that she
had made both yoeman and gentleman and after appointed for to be
in office with her son the prince, spoiled her and robbed her and put
her so in doubt of her life and son's life also. And then she come to
the castle of Hardelowe [Harlech] in Wales, and she had many great
gifts and [was] greatly comforted, for she had need thereof. And
most commonly she rode behind a young poor gentleman of
fourteen year age, his name was John Combe, born at Amysbery in
Wiltshire. And there hence she removed full privily unto the Lord
Jasper, Lord and Earl of Pembroke, for she durst not abide in no
place that was open, but in private. The cause was that counterfeit
tokens were sent unto her as though they had come from her most
dread lord the King Harry the VI.; but it was not of his sending,
neither of his doing, but forged thing;... for at the King's departing
from Coventry toward the field of Northampton, he kissed her and
blessed the prince, and commanded her that she should not come
unto him till that he send a special token unto her that no man knew
but the King and she. For the lords would fain had her unto London,
for they knew well that all the workings that were done grew by her,
for she was more wittier than the King, and that appeareth by his
deeds.
THE BATTLE OF WAKEFIELD (1460).

Source.—Hall's Chronicle, pp. 250, 251. (London:


1809.)
[Note.—Hall's Chronicle was first published in 1542, and therefore
the following extract is by no means contemporary with the events it
describes. But it is the only account of the battle of Wakefield, and it
derives some authority from the fact that Hall had an ancestor who
was slain in the fight.]
The duke of York with his people descended down the hill in good
order and array and was suffered to pass forward, toward the main
battle: but when he was in the plain ground between his castle and
the town of Wakefield, he was environed on every side, like a fish in
a net or a deer in a buckstall: so that he, manfully fighting, was
within half an hour slain and dead, and his whole army
discomfited.... While this battle was in fighting a priest called Sir
Robert Aspall, chaplain and schoolmaster to the young earl of
Rutland, second son to the abovenamed duke of York, of the age of
twelve years, a fair gentleman and a maidenlike person, perceiving
that flight was more safeguard than tarrying, both for him and his
master, secretly conveyed the earl out of the field ... but or he could
enter into a house the lord Clifford espied, followed and taken, and
by reason of his apparell demanded what he was. The young
gentleman, dismayed, had not a word to speak, but kneeled on his
knees imploring mercy and desiring grace both with holding up his
hands and making dolorous countenance, for his speech was gone
for fear. "Save him," said the Chaplain, "for he is a prince's son, and
peradventure may do you good hereafter." With that word the Lord
Clifford marked him and said, "By God's blood, thy father slew mine,
and so will I do thee and all thy kin," and with that word stuck the
earl to the heart with his dagger, and bade the chaplain bear the
earl's mother and brother word what he had done.... This cruel
Clifford and deadly blood-supper, not content with this homicide or
child-killing, came to the place where the dead corpse of the duke of
York lay, and caused his head to be stricken off, and set on it a
crown of paper and so fixed it on a pole and presented it to the
Queen, not lying far from the field, in great despite and much
derision, saying, "Madame, your war is done; here is your King's
ransom."
THE RAVAGES OF THE
LANCASTRIANS AFTER THE
VICTORY OF WAKEFIELD (1460).

Source.—Ingulph's Chronicles, pp. 421, 422. (Bohn


Edition.)
The duke being thus removed from this world, the north-men, being
sensible that the only impediment was now withdrawn, and that
there was no one now who could care to resist their inroads, again
swept onwards like a whirlwind from the north, and in the impulse of
their fury attempted to overrun the whole of England. At this period
too, fancying that everything tended to insure them freedom from
molestation, paupers and beggars flocked forth from those quarters
in infinite numbers, just like so many mice rushing forth from their
holes, and universally devoted themselves to spoil and rapine,
without regard of place or person. For, besides the vast quantities of
property which they collected outside, they also irreverently rushed,
in their unbridled and frantic rage, into churches and the other
sanctuaries of God, and most nefariously plundered them of their
chalices, books, and vestments, and, unutterable crime! broke open
the pixes in which were kept the body of Christ, and shook out the
sacred elements therefrom. When the priests and the other faithful
of Christ in any way offered to make resistance, like so many
abandoned wretches as they were, they cruelly slaughtered them in
the very churches or church yards. Thus did they proceed with
impunity, spreading in vast multitudes over a space of thirty miles in
breadth, and, covering the whole surface of the earth just like so
many locusts, made their way almost to the very walls of London; all
the moveables which they could possibly collect in every quarter
being placed on beasts of burden and carried off. With such avidity
for spoil did they press on, that they dug up the precious vessels,
which, through fear of them, had been concealed in the earth, and
with threats of death compelled the people to produce the treasures
which they had hidden in remote and obscure spots.
THE BATTLE OF MORTIMER'S
CROSS (1461).

Source.—Gregory's "Chronicle," in the Collections of


a London Citizen, p. 211. (Camden Society.)
Also Edward Earl of March, the Duke of York's son and heir, had a
great journey at Mortimer's Cross in Wales the second day of
February next so following, and there he put to flight the Earl of
Pembroke,[18] (and) the Earl of Wiltshire. And there he took and
slew of knights and squires to the number of 3,000.
[18] Jasper Tudor.

And in that journey was Owen Tudor taken and brought unto
Hereford, and he was beheaded at the market place, and his head
set upon the highest grice[19] of the market cross, and a mad
woman combed his hair and washed away the blood of his face, and
she got candles and set them about him, burning more than a
hundred. This Owen Tudor was father unto the Earl of Pembroke,
and had wedded Queen Catherine, King Harry the VI.'s mother,
thinking and trusting all the way that he should not be beheaded
until he saw the axe and the block, and when that he was in his
doublet he trusted on pardon and grace till the collar of his red
velvet doublet was ripped off. Then he said: "That head shall lie on
the stock that was wont to lie on Queen Catherine's lap," and put his
heart and mind wholly unto God, and full meekly to his death.
[19] Grices = steps upon which crosses are placed.
BATTLE OF TOWTON (1461).

Source.—Ingulph's Chronicles, pp. 425, 426. (Bohn


Edition.)
Edward pursued them as far as a level spot of ground, situate near
the castle of Pomfret and the bridge at Ferrybridge, and washed by
a stream of considerable size; where he found an army drawn up in
order of battle, composed of the remnants of the northern troops of
King Henry. They, accordingly, engaged in a most severe conflict,
and fighting hand to hand with sword and spear, there was no small
slaughter on either side. However, by the mercy of the Divine
clemency, King Edward soon experienced the favour of heaven, and,
gaining the wished-for victory over his enemies, compelled them
either to submit to be slain or to take to flight. For, their ranks being
now broken and scattered in flight, the King's army eagerly pursued
them, and cutting down the fugitives with their swords, just like so
many sheep for the slaughter, made immense havoc among them for
a distance of ten miles, as far as the city of York. Prince Edward,
however, with a part of his men, as conqueror, remained upon the
field of battle, and awaited the rest of his army, which had gone in
various directions in pursuit of the enemy.
When the solemnities of the Lord's day, which is known as Palm
Sunday, were now close at hand, after distributing rewards among
such as brought the bodies of the slain, and gave them burial, the
King hastened to enter the before-named city. Those who helped to
inter the bodies, piled up in pits and in trenches prepared for the
purpose, bear witness that eight-and-thirty thousand warriors fell on
that day, besides those who were drowned in the river before
alluded to, whose numbers we have no means of ascertaining. The
blood, too, of the slain, mingling with the snow, which at this time
covered the whole surface of the earth, afterwards ran down in the
furrows and ditches along with the melted snow, in a most shocking
manner, for a distance of two or three miles.
POPULAR BALLAD ON THE
ACCESSION OF EDWARD IV. (1461).

Source.—Archæologia, vol. xxix., p. 130.


"On Thursday the first week in Lent came Edward to London with
thirty thousand men, and so in field and town everyone called
Edward King of England and France."
Since God hath chosen thee to be his Knight,
And possessed thee in this right,
Then him honour with all thy might,
Edwardus Dei gratia!
Out of the stock that long lay dead,
God hath caused thee to spring and spread,
And of all England to be the head,
Edwardus Dei gratia!
Since God hath given thee through his might,
Out of that stock bred in sight,
The flower to spring and rose so white,
Edwardus Dei gratia!
Then give him laud and praising,
Thou virgin Knight of whom we sing,
Undefiled since thy beginning,
Edwardus Dei gratia!
God save thy countenance,
And so prosper to his pleasance,
That ever thine estate thou mayst enhance,
Edwardus Dei gratia!
THE MAYOR OF LONDON'S DIGNITY
(1463).

Source.—Gregory's "Chronicle" in the Collections of


a London Citizen, pp. 222, 223. (Camden Society.)
This year, about Midsummer, at the royal feast of the Sergeants of
the Coif, the Mayor of London was desired to be at that feast. And at
dinner time he came to the feast with his officers, agreeing and
according unto his degree. For within London he is next unto the
King in all manner [of] thing. And in time of washing the Earl of
Worcester was taken before the mayor and set down in the midst of
the high table. And the mayor seeing that his place was occupied
held him content, and went home again without meat or drink or
anything, but reward him he did as his dignity required of the city.
And took with him the substance of his brethren the aldermen to his
place, and were set and served as soon as any man could devise,
both of cygnet and of other delicacies enough, that all the house
marvelled how well everything was done in so short a time....
Then the officers of the feast, full evil ashamed, informed the
masters of the feast of this mishap that is befallen. And they,
considering the great dignity and costs and charge that belonged to
the city, anon sent unto the mayor a present of meat, bread and
wine and many divers subtleties. But when they that come with the
presents saw all the gifts and the service that was at the board, he
was full sore ashamed that should do the message, for the present
was not better than the service of meat was before the mayor and
throughout the high table. But his demeaning was so that he had
love and thanks for his message and a great reward withal. And thus
the worship of the city was kept and not lost for him. And I trust
that never it shall, by the grace of God.
THE MARRIAGE OF EDWARD IV.
(1464).

Source.—Gregory's "Chronicle" in the Collections of


a London Citizen, pp. 226, 227. (Camden Society.)
Now take heed what love may do, for love will not nor may not cast
no fault nor peril in nothing.
That same year, the first day of May, our sovereign lord the King
Edward IV. was wedded to the Lord Rivers' daughter; her name is
Dame Elizabeth that was wife unto Sir John Grey.... And this
marriage was kept full secretly long and many a day, that no man
knew it; but men marvelled that our sovereign lord was so long
without any wife, and were ever feared that he had been not chaste
of his living. But on All Hallows' day at Reading there it was known,
for there the King kept his common council, and the lords moved
him and exhorted him in God's name to be wedded and to live under
the law of God and Church, and (that) they would send into some
strong land to inquire a queen of good birth according to his dignity.
And then our sovereign might no longer hide his marriage, and told
them how he had done, and made that the marriage should be
opened unto his lords.
A DINNER OF FLESH (circa 1465).

Source.—The Boke of Nurture, by John Russell


(1460-1470). (Roxburghe Club, 1867.)

The furst Course.


Furst set for the mustard and brawne of boore, the
wild swyne,
Such potage as the cooke hathe made of yerbis spice
and wyne,
Beeff, moton, stewed feysaund, Swan with the
Chawdyn,[20]
Capoun, pigge, vensoun bake, lech lombard,[21] frutur
veaunt[22] fyne.
And then a Sotelte: }
Maiden mary that holy virgyne, } A Sotelte.[23]
And Gabrielle gretynge hur with an Ave }

[20] A sauce for swans.


[21] A dish of pork, eggs, cloves, currants, dates, and sugar
powdered together.
[22] Meat fritter.
[23] Made of sugar and wax.

The Second Course.


Two potages, blanger mangere and also Jely
For a standard vensoun rost kyd, faun or cony,
bustard, stork, crane pecock in hakille ryally,[24]
Partriche, wodcock plovere, egret, Rabettes sowkere,
[25]
Great birds, larks gentille, Creme de mere,
dowcettes,[26] payne puff with lech Jely ambere.
... A sotelte followynge in fere,
the course for to fullfylle,
An angelle goodly can appere,
And syngynge with a mery chere
Unto iij shepperds upon an hille.

[24] Sewn in the skin.


[25] Sucking rabbits.
[26] Sweet cakes.

The iij Course.


Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.

More than just a book-buying platform, we strive to be a bridge


connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.

Join us on a journey of knowledge exploration, passion nurturing, and


personal growth every day!

ebookbell.com

You might also like