0% found this document useful (0 votes)

27 views5 pages

Understandingsoftwaredynamics Preview

This document provides an overview of measuring performance in complex software environments like datacenters. It discusses key concepts like transaction latency, tail latency of the slowest transactions, and how hardware is utilized. The goal is to understand why some transactions are unexpectedly slow by observing program dynamics and making informed estimates of how long each part of a program should take. Reducing these occasional slow transactions can improve hardware efficiency and response times for users.

Uploaded by

xekose3389

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views5 pages

Understandingsoftwaredynamics Preview

Uploaded by

xekose3389

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Part I

Measurement
Understanding variation is the key to success in quality and business.
—W. Edwards Deming

Measurement is the act of ascertaining the size, amount, or degree of something.

Careful measurements are the underpinning of understanding software performance.

This first part describes a complex hardware and software environment, the book’s
emphasis on transaction latency, the concept of latency distributions, and the conse-
quences of long 99th percentile latencies.

Our overall goal is to understand the root causes of variance in transaction latency—the
apparently random unexpectedly long response times in complex software.

The datacenter environment is a superset of the environment you might have set up
when exploring the performance of database transactions, desktop software delays,
dedicated controller delays, or game delays. This part also introduces the important
practice of estimating within a factor of 10 how long pieces of code should take. As an
underpinning for the rest of the book, it leads readers through detailed measurements
of CPU, memory, disk, and network latencies. These chapters use pre-supplied but
flawed programs that every reader can run and get some insight and then can modify
as directed to fix the flaws and gain substantially more insight. The resulting measure-
ments will start to show the sources of latency variation in simple programs.

The first part serves to bring readers with varying backgrounds to a common base of
knowledge about performance measurement, user- and kernel-mode software interac-
tions, cross-thread and cross-program software interference, and interactions between
complex software and computer hardware. At the end of this part, every reader will be
able to make informed estimates of how long a piece of code should take.

Humble Bundle Pearson Software Development Ð © Pearson. Do Not Distribute.

This page intentionally left blank

Humble Bundle Pearson Software Development Ð © Pearson. Do Not Distribute.

Chapter 1
My Program Is Too Slow

Someone walks into my office and says “My program is too slow.” After a pause, I ask “How slow
should it be?"

A good programmer has a ready answer to this question, as she describes the work to be done
and estimates of how long each portion should take. Perhaps she says “This database query
accesses 10,000 records of which about 1,000 turn out to be relevant; each access should take
about 10 milliseconds and they are spread across 20 disks, so 10,000 accesses should be about
5 seconds total. There is no network activity and the CPU processing and memory use are small
and simple—all much faster than the disk access time. The actual query is taking about 15 sec-
onds, which is too slow.”

A sloppier programmer might answer “I wrote 1,000 lines of code all night using lots of existing
libraries, and it all works but takes about 15 seconds per query, and I want it to take 1/10 of a
second. One of those libraries must be too slow; how can I find it?” When asked, he has no idea
whether 1/10 of a second is a reasonable expectation, no idea how long each library call should
take, no idea if he is using the libraries appropriately, and no designed-in way to observe the
dynamics of his code to determine where the time really goes. We will explore all these issues in
this book.

1.1 Datacenter Context

We introduce some terms and concepts from a complex software environment. Your environ-
ment may be much simpler, but the ideas carry over almost exactly. The terminology is from
datacenters, but the ideas also apply to database, desktop, vehicle, gaming, and other time-
constrained environments.

A transaction or query or request is an input message to a computer system that must be dealt with
as a single unit of work. Each computer processing transactions is termed a server. The latency
or response time of a transaction is the time elapsed between sending a message and receiving
its result. The offered load is the number of transactions sent per second; when this exceeds the
number of transactions processed per second, response time suffers, sometimes dramatically.
A service is a collection of programs that handle one particular kind of transaction. Large data-
centers process transactions for dozens of different services simultaneously, and each service has
a different offered load and a different latency goal.

Humble Bundle Pearson Software Development Ð © Pearson. Do Not Distribute.

4 Chapter 1}}My Program Is Too Slow

Transaction latency is not constant—it has a probability distribution taken over thousands of
transactions per second. Tail latency refers to the slowest transactions in this distribution. A
simple way to summarize the tail latency is to state the 99th percentile latency—the time that is
exceeded by the slowest 1% of all transactions, i.e., by 50 transactions every second if the offered
load is 5,000 transactions per second.

By the dynamics of a program or collection of programs we mean the activity over time—what
pieces of code run when, what they wait for, what memory space they take, and how different
programs affect each other. As programmers, we imagine in our heads simple dynamics for a
program, but in reality the program may (occasionally) behave much differently than that pic-
ture and perform much more slowly than expected. If we can observe the true dynamics, we can
adjust our mental picture and usually improve the code’s performance with simple changes.

We are interested in user-facing transactions in complex software—the datacenter half of cell

phones, for example. We are particularly interested in transactions that are usually fast but occa-
sionally take much longer—enough that the end user sees an annoying delay. In datacenters, the
hardware budget for each service is often determined by how many transactions per second each
server can “handle.” This target number is determined empirically by increasing the offered load
until some tail-latency time constraint is exceeded, and then the target load is backed off a little.

If we can understand and then reduce the number of too-long transactions, the same hardware
can handle larger loads within the tail-latency goal, at no additional cost. This is worth a lot of
money. A skilled and somewhat lucky performance engineer can occasionally make a simple
software change that saves enough money to pay for 10 years of salary. Companies and custom-
ers like such people.

Time-constrained transaction software is fundamentally different from batch or offline soft-

ware (or most benchmarks). The important metric for transaction software is response time,
while the important metric for batch software is usually efficient hardware utilization. For trans-
actions, it is not the average response time that matters, but the slowest times, the tail latency.

In a datacenter, a higher average latency but shorter tail latency is usually preferred
over a lower average latency and longer tail latency. Most commuters prefer the same
thing—a route that takes a few minutes longer but always takes about the same time
is better than a slightly faster route that occasionally has unpredictable hour-long
delays.

For batch software, having the CPUs 98% busy on average can be good; for transaction soft-
ware, 98% busy is a disaster, and even CPUs 50% busy on average might be too much, because it
produces long response times whenever the offered load spikes for a few seconds to 3x above the
average. When I first joined Google in 2004, the average datacenter CPU was 9% busy and 91%
idle. The 9% busy was too low. Increasing that to 18% without increasing tail latency doubled
the efficiency of all those datacenters. Doubling again to 36% busy would be good, but doubling
a third time to 72% busy would likely ruin too many transactions’ time constraints.

In looking at the performance of complex transaction-oriented software, we assume in this book

that the programs involved fundamentally work and that on average they work quickly enough.

1.2 Datacenter Hardware 5

We won’t discuss designing or debugging such software, nor understanding or improving its
average performance. We also assume that always-slow transactions have been identified and
fixed in offline test/debug environments that have no time constraints, leaving us just with
occasionally slow transactions. We focus on the mechanisms that make occasional transactions
slow, on how to observe these mechanisms, and on how to interpret the observations.

When you use a cell phone to send a text message, read a post, search the web, look at a map,
stream a video, use an app, or even dial a telephone number, there is a datacenter somewhere
that responds to your requests. If these responses are annoyingly slow and some competing
app or service is faster, you may well switch to that one, or at least use the slow one less often.
Everyone in a time-constrained ecosystem has an incentive, often financial, to reduce annoying
delays. Few people have the skills to do so.

It is the goal of this book to teach a few more people how.

1.2 Datacenter Hardware

Large datacenters have something like 10,000 servers in a building, with each server a PC about
the size of a desktop PC but without the case. Instead, about 50 server boards are mounted in
a rack, and there are 200 racks spread around a very large room. A typical server has 1–4 CPU
chip sockets with 4–50 CPU cores each, a boatload1 of RAM, a couple of disks or solid-state drives
(SSDs), and a network connection to a datacenter-wide switching fabric set up so that any server
can communicate with any other server, and at least some of the servers can also communicate
with the Internet and hence your phone. Outside the building, there are big generators that can
run the entire building, including air conditioning, for days or weeks when there is a power out-
age. Inside, there are batteries that can run the servers and network switches for tens of seconds
while the generators start up.

Each server runs multiple programs. It usually doesn’t make business sense to dedicate some
servers to just doing email, others to just map tiles, and others to just instant messages. Instead,
each server runs multiple programs, and each of those programs likely has multiple threads. For
example, an email server program might have 100 worker threads processing email requests
for several thousand users simultaneously, most of whom are typing or reading, with many of
the active threads waiting for disk accesses or for other software layers. The worker threads take
incoming requests, do whatever is asked, respond, and then go on to another pending request
from another user. During the busiest hour of the day almost all the worker threads are busy,
while during the slowest hour of the day at least half of them will be idle, waiting for work. There
is a constant boom-and-bust cycle of offered work at almost all time scales—microsecond, milli-
second, second, and minute. There is even a seven-day cycle with lower activity on Saturday and
Sunday (for Western work weeks).

To control response times, it is important to have spare hardware resources available for user-
facing transactions, since the user load tends to spike now and then based on physical-world
events. But it is also economical to have some non-user-facing batch programs to run when
there are otherwise idle processors. In addition to user-facing foreground programs and batch

1
A computer science technical term, 1012

ICDL Computer Essentials
From Everand
ICDL Computer Essentials
Michael Anderson
4/5 (2)
The Art of Application Performance Testing by Ian Molyneaux
No ratings yet
The Art of Application Performance Testing by Ian Molyneaux
12 pages
Chapter09 - How To Test Performance
No ratings yet
Chapter09 - How To Test Performance
23 pages
Simulator Bridge Developers Guide
100% (1)
Simulator Bridge Developers Guide
143 pages
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
NetOps 2.0 Transformation: The DIRE Methodology
From Everand
NetOps 2.0 Transformation: The DIRE Methodology
Ray Belleville
5/5 (1)
Networking for System Administrators: IT Mastery, #5
From Everand
Networking for System Administrators: IT Mastery, #5
Michael W. Lucas
5/5 (1)
Understanding Software Dynamics Addison Wesley Professional Computing Series 1st Edition Richard L. Sites All Chapters Instant Download
100% (2)
Understanding Software Dynamics Addison Wesley Professional Computing Series 1st Edition Richard L. Sites All Chapters Instant Download
62 pages
Software Quality: Abstract
No ratings yet
Software Quality: Abstract
3 pages
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet
What Is Performance Testin1
No ratings yet
What Is Performance Testin1
6 pages
Linux, Apache, MySQL, PHP Performance End to End
From Everand
Linux, Apache, MySQL, PHP Performance End to End
Colin McKinnon
5/5 (1)
The Computer User's Survival Handbook: Why Is My Computer Slow?
From Everand
The Computer User's Survival Handbook: Why Is My Computer Slow?
Laszlo Szenes
No ratings yet
Performance Testing TestRail
No ratings yet
Performance Testing TestRail
26 pages
Computer technology
From Everand
Computer technology
Tikita Tisibelia
No ratings yet
A Comprehensive Guide About Computers and Technology
From Everand
A Comprehensive Guide About Computers and Technology
Dale Carnegie
No ratings yet
Ltestng
No ratings yet
Ltestng
179 pages
software testing 1 23
No ratings yet
software testing 1 23
14 pages
Ebook Performance Testing
No ratings yet
Ebook Performance Testing
44 pages
When A Failing Test Might Be OK - Random Tech Thoughts
No ratings yet
When A Failing Test Might Be OK - Random Tech Thoughts
6 pages
Cloud: Get All The Support And Guidance You Need To Be A Success At Using The CLOUD
From Everand
Cloud: Get All The Support And Guidance You Need To Be A Success At Using The CLOUD
John Hawkins
No ratings yet
Get Performance Requirements Right - Think Like A User: by Scott Barber
No ratings yet
Get Performance Requirements Right - Think Like A User: by Scott Barber
8 pages
Performance Testing Presentation On 03july
No ratings yet
Performance Testing Presentation On 03july
36 pages
Lecture 03. Performance Concepts
No ratings yet
Lecture 03. Performance Concepts
42 pages
Performance Integral SDLC
No ratings yet
Performance Integral SDLC
8 pages
Understanding Performance
No ratings yet
Understanding Performance
65 pages
Lec Week 1 and 2
No ratings yet
Lec Week 1 and 2
11 pages
What Is Software Architecture?: Describe Softwarearchitecture in Your Own Words. (5) Software Architecture
No ratings yet
What Is Software Architecture?: Describe Softwarearchitecture in Your Own Words. (5) Software Architecture
6 pages
DeepSeek vs. ChatGPT – Why DeepSeek is the Superior AI.
From Everand
DeepSeek vs. ChatGPT – Why DeepSeek is the Superior AI.
Gary Thatcher
No ratings yet
ASP.NET Core 1.0 High Performance
From Everand
ASP.NET Core 1.0 High Performance
James Singleton
No ratings yet
IT Infrastructure Architecture: Infrastructure Building Blocks and Concepts
No ratings yet
IT Infrastructure Architecture: Infrastructure Building Blocks and Concepts
42 pages
CODING FOR ABSOLUTE BEGINNERS: How to Keep Your Data Safe from Hackers by Mastering the Basic Functions of Python, Java, and C++ (2022 Guide for Newbies)
From Everand
CODING FOR ABSOLUTE BEGINNERS: How to Keep Your Data Safe from Hackers by Mastering the Basic Functions of Python, Java, and C++ (2022 Guide for Newbies)
Eric Vargas
No ratings yet
Professional Test Driven Development with C#: Developing Real World Applications with TDD
From Everand
Professional Test Driven Development with C#: Developing Real World Applications with TDD
James Bender
No ratings yet
20 Windows Tools Every SysAdmin Should Know
From Everand
20 Windows Tools Every SysAdmin Should Know
padmin
4.5/5 (3)
Performance Optimization Made Simple: A Practical Guide to Programming
From Everand
Performance Optimization Made Simple: A Practical Guide to Programming
William E. Clark
No ratings yet
Term / Concept Description Capacity: Investigation Is An Activity Based On
No ratings yet
Term / Concept Description Capacity: Investigation Is An Activity Based On
11 pages
Virtual Report Processing: The Mapper Story
From Everand
Virtual Report Processing: The Mapper Story
Louis Schlueter
No ratings yet
Top Jobs: Computer and Information Technology
From Everand
Top Jobs: Computer and Information Technology
William Perry
No ratings yet
Lecture3-4
No ratings yet
Lecture3-4
27 pages
Thesis
No ratings yet
Thesis
40 pages
Cyber Vulnerabilities: Education, #3
From Everand
Cyber Vulnerabilities: Education, #3
Artur Victoria
No ratings yet
Top 5 Java Performance Considerations
No ratings yet
Top 5 Java Performance Considerations
28 pages
Computer Ch-1 Exercise
No ratings yet
Computer Ch-1 Exercise
23 pages
ICT333e Resit Revision 20191031 2in1
No ratings yet
ICT333e Resit Revision 20191031 2in1
29 pages
Understanding Software Engineering Vol 1: Where does the software run and how? The hardware.
From Everand
Understanding Software Engineering Vol 1: Where does the software run and how? The hardware.
Gabriel Clemente
No ratings yet
Edge Cloud Operations: A Systems Approach
From Everand
Edge Cloud Operations: A Systems Approach
Larry L Peterson
No ratings yet
Software Design And Development in your pocket
From Everand
Software Design And Development in your pocket
David Chen
5/5 (1)
System Design
No ratings yet
System Design
9 pages
Performance Testing For Best Case: International Journal of Pavement Engineering July 2021
No ratings yet
Performance Testing For Best Case: International Journal of Pavement Engineering July 2021
6 pages
Performance Testing
No ratings yet
Performance Testing
4 pages
SDLC Lo2
No ratings yet
SDLC Lo2
9 pages
Performance Testing in Nutshell: With Apache Jmeter Example
No ratings yet
Performance Testing in Nutshell: With Apache Jmeter Example
72 pages
Makhanlal Chaturvedi Rashtriya Patrakaritvishwavidyalya, Bhopal
No ratings yet
Makhanlal Chaturvedi Rashtriya Patrakaritvishwavidyalya, Bhopal
64 pages
Software Project Management Note BCA Bhairahawa Multiple Campus
No ratings yet
Software Project Management Note BCA Bhairahawa Multiple Campus
126 pages
STA - UNIT IV
No ratings yet
STA - UNIT IV
30 pages
Assignment 3
No ratings yet
Assignment 3
17 pages
Motivation For Research: Software Performance
No ratings yet
Motivation For Research: Software Performance
21 pages
Performance Testing Final - Vinay
100% (1)
Performance Testing Final - Vinay
45 pages
IAU-ST-Lecture2
No ratings yet
IAU-ST-Lecture2
30 pages
Performance testing
No ratings yet
Performance testing
78 pages
A Developer's Guide To Load Testing: Software Architecture For Developers
No ratings yet
A Developer's Guide To Load Testing: Software Architecture For Developers
61 pages
OMA Resume
No ratings yet
OMA Resume
1 page
Resultados WoS n57 Tudo + International Business e Global Value Chain
No ratings yet
Resultados WoS n57 Tudo + International Business e Global Value Chain
140 pages
Assignment3 Java
No ratings yet
Assignment3 Java
10 pages
Lecture 4A of Programming Language.
No ratings yet
Lecture 4A of Programming Language.
19 pages
I L C - CA - Falcon
No ratings yet
I L C - CA - Falcon
4 pages
WBS
No ratings yet
WBS
1 page
Lenovo V15 G2 ITL 82KB00G5LM
No ratings yet
Lenovo V15 G2 ITL 82KB00G5LM
2 pages
Students File - S4
No ratings yet
Students File - S4
6 pages
READING 2 UNIT 11 Old-Inventions-Reading-Comprehension
No ratings yet
READING 2 UNIT 11 Old-Inventions-Reading-Comprehension
2 pages
Southern Leyte State University: Project Title: Proponents/Researchers
No ratings yet
Southern Leyte State University: Project Title: Proponents/Researchers
4 pages
Principles of Check Weighing
No ratings yet
Principles of Check Weighing
72 pages
Adding True Bypass To A Vintage Big Muff
No ratings yet
Adding True Bypass To A Vintage Big Muff
8 pages
Continuing Professional Development Accreditation System
No ratings yet
Continuing Professional Development Accreditation System
3 pages
Notes KS
No ratings yet
Notes KS
5 pages
Muhammad Atta Ur Rehman Bscs 5B S.E (Software Engineering) Final Assignment 1 8-ARID-5198
No ratings yet
Muhammad Atta Ur Rehman Bscs 5B S.E (Software Engineering) Final Assignment 1 8-ARID-5198
7 pages
Near-Field Short Course - Nearfield Systems Inc.
No ratings yet
Near-Field Short Course - Nearfield Systems Inc.
2 pages
File
No ratings yet
File
339 pages
Problem Statements - Hackathon
No ratings yet
Problem Statements - Hackathon
5 pages
Epanet Small Systems Workshop 2020
No ratings yet
Epanet Small Systems Workshop 2020
44 pages
Handy Homework Helper English
100% (1)
Handy Homework Helper English
7 pages
AC Resonence Test Set
No ratings yet
AC Resonence Test Set
39 pages
EVK1000 Product Brief
No ratings yet
EVK1000 Product Brief
9 pages
Rick and Morty Season 2 S02 720p Torrent
No ratings yet
Rick and Morty Season 2 S02 720p Torrent
1 page
End User License Agreement For Arm Software Development Tools
No ratings yet
End User License Agreement For Arm Software Development Tools
7 pages
TC-Question Bank For 1st IA Test PDF
No ratings yet
TC-Question Bank For 1st IA Test PDF
2 pages
PH 2 QNi EHm RCC L5 F8
No ratings yet
PH 2 QNi EHm RCC L5 F8
7 pages
Resume Reference List Layout
100% (1)
Resume Reference List Layout
6 pages
Seraphim - Firefly 500
No ratings yet
Seraphim - Firefly 500
2 pages
Bitcoin Presentation
No ratings yet
Bitcoin Presentation
12 pages

Understandingsoftwaredynamics Preview

Uploaded by

Understandingsoftwaredynamics Preview

Uploaded by

Part I

Measurement is the act of ascertaining the size, amount, or degree of something.

Humble Bundle Pearson Software Development Ð © Pearson. Do Not Distribute.

Humble Bundle Pearson Software Development Ð © Pearson. Do Not Distribute.

1.1 Datacenter Context

Humble Bundle Pearson Software Development Ð © Pearson. Do Not Distribute.

We are interested in user-facing transactions in complex software—the datacenter half of cell

Time-constrained transaction software is fundamentally different from batch or offline soft-

In looking at the performance of complex transaction-oriented software, we assume in this book

Humble Bundle Pearson Software Development Ð © Pearson. Do Not Distribute.

It is the goal of this book to teach a few more people how.

1.2 Datacenter Hardware

Humble Bundle Pearson Software Development Ð © Pearson. Do Not Distribute.

You might also like