0% found this document useful (0 votes)
337 views7 pages

ByteByteGo - Technical Interview Prep

The document provides guidance on back-of-the-envelope estimation techniques for system design interviews, emphasizing the importance of scalability, latency, and availability metrics. It includes examples of data volume calculations, typical latency times for various operations, and common assumptions for estimating query per second (QPS) and storage requirements. Additionally, it offers tips for effective estimation during interviews, such as rounding, writing down assumptions, and labeling units.

Uploaded by

bavandla1988
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
337 views7 pages

ByteByteGo - Technical Interview Prep

The document provides guidance on back-of-the-envelope estimation techniques for system design interviews, emphasizing the importance of scalability, latency, and availability metrics. It includes examples of data volume calculations, typical latency times for various operations, and common assumptions for estimating query per second (QPS) and storage requirements. Additionally, it offers tips for effective estimation during interviews, such as rounding, writing down assumptions, and labeling units.

Uploaded by

bavandla1988
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

11/13/24, 11:09 AM ByteByteGo | Technical Interview Prep

Products Newsletter YouTube Login

System Design Interview


00 Foreword 03
Back-of-the-envelope
01 Join the Community

02 Scale From Zero To Millions Of


Estimation
Users In a system design interview, sometimes you are asked to
system capacity or performance requirements using a back
03 Back-of-the-envelope
Estimation
envelope estimation. According to Jeff Dean, Google Senio
“back-of-the-envelope calculations are estimates you create
combination of thought experiments and common performance
04 AInterviews
Framework For System Design to get a good feel for which designs will meet your requirements
You need to have a good sense of scalability basics to effectiv
05 Design A Rate Limiter out back-of-the-envelope estimation. The following concepts s
well understood: power of two [2], latency numbers every pro
should know, and availability numbers.
06 Design Consistent Hashing
Power of two
07 Design A Key-value Store Although data volume can become enormous when deal
distributed systems, calculation all boils down to the basics. T
correct calculations, it is critical to know the data volume unit u
08 Design A Unique ID Generator
In Distributed Systems power of 2. A byte is a sequence of 8 bits. An ASCII character
byte of memory (8 bits). Below is a table explaining the data vol
09 Design A URL Shortener (Table 1).
Power Approximate value Full name Short n
10 Design A Web Crawler 10 1 Thousand 1 Kilobyte 1 KB

https://bytebytego.com/courses/system-design-interview/back-of-the-envelope-estimation 1/7
11/13/24, 11:09 AM ByteByteGo | Technical Interview Prep

Power Approximate value Full name Short n


Products
20
Newsletter
1 Million
YouTube
1 Megabyte
Login
1 MB
30 1 Billion 1 Gigabyte 1 GB
System Design Interview 40 1 Trillion 1 Terabyte 1 TB
50 1 Quadrillion 1 Petabyte 1 PB
00 Foreword
Table 1
01 Join the Community Latency numbers every program
02 Scale From Zero To Millions Of
should know
Users Dr. Dean from Google reveals the length of typical computer op
in 2010 [1]. Some numbers are outdated as computers becom
03 Back-of-the-envelope
Estimation
and more powerful. However, those numbers should still be abl
us an idea of the fastness and slowness of different c
operations.
04 AInterviews
Framework For System Design
Operation name Time
05 Design A Rate Limiter L1 cache reference 0.5 ns
Branch mispredict 5 ns
06 Design Consistent Hashing L2 cache reference 7 ns
Mutex lock/unlock 100 ns
07 Design A Key-value Store
Main memory reference 100 ns
08 Design A Unique ID Generator
In Distributed Systems Compress 1K bytes with Zippy 10,000 ns = 10 µ
Send 2K bytes over 1 Gbps network 20,000 ns = 20 µ
09 Design A URL Shortener
Read 1 MB sequentially from memory 250,000 ns = 250
10 Design A Web Crawler Round trip within the same datacenter 500,000 ns = 500
Disk seek 10,000,000 ns =
https://bytebytego.com/courses/system-design-interview/back-of-the-envelope-estimation 2/7
11/13/24, 11:09 AM ByteByteGo | Technical Interview Prep

Operation name Time


Products Newsletter YouTube
Read 1 MB sequentially from the network
Login
10,000,000 ns =
Read 1 MB sequentially from disk 30,000,000 ns =
System Design Interview Send packet CA (California) ->Netherlands- 150,000,000 ns =
>CA ms
00 Foreword Table 2
Notes
01 Join the Community ns = nanosecond, µs = microsecond, ms = millisecond
1 ns = 10^-9 seconds
02 Scale
Users
From Zero To Millions Of
1 µs= 10^-6 seconds = 1,000 ns
1 ms = 10^-3 seconds = 1,000 µs = 1,000,000 ns
03 Back-of-the-envelope
Estimation A Google software engineer built a tool to visualize Dr. Dean’s n
The tool also takes the time factor into consideration. Figures 2-
04 AInterviews
Framework For System Design the visualized latency numbers as of 2020 (source of figures: r
material [3]).
05 Design A Rate Limiter

06 Design Consistent Hashing

07 Design A Key-value Store

08 Design A Unique ID Generator


In Distributed Systems

09 Design A URL Shortener

10 Design A Web Crawler

https://bytebytego.com/courses/system-design-interview/back-of-the-envelope-estimation 3/7
11/13/24, 11:09 AM ByteByteGo | Technical Interview Prep

Products Newsletter YouTube Login

System Design Interview


00 Foreword

01 Join the Community

02 Scale
Users
From Zero To Millions Of

03 Back-of-the-envelope
Estimation
Figure 1
04 AInterviews
Framework For System Design
By analyzing the numbers in Figure 1, we get the following conc
05 Design A Rate Limiter Memory is fast but the disk is slow.
Avoid disk seeks if possible.
06 Design Consistent Hashing
Simple compression algorithms are fast.
Compress data before sending it over the internet if possible
Data centers are usually in different regions, and it takes
07 Design A Key-value Store send data between them.

08 Design A Unique ID Generator


In Distributed Systems
Availability numbers
High availability is the ability of a system to be continuously op
for a desirably long period of time. High availability is measu
09 Design A URL Shortener percentage, with 100% means a service that has 0 downtim
services fall between 99% and 100%.
10 Design A Web Crawler A service level agreement (SLA) is a commonly used term fo
providers. This is an agreement between you (the service prov
https://bytebytego.com/courses/system-design-interview/back-of-the-envelope-estimation 4/7
11/13/24, 11:09 AM ByteByteGo | Technical Interview Prep

your customer, and this agreement formally defines the level o


Products Newsletter
your service will deliver. YouTube
Cloud providers Login
Amazon [4], Google
Microsoft [6] set their SLAs at 99.9% or above. Uptime is tra
measured in nines. The more the nines, the better. As shown in
the number of nines correlate to the expected system downtime
System Design Interview Availability Downtime Downtime Downtime Dow
% per day per week per month per
00 Foreword 14.40
99% minutes 1.68 hours 7.31 hours 3.65
01 Join the Community 99.99% 8.64 1.01 4.38 52.6
seconds minutes minutes min
02 Scale
Users
From Zero To Millions Of
99.999% 864.00 6.05 26.30 5.26
seconds seconds min
03 Back-of-the-envelope
Estimation 99.9999% 86.40
milliseconds 604.80
2.63
seconds
31.5
seco

04 AInterviews
Framework For System Design Table 3
Example: Estimate Twitter QPS a
05 Design A Rate Limiter
storage requirements
Please note the following numbers are for this exercise only as
06 Design Consistent Hashing not real numbers from Twitter.
Assumptions:
07 Design A Key-value Store 300 million monthly active users.
50% of users use Twitter daily.
08 Design A Unique ID Generator
In Distributed Systems Users post 2 tweets per day on average.
10% of tweets contain media.
09 Design A URL Shortener Data is stored for 5 years.
Estimations:
10 Design A Web Crawler Query per second (QPS) estimate:
Daily active users (DAU) = 300 million * 50% = 150 million
https://bytebytego.com/courses/system-design-interview/back-of-the-envelope-estimation 5/7
11/13/24, 11:09 AM ByteByteGo | Technical Interview Prep

Tweets QPS = 150 million * 2 tweets / 24 hour / 3600 se


Products
~3500 Newsletter YouTube Login
Peek QPS = 2 * QPS = ~7000
We will only estimate media storage here.
System Design Interview Average tweet size:
tweet_id 64 bytes
00 Foreword text 140 bytes
media 1 MB
Media storage: 150 million * 2 * 10% * 1 MB = 30 TB per da
01 Join the Community 5-year media storage: 30 TB * 365 * 5 = ~55 PB

02 Scale
Users
From Zero To Millions Of Tips
Back-of-the-envelope estimation is all about the process. So
problem is more important than obtaining results. Interviewers
03 Back-of-the-envelope
Estimation your problem-solving skills. Here are a few tips to follow:
Rounding and Approximation. It is difficult to perform com
04 AInterviews
Framework For System Design math operations during the interview. For example, what is t
of “99987 / 9.1”? There is no need to spend valuable time
complicated math problems. Precision is not expected. Us
05 Design A Rate Limiter numbers and approximation to your advantage. The
question can be simplified as follows: “100,000 / 10”.
06 Design Consistent Hashing Write down your assumptions. It is a good idea to write do
assumptions to be referenced later.
Label your units. When you write down “5”, does it mean 5
07 Design A Key-value Store MB? You might confuse yourself with this. Write down t
because “5 MB” helps to remove ambiguity.
08 Design A Unique ID Generator
In Distributed Systems
Commonly asked back-of-the-envelope estimations: QP
QPS, storage, cache, number of servers, etc. You can pract
calculations when preparing for an interview. Practice
09 Design A URL Shortener perfect.
Congratulations on getting this far! Now give yourself a pat on t
10 Design A Web Crawler Good job!

https://bytebytego.com/courses/system-design-interview/back-of-the-envelope-estimation 6/7
Reference materials
11/13/24, 11:09 AM ByteByteGo | Technical Interview Prep

Products Newsletter YouTube Login


[1] J. Dean.Google Pro Tip: Use Back-Of-The-Envelope-Calcula
Choose The Best Design:
http://highscalability.com/blog/2011/1/26/google-pro-tip-use-ba
System Design Interview he-envelope-calculations-to-choo.html
[2] System design primer:
https://github.com/donnemartin/system-design-primer
00 Foreword
[3] Latency Numbers Every Programmer Should Know:
https://colin-scott.github.io/personal_website/research/interactiv
01 Join the Community cy.html
[4] Amazon Compute Service Level Agreement:
02 Scale
Users
From Zero To Millions Of https://aws.amazon.com/compute/sla/
[5] Compute Engine Service Level Agreement (SLA):
https://cloud.google.com/compute/sla
03 Back-of-the-envelope
Estimation

04 AInterviews
Framework For System Design

05 Design A Rate Limiter

06 Design Consistent Hashing

07 Design A Key-value Store

08 Design A Unique ID Generator


In Distributed Systems

09 Design A URL Shortener

10 Design A Web Crawler

https://bytebytego.com/courses/system-design-interview/back-of-the-envelope-estimation 7/7

You might also like