0% found this document useful (0 votes)

31 views385 pages

System Design

The document outlines key concepts in system design, particularly focusing on distributed systems, scalability, and database management techniques such as partitioning and sharding. It discusses the importance of trade-offs between consistency and performance, as well as various design patterns and protocols used in system architecture. Additionally, it highlights the significance of redundancy, load balancing, and the CAP theorem in ensuring system reliability and efficiency.

Uploaded by

Reddy Sri Pavani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views385 pages

System Design

Uploaded by

Reddy Sri Pavani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 385

System Design

System design is based on the principles of Distributed Systems. A group of computers working together to achieve
a common goal.

Group of computers can handle more data, more users, and can-do tougher tasks.

Vertical Scaling and Horizontal Scaling

Performance: How fast your system works
A system can be made more reliable using mechanisms like replication, redundancy and Fail over mechanisms.

Having a strong consistency can slow down performance so many systems adopt eventual consistency

Tradeoffs based on what we need
It involves choosing right database, designing DB schema and using techniques like Partitioning , Sharding and
Replication for optimal storage and replication.

Partitioning

Partitioning divides a single database table into smaller, more manageable pieces called
partitions. These partitions are stored within the same database instance. Partitioning is typically
used to improve query performance, optimize storage, and facilitate easier maintenance.

Sharding

Sharding is a form of horizontal partitioning but is applied across multiple database instances
(or servers). Sharding is commonly used in distributed systems to handle large-scale datasets and
high traffic by spreading the load across multiple machines.
 To
make sure everything we do in database is done right and reliably.

Consistence Hashing makes it easier to add or remove servers without causing too many disruptions and it also
helps improving load balancing and scalability.
Building Blocks:
3. Databases  most common are SQL and NoSQL
Message queues decouple sender and receiver aligning them to work independently and on different rates.

Distributed Systems book: Designing Data intensive applications

Object Storage  example Amazon S3

It can be used for various purposes like Load Balancing, Caching, Security or Content Filtering

8. CDN  Content Delivery Network  group of servers spread across different locations worldwide.

It stores copies of website contents like images, videos, and files. When you visit a website CDN serves the
content from the nearest server making the website load faster.

System design interviews are all about trade-offs

Expectation in a system design interview is that you can clarify functional and non-functional requirements
Netflix Blog, Uber Blog, Airbnb Tech Blogs
LLD: implementation details  writing actual code classes and objects

UML: Unified Modelling Language

Design Patterns:

How to answer a LLD question in an interview

1) Clarify requirements and core use cases

2) Identify entities
3) Create classes
4) Identify core methods based on use cases
5) Define relationships between classes
6) Implement methods
7) Exception Handling  Ho you would handle errors, edge cases, exceptions, and unexpected input
Best LLD Coding practices:

Design patterns in system design provide reusable solutions to common problems encountered
when designing complex systems. These patterns help ensure scalability, maintainability,
reliability, and efficiency. Below is an overview of some key design patterns used in system
design:

HLD Patterns are about macro-level architecture—how systems interact, scale, and operate.
LLD Patterns are about micro-level design—how classes and objects work together within the
architecture.
High Level Design

https://www.hellointerview.com/learn/system-design/in-a-hurry/core-concepts

Scalability:
How Databases Help Solve Synchronization Challenges

Transactions:

 Most databases provide transactions to ensure atomicity, consistency, isolation, and durability
(ACID).
 Transactions allow multiple operations (like reads and writes) to be treated as a single unit,
ensuring that all operations succeed or fail together.
In Redis this distributed lock is called Red Lock.
Fan-Out in Distributed Systems
Consistency:

What is Cache Invalidity?

A cache is a high-speed data storage layer that temporarily stores frequently accessed data to
improve application performance. However, if the original data changes and the cache is not
updated, the cache contains outdated (or "stale") data. Invalidating the cache ensures that it
stops serving stale data and retrieves fresh data from the primary source.
Locking:

Locks are important for enforcing the correctness of our system but can be disastrous for
performance.

HOW OCC Works

Indexing:
Indexing in Relational Databases

In relational databases (e.g., MySQL, PostgreSQL, SQL Server), indexes can be created on one
or more columns of a table. These indexes are maintained by the database system and are highly
optimized for common operations like searching, sorting, and filtering.
You cannot have unlimited secondary indexes due to performance and storage considerations.

B-Tree index:
in most relational database management systems (RDBMS), a B-tree index is created by default on the
primary key of a table.
Geo Spatial Indexing:
Vector indexes (for High Dimensional data):
Full Text indexes:
Creating and using specialized indexes allows you to efficiently search and query large datasets, whether
you are dealing with geospatial data, high-dimensional vectors, or large text fields. Each type of index
has its own strengths and use cases and understanding when and how to use them is crucial for
optimizing performance in database design.

Communication Protocols:
you'll be asked to reason about the communication protocols you'll use to
build your system. You've got two different categories of protocols to handle:
internal and external. Internally, for a typical microservice application which
consistitues 90%+ of system design problems, either HTTP(S) or gRPC will
do the job. Don't make things complicated.
Externally, you'll need to consider how your clients will communicate with your
system: who initiates the communication, what are the latency considerations,
and how much data needs to be sent.
Across choices, most systems can be built with a combination of HTTP(S),
SSE or long polling, and Websockets. Browsers and apps are built to handle
these protocols and they're easy to use and generally speaking most system
design interviews don't deal with clients that need to have custom, high-
performance protocols.
a server can handle multiple WebSocket connections with more than one client at a time. In fact,
WebSocket servers are designed to support many concurrent connections, allowing each client to
maintain a separate WebSocket connection to the server for real-time, bidirectional communication.
Ⅰ. Scalability:

As a system grows, the performance starts to degrade unless we adapt it to deal with that growth.
Scalability is the property of a system to handle a growing amount of load by adding resources to the
system.

System can grow in multiple ways:

Ways to scale a system:
3. Load Balancers
4. Caching
5. CDN’s

6. Partitioning
Amazon DynamoDB uses partitioning to distribute data and traffic for its NoSQL database service across
many servers, ensuring fast performance and scalability.

Asynchronous communication:
Slack uses asynchronous communication to handle messaging. Here's how it works:
Ⅱ. Availability

Redundancy Techniques:
Load Balancers:

 Hardware Load Balancers: Physical devices that distribute traffic based on pre-configured
rules.
 Software Load Balancers: Software solutions that manage traffic distribution, such as HA
Proxy, Nginx, or cloud-based solutions like AWS Elastic Load Balancer.

Failover Mechanisms: Failover mechanisms automatically switch to a redundant system when a failure
is detected.

Data replication:

Monitoring and Alerts:

Ⅲ. CAP Theorem: Desirable property of Distributed Systems with replicated property: CAP
In a consistent distributed system, if you write data to node A, a read operation from node B will
immediately reflect the write operation on node A.
Consistency is crucial for applications where having the most up-to-date data is critical, such as
financial systems, where a balance inquiry must reflect the most up-to-date state of an account.
Availability is important for applications that need to remain operational at all
times, such as online retail systems.
A network partition occurs when a network failure causes a distributed system to split into two
or more groups of nodes that cannot communicate with each other.

When there is a network partition, the system must choose

between Consistency and Availability.

Partition Tolerance is essential for distributed systems because network failures can and do
happen. A system that tolerates partitions can maintain operations across different network
segments.
Designing distributed systems requires carefully balancing these trade-offs
based on application requirements.
Systems like Cassandra allow configuring the level of consistency on a per-query basis, providing
flexibility.
Ⅳ. Acid Properties:

"Rolled back" means undoing all the changes made during a transaction, ensuring that the system
returns to its previous state as if the transaction never occurred. This is a critical feature of
Atomicity in ACID transactions.
Distributed databases, like Amazon DynamoDB or Google Spanner, operate across multiple
nodes or regions to handle massive amounts of data efficiently. Ensuring consistency in such
environments is challenging because of the need to balance three key aspects: Consistency,
Availability, and Partition Tolerance (as described in the CAP theorem). This gives rise to two
models of consistency:
Choosing Between Strong and Eventual Consistency:

 Strong Consistency: Suitable for critical systems like financial services, healthcare applications,
or inventory management, where data accuracy is non-negotiable.
 Eventual Consistency: Works well for use cases like social media, content distribution, or
caching systems, where availability and performance are more important than immediate
accuracy.
How do ACID transactions work?
 Begin Transaction
 Execute Transaction
 Commit Transaction
 Rollback Transaction
V. Consistent Hashing: A common method to distribute data as evenly as possible among

serverIndex = hash(key)%N; N size of the server pool.

If the number of servers remains the same an object key will always map to the same server. This
approach works if data distribution is even and servers number did not change, which will stop
working if servers go up or down.
Byte Byte go video: Consistent Hashing  Hashing both Object Keys and Server names 
same hash function and same range of values called hash space. We connect hash space
beginning and end to form a hash ring. Using a hash function, we hash each server by its name or
ip address and place the server onto the ring. Next, we hash each object with the same hash
function where we do not use a modulo operator to assign it to a hash ring. To locate the server
for a particular object we go clockwise from the location of the object key on a ring until a server
is found.
With Simple Hashing, when a new server is added, almost all the keys need to be remapped. But
with consistent hashing, adding a new server only requires redistribution of a fraction of the
keys.

Cons: Distribution of the objects in the servers on the ring is likely to be uneven. We pick n
random points on the ring; we are very unlikely to get a perfect partition of the ring into equally
sized segments. Even if original servers are uniformly placed. To avoid this, we use the concept
of virtual nodes. Each server handles multiple segments on the ring. As the no: of virtual nodes
increases, the distribution of objects becomes more balanced. Having more virtual nodes means
taking more space to store the metadata about the virtual nodes. This is a trade-off we can tune
the no: of virtual nodes to fit our system requirements.

Amazon Dynamo DB, Apache Cassandra uses Consistent hashing where it is used for data
partitioning. It helps these databases minimize data movement during rebalancing.

CDNs like Akamai use consistent hashing to help distribute web contents evenly among the edge
servers.
Load balancers like Google Load balancer use consistent hashing to distribute persistent
connections evenly across backend servers. This limits the no: of connections that needs to be re-
established when a backend server goes down.

 Node: a server that provides functionality to other services

 Hash function: a mathematical function used to map data of
arbitrary size to fixed-size values
 Data partitioning: a technique of distributing data across multiple
nodes to improve the performance and scalability of the system
 Data replication: a technique of storing multiple copies of the same
data on different nodes to improve the availability and durability of
the system
 Hotspot: A performance-degraded node in a distributed system
due to a large share of data storage and a high volume of retrieval
or storage requests
 Gossip protocol: peer-to-peer communication technique used by
nodes to periodically exchange state information.
Orthogonality of Replication and Partitioning:

The terms orthogonal and separate are used here to explain that replication and partitioning
are independent of each other and can be applied together in a flexible way.

 Partitioning and Replication Are Independent: You can use partitioning (sharding)
without replication, or you can replicate the data across servers while still partitioning it.
These two strategies do not directly affect each other but can be combined for better
performance and fault tolerance.

 Replication Tradeoffs:

 More replicas increase availability but also introduce the cost of synchronizing replicas.
 Spread: Having multiple replicas of the same data item can increase the spread, but the
system must avoid overloading certain replicas.

 Partitioning Tradeoffs:

 Memory Bound: Partitioning helps prevent overloading individual servers and ensures
that the data set can scale.
 Increased Throughput: Partitioning helps distribute the workload across multiple
servers, improving throughput and enabling the system to handle more operations in
parallel.
Static hash partitioning:
Consistent Hashing: Consistent hashing minimizes the number of keys to be
remapped when the total number of nodes changes
The following operations are executed to locate the position of a node on
the hash ring :
1. Hash the internet protocol (IP) address or domain name of the
node using a hash function
2. The hash code is base converted
3. Modulo the hash code with the total number of available positions
on the hash ring

Suppose the hash function produces an output space size of 10 bits (2¹⁰
= 1024), the hash ring formed is a virtual circle with a number range
starting from 0 to 1023. The hashed value of the IP address of a node is
used to assign a location for the node on the hash ring.

The key of the data object is hashed using the same hash function to
locate the position of the key on the hash ring. The hash ring is
traversed in the clockwise direction starting from the position of the key
until a node is found. The data object is stored on the node that was
found. In simple words, the first node with a position value greater than
the position of the key stores the data object.

The failure (crash) of a node results in the movement of data objects

from the failed node to the immediate neighboring node in the clockwise
direction. The remaining nodes on the hash ring are unaffected.

Average number of keys stored on a node = k/N

where k is the total number of keys (data objects) and N is the number
of nodes.

The deletion or addition of a node results in the movement of an

average number of keys stored on a single node. Consistent hashing aid
cloud computing by minimizing the movement of data when the total
number of nodes changes due to dynamic load
There is a chance that nodes are not uniformly distributed on the
consistent hash ring. The nodes that receive a huge amount of traffic
become hotspots resulting in cascading failure of the nodes.

Consistent hashing; Virtual nodes

The nodes are assigned to multiple positions on the hash ring by hashing
the node IDs through distinct hash functions to ensure uniform
distribution of keys among the nodes. The technique of assigning
multiple positions to a node is known as a virtual node. The virtual
nodes improve the load balancing of the system and prevent hotspots.
The number of positions for a node is decided by the heterogeneity of
the node. In other words, the nodes with a higher capacity are assigned
more positions on the hash ring.

The data objects can be replicated on adjacent nodes to minimize the

data movement when a node crashes or when a node is added to the
hash ring. In conclusion, consistent hashing resolves the problem of
dynamic load.
The BST data structure is stored on a centralized highly available
service. As an alternative, the BST data structure is stored on each node,
and the state information between the nodes is synchronized through
the gossip protocol.
In the diagram, suppose the hash of an arbitrary key ‘xyz’ yields the
hash code output 5. The successor BST node is 6 and the data object
with the key ‘xyz’ is stored on the node that is at position 6. In general,
the following operations are executed to insert a key (data object):
1. Hash the key of the data object
2. Search the BST in logarithmic time to find the BST node
immediately greater than the hashed output
3. Store the data object in the successor node.
The insertion of a new node results in the movement of data objects that
fall within the range of the new node from the successor node. Each
node might store an internal or an external BST to track the keys
allocated in the node. The following operations are executed to insert a
node on the hash ring:
1. Insert the hash of the node ID in BST in logarithmic time
2. Identify the keys that fall within the subrange of the new node from
the successor node on BST
3. Move the keys to the new node

The deletion of a node results in the movement of data objects that fall
within the range of the decommissioned node to the successor node. An
additional external BST can be used to track the keys allocated in the
node. The following operations are executed to delete a node on the
hash ring:
1. Delete the hash of the decommissioned node ID in BST in
logarithmic time
2. Identify the keys that fall within the range of the decommissioned
node
3. Move the keys to the successor node

The asymptotic complexity of consistent hashing operations are

k no: of keys and n total no: of nodes
The following are the disadvantages of consistent hashing :
 cascading failure due to hotspots

 non-uniform distribution of nodes and data

 oblivious to the heterogeneity in the performance of nodes

The following are the disadvantages of virtual nodes :

 when a specific data object becomes extremely popular, consistent
hashing will still send all the requests for the popular data object to
the same subset of nodes resulting in a degradation of the service
 capacity planning is trickier with virtual nodes

 memory costs and operational complexity increase due to the

maintenance of BST
 replication of data objects is challenging due to the additional logic
to identify the distinct physical nodes
 downtime of a virtual node affects multiple nodes on the ring
Discord, The distributed NoSQL data stores such as Amazon DynamoDB,
Apache Cassandra, and Riak use consistent hashing to dynamically
partition the data set across the set of nodes. The data is partitioned for
incremental scalability.
The video storage and streaming service Vimeo uses consistent hashing
for load balancing the traffic to stream videos.
The video streaming service Netflix uses consistent hashing to distribute
the uploaded video content across the content delivery network (CDN).

Bounded Load Consistent Hashing:

Multiprobe Consistent Hashing: Multi-probe consistent hashing is an enhancement of the

traditional consistent hashing algorithm. Its primary goal is to improve the load balancing of keys across
cache servers, especially in systems where there are relatively few servers, or when virtual nodes (used
to improve load distribution) are not feasible or sufficient.

How Multi-Probe Consistent Hashing Works

 Key Idea: Instead of using a single hash value to determine a server for a key, multiple
hash values are used to probe multiple positions in the hash space. The best position is
chosen based on some optimization criteria (e.g., minimizing load imbalance).

Consistent hashing is popular among distributed systems such internet-

scale URL shortener , and Pastebin. The most common use cases of
consistent hashing are data partitioning and load balancing.

Ⅵ. Rate Limiting:
Token Bucket Algorithm:
The Token Bucket Algorithm is ideal for scenarios where occasional bursts of requests are acceptable, as
it combines a steady refill rate with the flexibility to handle short-term spikes. However, it may not be
suitable for environments with strict smoothness requirements or where scaling to a large number of
users is a concern.
Leaky Bucket:
Fixed Window Counter:
Sliding Window Counter:
Single Point of Failure: (SPOF)
The application servers are not SPOFs since you have two of them. If one fails,
the other can still handle requests, assuming the load balancer can distribute
traffic effectively.
Strategies to avoid SPOF:

2. Load Balancing
5. Graceful Handling of Failures

Design applications to handle failures without crashing.

Example: If a service that provides user recommendations fails, the application

should still function, perhaps with a message indicating limited features
temporarily.

Implement failover mechanisms to automatically switch to backup systems

when failures are detected.
Fault Tolerance:
Fault tolerance describes a system’s ability to handle errors and
outages without any loss of functionality. It is a critical capability,
especially in cloud computing, where reliability and uptime are
paramount.
What is fault tolerance in cloud computing? It involves designing
systems that can automatically recover from failures, ensuring
minimal disruption to services. This is essential for maintaining
customer trust and business continuity.

These are some of the most common approaches to achieving fault

tolerance:

Multiple hardware systems capable of doing the same work. For

example, an Application can have its two databases located on two
different physical servers, potentially in different locations. That
way, if the primary database server experiences an error, a
hardware failure, or a power outage, the other server might not be
affected.

Multiple instances of software capable of doing the same work.

For example, many modern applications make use of
containerization platforms such as Kubernetes so that they can run
multiple instances of software services. One reason for this is so that
if one instance encounters an error or goes offline, traffic can be
routed to other instances to maintain application functionality.

Backup sources of power, such as generators, are often used in

on-premises systems to protect the application from being knocked
offline if power to the servers is impacted by, for example, the
weather. That type of outage is more common than you might
expect.

High availability refers to a system’s total uptime, and achieving

high availability is one of the primary reasons architects look to
build fault-tolerant systems. But availability and Fault tolerance are
not the same thing. Keeping an application highly available is not
simply a matter of making it fault tolerant. A highly fault-tolerant
application could still fail to achieve high availability if, for example,
it has to be taken offline regularly to upgrade software components,
change the database schema, etc. However its difficult to achieve
high availability without robust fault tolerant systems.

it’s important to assess the level of fault tolerance your application

requires and build your system accordingly since building fault
tolerant systems can be complex and expensive.

In this case, your goal is normal functioning — you want your

application, and by extension the user’s experience, to remain
unchanged even if an element of your system fails or is knocked
offline.

Another approach is aiming for what’s called graceful

degradation, where outages and errors are allowed to impact
functionality and degrade the user experience, but not knock the
application out entirely. For example, if a software instance
encounters an error during a period of heavy traffic, the application
experience may slow for other users, and certain features might
become unavailable.

Building for normal functioning obviously provides for a superior

user experience, but it’s also generally more expensive. The goals
for a specific application, then, might depend on what it is used for.
Mission-critical applications and systems will likely need to maintain
normal functioning where as it might make economic sense to allow
less essential systems to degrade gracefully.

Survival goals can vary, but here are some common ones for
applications that run on one or more of the public clouds, in
ascending order of resilience:

 Survive node failure. Running instances of your software on

multiple nodes (often different physical servers) with the same
AZ (data center) can allow your application to survive faults
(such as hardware failures or errors) on one or more of those
nodes.
 Survive AZ failure. Running instances of your software
across multiple availability zones (data centers) within a cloud
region will allow you to survive AZ outages, such as a specific
data center losing power during a storm.
 Survive region failure. Running instances of your software
across multiple cloud regions can allow you to survive an
outage affecting an entire region, such as the AWS US-east-1
outage in 2020
 Survive cloud provider failure. Running instances of your
software both in the cloud and on-premises, or across multiple
cloud providers, can allow you to survive even a full cloud
provider outage.
This application could survive a node, AZ, or even region failure
affecting its application layer, its database layer, or both.
In Application Layer:

In the diagram above, the application is spread across multiple

regions, with each region having its own Kubernetes cluster.

Within each region, the application is built with microservices that

execute specific tasks, and these microservices are typically
operated inside Kubernetes pods. This allows for much greater fault
tolerance, since a new pod with a new instance can be started up
whenever an existing pod encounters an error. This approach also
makes the application easier to scale horizontally — as the load on a
specific service increases, additional instances of that service can be
added in real time to handle the load, and then removed when the
load dies down again and they’re no longer needed.

In database Layer:

Data Pipeline

 CDC (Change Data Capture): Changes in the database are captured in real time and
sent to Kafka, which acts as a message broker for streaming data.
 Data Warehouse: Kafka streams the captured data changes to the data warehouse,
enabling:
o Analytics and reporting.
o Real-time data processing for insights.
 This ensures that the data warehouse stays in sync with the operational database.
The application in the diagram above takes a similar approach in the
database layer. Here, CockroachDB is chosen because its
distributed, node-based nature naturally provides a high level of
fault tolerance and the same flexibility when it comes to scaling up
and down horizontally. Being a distributed SQL database, it also
allows for strong consistency guarantees, which is important for
most transactional workloads.

CockroachDB also makes sense for this architecture because

although it’s a distributed database, it can be treated like a single-
instance Postgres database by the application — almost all the
complexity of distributing the data to meet your application’s
availability and survival goals happens under the hood.

What is Consensus in Distributed System?

In a distributed system, multiple computers (known

as nodes) are mutually connected with each other and
collaborate with each other through message passing. Now,
during computation, they need to agree upon a common
value to coordinate among multiple processes. This
phenomenon is known as Distributed Consensus.

In a distributed system, it may happen that multiple nodes

are processing large computations distributedly and they
need to know the results of each node to keep them updated
about the whole system. In such a situation, the nodes need
to agree upon a common value. This is where the
requirement for consensus comes into the picture.
Challenges in Distributed Consensus
A distributed system can face mainly two types of failure.

1. Crash failure
2. Byzantine failure
Consensus Algorithms
Practical Byzantine Fault Tolerance

There are other voting-based consensus algorithms like —

HotStuff, Paxos , Raft etc…
Consensus algorithms are used in distributed systems to ensure that multiple independent nodes

(or entities) can agree on a common state, despite failures or network partitions. These

algorithms are crucial in decentralized networks, where there is no central authority to coordinate

decisions.
Gossip Protocol:

Centralized State Management Service

A centralized state management service such as Apache Zookeeper can
be configured as the service discovery to keep track of the state of
every node in the system. Although this approach provides a strong
consistency guarantee, the primary drawbacks are the state
management service becomes a single point of failure and runs into
scalability problems for a large distributed system.

Peer-To-Peer State Management Service

The peer-to-peer state management approach is inclined towards high

availability and eventual consistency. The gossip protocol algorithms can
be used to implement peer-to-peer state management services with
high scalability and improved resilience [1].
The gossip protocol is also known as the epidemic
protocol because the transmission of the messages is similar to the
way how epidemics spread. The concept of communication in gossip
protocol is analogous to the spread of rumors among the office staff or
the dissemination of information on a social media website.
Broadcast Protocols

The popular message broadcasting techniques in a distributed system

are the following:
 point-to-point broadcast

 eager reliable broadcast

 gossip protocol
The gossip protocol is a decentralized peer-to-peer communication
technique to transmit messages in an enormous distributed system. The
key concept of gossip protocol is that every node periodically sends out
a message to a subset of other random nodes. The entire system will
receive the message eventually with a high probability. In layman’s
terms, the gossip protocol is a technique for nodes to build a global map
through limited local interactions.
After a few rounds there is a high probability, each node will receive a
message.
The gossip protocol is built on a robust, scalable, and eventually
consistent algorithm. The gossip protocol is typically used to maintain
the node membership list, achieve consensus, and fault detection in a
distributed system. In addition, additional information such as
application-level data can be piggybacked on gossip messages
Piggybacking in computer science and distributed systems refers to the practice of attaching
additional information to an existing communication or message, rather than sending a separate,
standalone message for that information.

The gossip protocol is reliable because a node failure can be overcome

by the retransmission of a message by another node. First-in-first-out
(FIFO) broadcast, causality broadcast, and total order broadcast can be
implemented with gossip protocol. The gossip protocol parameters such
as cycle and fanout can be tuned to improve the probabilistic
guarantees of the gossip protocol.

The gossip protocol is a decentralized communication protocol designed for scalability, reliability, and
efficiency in large-scale distributed systems.
The number of nodes that will receive the message from a particular
node is known as the fanout. The count of gossip rounds required to
spread a message across the entire cluster is known as the cycle.

Gossip Protocol Advantages:

Scalability is the ability of the system to handle the increasing load

without degradation of the performance. The gossip protocol cycle
requires logarithmic time to achieve convergence. In addition, every
node interacts with only a fixed number of nodes and sends only a fixed
number of messages independent of the number of nodes in the system.
A node doesn’t wait for an acknowledgment to improve latency.

Fault tolerance is the ability of the system to remain functional in the

occurrence of failures such as node crashes, network partitions, or
message loss. The distributed system employing the gossip protocol is
fault tolerant due to tolerance towards unreliable networks. The
redundancy, parallelism, and randomness offered by the gossip protocol
improve the fault tolerance of the system.

Robustness:

The symmetric nature of the nodes participating in the gossip protocol

improves the robustness of the system. A node failure will not disrupt
the system quality. The gossip protocol is also robust against transient
network partitions. However, the gossip protocol is not robust against a
malfunctioning node or a malicious gossip message unless the data is
self-verified.

A score-based reputation system for nodes can be used to prevent

gossip system corruption by malicious nodes. Appropriate mechanisms
and policies such as encryption, authentication, and authorization must
be implemented to enforce the privacy and security of the gossip
system.

Convergent Consistency

Consistency is the technique of ensuring the same state view across

every node in the system. The different consistency levels such as
strong, eventual, causal, and probabilistic consistency have different
implications on the performance, availability, and correctness of the
system [2]. The gossip protocol converges to a consistent state in
logarithmic time complexity through the exponential spread of data.

Decentralization: The gossip protocol offers an extremely

decentralized model of information discovery through peer-to-peer
communication.

Simplicity: Most variants of the gossip protocol can be implemented

with very little code and low complexity. The symmetric nature of the
nodes makes it trivial to execute the gossip protocol.

Integration and Interoperability:

The gossip protocol can be integrated and interoperated with distributed

system components such as the database, cache, and queue. Common
interfaces, data formats, and protocols must be defined to implement
the gossip protocol across different distributed system components.

Bounded Load: The classic distributed system protocols usually

generate high surge loads that might overload individual distributed
system components. The gossip protocol will produce only a strictly
bounded worst-case load on individual distributed system components
to avoid the disruption of service quality. The peer node selection in the
gossip protocol can be tuned to reduce the load on network links. In
practice, the load generated by the gossip protocol is not only bounded
but also negligible compared to the available bandwidth.
Pending: from Types of Gossip Protocol.

Service Discovery:

Today’s modern applications are far more complex, consisting of dozens or

even hundreds of services, each with multiple instances that scale up and down
dynamically.

This makes it harder for services to efficiently find and communicate with
each other across networks.

That’s where Service Discovery comes into play.

Service discovery is a mechanism that allows services in a distributed system to find and
communicate with each other dynamically. It hides the complex details of where services are
located, so they can interact without knowing each other's exact network spots. Service discovery
registers and maintains a record of all your services in a service registry. This service registry
acts as a single source of truth that allows your services to query and communicate with each
other.

Example service registry record of a service:

A service registry typically stores:

 Basic Details: Service name, IP, port, and status.

 Metadata: Version, environment, region, tags, etc.
 Health Information: Health status, last health check.
 Load Balancing Info: Weights, priorities.
 Secure Communication: Protocols, certificates.

This abstraction is important in environments where services are constantly

being added, removed, or scaled.
Think about a massive system like Netflix, with hundreds of microservices
working together. Hardcoding the locations of these services isn’t scalable. If a
service moves to a new server or scales dynamically, it could break the entire
system.

Service discovery solves this by dynamically and reliably enabling services to

locate and communicate with one another.

Service registration options:

Service registration is the process where a service announces its availability to

a service registry, making it discoverable by other services. The method of
registration can vary depending on the architecture, tools, and deployment
environment.
Types of Service Discovery
There are two primary types of service discovery: client-side discovery and
server-side discovery.

Client Side Discovery:

Netflix’s open-source library, Eureka, is a popular tool for client-side service
discovery.
Service discovery may not be the most glamorous aspect of distributed systems,
but it is undoubtedly one of the most essential. Think of service discovery as the
address book of your microservices architecture. Without it, scaling and
maintaining distributed systems would be chaotic. It serves as the backbone that
enables the seamless communication and coordination between services,
allowing complex applications to function reliably and efficiently.

Best Practices for implementing Service discovery:

Disaster Recovery:

Disaster recovery (DR) is an organization’s ability to restore access

and functionality to IT infrastructure after a disaster event, whether
natural or caused by human action (or error). DR is considered a
subset of business continuity, explicitly focusing on ensuring that
the IT systems that support critical business functions are
operational as soon as possible after a disruptive event occurs.
DR planning and strategies focus on responding to and
recovering from disasters—events that disrupt or completely
stop a business from operating.

Disaster recovery relies on having a solid plan to get critical applications and infrastructure up
and running after an outage—ideally within minutes.

An effective DR plan addresses three different elements for recovery:

 Preventive: Ensuring your systems are as secure and reliable as possible, using tools and
techniques to prevent a disaster from occurring in the first place. This may include
backing up critical data or continuously monitoring environments for configuration
errors and compliance violations.
 Detective: For rapid recovery, you’ll need to know when a response is necessary. These
measures focus on detecting or discovering unwanted events as they happen in real
time.
 Corrective: These measures are aimed at planning for potential DR scenarios, ensuring
backup operations to reduce impact, and putting recovery procedures into action to
restore data and systems quickly when the time comes.

Typically, disaster recovery involves securely replicating and

backing up critical data and workloads to a secondary location or
multiple locations—disaster recovery sites. A disaster recovery site
can be used to recover data from the most recent backup or a
previous point in time. Organizations can also switch to using a DR
site if the primary location and its systems fail due to an unforeseen
event until the primary one is restored.

Types of disaster Recovery:

 Backups: With backups, you back up data to an offsite system

or ship an external drive to an offsite location. However,
backups do not include any IT infrastructure, so they are not
considered a full disaster recovery solution.
 Backup as a service (BaaS): Similar to remote data
backups, BaaS solutions provide regular data backups offered
by a third-party provider.
 Disaster recovery as a service (DRaaS): Many cloud
providers offer DRaaS, along with cloud service models
like IaaS and PaaS. A DRaaS service model allows you to back
up your data and IT infrastructure and host them on a third-
party provider’s cloud infrastructure. During a crisis, the
provider will implement and orchestrate your DR plan to help
recover access and functionality with minimal interruption to
operations.
 Point-in-time snapshots: Also known as point-in-time copies,
snapshots replicate data, files, or even an entire database at a
specific point in time. Snapshots can be used to restore data as
long as the copy is stored in a location unaffected by the
event. However, some data loss can occur depending on when
the snapshot was made.
 Virtual DR: Virtual DR solutions allow you to back up
operations and data or even create a complete replica of your
IT infrastructure and run it on offsite virtual machines (VMs). In
the event of a disaster, you can reload your backup and
resume operation quickly. This solution requires frequent data
and workload transfers to be effective.
 Disaster recovery sites: These are locations that
organizations can temporarily use after a disaster event, which
contain backups of data, systems, and other technology
infrastructure.

When it comes to creating disaster recovery strategies, you should

carefully consider the following key metrics:

 Recovery time objective (RTO): The maximum acceptable

length of time that systems and applications can be down
without causing significant damage to the business. For
example, some applications can be offline for an hour, while
others might need to recover in minutes.
 Recovery point objective (RPO): The maximum age of data
you need to recover to resume operations after a major event.
RPO helps to define the frequency of backups.

Typically, the smaller your RTO and RPO values (or the faster
your applications need to recover after an interruption), the
higher the cost to run your application.

Cloud disaster recovery can greatly reduce the costs of RTO and
RPO when it comes to fulfilling on-premises requirements for
capacity, security, network infrastructure, bandwidth, support, and
facilities.

Distributed Tracing:

Cloud computing, microservices, open-source tools, and container-

based delivery have made applications more distributed across an
increasingly complex landscape. As a result, distributed tracing has
become crucial to responding quickly to issues.

Distributed tracing is a method of observing requests as they

propagate through distributed cloud environments. It follows an
interaction and tags it with a unique identifier. This identifier stays
with the transaction as it interacts with microservices, containers,
and infrastructure. In turn, this identifier offers real-time visibility
into user experience, from the top of the stack to the application
layer and the infrastructure beneath.

 Monolithic applications are typically hosted on a few static servers, making monitoring
straightforward with traditional tools.

 In cloud-native architectures, applications are distributed across multiple servers, containers,

or clusters, often running in dynamic environments like Kubernetes. These architectures scale up
or down automatically, and instances can appear or disappear at any time. Traditional tools
designed for static environments struggle to track such changes.

 In monolithic systems, components communicate internally, often within the same server,
making performance issues easier to trace.

 Microservices architectures involve many loosely coupled services communicating over

networks using APIs. Monitoring these interactions requires tools capable of tracking
dependencies and pinpointing performance bottlenecks across services.

 Monolithic monitoring tools primarily focus on system-level metrics like CPU usage or
memory consumption.

 Cloud-native systems demand observability at a deeper level—covering logs, metrics, and

distributed traces—across a sprawling ecosystem. Modern tools like Prometheus, Grafana, or
Open Telemetry are designed to provide this observability.

Different types of Tracing:

Distributed tracing offers teams and organizations multiple benefits,
including improved application performance, improved compliance
with service-level agreements (SLAs), and faster time to market.

Distributed tracing enables teams to monitor and analyze the performance of an application across its
entire architecture, including microservices, APIs, and databases. By identifying bottlenecks, slow
services, or inefficient resource usage, teams can take corrective actions to optimize performance. It
provides a clear picture of how requests flow through different services, allowing developers to pinpoint
and address latency or errors at specific points in the workflow.
SLAs often specify stringent performance and uptime requirements. Distributed tracing helps
organizations meet these commitments by providing real-time visibility into application health and
performance. With detailed insights into response times and service dependencies, teams can ensure
that applications meet agreed-upon thresholds for availability and responsiveness. Moreover,
distributed tracing allows proactive monitoring and resolution of potential issues before they breach SLA
terms, reducing the risk of penalties and maintaining trust with clients or users.

By offering a detailed view of an application’s behavior, distributed tracing significantly accelerates the
debugging and troubleshooting process. It eliminates the need to sift through logs manually by providing
precise data about where and why failures occur. This efficiency enables development teams to resolve
issues quickly, focus more on building new features, and iterate faster. Additionally, distributed tracing
fosters smoother collaboration between teams working on different parts of the application, ensuring
quicker and more coordinated deployments.

Bottom-line growth refers to the increase in a company's profitability, specifically the net profit or the
"bottom line" shown on its financial statement.

Challenges of Distributed tracing:

Sampling in the context of distributed tracing refers to the practice of selectively capturing or recording
a subset of all the trace data generated by requests or transactions in a system, rather than recording
every single trace. Since capturing every trace can generate a large volume of data and incur
performance costs, sampling helps reduce the overhead by only tracking a smaller, representative
sample.

Distributed tracing is essential to monitoring, debugging, and

optimizing distributed software architecture such as microservices
— especially in dynamic microservices architectures. It tracks a
single request by collecting and analyzing data on every interaction
with every service the request touches.

In distributed tracing, each individual operation that is part of a larger request is referred to as a span
(or segment). A span represents a single unit of work performed by a service, such as a database query,
API call, or function execution. Each span contains essential metadata, such as the operation name, start
and end timestamps, and any other relevant information, like the status of the operation, logs, or error
details.
As a request travels through multiple services in a microservices-based architecture, the request
may trigger multiple spans. These spans are connected in a parent-child relationship:

 Parent Span: A span that initiates a larger unit of work and may have one or more child
spans associated with it.
 Child Span: A span triggered by a parent span, representing a sub-operation or activity
that takes place as part of the larger task.

The distributed trace is the full picture of a request's lifecycle as it traverses multiple services, and it
links all these spans together in the correct order to provide visibility into the entire flow of the request.
 End-to-end visibility: By following the trace from one service to the next, teams can
understand the flow of requests across different systems.
 Performance monitoring: Identifying which spans take the most time helps in pinpointing
performance issues, such as slow database queries or network delays.

 Troubleshooting: Distributed traces enable easier debugging by showing the sequence of

operations and any errors that may have occurred in the process.

In microservices or serverless architectures especially, distributed

tracing is essential for quickly getting answers to specific questions.
DevOps, operations teams, and site reliability engineers (SREs) all
find distributed tracing useful for the following situations:

 understanding the current health of microservices within a

distributed system.
 rapidly identifying the root cause of errors in the same setting;
and
 spotting performance bottlenecks that either currently affect or
potentially impact the user experience.

Distributed tracing can also be a strategic asset for proactively

optimizing problematic or inefficient code within certain
microservices.

Log files contain valuable details that are a crucial part of

distributed tracing, but logging and distributed tracing are not the
same. Writing log files is as much an art as it is a science. Logs must
contain enough information to trigger the appropriate action but be
lightweight to not bog down system resources. To understand the
difference, it’s helpful to first be aware of two types of logging:

1. centralized logging
2. distributed logging
In centralized logging, each service generates logs that are sent to a central logging system (like
ELK Stack, Splunk, or AWS CloudWatch), where they can be aggregated, searched, and
analyzed. This approach simplifies log management and helps in troubleshooting issues across
multiple services in a distributed system.
In distributed logging, each service or component maintains its own logs, which are stored
locally or in a decentralized manner. This can be beneficial for high-traffic systems or
applications composed of many microservices, where constant transmission of logs to a central
server could overwhelm the system's resources.
Key Performance Metrics:

Key performance metrics (KPMs) are specific, critical data points that provide insight into the
most important aspects of system performance. These metrics are used to evaluate the overall
health and efficiency of an application or service. KPMs often align with business objectives or
operational goals and can help teams identify areas that need attention. Some examples of key
performance metrics for distributed systems and applications might include:
1. Response Time / Latency: The time taken for a request to travel from the client to the
server and back. This is crucial in determining the user experience.
2. Throughput: The number of requests or transactions processed in a given time period.
This measures how much work the system is handling.
3. Error Rate: The percentage of requests or operations that fail, helping teams understand
the reliability of a system.
4. Availability: The percentage of time the system is operational and can serve requests. It's
crucial for ensuring the system’s uptime and reliability.
5. Resource Utilization: Metrics such as CPU, memory, and disk usage that track how
efficiently the system is utilizing resources.
6. Saturation: How close the system is to reaching its maximum capacity, indicating
potential for overload.

In the context of distributed tracing, key performance metrics help teams understand how their
system is performing in real time, from latency and response times to error rates and throughput.
This information allows teams to optimize performance, resolve issues, and prevent system
degradation before it impacts users.
This diagram illustrates the flow of data and interactions between various components in a typical web
application architecture
Content Delivery network: It was originally developed to speed up
the delivery of static HTML content for users all around the world. At
a fundamental level, a CDN brings content closer to the user. This
improves the performance of a web service as perceived by the
user.

To bring a service closer to the users, CDN deploys servers at

hundreds of locations all around the world. These server locations
are called point of presence, or POPs. A server inside a POP is now
commonly called an edge server. Having many POPs all over the
world ensures that every user can reach a fast edge server close to
them. Different CDNs use different technologies to direct a user's
request to the closest POP.

Two common ones are DNS-based routing and Anycast.

With DNS-based routing, each POP has its own IP address. When a
user looks up the IP address for the CDN, DNS returns the IP address
of the POP closest to them. With Anycast, all POPs share the same IP
address. When a request comes into the Anycast network for that IP
address, the network sends the request to the POP that is closest to
the requester.

Each edge server acts as a reverse proxy, with a huge content

cache.

Static contents are cached at the edge server in the content cache.
If a piece of content is in the cache, it could be quickly returned to
the user. Since the edge server only asks for a copy of the static
content from the origin server if it is not in the cache,

This greatly reduces the load and bandwidth requirements of the

origin server cluster.

A modern CDN could also transform static content into more

optimized formats. For example, it could minify JavaScript bundles
on the fly or transform an image file from an old format to a modern
one like WebP or AVIF.

The edge server also serves a very important role in the modern
HTTP stack. All TLS connections terminate at the edge server. TLS
handshakes are expensive. The commonly used TLS version, like
TLS 1.2, takes several network rounds trips to establish. By
terminating the TLS connection at the edge, it significantly reduces
the latency for the user to establish an encrypted TCP connection.
This is one reason why many modern applications send even
dynamic, uncacheable HTTP content over to CDN.
Besides performance, a modern CDN brings two other major
benefits. First is security. All modern CDNs have huge network
capacity at the edge. This is the key to providing effective DDoS
protection against large-scale attacks by having a network with
capacity much larger than the attackers.

This is especially helpful when CDN is built using anycast.

A modern CDN improves availability. A CDN by its very nature is

highly distributed. By having copies of contents available in many
pubs, a CDN can withstand many more hardware failures than the
origin servers. A modern CDN provides many benefits. If we are
serving HTTP traffic, we should be using a CDN.

Forward Proxy vs Reverse proxy:

Forward Proxy: A forward proxy is a server that sits between a group

of client machines and the internet. When those clients make
requests to websites on the internet, the forward proxy acts as a
middleman intercepts those requests and talks to the web servers
on behalf of those client machines.

A forward proxy protects the client's online identity. By using a

forward proxy to connect to a website, the IP address of the client is
hidden from the server. Only the IP address of the proxy is visible. It
would be harder to trace back to the client.

a forward proxy

can be used to bypass browsing restrictions. Some institutions like

governments, schools, and big businesses use firewalls to restrict
access to the internet. By connecting to a forward proxy outside the
firewalls, the client machine can potentially get around these
restrictions. It does not always work because the firewalls
themselves could block the connections to the proxy.

a forward proxy can be used to block access to certain content. It is

not uncommon for schools and businesses to configure their
networks to connect all clients to the web through a proxy and apply
filtering rules to disallow sites like social networks.
Configuration: A forward proxy requires the client (e.g., browser or application) to explicitly configure its
settings to point to the proxy server.

For large institutions, they usually apply a technique called

transparent proxy to streamline the process. A transparent proxy
works with layer 4 switches to redirect certain types of traffic to the
proxy automatically. There is no need to configure the client
machines to use it. It is difficult to bypass a transparent proxy when
the client is on the institution's network.

A transparent proxy does not require the client to be aware of its presence or configure any settings.
Instead, it intercepts the client's traffic and redirects it to the proxy automatically, usually with the help
of a Layer 4 switch or a router.
Transparent proxies are particularly useful in environments like schools, large corporations, or ISPs,
where managing configurations for numerous devices would be impractical.

Once configured, all web requests from the browser are routed through the proxy.

Why it Matters: Without this configuration, the forward proxy won’t intercept or process the
client’s requests.
Reverse Proxy: A reverse proxy sits between the Internet and the web servers. It intercepts the
requests from the clients and talk to the web server on behalf of the clients.

a reverse proxy could be used to protect a website. The website's IP addresses are hidden behind
the reverse proxy and are not revealed to the clients. This makes it much harder to target a DDoS
attack against a website.
Reverse proxy is used for load balancing.

A popular website handling millions of users every day is unlikely to be able to handle the traffic
with a single server. A reverse proxy can balance a large amount of incoming requests by
distributing the traffic to a large pool of web servers and effectively preventing any single one of
them from becoming overloaded.

Services like Cloudflare, put reverse proxy servers in hundreds of locations all around the world.
This puts the reverse proxy close to the users and at the same time provides a large amount of
processing capacity.

Cloudflare places reverse proxy servers in hundreds of locations worldwide, typically in content
delivery network (CDN) nodes. These servers are strategically located near users to improve
performance and reliability.

Reverse proxy caches static content. A piece of content could be cached on the reverse proxy for
a period of time. If the same piece of content is requested again from the reverse proxy, the
locally cached version could be quickly returned.

Here’ how Cloudflare reverse proxy works

 User Request:

 A user requests a website or service (e.g., example.com).

 The request is routed to the nearest Cloudflare reverse proxy server instead of the
website’s origin server.

 Processing at the Reverse Proxy:

 Caching: The proxy checks if the requested content (e.g., HTML, images, scripts) is
already cached at the node.
o If cached, it serves the content directly to the user, reducing latency and load on
the origin server.
 Security Checks: The proxy analyzes the request to detect and block malicious traffic,
such as DDoS attacks or bots.
 Compression and Optimization: The proxy optimizes resources (e.g., compressing
images) before sending them to the user.

 Forwarding to the Origin Server (if needed):

 If the requested content isn’t cached or requires dynamic processing (e.g., login pages),
the proxy forwards the request to the origin server.
 Once the origin server responds, the proxy processes the response and sends it back to the
user.

 Global Distribution:

 Cloudflare maintains servers in hundreds of global locations, meaning requests are served
from the closest proxy, reducing latency, and improving speed.

Benefits of Cloudflare’s Reverse Proxy Architecture

SSL encryption: A reverse proxy can handle SSL encryption. SSL handshake is computationally
expensive. A reverse proxy can free up the origin servers from those expensive operations.
Instead of handling SSL for all clients, a website only needs to handle SSL handshake from a
small number of reverse proxies.
For a modern website, it is not uncommon to have many layers of reverse proxies. The first layer
could be an edge service like Cloudflare. The reverse proxies are deployed to hundreds of
locations worldwide close to the users. The second layer could be an API gateway or load
balancer at the hosting provider. Many Cloud providers combine these two layers into a single
ingress service. The user would enter the Cloud network at the edge close to the user. From the
edge, the reverse proxy connects over a fast fiber network to the load balancer, where the request
is evenly distributed over a cluster of web servers.

An ingress service is a component in modern cloud and containerized environments (e.g., Kubernetes)
that manages and directs incoming traffic from external users or clients to services within the network.
It acts as a gateway or entry point to handle and route requests efficiently. The ingress service typically
performs tasks like routing, load balancing, SSL termination, and enforcing security policies.

DNS: And DNS stands for Domain Name System. And DNS resolves names to numbers. To be more
specific, it resolves domain names to IP addresses. So, if you type in a web address in your web browser,
DNS will resolve the name to a number. Because the only thing computers know are numbers. So, for
example if you wanted to go to a certain website, you would open up your web browser and type in the
domain name of that website.

So, for example let us use yahoo.com. Now technically you really do not have to type in yahoo.com to
retrieve the Yahoo web page. You can just type in the IP address instead if you already knew what the IP
address was. But since we are not accustomed to memorizing and dealing with numbers, especially
when there are millions of websites on the internet, we can just type in the domain name instead and
let DNS convert it to an IP address for us. So back to our example, when you type in yahoo.com in your
web browser, The DNS server will search through its database to find a matching IP address for that
domain name. And when it finds it, it will resolve that domain name to the IP address of the Yahoo
website. And once that is done, then your computer can communicate with the Yahoo web server and
retrieve the web page.

So when you type in yahoo.com in your web browser, and if your web browser or operating system can't
find the IP address in its own cache memory, it will send the query to the next level to what is called the
resolver server. The resolver server is basically your ISP or internet service provider. So, when the
resolver receives the query, it will check its own cache memory to find an IP address for yahoo.com and
if it can't find it, it will send the query to the next level, which is the root server.

The root servers are the top or the root of a DNS hierarchy. There are 13 sets of these root servers. And
they are strategically placed around the world, and they are operated by 12 different organizations.

And each set of these root servers has their own unique IP address. So, when a root server receives a
query for the IP address for yahoo.com, the root server is not going to know what the IP address is. But
the root server does know where to send the resolver to help it find the IP address. So, the root server
will direct the resolver to the TLD or top-level domain server for the .com domain. So, the resolver will
now ask the TLD server for the IP address for yahoo.com. The top-level domain server stores the address
information for top level domains such as .com, .net, .org and so on.

The TLD server that manages the .com domain, which yahoo.com is a part of. So, when a TLD server
receives a query for the IP address for yahoo.com, the TLD server is not going to know what the IP
address is for yahoo.com. So, the TLD will direct the resolver to the next and final level, which are the
authoritative name servers.

The authoritative name server or servers are responsible for knowing everything about the domain,
which includes the IP address. They are the final authority. So, when the authoritative name server
receives the query from the resolver, the name server will respond with the IP address for yahoo.com.
And finally, the resolver will tell your computer the IP address for yahoo.com and then your computer
can now retrieve the Yahoo web page. It is important to note that once the resolver receives the IP
address, it will store it in its cache memory in case it receives another query for yahoo.com so it does
not have to go through all those steps again.

Caching:
Caching is a common technique in modern computing to enhance system performance and reduce
response time. From the front end to the back end, caching plays a crucial role in improving the
efficiency of various applications and systems. A typical system architecture involves several layers of
caching. At each layer, there are multiple strategies and mechanisms for caching data, depending on the
requirements and the constraints of the specific application.

Caching within a computer: The most common hardware caches are L1, L2, and L3 caches. L1 cache is
the smallest and fastest cache, typically integrated into the CPU itself. It stores frequently accessed data
and instructions, allowing the CPU to quickly access them without having to fetch them from slower
memory. L2 cache is larger but slower than L1 cache and is typically located on the CPU die. L3 cache is
even larger and slower than L2 cache and is often shared between multiple CPU cores.
Another common hardware cache is the Translation Look-Aside Buffer or TLB. It stores recently used
virtual to physical address translations. It is used by the CPU to quickly translate virtual memory
addresses to physical memory addresses, reducing the time needed to access data from memory.

At the operating system level, there are page cache and other file system caches. Page cache is managed
by the operating system and resides in main memory. It is used to store recently used disk blocks in
memory. When a program requests data from the disk, the operating system can quickly retrieve the
data from memory instead of reading it from disk.

Caching in a typical application system architecture.

On the application frontend, web browsers can cache HTTP responses to enable faster retrieval of data.
When we request data over HTTP for the first time and it is returned with an expiration policy in the
HTTP header, we request the same data again and the browser returns the data from its cache if
available.
Content delivery network, or CDNs, are widely used to improve the delivery of static content, such as
images, videos, and other web access. One of the ways that CDNs speed up content delivery is through
caching. When a user requests contents from a CDN, the CDN network looks for the requested content
in its cache. If the content is not already in the cache, the CDN fetches it from the origin server and
caches it on its edge servers. When another user requests the same content, the CDN can deliver the
content directly from its cache, eliminating the need to fetch it from the origin server again.

Some load balancers can cache resources to reduce the load on backend servers. When a user requests
content from a server behind a load balancer, the load balancer can cache the response and serve it
directly to future users, who request the same content. This can improve response times and reduce the
load on backend servers.

Caching does not always have to be in memory. In messaging infrastructure, message brokers such as
Kafka can cache a massive number of messages on disk. This allows consumers to retrieve the messages
at their own pace. The messages can be cached for a long period of time based on the retention policy.
Unlike traditional in-memory caching (e.g., in Redis or Memcached), message brokers like Kafka
implement disk-based caching, which enables large-scale, durable, and efficient message storage.

How Kafka’s disk based caching works?

Distributed caches such as Redis, can store key value pairs in memory, providing high read-write
performance compared to traditional databases.

Full-text search engines, such as Elasticsearch, can index data for document search and log search,
providing quick and efficient access to specific data.
Full-text search engines, such as Elasticsearch, are specialized tools designed to index, search, and
analyze large volumes of textual data. They enable efficient and quick retrieval of specific data, even in
datasets containing millions of documents or logs.
Elasticsearch is a distributed, open-source search and analytics engine. It’s commonly used for full-text
searches, log analysis, and real-time analytics.

Key components in Elastic Search:

How Does Elasticsearch Use the Inverted Index?
Even within the database, there are multiple levels of caching available. Data is typically written to a
write-ahead log before being indexed in a B-tree. The buffer pool is a memory area used to cache query
results, while materialized views can pre-compute query results for faster performance. The transaction
log records all transactions and updates to the database, while the replication log tracks the replication
state in a database cluster.
caching is like a memory layer that stores copies of frequently accessed data. It's a strategy to speed
things up by keeping data readily available, reducing the need to fetch it from slower databases every
time it is requested. Think about a database with user profiles. A cache for this database might store the
most popular user profile so that when someone views a profile, it loads instantly instead of hitting the
database on every view. Now even with these performance gains, caching also introduces new
challenges.

Problems with Caching:

Cache Stampede:

Imagine a web server using Redis to cache pages for a set duration. With caching, the system stays
responsive under high load since resource-heavy pages are served from cache.

However, under extreme traffic, if a cached page expires, multiple threads across the web cluster may
try refreshing the expired page at the same time. This flood of requests could overwhelm the database,
potentially causing system failure and prevent the page from being re-cached
how can we prevent stampede? A few key strategies. One is locking. Upon a cache miss, each request
attempts to acquire a lock for that cache key before recomputing the expired page. If the lock is not
acquired, there are some options.

One, the request can wait until the value is recomputed by another thread.

Two, the request can immediately return a not found response and let the client handle the situation
with a backup retry.
Three, the system can maintain a stale version of the cache item to be used temporarily while the new
value is recomputed.

Locking requires an additional write operation for the lock itself, and implementing lock acquisition
correctly can be challenging.
Another solution is offloading recomputation to an external process. This method can be activated in
various ways, either proactively when a cache key is nearing expiration, or reactively when a cache miss
occurs. This approach adds another moving part to the architecture that requires careful ongoing
maintenance and monitoring.

A third approach is probabilistic early expiration. In this strategy, each request has a small chance of
proactively triggering a recomputation of the cache value before its actual expiration. The likelihood of
this happening increases as the expiration time approaches. This staggered early refreshing mitigates
the impact of Stampede since fewer keys will expire.
Cache penetration: This happens when a request is made for data that doesn't exist in the database or
cache. This results in unnecessary load as the system tries to retrieve non-existent data. This can de-
stabilize the entire system if the request volume is high. To mitigate cache penetration, implement a
placeholder value for non-existent keys. This way, follow-up requests for the same missing data hit the
placeholders in cache instead of pointlessly hitting the database again.

Setting appropriate TTL for these placeholders prevents them from occupying cache space indefinitely.
However, this approach requires careful tuning to avoid significant cache resource consumption,
especially for systems with many lookups of non-existent keys.

Another approach uses Bloom filters, a space-efficient probabilistic data structure for quickly checking if
elements are in a set before querying the databases. When new records are added to storage, their keys
are recorded in Bloom Filter. Before fetching records, the application checks the Bloom Filter first. If the
key is absent, the record conclusively doesn't exist, allowing the application to return a null value
immediately.
However, positive key presence doesn't guarantee existence. A small percentage of cache reads may still
result in misses.

Cache Crash:

Our entire cache system suddenly fails. What happens next? With no cache layer, every single request
now slams straight into the database. This sudden spike in traffic can easily overwhelm databases,
jeopardizing overall system stability. What's worse? Users start obsessively hitting refresh, compounding
the problem. A close cousin to the cache crash is cache avalanche. This can happen in two scenarios.

One, when a massive chunk of cache data expires all at once.

Two, when the cache restarts and is cold and empty.

In both cases, a crushing wave of requests hits the databases all at
once. This sudden low spike overwhelms the system, much like hundreds of people abruptly cramming
through a single tiny door after a fire alarm.

So how do we tackle these challenges? First option, implement a circuit breaker, which temporarily
blocks incoming requests when the system is clearly overloaded. This prevents total meltdown and buys
time for recovery.

Next strategy, deploy highly available cache cluster with redundancy. If parts of the cache go down,
other parts remain operational. The goal is to reduce the severity of full crashes. And don't dismiss
cache pre-warming, particularly critical after a cold start. Here, essential data is proactively populated in
the cold cache before it's put into service. This avoids abruptly bombarding the databases.
A cache stampede happens when many requests simultaneously hit the same expired cache entry,
overwhelming the database as it tries to refresh just that single data point. A cache avalanche is a
broader issue, where numerous requests for different data flood the system after a cache is cleared or
restarted. putting a strain on resources.
Caching is a powerful technique in system design that, when implemented
correctly, can drastically improve the performance, scalability, and cost-
efficiency of a system. However, it comes with its own set of challenges,
particularly around consistency and invalidation.

By understanding the different types of caches, cache placement strategies, and

best practices, you can design a robust caching strategy that meets the needs of
your application.

Caching Strategies: There are several caching strategies, depending on what a

system needs - whether the focus is on optimizing for read-
heavy workloads, write-heavy operations, or ensuring data consistency.

1. Read Through Strategy:

2. Cache Aside

Cache Aside, also known as "Lazy Loading", is a strategy where the application code handles
the interaction between the cache and the database. The data is loaded into the cache only
when needed.

The application first checks the cache for data. If the data exists in cache (cache hit), it’s
returned to the application. If the data isn't found in cache (cache miss), the application
retrieves it from the database (or the primary data store), then loads it into the cache for
subsequent requests.
Cache Aside is perfect for systems where the read-to-write ratio is high, and
data updates are infrequent. For example, in an e-commerce website, product
data (like prices, descriptions, or stock status) is often read much more
frequently than it's updated.
Write Through is ideal for consistency-critical systems, such as financial
applications or online transaction processing systems, where the cache and
database must always have the latest data.

4. Write Around
Write Around is a caching strategy where data is written directly to the
database, bypassing the cache.

The cache is only updated when the data is requested later during a read
operation, at which point the Cache Aside strategy is used to load the data into
the cache.
Distributed Caching:

What is Distributed Caching?

Distributed caching is a technique where cache data is stored across multiple

nodes (servers) instead of being confined to a single machine.

This allows the cache to scale horizontally and accommodate the needs
of large-scale applications
Dedicated Cache Servers vs. Co-located Cache

When designing a caching strategy for a distributed system, one of the critical
decisions you need to make is where to host the cache.

The two primary options are using dedicated cache servers or co-locating the
cache with application servers.

Dedicated cache servers are standalone machines or virtual instances used only
for caching. They are separate from the application servers and are optimized
for caching.
Sign in
What is Distributed Caching?
When one cache server is not enough

ASHISH PRATAP SINGH

SEP 02, 2024

108

Caching is used to temporarily store copies of frequently accessed data in

high-speed storage layers (such as RAM) to reduce latency and load on the
server or database.
Visualized using Multiplayer

When your dataset size is small, it’s usually enough to keep all the cache data
on one server.

But as the system gets bigger, the cache size also gets bigger and a single-node
cache often falls short when scaling to handle millions of users and massive
datasets.

In such scenarios, we need to distribute the cache data across multiple servers.

This is where distributed caching comes into play.

In this article, we will explore distributed caching in detail, including what it is,
why it’s important, how it works, it’s components, challenges, best practices
and popular distributed caching solutions.
📣 Design, develop and manage
distributed software better (Sponsored)

Multiplayer auto-documents your system, from the high-level logical

architecture down to the individual components, APIs, dependencies, and
environments. Perfect for teams who want to speed up their workflows and
consolidate their technical assets.

Get Started

What is Distributed Caching?

Distributed caching is a technique where cache data is stored across multiple
nodes (servers) instead of being confined to a single machine.
This allows the cache to scale horizontally and accommodate the needs
of large-scale applications.

Visualized using Multiplayer

Why Use Distributed Caching?

1. Scalability

Distributed caching allows applications to scale horizontally by adding more

cache nodes.

This helps manage more traffic without a significant drop in performance.

2. Fault Tolerance

Since data is spread across multiple nodes, the failure of a single node doesn’t
result in the loss of the entire cache.

Remaining nodes can continue to serve requests allowing the system to recover
gracefully.

3. Load Balancing

By distributing the cache across several nodes, the load is spread evenly.

This helps prevent any single node becoming a bottleneck.

🧩 Components of Distributed Caching

A distributed cache system typically consists of the following components:

1. Cache Nodes: These are the individual servers where the cache data is
stored. Each node is a part of the overall cache cluster.
2. Client Library/Cache Client: Applications use a client library to talk to
the distributed cache. This library handles the logic of connecting to cache
nodes, distributing data, and retrieving cached data.
3. Consistent Hashing: This method spreads data evenly across cache
nodes. It ensures that adding or removing nodes has minimal impact on
the system.
4. Replication: To make the system more reliable, some distributed
caches replicate data across multiple nodes. If one node goes down, the
data is still available on another.
5. Sharding: Data is split into shards, and each shard is stored on a different
cache node. It helps distribute the data evenly and allows the cache to
scale horizontally.
6. Eviction Policies: Caches implement eviction policies like LRU (Least
Recently Used), LFU (Least Frequently Used), or TTL (Time to Live) to
get rid of old or less-used data and make space for new data.
7. Coordination and Synchronization: Coordination mechanisms
like distributed locks or consensus protocols ensure that cache nodes
remain synchronized, especially when multiple nodes try to change the
same data at the same time.

⚖️Dedicated Cache Servers vs. Co-located Cache

When designing a caching strategy for a distributed system, one of the critical
decisions you need to make is where to host the cache.

The two primary options are using dedicated cache servers or co-locating the
cache with application servers.

1. Dedicated Cache Servers

Dedicated cache servers are standalone machines or virtual instances used only
for caching.

They are separate from the application servers and are optimized for caching.
Visualized using Multiplayer

Advantages:

 Scalability: They can be scaled independently of the application servers.

You can add more cache nodes without impacting the application layer.
 Resource Isolation: Keeping cache on separate servers means it won't
slow down your main servers.

Disadvantages:

 Cost: Running dedicated cache servers can be pricey, especially if you

need many such nodes.
 Network Latency: Since they are separate from the application servers, it
might take longer to fetch data from the cache.
2. Co-located Cache

Co-located cache means running the cache and the application on the same
server.

In this setup, the application and the cache share the same hardware resources,
such as CPU, memory, and network interfaces.
If you need your system to handle a lot of users, keep resources separate, and
have the budget for it, using dedicated cache servers is likely the better option.

But if your app is smaller, you want to save money, or need it to be super-fast,
putting the cache and app on the same server can work well.
Cache Invalidation is the process of removing or updating stale data from a cache to ensure it
reflects the most up-to-date version of the data stored in the underlying system (e.g., a database).
Since caching stores copies of data for faster access, it can become out-of-date when the original
data changes. Cache invalidation ensures consistency between the cache and the underlying data
source.
However, it also introduces complexities, such as data consistency, cache
invalidation, and network partitioning, which must be carefully managed.
Load Balancer:

Load balancing is the process of distributing incoming network traffic across

multiple servers to ensure that no single server is overwhelmed.
By evenly spreading the workload, load balancing aims to prevent overload on
a single server, enhance performance by reducing response times and improve
availability by rerouting traffic in case of server failures.
There are several algorithms to achieve load balancing, each with its pros and
cons.
Drawbacks:
 Does not consider server load or response time.
 Can lead to inefficiencies if servers have different processing capabilities.
Drawbacks:
 Slightly more complex to implement than simple Round Robin.
 Does not consider current server load or response time
MD5 hash of the client's IP address and uses the modulo operator to determine
the index of the server to which the request should be routed.

This ensures that requests from the same IP address are always directed to
the same server.
In system design, in-memory refers to the practice of storing data in the main memory (RAM) of a
computer instead of persisting it on disk storage (such as SSDs or HDDs). This approach is used when
systems require fast access to data, as accessing data from memory is significantly faster than reading it
from disk.

https://www.youtube.com/watch?v=4jyAzM5Ejxo

Databases:

Relational Databases: Sharding is difficult, provides vertical scaling

A schema defines the structure and organization of data in a relational database. It specifies how
tables, columns, data types, relationships, constraints, and other database elements are arranged.
Schema provides the blueprint for organizing data, ensuring efficiency, integrity, and scalability
in relational databases.
Components of a Schema
e.g. MySQL, PostgreSQL, Oracle DB, Microsoft SQL Server
Every time a customer does something we store it in the event table and use a foreign key to connect
event table with customer table.
you would use a relational database for structured data and ACID compliance.

If you have a lot of data, like billions of records, traditional databases can be slow. To improve query
performance with big data, columnar databases have been developed.
a spreadsheet populated with some customer data.

First, let us see how a traditional database works.

Before we start, there is a rule that we need to follow. We must read the data from left to right, starting
from the beginning of each row, and proceed from row to row. Kind of like reading a book. So, let us
start. Let us ask traditional database to return all paying customers.

this is where a columnar database comes in handy.

you are going to read the data from top to bottom. And you are only going to read the columns that I
ask you to. Now, let us ask the columnar database the same question to return all paying customers.
Columnar DBs: because Fixed Schema is present but no support for ACID transactions. Used for Heavy
reads.

e.g: Streaming data, event data

Apache Cassandra, HBase are examples of Column DBs

both relational and columnar databases require you to define a schema before you can store anything. If
your requirements change over time, you may need to add additional columns or perhaps change the
data type of one of them.
They support a schema-less or dynamic schema model, which means you don’t need to define all
columns in advance, making them more adaptable to changing data needs.

These databases follow an eventual consistency model, which ensures data synchronization across all
nodes over time. However, this means they may not guarantee immediate consistency.

What if you have an e-commerce application with thousands of products? Different products usually
have a different number of attributes. Managing thousands of attributes in relational databases is
inefficient and can slow down your database. This is an ideal use case for the Document database. You
can store a product with all its attributes in a single document. It simplifies management and improves
the reading speed. And if you change the attribute of one product, it will not affect others.

Document database is a type of database used to store and query data in JSON-like documents. JSON is
a data format that is both human and machine readable. Developers can use JSON documents in their
code and save them directly into the document database.
Flexible, semi-structured, and hierarchical nature of documents and document databases allows them to
evolve with the needs of applications.

Document Based: used when there is

No fixed Schema Due to no fixed schema there can be NULL or empty values and we have to handle
that in the application code

Heavy reads and writes

Do not provide ACID transactions  have to handle in application code not provided by DB

Highly Scalable, provides Sharding, Special query operations/ aggregation queries is provided by
Document based DB.
JSON:

JSON represents data in three ways. The first is the key-value pairs. These pairs recorded within curly
braces. The key is a string, and the value can be any data type, such as integer, decimal, or Boolean. For
example, a simple key-value pair is year 2024. Next is the array. Array is an ordered collection of values
defined within left and right brackets. Items in the array are comma-separated. For example, fruit,
apple, and grapes. And finally, objects.

Object is a collection of key-value pairs. Essentially, JSON documents allow developers to embed objects
and create nested pairs. For example, address, country, USA, and state, Texas.

Here is an example of a JSON-like document that describes a film dataset. You can see that the JSON
document holds simple values, arrays, and objects quite flexibly. You can even have an array with JSON
objects within it. Document-oriented databases allow you to create unlimited level hierarchy of
embedded JSON objects. It's entirely up to you what schema you want to give to your document store.--
> Flexible Schema

You can create, read, update, and delete entire documents stored in the database. Document databases
provide a query language or API that allows developers to run these operations.
Each document has a unique identifier that serves as a key. Then you can use API or query language to
read document data. You can run queries using field values or keys. You can also add indexes to the
database to increase read performance. And finally, you can update existing documents flexibly. You can
either rewrite the entire document or update individual values.

Advantages of document databases:

JSON documents map to objects, a common data type in most programming languages. When building
applications, developers can flexibly create and update documents directly from the code. This means
they spend less time creating data models beforehand. As a result, application development becomes
more rapid and efficient.
schema. A document-oriented database allows you to create multiple documents with different fields
within the same collection. This can be handy when storing unstructured data like emails or social media
posts. However, some document databases offer schema validation, allowing you to impose some
restrictions on the structure.

Another advantage is performance at scale. Document databases offer built-in distribution capabilities.
You can scale them horizontally across multiple servers without impacting performance. Which is also
cost-efficient. Also, document databases provide fault tolerance and high availability through built-in
replication.

e.g.: MongoDB

Search DBs: Full Text queries

The data that is stored in Search DBs. Search DB is not the primary data store. actual data is stored in
relational / non -relational database.
e.g. ElasticSearch

Images/videos are stored in Amazon S3

Graph Databases: Have you ever seen a detective board in the movies with pictures, news articles, and
notes connected by thumbtacks and yarn? Immediately, you can see the power of connecting the dots
in all of those relationships. Imagine taking that detective board and applying a mathematical engine
that could query its data relationships. Well, that's a graph database.

I want to explain graph databases by starting with relational databases. One of the main traits of the
relational database is constraining nature of its relationships.
Imagine you want to know the relationship between a group of 10 students originating from completely
different universities. Well, at first glance, you would think that since the students didn't go to the same
universities, they really don't have a connection. But if you look at their professors, we can discover that
they all shared a common professor when they were students. So now let us zoom in and describe some
of the traits of graph database. First, you have nodes, which are essentially records. Connected to those
nodes is a type of relationship, which can have a direction and property associated with it. So in our
case, the direction points from the original professor. The relationship type is a student of. and property
is the year and semester where they were taught.

Now, querying this database isn't like your typical SQL query. Graph database vendors often have their
own query language, so this is something that the industry is still working out.

You need to be very careful with graph databases because they can infer connections that don't actually
mean anything. For example, imagine the inference you could make if all the students in our previous
example ended up dropping out of school. Does that mean that original professor had some kind of
meaningful impact on that bad outcome? Well, anything is possible, but we have to be a little bit more
skeptical of such conspiratorial patterns. So graph databases are usually a mechanism for starting
questions. but not necessarily answering them. So what we see is data science community using graph
databases to test inferences.
e.g: Aranga DB
Vector Databases: Now, information comes in many forms. Some information is unstructured, like text
documents. pictures, videos, and audio, while some is structured, like application logs, tables, and
graphs. On the other side, we have vector databases that store data as high-dimensional vectors. Each
vector has a certain number of dimensions, ranging from tens to thousands, depending on the
complexity of the data.
Now, we can apply some kind of transformations to the raw data. We can encode all types of data in
vectors, that capture the meaning and context of the asset. This allows us to find similar assets by
searching for neighboring data points.

Vector search methods enable unique experiences, such as taking a photograph with your smartphone
and searching for similar images. You can also find documents that are similar to a given document
based on the topic and sentiment. And find products that are similar to a given product based on their
features and ratings.
e.g: milvus

Key-Value Databases: database. It stores data as a collection of key-value pairs where a key serves as a
unique identifier. Both keys and values can be anything, ranging from simple objects to complex
compound objects. The document database is a special type of key-value store where keys can only be
strings.
The document database is a special type of key-value store where keys can only be strings. Also, when
querying your document store, you can read the entire value or a part of the value, especially if the
value is another JSON object.
Advantages of key-value databases: Most key value databases

can scale horizontally and automatically distribute data across servers to reduce bottlenecks at a single
server. Then there is ease of use. Key-valued databases follow the object-oriented paradigm, allowing
developers to map real-world objects directly to software objects. Unlike relational databases, key-
valued databases don't have to perform resource-intensive table joins, which makes them much faster.
Most key value databases can scale horizontally and automatically distribute data across servers to
reduce bottlenecks at a single server. Then there is ease of use. Key-valued databases follow the object-
oriented paradigm, allowing developers to map real-world objects directly to software objects. Unlike
relational databases, key-valued databases do not have to perform resource-intensive table joins, which
makes them much faster.

Key value Stores: Caching solutions are implemented using key value stores. They are quite fast and
provide quick access. Request and Response can also be stored in key value stores.

e.g. - Redis, Memcached, DynamoDB, Amazon Aurora, Azure SQL

Use Cases:
e.g., etcd

k8s uses etcd

Time series DB:

One example of general-purpose metrics database is InfluxDB. We also frequently use Prometheus to
collect time-series data from infrastructure. Prometheus is also a time-series database with additional
features to query targets such as VMs and Kubernetes pods.
Overview of How Data Is Stored and Accessed in Databases

When working with databases, it's essential to understand how they organize, store, and retrieve
data both in RAM and on disk. Here's a breakdown to clarify your questions:
Database handles everything.

SQL vs NOSQL:

In the document model, data is stored as documents in formats such

as JSON or XML.
Each document contains a unique identifier (key) and a set of key-value
pairs (attributes). Documents can have varying structures, making the
document model schema-less or flexible.

Column Family model:

In the column-family model, data is organized into rows and columns, but
unlike the relational model, each row can have a variable number of
columns. It is optimized for fast querying and large-scale distributed storage.
In this model, querying relationships (e.g., finding all orders placed by a user) is
highly efficient, especially for applications with complex interconnected data.
Schema:
If you need to add a new column, modify a data type, or change relationships, it
often requires schema migrations.

This can lead to downtime or careful planning in production systems to avoid

disruptions.

SQL databases are typically designed to scale vertically (also known as scale-
up).
NoSQL databases are designed to scale horizontally (also known as scale-out).
Unlike SQL databases NoSQL databases do not prioritize full ACID transactions due to the need
for high availability and scalability in distributed environments.

Instead, many NoSQL databases follow the BASE model:

 Basically Available: The system guarantees availability, meaning that data can always be
read or written, even if some nodes in the distributed system are unavailable.
 Soft state: The system may be in a temporarily inconsistent state, but eventual
consistency will be reached over time.
 Eventually consistent: Over time, the system will become consistent, though it may not
happen immediately. This trades immediate consistency for higher availability.
 The BASE model is designed for scenarios where strict consistency is not
required, and performance and availability are more important, such as
real-time data analytics, social media platforms, or large-scale distributed
applications.

 While some NoSQL databases offer ACID-like features, they are

generally less robust than those in SQL databases.
Database Indexes:
A database index is a super-efficient lookup table that allows a database to find
data much faster. It holds the indexed column values along with pointers to the
corresponding rows in the table.

Without an index, the database might have to scan every single row in a massive
table to find what you want – a painfully slow process. But, with an index, the
database can zero in on the exact location of the desired data using the index’s
pointers.
Types of Data indexes:

Indexes based on Structure and Key Attributes:

Indexes based on Data Coverage:

Dense Index:
Specialized Index Types:
Filtered Index:
Covering Index:
Pending  data structures of indexes
Consistency patterns:

A distributed system provides benefits such as scalability and fault tolerance.

However, maintaining consistency across the distributed system is non-
trivial. Consistency is vital to achieving reliability, deterministic system state,
and improved user experience.

The consistency patterns can be broadly categorized as follows

 strong consistency
 eventual consistency
 weak consistency

The eventual consistency model is an optimal choice for distributed systems that
favor high availability and performance over consistency. Strong consistency is an
optimal consistency model when the same data view must be visible across the
distributed system without delay.

In the strong consistency pattern, read operations performed on any server

must always retrieve the data that was included in the latest write operation.
The strong consistency pattern typically replicates data synchronously
across multiple servers. Put another way, when a write operation is executed
on a server, subsequent read operations on every other server must return
the latest written data.

The benefits of strong consistency are the following

 simplified application logic

 increased data durability
 guaranteed consistent data view across the system
The limitations of strong consistency are as follows

 reduced availability of the service

 latency

Message Queues:

From Gaurav Sen video: Notifier+ Load Balancing+ heartbeat +persistence 

message queue

It takes tasks , persists them assigns them to the server and if server is taking too
long to give an acknowledgement it thinks server is dead and assigns it to a new
server following one of many strategies.

HLD Interview:

1. Requirements:
1.1 Functional requirements:
1.2 Non-functional requirements:
Here is a checklist of things to consider that might help you identify the most important non-
functional requirements for your system. You will want to identify the top 3-5 that are most
relevant to your system.
2. Capacity estimation
DAU Daily active users

QPS  Queries per second

In system design, capacity estimation becomes necessary when key design decisions
depend on understanding the scale of data or traffic. This is not about doing math for math's
sake; it’s about figuring out whether the chosen architecture will scale effectively for the
problem at hand. A concrete example is designing a TopK system for trending topics on
Facebook posts.
3. Core entities
4. API or system interface

Do not overthink this. Bias toward creating a REST API. Use GraphQL only if
you really need clients to fetch only the requested data (no over- or under-
fetching). If you are going to use websockets, you will want to describe the
wire protocol.
For Twitter, we would choose REST and would have the following endpoints.
Notice how we can use our core entities as the objects that are exchanged via
the API.
5. Data Flow
6. High Level Design
Now that you have a clear understanding of the requirements, entities, and API of your system,
you can start to design the high-level architecture. This consists of drawing boxes and arrows to
represent the different components of your system and how they interact. Components are
basic building blocks like servers, databases, caches, etc.
For our simple Twitter example, here is how you might build up your design, one endpoint at a
time:
7. Deep Dives

Now that you have a high-level design in place, you're going to harden your design by (a)
ensuring it meets all of your non-functional requirements (b) addressing edge cases (c)
identifying and addressing issues and bottlenecks and (d) improving the design based on
probes from your interviewer.
This can include things like expected daily/monthly users, read/write requests
per second, data storage and network bandwidth needs.
What is Denormalization?

Denormalization is a database optimization technique where we reduce the level of

normalization by intentionally duplicating data or precomputing derived values to improve read
performance. It is often used in read-heavy systems like reporting dashboards or
recommendation engines.
Typical Use Case:

 Processing user login or fetching personalized data for a logged-in user.

Cache Invalidation vs Cache Eviction:
Web Server and Application Server refer to different types of software
that help process requests and deliver content to users over the internet.

Examples:

 Apache HTTP Server

 Nginx
Examples:

 Apache Tomcat (a servlet container and application server)

 JBoss/WildFly (Java-based application server)
 GlassFish
 WebLogic
Low Level Design

SOLID Principles:

Advantages:
Now, we can have separate implementations for audio and video players:
Why DIP is Important?

Without DIP, high-level modules directly depend on low-level modules, creating tight coupling.
This makes the code harder to modify and extend. For example, if you want to change the
behavior of a low-level module, you might need to modify the high-level module as well,
breaking the design's flexibility.
Design Patterns:

Creational  Creational design patterns responsibility is to create object/controls

the creation of object.

Prototype pattern: Used to make a copy/clone from existing object since new object
creation can be expensive. Cloning should be responsibility of the class itself not the
client .
When to Use Prototype Pattern

 When the cost of creating a new instance of an object is expensive or complicated.

 When you need to create objects by copying an existing instance.
Yes, a shallow copy means that the copied object itself is a new instance,
but its fields (if they are references) still refer to the same objects as
in the original object. This applies only to fields that are references (e.g.,
objects or arrays). Let’s clarify this with details and examples.

Shallow Copy Example:

O/P:
Limitations:

 Only creates a shallow copy by default. For deep copying, you must handle it manually.
 Relies on the class implementing the Cloneable interface, which is considered less flexible
than alternatives like the copy constructor or serialization.

Singleton Pattern: It is used when we have to create only 1 instance of the class

Use cases: DB connection

4 Ways to achieve this:

Eager

--> the object is being created

at the time of class loading itself hence the name eager
Lazy  we create the object only if there is an use for the object in the getInstance
Method
--> Two threads coming at the same
time can create two objects we can avoid it using synchronized keyword.
Synchronized expensive  because every thread creates a lock --> so it is not
used Double locking is used.

Double Locking: Used in the industry

Factory pattern: The Factory Pattern ensures that the logic for creating
objects is encapsulated in a single place, typically a factory class, so that
the client code doesn’t need to be modified when new object types are
introduced. This achieves loose coupling and makes it easier to extend or
modify the application.
If we want to remove square then we do not need to modify all the classes in which
we created object of square.

If Extended with Dependency Injection (No Changes to Factory Logic)

To extend the Factory Pattern with Dependency Injection or Dynamic
Registry, we remove the need to modify the factory logic when new types
are added. Instead of hardcoding the object creation logic in the factory, we
use a registry that maps shape types to their respective class
implementations. This registry can be populated dynamically, making the
factory truly open for extension and closed for modification (OCP from
SOLID principles).
The factory method creates an instance of the shape dynamically using
reflection. Let's analyze this step by step
--> If the shapeType isn't registered, an exception is thrown.
Client code:
Abstract Factory pattern: Factory of Factory

The Abstract Factory Pattern is a creational design pattern used to provide an interface for
creating families of related or dependent objects without specifying their concrete classes. It is
commonly used when:

1. You need to create objects that are part of a group or "family" and are designed to work
together.
2. You want to enforce consistency among the objects in the group.
3. You want to encapsulate the creation logic, making it easier to introduce new families of object

Use Cases

1. Cross-platform UI toolkits (e.g., WindowsButton, MacOSButton).

2. Multiple database support (e.g., MySQL, MongoDB connectors).
3. Themed designs (e.g., DarkTheme, LightTheme widgets).
Builder Pattern: When we want to create an object step by step.
Builder Pattern is a creational design pattern used to construct
complex objects step by step. It allows for more flexible object creation,
especially when the object needs to be created with various parts, and you
want to separate the construction process from the actual representation of
the object.
Structural Design Patterns:

Decorator Pattern: The Decorator pattern is a structural design pattern that

allows you to dynamically add behavior to an object at runtime without
altering its structure. It’s often used to extend the functionalities of objects
in a flexible and reusable way.
Proxy Pattern: The Proxy Design Pattern is a structural pattern that provides an object
representing another object. This pattern is used to control access to an object by acting as an
intermediary. A proxy can be used to:

 Control access to the original object (e.g., lazy loading, access control).
 Add additional functionality to the object (e.g., logging, caching).
 Prevent direct access to the object to protect it from misuse or heavy computation.
LLD interview:
For design stack overflow question

Let’s assume the interviewer wants us to focus on:

Stack Overflow class implementation: Java

If the expectation is to demo and test the code, you can create a separate demo
class like StackOverflowDemo
You can find the implementation of Design Stack Overflow in Java and
Python here.

15k BTC Dorks
58% (12)
15k BTC Dorks
256 pages
System Design For Data Engineering
No ratings yet
System Design For Data Engineering
83 pages
System Design Basics: Key Concepts
No ratings yet
System Design Basics: Key Concepts
35 pages
System Design Interview Prep Book
No ratings yet
System Design Interview Prep Book
256 pages
Stock Market System Design
No ratings yet
Stock Market System Design
38 pages
Algomasterio System Design Interview Handbook
No ratings yet
Algomasterio System Design Interview Handbook
19 pages
Collection of System Design PDF
No ratings yet
Collection of System Design PDF
34 pages
Scale From Zero To Millions of Users
No ratings yet
Scale From Zero To Millions of Users
40 pages
System Design Interview - An Insider's Guide
90% (10)
System Design Interview - An Insider's Guide
103 pages
Ebook - Cracking The System Design Interview Course
100% (2)
Ebook - Cracking The System Design Interview Course
91 pages
System Design Interview Questions 1696745718
No ratings yet
System Design Interview Questions 1696745718
32 pages
System Design Handbooks
No ratings yet
System Design Handbooks
13 pages
DISTRIBUTED SYSTEMS - Dis Unit 1-5
No ratings yet
DISTRIBUTED SYSTEMS - Dis Unit 1-5
29 pages
System Design - ML Design 1 PDF
100% (1)
System Design - ML Design 1 PDF
24 pages
Interviewing at Google Guide Tokyo
No ratings yet
Interviewing at Google Guide Tokyo
9 pages
System Design: Interview Prep
No ratings yet
System Design: Interview Prep
30 pages
Devops Full Notes
100% (6)
Devops Full Notes
230 pages
System Design
No ratings yet
System Design
22 pages
System Design Cheat Sheet
No ratings yet
System Design Cheat Sheet
6 pages
Big Data IN A Gist
No ratings yet
Big Data IN A Gist
16 pages
Unit 4
No ratings yet
Unit 4
13 pages
Syst & DB
No ratings yet
Syst & DB
9 pages
SD Roadmap PDF
No ratings yet
SD Roadmap PDF
145 pages
System Design Terms
No ratings yet
System Design Terms
52 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
48 pages
System Design Theory Book
No ratings yet
System Design Theory Book
128 pages
The Ultimate System Design Cheat Sheet
No ratings yet
The Ultimate System Design Cheat Sheet
21 pages
Chapter01 - Part 03
No ratings yet
Chapter01 - Part 03
40 pages
Unit 2.2:-BSR (Broadcasting With Selective Reduction) 8: Class
No ratings yet
Unit 2.2:-BSR (Broadcasting With Selective Reduction) 8: Class
15 pages
System Design
No ratings yet
System Design
30 pages
Zookeeper Tutorial
100% (1)
Zookeeper Tutorial
43 pages
MDS 271 2448001
No ratings yet
MDS 271 2448001
9 pages
Unit-4 DFS-1
No ratings yet
Unit-4 DFS-1
9 pages
1504846528session31 NoSQL
No ratings yet
1504846528session31 NoSQL
12 pages
System Design
No ratings yet
System Design
49 pages
System Design Importnat Concepts
No ratings yet
System Design Importnat Concepts
16 pages
ICS 408 Exam A
No ratings yet
ICS 408 Exam A
5 pages
A Thorough Introduction To Distributed Systems
No ratings yet
A Thorough Introduction To Distributed Systems
31 pages
SDA Presentation
No ratings yet
SDA Presentation
12 pages
Destributed System Lecture Note Finale
No ratings yet
Destributed System Lecture Note Finale
148 pages
Deadlock Prevention and Avoidance-1
No ratings yet
Deadlock Prevention and Avoidance-1
2 pages
Python 3 Cheat Sheet
94% (51)
Python 3 Cheat Sheet
2 pages
Let Us Python by Yashavant Kanetkar
88% (26)
Let Us Python by Yashavant Kanetkar
429 pages
Binary Locks
No ratings yet
Binary Locks
4 pages
Parallel Processing Chapter - 5: Process Level Parallelism
No ratings yet
Parallel Processing Chapter - 5: Process Level Parallelism
30 pages
Game of Bitcoins Mega Airdrop Sheet
No ratings yet
Game of Bitcoins Mega Airdrop Sheet
9 pages
Applied Generative AI For Beginners Practical Knowledge 1703207445
93% (14)
Applied Generative AI For Beginners Practical Knowledge 1703207445
221 pages
LLM Application Through Production
100% (11)
LLM Application Through Production
254 pages
Top 100 Applications of Generative AI 1683282083
100% (15)
Top 100 Applications of Generative AI 1683282083
119 pages
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
93% (15)
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
334 pages
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
81% (48)
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
708 pages
AWS Course - All Slides
80% (10)
AWS Course - All Slides
879 pages
Smart Contract
No ratings yet
Smart Contract
39 pages
Head First Python PDF
96% (26)
Head First Python PDF
494 pages
A Technical Overview of Couchbase
No ratings yet
A Technical Overview of Couchbase
96 pages
Module 1 Blockchain
No ratings yet
Module 1 Blockchain
21 pages
BlockChain Q1 Syllabus Final
No ratings yet
BlockChain Q1 Syllabus Final
4 pages
Generative Ai Fundamentals v1
100% (16)
Generative Ai Fundamentals v1
80 pages
Parallel and Distributed - Courseoutline
No ratings yet
Parallel and Distributed - Courseoutline
2 pages
Introduction To Artificial Intelligence
93% (41)
Introduction To Artificial Intelligence
316 pages
Hands On Machine Learning With Python Concepts and Applications For Beginners - John Anderson 2018
91% (11)
Hands On Machine Learning With Python Concepts and Applications For Beginners - John Anderson 2018
166 pages
Data Structure and Algorithmic Thinking With Python Data Structure and Algorithmic Puzzles PDF
95% (21)
Data Structure and Algorithmic Thinking With Python Data Structure and Algorithmic Puzzles PDF
471 pages
Generative AI With Large Language Models
100% (3)
Generative AI With Large Language Models
31 pages
Course Outline Distributed Systems Course Outline Dstributed Systems
No ratings yet
Course Outline Distributed Systems Course Outline Dstributed Systems
2 pages
Transactions and Concurrency Control
No ratings yet
Transactions and Concurrency Control
6 pages
BIgData and Hadoop Ecosytem
No ratings yet
BIgData and Hadoop Ecosytem
8 pages
React Quickstart Step Guide PDF
100% (6)
React Quickstart Step Guide PDF
188 pages
Coursera - AWS Fundamentals-Building Serverless
No ratings yet
Coursera - AWS Fundamentals-Building Serverless
1 page
Cse TR 001 19
No ratings yet
Cse TR 001 19
15 pages
Https Steponcrypto - Com - Profile
No ratings yet
Https Steponcrypto - Com - Profile
2 pages
The Python Bible
97% (31)
The Python Bible
506 pages
Natural Language Processing With PyTorch - Build Intelligent Language Applications Using Deep Learning PDF
100% (14)
Natural Language Processing With PyTorch - Build Intelligent Language Applications Using Deep Learning PDF
210 pages
Architecture: Shared-Nothing Mysql Database Management System
No ratings yet
Architecture: Shared-Nothing Mysql Database Management System
5 pages
Full Course of Machine Learning
100% (16)
Full Course of Machine Learning
660 pages
Seminar On A P2P File Distribution System
No ratings yet
Seminar On A P2P File Distribution System
22 pages
Inter Process Communication
No ratings yet
Inter Process Communication
25 pages
AWS Certified Solution Architect Associate Study Guide V1.0 Abdul Jaseem VP Release 30 Aug 2020
100% (6)
AWS Certified Solution Architect Associate Study Guide V1.0 Abdul Jaseem VP Release 30 Aug 2020
235 pages
Mod 5
No ratings yet
Mod 5
61 pages
It 8 Sem Cloud Computing Winter 2018
No ratings yet
It 8 Sem Cloud Computing Winter 2018
2 pages
Python Notes For Professionals
100% (18)
Python Notes For Professionals
814 pages
Lesson1 Intro
No ratings yet
Lesson1 Intro
20 pages
Machine Learning Projects Python
94% (18)
Machine Learning Projects Python
134 pages
EBOOK - Python Crash Course For Data Analysis
100% (12)
EBOOK - Python Crash Course For Data Analysis
168 pages
Whitecoin Technical White Paper - en
No ratings yet
Whitecoin Technical White Paper - en
50 pages
10 Distributed Systems
No ratings yet
10 Distributed Systems
66 pages
Computer Science
No ratings yet
Computer Science
20 pages
Exam Chits
No ratings yet
Exam Chits
7 pages
ZSD Invoice Form01
No ratings yet
ZSD Invoice Form01
8 pages
Hackers Guide To Machine Learning With Python PDF
100% (15)
Hackers Guide To Machine Learning With Python PDF
272 pages
Practical Projects
100% (30)
Practical Projects
478 pages
Python Programming For Beginners - Learn Python Programming in 24 Hours PDF
100% (21)
Python Programming For Beginners - Learn Python Programming in 24 Hours PDF
133 pages
Machine Learning Projects in Python
100% (16)
Machine Learning Projects in Python
135 pages
Python Cheat Sheet: Ata Tructures
100% (12)
Python Cheat Sheet: Ata Tructures
2 pages
Object Oriented Python Tutorial
100% (20)
Object Oriented Python Tutorial
111 pages
CC Mini-Project
No ratings yet
CC Mini-Project
27 pages
Scalability By Design
From Everand
Scalability By Design
Chukwunonso Offor
No ratings yet
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
SQL Demystified: A Beginner's Roadmap to Data Retrieval and Management
From Everand
SQL Demystified: A Beginner's Roadmap to Data Retrieval and Management
Kaushal Mehta
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Database Management System
From Everand
Database Management System
Manish Soni
No ratings yet
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet
Microsoft Azure Fundamentals Exam Cram: Second Edition
From Everand
Microsoft Azure Fundamentals Exam Cram: Second Edition
IP Specialist
5/5 (1)
DBMS MASTER: Become Pro in Database Management System
From Everand
DBMS MASTER: Become Pro in Database Management System
Ummed Singh
No ratings yet
CouchDB Essentials: Definitive Reference for Developers and Engineers
From Everand
CouchDB Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
AWS Certified Solutions Architect - Professional
From Everand
AWS Certified Solutions Architect - Professional
VB Dev
No ratings yet
Amazon RDS Architecture and Administration: Definitive Reference for Developers and Engineers
From Everand
Amazon RDS Architecture and Administration: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
Concise Oracle Database For People Who Has No Time
From Everand
Concise Oracle Database For People Who Has No Time
Billy Aung Myint
No ratings yet
Redshift Essentials: Definitive Reference for Developers and Engineers
From Everand
Redshift Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
Aerospike Architecture and Implementation: Definitive Reference for Developers and Engineers
From Everand
Aerospike Architecture and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Couchbase Essentials: Definitive Reference for Developers and Engineers
From Everand
Couchbase Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Virtuoso Database Systems: The Complete Guide for Developers and Engineers
From Everand
Virtuoso Database Systems: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Amazon EMR Solutions in Cloud Computing: Definitive Reference for Developers and Engineers
From Everand
Amazon EMR Solutions in Cloud Computing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
PrestoDB in Practice: Definitive Reference for Developers and Engineers
From Everand
PrestoDB in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Superset Data Exploration and Analysis Framework: Definitive Reference for Developers and Engineers
From Everand
Superset Data Exploration and Analysis Framework: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Debezium in Action: Definitive Reference for Developers and Engineers
From Everand
Debezium in Action: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
From Everand
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering Amazon Redshift: Scalable Cloud Data Warehousing
From Everand
Mastering Amazon Redshift: Scalable Cloud Data Warehousing
Robert Johnson
No ratings yet
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet