0% found this document useful (0 votes)
37 views13 pages

System Design Handbooks

Uploaded by

Vadla Bhaskar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views13 pages

System Design Handbooks

Uploaded by

Vadla Bhaskar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

brijpandeyji brijpandeyji

Contents
API Gateway
Asynchronism

System Design Async API


Availability

Terminology Availability patterns


Fail-over
Replication
Availability in number
Batch Processing
Bloom Filter
Cache
The Ultimate Guide to Client caching
System Design Terms
CDN caching
and Glossary Web server caching
Database caching
Application caching

swipe swipe

brijpandeyji brijpandeyji

Contents Contents
Cache-aside Consistent Hashing
Write-through Count Min Sketch
Write-behin Data Warehousin
Refresh-ahea Row oriented Storage
Cache Stampede Column oriented storage
CDN (Content Delivery Network) Data Cub
Push CDN Denormalization
Pull CD Deserialization
CQR Disaster recovery
Clocks Distributed File Storage
Communication DNS (Domain Name System)
Consensus Document Store
Consistency patterns Enterprise Service Bus (ESB)
Weak consistency Event-Driven Architecture (EDA)
Eventual consistency Event Sourcing
Strong consistency Federation

swipe swipe
brijpandeyji brijpandeyji

Contents Contents
Geohashing Maintainability
gRPC Master-master replication
GraphQL Master-slave replication
Graph Databases Memory Cache
Indexe Message Brokers
Implicit indexes Message Queues
Composite indexe Microservices
Key-value stor mTLS
Caching N-tier architecture
Session management Non-Relational Database
High-speed data storag Object Store
Large scale systems Ordering
Latency OAuth 2.0
Linearizability OpenID Connect (OIDC)
Load balancer Partitioning/Shardin
Long polling Horizontal Sharding

swipe swipe

brijpandeyji brijpandeyji

Contents Contents
Vertical Shardin Scalability
Performance Serialization
Publish-Subscribe Server-Sent Events (SSE)
Quadtrees Service Discovery
REST SLA (Service Level Agreement)
Read API SLO (Service Level Objective)
Read Replicas SLI (Service Level Indicator)
RDBM Single Sign-On (SSO)
Relational Databases SQL Tunin
Reliability Indexing
Remote Procedure Call (RPC) Query optimization
Replication Partitioning
Request Coalescing Denormalizatio
Response time SSL (Secure Sockets Layer)
Reverse Proxy Stream Processing
Rate Limiting TCP

swipe swipe
brijpandeyji brijpandeyji
API Gateway
Contents
API Gateway refers to a server that
Three Tier Caching acts as an entry point for client
Throughput requests to access web services or
APIs.
TLS (Transport Layer Security)
Transactio
Atomicity Asynchronism
Consistency
Isolation Asynchronism refers to a programming
paradigm where multiple tasks can be
Durabilit executed concurrently, without waiting
UD for each other to complete.
Virtual Machines (VMs)
Wide column store
Web Sockets Async API
END Async APIs enable multiple clients to
initiate and manage requests
concurrently, improving system
performance and scalability.

swipe swipe

brijpandeyji brijpandeyji
Availability Replication
Availability refers to the ability of a Replication is an availability
system to remain operational and pattern that involves maintaining
accessible to users, even in the multiple copies of data in different
event of failures or outages. locations.

Availability Patterns Availability in Numbers


Availability patterns are the design
techniques and practices used to ensure Availability in numbers is an availability
high availability of a system. These pattern that involves using redundancy to
patterns include replication, failover ensure that there are multiple instances
and availability in numbers. of a system available to handle requests.

Fail Over Batch Processing


Fail-over is an availability pattern Batch processing is the execution
that involves switching to a backup of a series of jobs or tasks in a
system when the primary system single batch or group, rather than
fails. individually or in real-time.

swipe swipe
brijpandeyji brijpandeyji
Bloom Filter CDN Caching
Bloom filter is a probabilistic CDN caching involves storing
data structure used to test frequently accessed data on a
whether an element is a member Content Delivery Network.
of a set.

Cache
Web Server Caching
Cache is a temporary storage
location that stores frequently Web server caching is the
accessed data to improve system process of storing web pages on
performance. the server-side.

Client Caching Database Caching


Client caching is the process of Database caching is the process of
storing frequently used data on storing frequently used data on the
the client-side. database.

swipe swipe

brijpandeyji brijpandeyji
Application Caching Write Behind
Application caching is the process Write-behind or write-back is a cache
of storing frequently used data management technique involves
in the application. updating the cache first and then
updating the original data source
later.

Cache Aside
Refresh Ahead
Cache-aside is a cache management
technique that involves storing Refresh-ahead involves updating
data in cache only when it's the cache with anticipated data
needed. before it is requested.

Cache Stampede
Write Through
Cache stampede is where multiple
Write-through involves updating the clients simultaneously request data that
cache and the original data source is not currently in the cache, resulting
at the same time. in a surge of requests to the original
data source.
swipe swipe
brijpandeyji brijpandeyji
CDN CQRS
CDN (Content Delivery Network) is a CQRS (Command and Query Responsibility
distributed network of servers that Segregation) is an architectural pattern
are geographically distributed to that separates read and write operations
provide fast and reliable content into distinct models, each with its own
delivery to users. data store.

Push CDN Clocks


A CDN that pushes content to a network Clocks refer to a mechanism
of servers distributed around the world used to keep track of time and
before the content is requested, synchronize events across
enabling faster delivery to users. distributed systems.

Pull CDN Communication


A CDN that pulls content from the
origin server when it is requested, Communication in system design
and caches it at edge servers refers to the exchange of information
distributed around the world, between different components of a
enabling faster delivery to users. distributed system.
swipe swipe

brijpandeyji brijpandeyji
Consensus Eventual Consistency
Consensus is a mechanism used to Data will eventually become consistent,
reach agreement among distributed but there may be a delay.
components on a single value or
decision.
Strong Consistency
Clocks refer to a mechanism
Consistency Patterns used to keep track of time and
synchronize events across
Consistency patterns are techniques distributed systems.
used to ensure that different
components of a distributed system
share a consistent view of data.
Consistent Hashing
Consistent hashing is a technique to
Weak Consistency partition and distribute data across
multiple nodes in a way that minimizes
Data may be inconsistent for a short the amount of data movement when nodes
period of time in a weak consistency are added or removed from the system.
pattern.
swipe swipe
brijpandeyji brijpandeyji
Count Min Sketch Column Oriented Storage
Count Min Sketch is a probabilistic The database is partitioned vertically and
data structure used for frequency with this approach reads are performed
counting and approximate querying. easily as compared to writes.

Data Warehousing Data Cube


Data warehousing is a technique used A data cube allows data to be
in system design to store and analyze modelled and viewed in multiple
large amounts of data from multiple dimensions.
sources in a centralized repository.

Row Oriented Storage Denormalization


The database is partitioned Denormalization is a technique used in
horizontally and with this approach database design to optimize query
writes are performed easily as performance by adding redundant data to
compared to reads. a database schema.

swipe swipe

brijpandeyji brijpandeyji
Deserialization DNS (Domain Name System)
Deserialization is the process of DNS (Domain Name System) is a
converting data in a serialized hierarchical distributed system that
format back into its original form, translates human-readable domain names
such as converting binary data into into IP addresses that machines can
an object. understand.

Document Store
Disaster Recovery
A document store is a NoSQL database
Disaster recovery is a process of that stores and retrieves data in JSON,
preparing for and recovering from an XML, or other document formats,
unexpected event that causes a system providing flexible and schemaless data
outage or data loss. modeling.

Enterprise Service Bus (ESB)


Distributed File Storage
Enterprise Service Bus (ESB) is a software
Distributed file storage is a system architecture that provides a messaging
design approach that allows files to infrastructure to facilitate communication
be stored across multiple nodes or between disparate applications, services,
servers in a distributed network. and systems within an organization.
swipe swipe
brijpandeyji brijpandeyji
Event-Driven Architecture (EDA) Geohashing
Event-Driven Architecture (EDA) is a Geohashing is a system design concept
software design pattern that used for spatial indexing of geographic
emphasizes the use of events to data.
trigger and communicate between
different parts of a system.
gRPC
Event Sourcing gRPC is a high-performance open-
source Remote Procedure Call (RPC)
Event Sourcing is a design pattern in framework that uses protocol buffers
which the state of a system is derived to enable communication between
from a sequence of events. distributed systems.

GraphQL
Federation
GraphQL is a query language for APIs that
Federation is a system design allows clients to specify exactly what data
concept in which multiple they need and provides a predictable and
autonomous systems are combined efficient approach for fetching data.
into a single cohesive system.
swipe swipe

brijpandeyji brijpandeyji
Graph Databases Composite Indexes
Graph databases are NoSQL Composite indexes are created by using
databases that store data in nodes multiple columns to uniquely identify the
and edges, enabling the data points.
representation and querying of
complex, interconnected data.
Key-Value Store
Indexes A key-value store is a type of
NoSQL database that stores data
Indexes are data structures used in as key-value pairs, enabling high-
databases to speed up the retrieval of speed data retrieval and storage.
data by allowing for faster lookup of
data based on specific fields.

Caching
Implicit Indexes Key-value stores are often used as an in-
memory cache to store frequently
Implicit indexes are created by accessed data, such as user sessions, to
databases to internally to store, improve the performance of web
retrieve faster and efficiently. applications.
swipe swipe
brijpandeyji brijpandeyji
Session Management Latency
Key-value stores are often used to Latency is the time it takes for a request
store session data, such as user to be sent and a response to be received,
information and shopping cart and is an important factor in system
contents, to keep track of the current performance and user experience.
state of a user’s session.

Linearizability
High-speed Data Storage
Linearizability is a consistency
Key-value stores are often used to model used in distributed systems to
store large amounts of data quickly ensure that all nodes see the same
and efficiently, such as in real-time order of events, even when requests
analytics or distributed systems. are processed concurrently.

Large Scale Systems Load Balancer


Large scale systems refer to complex A load balancer is a network device or
software systems that need to handle software component that distributes
large amounts of data or traffic, and traffic across multiple servers to improve
require scalable, fault-tolerant system performance and prevent
architecture. overloading of individual servers.
swipe swipe

brijpandeyji brijpandeyji
Long Polling Master-Slave Replication
Long polling is a technique used in web Master-slave replication is a database
applications to allow for real-time replication technique where one node
updates without constantly refreshing (the master) accepts updates and
the page, by holding open a connection propagates them to one or more other
until new data is available. nodes (the slaves).

Maintainability Memory Cache


Maintainability is a software quality Memory cache is a type of cache
attribute that refers to how easy it is that stores data in the memory of
to modify and maintain a system over a system, enabling high-speed data
time. retrieval.

Master-Master Replication Message Brokers


Master-master replication is a database Message brokers are software
replication technique where two or more components that enable communication
nodes can act as both the master and the between different parts of a system by
slave, allowing for data to be updated on routing messages between different
any node and then propagated to all other nodes.
nodes.
swipe swipe
brijpandeyji brijpandeyji
Message Queues N-tier Architecture
Message queues are a type of message N-tier architecture is a software
broker that allows for asynchronous architecture pattern where a system is
communication between different parts of divided into multiple layers, with each
a system by storing messages until they layer responsible for a specific set of
can be processed. functions.

Microservices
Non-Relational Databases
Microservices are a software
architecture pattern where a system is NoSQL databases are a type of
broken down into small, independent database that do not use a traditional
services that can be developed and relational data model, allowing for
deployed separately. greater flexibility and scalability.

mTLS Object Store


mTLS (Mutual Transport Layer Security) is Object store is a type of data storage
a security protocol that uses certificates system that stores data in a flat
to establish a secure connection between namespace, using unique identifiers for
two nodes, ensuring that both parties can each object.
verify each other's identity.
swipe swipe

brijpandeyji brijpandeyji
Ordering Partitioning/Sharding
Ordering refers to the sequence in A database scaling technique where data
which events occur in a system, is distributed across multiple nodes to
and is an important consideration improve performance, availability, and
in distributed systems to ensure scalability.
consistency and correctness.

OAuth 2.0 Horizontal Sharding


An authorization framework that Horizontal sharding is a database
enables third-party applications to partitioning technique where data is
access user data on a resource server distributed across multiple nodes
without the user's credentials. based on a shard key

OpenID Connect (OIDC)


Vertical Sharding
An authentication protocol that
enables single sign-on (SSO) between Vertical sharding involves splitting a
different systems using OAuth 2.0 table's columns into multiple physical
framework. tables.

swipe swipe
brijpandeyji brijpandeyji
Performance REST
A measure of how well a system or A architectural style for designing
component accomplishes its intended web services that uses HTTP methods
function, usually measured in terms of to perform operations on resources.
response time, throughput, or resource
utilization.

Publish-Subscribe Read API


An API designed for reading data
A messaging pattern where a message from a system or database, usually
producer sends messages to multiple optimized for high throughput and
consumers who are interested in receiving low latency.
them, without requiring the producer to
know the consumers' identities.

Quadtrees Read Replicas


Duplicate copies of a database that
A data structure that represents a two- are used to offload read traffic from
dimensional space partitioned into smaller the primary database, improving
regions to efficiently perform spatial performance and scalability.
queries.
swipe swipe

brijpandeyji brijpandeyji
RDBMS Remote Procedure Call (RPC)
A software system that manages A protocol that enables a program to
relational databases, providing execute a procedure on a remote
tools for data storage, retrieval, system over a network, as if it were
and modification. local.

Relational Databases Replication


A type of database that stores data in The process of copying data
tables with predefined relationships from one database to another
between them, usually managed by an for redundancy, availability, and
RDBMS. scalability purposes.

Reliability Request Coalescing


The ability of a system or component to A technique that groups multiple small
perform its intended function with a requests into a single larger request to
certain level of confidence and under reduce overhead and improve
specific conditions. performance.

swipe swipe
brijpandeyji brijpandeyji
Response Time Scalability
The time it takes for a system or The ability of a system or component to
component to respond to a request handle increasing amounts of work or
or event. traffic by adding resources or nodes to
the system.

Reverse Proxy
Serialization
A server that acts as an intermediary The process of converting an
between clients and a back-end server, object into a format that can be
usually used for load balancing, transmitted over a network or
security, and caching purposes. stored in a file.

Rate Limiting
Server-Sent Events (SSE)
A technique used to limit the amount of
traffic or requests that a system or API A protocol for real-time, bi-directional
can handle over a given period of time, to communication between a server and
prevent overload and maintain clients over HTTP, used for streaming
performance. data or notifications.

swipe swipe

brijpandeyji brijpandeyji
Service Discovery SLI (Service Level Indicator)
The process of automatically discovering A measurable metric used to track and
and registering services in a distributed monitor a service's performance or
system, usually using a centralized availability, usually based on an SLO or
registry or a peer-to-peer protocol. business requirements.

SLA (Service Level Agreement) Single Sign-On (SSO)


A contractual agreement that specifies A process that enables users to
the expected level of service and authenticate once and access multiple
performance that a provider will deliver applications or systems without
to a client. requiring additional authentication
steps.

SLO (Service Level Objective) SQL Tuning


A measurable goal or target for a The process of optimizing SQL queries
service's performance or availability, to improve their performance and
usually based on an SLA or business efficiency.
requirements.
swipe swipe
brijpandeyji brijpandeyji
Indexing Denormalization
Creating indexes on columns that are Denormalization is the process of
frequently used in WHERE clauses and purposely adding redundant data to one
JOINs can significantly improve query or more tables to improve query
performance. performance.

Query Optimization SSL (Secure Sockets Layer)


The database optimizer analyzes A security protocol that encrypts
the query and selects the most data transmitted over the internet
efficient execution plan. to ensure secure communication.

Partitioning
Stream Processing
Partitioning large tables into smaller,
more manageable chunks can improve A method of processing continuous
query performance, especially for large streams of data in real-time to derive
data sets. insights and take action.

swipe swipe

brijpandeyji brijpandeyji
TCP (Transmission Control TLS (Transport Layer Security)
Protocol) A security protocol that provides
A reliable, connection-oriented encryption and authentication of data
protocol used for transmitting data transmitted over the internet.
over the internet.

Three Tier Caching Transaction


A logical unit of work that
A caching strategy that uses three comprises one or more database
levels of cache: client, application operations that must be performed
server, and database server, to as a single, indivisible unit.
improve application performance.

Throughput Atomicity
The amount of data or transactions It means either all the operations of a
that a system can process in a given transaction are properly reflected in the
amount of time. database or none of them.

swipe swipe
brijpandeyji brijpandeyji
Consistency UDP (User Datagram Protocol)
It means the execution of a A lightweight, connectionless
transaction should be isolated so that protocol used for transmitting data
data consistency be maintained. over the internet.

Isolation Virtual Machines (VMs)


In situations where multiple transactions A software emulation of a physical
are executing, each transaction should be computer that allows multiple
unaware of the other executing operating systems to run on a
transaction and should be isolated. single physical machine.

Durability Wide Column Store


It means after the transaction is A type of NoSQL database that uses
complete, the changes that are made a column-family data model to store
to the database should persists even data in a distributed and scalable
in case of system failures. manner.

swipe swipe

brijpandeyji brijpandeyji
Web Sockets
For More Interesting
A protocol that provides bi-directional,
full-duplex communication channels Content
over a single, long-lived connection
between a client and server.

Brij Kishore Pandey

Follow Me On
LinkedIn
https://www.linkedin.com/in/brijpandeyji/

swipe

You might also like