0% found this document useful (0 votes)
23 views30 pages

System Design

Uploaded by

tejaswi nadendla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views30 pages

System Design

Uploaded by

tejaswi nadendla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

System Design

August 11, 2024

Day 1: Basics
What is System Design?
System design is the process of defining the architecture, components, modules,
interfaces, and data for a system to satisfy specific requirements.

Horizontal vs. Vertical Scaling


Horizontal Scaling: Adding more servers to a system to distribute the load.
Vertical Scaling: Adding more power (CPU, RAM) to an existing server.

What is Capacity Estimation?


Capacity estimation is the process of determining the resources (e.g., CPU,
memory, bandwidth) required to handle a specific workload.

What is HTTP?
HTTP (HyperText Transfer Protocol) is the foundation of data communication
on the web, facilitating the transfer of resources like HTML pages.

What is the Internet TCP/IP stack?


The TCP/IP stack is a suite of communication protocols used to interconnect
network devices on the internet, consisting of four layers: Application, Trans-
port, Internet, and Link.

What happens when you enter Google.com?


When you enter Google.com, your browser sends a DNS request to resolve the
domain to an IP address, establishes a TCP connection, sends an HTTP request,
and receives the response containing the webpage.

1
What are Relational Databases?
Relational databases store data in structured tables with predefined schemas
and use SQL for querying.

What are Database Indexes?


Database indexes are data structures that improve the speed of data retrieval
operations on a database table at the cost of additional storage and maintenance.

What are NoSQL databases?


NoSQL databases are non-relational databases designed for horizontal scaling
and handling large amounts of unstructured or semi-structured data.

What is a Cache?
A cache is a high-speed data storage layer that stores a subset of data, typically
transient, to serve future requests faster.

What is Thrashing?
Thrashing occurs when a system spends more time swapping data between
memory and disk than executing actual tasks, often due to insufficient memory.

What are Threads?


Threads are the smallest unit of a process that can be scheduled and executed
independently within a program.

2
Day 2: Load Balancing
What is Load Balancing?
Load balancing is the process of distributing network or application traffic across
multiple servers to ensure no single server becomes a bottleneck, improving
performance and reliability.

What is Consistent Hashing?


Consistent hashing is a technique used in distributed systems to evenly dis-
tribute data across a dynamic set of nodes, minimizing the number of data
items that need to be relocated when nodes are added or removed.

What is Sharding?
Sharding is a database partitioning technique that divides large databases into
smaller, faster, and more easily managed parts called shards, often used in
NoSQL databases to improve scalability.

3
Day 3: DataStores
What are Bloom Filters?
Bloom filters are probabilistic data structures used to test whether an element
is a member of a set, allowing for false positives but not false negatives, and are
space-efficient.

What is Data Replication?


Data replication is the process of storing copies of data on multiple servers or
locations to improve availability, fault tolerance, and access speed.

How are NoSQL databases optimized?


NoSQL databases are optimized through techniques like horizontal scaling,
sharding, denormalization, and eventual consistency to handle large-scale, dis-
tributed data efficiently.

What are Location-based Databases?


Location-based databases are designed to store and query data that is geo-
graphically located, often optimized for geospatial queries and real-time location
tracking.

Database Migrations
Database migrations involve applying incremental changes to the database schema,
such as adding or altering tables and indexes, without affecting the existing data.

4
Day 4: Consistency vs. Availability
What is Data Consistency?
Data consistency ensures that data remains accurate, valid, and in sync across
all nodes in a distributed system after any operation.

Data Consistency Levels


Data consistency levels define the guarantees a system provides regarding the
visibility of data updates, ranging from strong consistency (immediate visibility)
to eventual consistency (updates propagate over time).

Transaction Isolation Levels


Transaction isolation levels define the extent to which transactions are isolated
from each other, preventing phenomena like dirty reads, non-repeatable reads,
and phantom reads.

5
Day 5: Message Queues
What is a Message Queue?
A message queue is a form of asynchronous communication protocol where mes-
sages are stored in a queue until they are processed by the receiving application,
enabling decoupled systems.

What is the publisher-subscriber model?


The publisher-subscriber model is a messaging pattern where publishers send
messages to a topic, and subscribers receive only the messages of the topics they
are interested in.

What are event-driven systems?


Event-driven systems are architectures where system components communicate
through events, reacting to state changes asynchronously, which improves scal-
ability and responsiveness.

Database as a Message Queue


Using a database as a message queue involves storing messages in database
tables, although it may lead to performance issues and is less efficient compared
to dedicated message queue systems.

6
Day 6: DevOps Concepts
What is a Single Point of Failure?
A single point of failure is a component in a system that, if it fails, will cause
the entire system to fail, making it critical to design for redundancy and fault
tolerance.

What are Containers?


Containers are lightweight, portable, and self-sufficient environments that pack-
age an application and its dependencies, ensuring consistency across different
computing environments.

What is Service Discovery and Heartbeats?


Service discovery is the process of automatically detecting and connecting to
services in a network, while heartbeats are regular signals sent by services to
indicate they are active and healthy.

How to avoid Cascading Failures?


Cascading failures can be avoided by implementing redundancy, circuit breakers,
rate limiting, and isolating components to prevent failures from propagating
through the system.

Anomaly Detection in Distributed Systems


Anomaly detection in distributed systems involves monitoring and analyzing
system metrics to identify and respond to unusual patterns or behaviors that
could indicate failures or security threats.

Distributed Rate Limiting


Distributed rate limiting controls the rate of requests across distributed systems
to prevent overloading services, often implemented using tokens or leaky bucket
algorithms.

7
Day 7: Caching
What is Distributed Caching?
Distributed caching stores cached data across multiple servers to improve scal-
ability, fault tolerance, and access speed, ensuring data is closer to the applica-
tion.

What are Content Delivery Networks?


Content Delivery Networks (CDNs) are geographically distributed networks of
servers that cache and deliver content to users based on their location, reducing
latency and load on the origin server.

Write Policies
Write policies in caching determine how data is written to the cache, with
common strategies being write-through (synchronously writes to cache and
database) and write-back (writes to cache first, then to the database).

Replacement Policies
Replacement policies define how cached data is evicted when the cache reaches
capacity, with strategies like LRU (Least Recently Used), FIFO (First In, First
Out), and LFU (Least Frequently Used).

8
Day 8: Microservices
Microservices vs. Monoliths
Microservices are small, independent services that communicate over a network,
offering flexibility and scalability, while monoliths are single, unified applications
with tightly coupled components.

How monoliths are migrated


Monoliths are migrated to microservices by gradually extracting and refactor-
ing individual components into services, ensuring each service is independently
deployable and scalable.

9
Day 9: API Gateways
How are APIs designed?
APIs are designed by defining the endpoints, request/response formats, authen-
tication methods, and rate limits, focusing on providing a clear, consistent, and
efficient interface for consumers.

What are asynchronous APIs?


Asynchronous APIs allow the client to make a request without waiting for the
server to complete processing, enabling better performance and user experience,
especially in long-running operations.

10
Day 10: Authentication Mechanisms
OAuth
OAuth is an open standard for token-based authentication, allowing third-party
services to access user information without exposing credentials, often used for
single sign-on.

Token Based Auth


Token-based authentication involves issuing a token to the client upon successful
login, which is used to authenticate subsequent requests without needing to re-
enter credentials.

Access Control Lists and Rule Engines


Access Control Lists (ACLs) and Rule Engines are used to define and enforce
fine-grained permissions on resources, ensuring that only authorized users or
services can perform specific actions.

11
Day 11: System Design Tradeoffs
Pull vs. Push
Pull: The client requests updates or data periodically.
Push: The server sends updates or data to the client as soon as they are
available.

Memory vs. Latency


More memory can reduce latency by storing more data in-memory, reducing
access times, while less memory can increase latency due to more frequent disk
I/O operations.

Throughput vs. Latency


Increasing throughput (amount of work done in a given time) may increase
latency (time taken for a single operation) as the system processes more requests
simultaneously.

Consistency vs. Availability


In distributed systems, consistency ensures all nodes see the same data at the
same time, while availability ensures the system continues to operate even if
some nodes fail, often requiring trade-offs between the two.

Latency vs. Accuracy


Lower latency can improve system responsiveness but may result in reduced
accuracy due to incomplete data processing or estimation.

SQL vs. NoSQL databases


SQL databases provide strong consistency and ACID properties, while NoSQL
databases offer flexibility, scalability, and eventual consistency, making them
suitable for different use cases.

12
Day 12: Practice Problems

13
System Design of a Live-Streaming App
Approach:
1. Architecture: Use a distributed architecture with a combination of a
central server and a Content Delivery Network (CDN) to handle the
streaming and distribution of video content efficiently.
2. Ingestion: Capture live video from broadcasters, encode it into multiple
formats to support various devices and bandwidths, and push the streams
to the CDN.

3. Real-Time Interaction: Implement low-latency communication using


WebSockets or gRPC for real-time chat and interactions between users.
4. Scalability: Use auto-scaling groups for handling sudden spikes in user
traffic and geographically distributed edge servers to reduce latency.
5. Load Balancing: Distribute incoming traffic across multiple servers us-
ing load balancers to ensure no single server is overwhelmed.
6. Monitoring and Analytics: Implement real-time monitoring for system
health and user metrics to optimize performance and user experience.

14
System Design of Instagram
Approach:
1. Architecture: Use a microservices architecture to handle different func-
tionalities like user management, photo storage, feed generation, and no-
tifications.
2. Photo Storage: Store photos in a distributed storage system like Ama-
zon S3 or a similar object storage service, with a CDN to serve images
quickly.

3. Feed Generation: Use a distributed task queue to generate and up-


date user feeds asynchronously, ensuring scalability as the number of users
grows.
4. Database: Use a NoSQL database like Cassandra for storing user data,
posts, and comments, ensuring high availability and scalability.

5. Caching: Implement caching at multiple levels (e.g., Redis) to speed up


frequent read operations like fetching user feeds or profiles.
6. Load Balancing: Distribute user requests across multiple instances of
each microservice using load balancers.

15
System Design of Tinder
Approach:
1. Architecture: Implement a microservices architecture to handle different
services like user profile management, matching, messaging, and notifica-
tions.
2. Matching Algorithm: Use an efficient matching algorithm that consid-
ers user preferences and location, possibly incorporating machine learning
for improved recommendations.

3. Real-Time Messaging: Use WebSockets or gRPC for real-time messag-


ing between matched users, ensuring low latency.
4. Database: Use a combination of SQL for relational data (user profiles)
and NoSQL for scalable storage of matches and messages.
5. Caching: Implement caching for frequently accessed data, such as poten-
tial matches, using a system like Redis.
6. Scalability: Ensure scalability through horizontal scaling of microservices
and load balancing across instances.

16
System Design of WhatsApp
Approach:
1. Architecture: Use a microservices architecture to handle various features
like messaging, media sharing, user management, and notifications.

2. Real-Time Messaging: Implement messaging using WebSockets or XMPP


(Extensible Messaging and Presence Protocol) for real-time communica-
tion.
3. End-to-End Encryption: Ensure security by implementing end-to-end
encryption for all messages and media shared between users.
4. Database: Use NoSQL databases like Cassandra for storing messages,
with replication for high availability.
5. Media Storage: Store media files (images, videos) in a distributed stor-
age system like Amazon S3, served via a CDN for quick access.

6. Load Balancing: Use load balancers to distribute user traffic across


multiple servers, ensuring even load distribution.

17
System Design of TikTok
Approach:
1. Architecture: Use a microservices architecture to handle different func-
tionalities such as video uploading, processing, feed generation, and user
management.
2. Video Processing: Implement a video processing pipeline that includes
encoding, thumbnail generation, and storage in a distributed file system.
3. Feed Generation: Use machine learning algorithms to generate person-
alized video feeds for users based on their interests and interaction history.
4. Database: Use a NoSQL database like DynamoDB for storing user data,
video metadata, and interaction logs.
5. Caching: Cache popular videos and user profiles to reduce database load
and improve response times.

6. CDN: Use a CDN to deliver video content efficiently to users around the
world, minimizing latency.

18
System Design of an Online Coding Judge - Part 1
Approach:
1. Architecture: Use a microservices architecture to separate components
like code submission, compilation, testing, and user management.

2. Code Submission: Implement a submission system where users can


upload their code, which is then queued for compilation and testing.
3. Compilation and Testing: Set up isolated environments (e.g., Docker
containers) to compile and run code against test cases, ensuring security
and consistency.
4. Result Storage: Store the results of the tests in a database, which users
can access to see their submission outcomes.
5. Load Balancing: Use load balancers to distribute submission requests
across multiple servers for efficient processing.

19
System Design of an Online Coding Judge - Part 2
Approach:
1. Concurrency: Implement a task queue system to handle multiple con-
current submissions, ensuring that each submission is processed indepen-
dently.
2. Scalability: Scale the compilation and testing services horizontally to
handle large numbers of submissions, especially during coding contests.
3. Leaderboard: Implement a leaderboard system to rank users based on
their performance, updating it in real-time during contests.
4. Database: Use a NoSQL database to store submission results and user
rankings, ensuring quick read/write operations.
5. Caching: Cache frequently accessed data like user rankings and problem
statements to reduce database load.

20
System Design of UPI Payments
Approach:
1. Architecture: Use a distributed microservices architecture to handle
transactions, user authentication, and payment processing.

2. Security: Implement robust security measures like encryption, tokeniza-


tion, and two-factor authentication to protect user data and transactions.
3. Transaction Handling: Use ACID-compliant databases to ensure that
transactions are processed reliably and consistently.

4. High Availability: Implement replication and failover mechanisms to


ensure the system is available 24/7.
5. Load Balancing: Use load balancers to distribute transaction requests
across multiple servers, preventing bottlenecks.

6. Fraud Detection: Implement real-time monitoring and machine learning


algorithms to detect and prevent fraudulent transactions.

21
System Design of IRCTC
Approach:
1. Architecture: Use a microservices architecture to handle booking, pay-
ment processing, seat allocation, and user management.

2. Booking System: Implement a robust booking system that can handle


a high volume of concurrent users, particularly during peak hours.
3. Seat Allocation: Design a seat allocation algorithm that efficiently as-
signs seats based on user preferences and availability.

4. Database: Use a relational database to store user bookings, train sched-


ules, and seat availability, with indexing for quick queries.
5. Load Balancing: Implement load balancers to distribute user requests
across multiple servers, ensuring smooth operation during peak times.

6. High Availability: Use replication and failover strategies to ensure the


booking system remains available even in case of server failures.

22
System Design of Netflix Video Onboarding Pipeline
Approach:
1. Architecture: Use a distributed microservices architecture to handle
video ingestion, encoding, storage, and distribution.

2. Video Ingestion: Implement a pipeline to upload and ingest videos,


breaking them down into smaller segments for processing.
3. Video Encoding: Set up a distributed encoding system that processes
video segments in parallel, converting them into multiple formats and
bitrates.
4. Storage: Store encoded videos in a distributed storage system, using a
CDN to deliver them efficiently to users.
5. Scalability: Ensure scalability by distributing the encoding and storage
workloads across multiple servers and data centers.

6. Monitoring: Implement real-time monitoring to track the status of video


processing tasks, ensuring timely completion.

23
System Design of Doordash
Approach:
1. Architecture: Use a microservices architecture to handle order place-
ment, routing, delivery management, and user notifications.

2. Order Placement: Implement a robust system for handling high vol-


umes of orders, ensuring accuracy and reliability.
3. Routing and Delivery: Use algorithms for optimal routing and delivery
assignment, taking into account real-time traffic and driver availability.

4. Database: Use a NoSQL database to store order details, user preferences,


and delivery status, ensuring scalability and quick access.
5. Real-Time Tracking: Implement real-time tracking for users to follow
their orders, using WebSockets or similar technologies.

6. Scalability: Ensure scalability by horizontally scaling the microservices


and using load balancers to distribute requests.

24
System Design of Amazon Online Shops
Approach:
1. Architecture: Use a microservices architecture to handle various aspects
of the platform, including product search, inventory management, user
accounts, and payment processing.
2. Product Search: Implement a search engine optimized for fast retrieval
of products based on keywords, categories, and filters.
3. Inventory Management: Design a real-time inventory management
system that updates stock levels as orders are placed, ensuring accuracy.
4. Database: Use a combination of SQL and NoSQL databases to manage
product details, user accounts, and transaction history.
5. Scalability: Use auto-scaling and load balancing to handle large volumes
of user traffic, especially during peak shopping seasons.

6. Security: Implement secure payment gateways and data encryption to


protect user information and transaction details.

25
System Design of Google Maps
Approach:
1. Architecture: Use a distributed architecture to handle geospatial data
processing, real-time navigation, and user interaction.

2. Geospatial Data: Implement a system for managing and processing


large volumes of geospatial data, including maps, routes, and traffic infor-
mation.
3. Real-Time Navigation: Use algorithms for real-time route optimization
and turn-by-turn navigation, factoring in traffic and road conditions.
4. Database: Use a spatial database like PostGIS for storing and querying
geospatial data efficiently.
5. Load Balancing: Distribute user requests across multiple servers using
load balancers, ensuring smooth operation.

6. Scalability: Ensure scalability by horizontally scaling the data processing


and navigation services to handle millions of concurrent users.

26
System Design of Gmail
Approach:
1. Architecture: Use a microservices architecture to manage user accounts,
emails, storage, and spam filtering.

2. Email Storage: Store emails in a distributed storage system, ensuring


quick access and reliable backups.
3. Spam Filtering: Implement machine learning algorithms to detect and
filter out spam emails, ensuring user inboxes remain clean.

4. Database: Use a combination of SQL and NoSQL databases for storing


user data, emails, and metadata.
5. Scalability: Use load balancers and auto-scaling to manage high volumes
of incoming and outgoing emails, especially during peak hours.

6. Security: Ensure the security of user accounts and emails through en-
cryption, two-factor authentication, and regular security audits.

27
System Design of a Chess Website
Approach:
1. Architecture: Use a microservices architecture to handle user accounts,
matchmaking, game logic, and real-time gameplay.

2. Game Logic: Implement the chess game logic in a stateless service that
can be scaled horizontally to handle multiple games concurrently.
3. Real-Time Gameplay: Use WebSockets or gRPC to enable real-time
communication between players, ensuring low-latency gameplay.

4. Database: Use a NoSQL database to store user profiles, game history,


and leaderboards, ensuring scalability and quick access.
5. Scalability: Ensure scalability through horizontal scaling of the game
logic service and matchmaking system.

6. Security: Implement measures to prevent cheating, such as detecting


unusual patterns of play and securing communication channels.

28
System Design of Uber
Approach:
1. Architecture: Use a microservices architecture to handle user accounts,
driver management, ride matching, and payments.

2. Real-Time Matching: Implement a real-time matching system that


pairs riders with the nearest available drivers, considering factors like lo-
cation and traffic.
3. Routing and Navigation: Use real-time routing algorithms to optimize
the routes for drivers, factoring in traffic conditions and estimated arrival
times.
4. Database: Use a combination of SQL and NoSQL databases to manage
user accounts, ride history, and payment details.
5. Scalability: Ensure scalability by distributing the workload across mul-
tiple services and using load balancers to handle high traffic volumes.
6. Security: Implement secure payment gateways and encryption to protect
user data and financial transactions.

29
System Design of Google Docs
Approach:
1. Architecture: Use a microservices architecture to manage document
creation, editing, collaboration, and storage.

2. Real-Time Collaboration: Implement real-time collaboration using


Operational Transformation (OT) or Conflict-Free Replicated Data Types
(CRDTs) to handle concurrent edits.
3. Document Storage: Store documents in a distributed storage system,
with version control to manage edits and revisions.
4. Database: Use a combination of SQL and NoSQL databases to store user
data, document metadata, and collaboration history.
5. Scalability: Ensure scalability by horizontally scaling the document stor-
age and collaboration services to handle millions of concurrent users.

6. Security: Implement encryption and access control measures to protect


user documents and data from unauthorized access.

30

You might also like