0% found this document useful (0 votes)
47 views90 pages

90 Must Know Interview Questions

Uploaded by

antoajpstar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views90 pages

90 Must Know Interview Questions

Uploaded by

antoajpstar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 90

90 Must-Know tech

Concepts for your


next interview
Swip
e le
f

tt
ok
now
Data Partition / Sharding
Allows for the utilization of multiple
nodes, each responsible for managing a
distinct portion of the entire dataset.
Data Replication
Multiple data copies across nodes for
availability, scalability, and performance

Synchronous replication

Asynchronous replication
RPC
Interprocess communication protocol
used in distributed system used across
transport and application layers
Consisteny
Each node returning same & recent write
data at any given time

Eventual Consistency
Casual Consistency
Sequential Consistency
Strict Consistency
CAP Theorm
In a distributed system or when there is a
network partition, only two out of the
three properties can be provided
simultaneously.

Consistency
Availability
Partition Tolerance

CP, CA, AP
Rate Limiting
Rate Limiting prevent throttling errors
and improve throughput.

1️⃣ Token bucket


2️⃣ Leaking Bucket
3️⃣ Fixed window counter
4️⃣ Sliding window log
5️⃣ Sliding window counter
Canary Test
Distribute traffic between the new
version alongside the production version
deployed, test & evaluate the latest
version of the application by routing
some amount of traffic to the canary
version
Load Balancer
A Loadbalancer is a simple technology to
distribute the incoming load so that you
do not stress one server and improves
overall application scalability.

Loadbalancer helps an application


architecture improve its performance,
scalability, and availability. Apart from
load distribution, load balancers offer
many other features like service
discovery, health checking, logging,
security, etc.
Scrum
Scrum is an agile approach to developing
and releasing software products

Scrum starts with three key roles, namely

Product Owner
Scrum Master
Scrum team / Feature team /
Development team
A/B Test
Route subset of users to a new version
based on rules ( like geolocation,
selected users, browser version, and
other criteria) and carry out testing
Distributed message queue
A producer puts a message Or work in a
line, which the consumers pick up for
further processing or work defined. This
is an important concept to explain how
you can scale your system, perform
asynchronous processing,
DNS
Translates website names to IP address.
DNS service works much like a phone
directory. Services like Route53 help
configure the domain and routing, and
one needs to understand common types
of resource records like A, CNAME, NS,
MX, etc.
CDN
The content delivery network is a server
between the origin and the client

CDN is a group of proxy servers


distributed across the world. They are
placed on the network edge.

CDN helps in making the product faster,


reduces the overall latency, and enables
data-intensive applications, making them
highly available, scalable, reliable, and
secure
PUB SUB
Publish-subscribe messaging works
asynchronously, involving publishers,
topics, subscribers, and messages. It
helps to decouple the system, scalability,
Durability, etc.

A set of publishers publish the


information to an event bus, and
interested subscribers will consume the
data; both don't have to know each other.
Picture a group chat where some friends
share excellent news (publishers), and
others who want to know about it
Distributed Caching
Caching servers store the data in memory
and return it to the end users, improving
the overall performance and end-user
experience. A fleet of servers works
together to keep and serve the data in a
distributed caching system, enhancing
availability.
Database
Understanding data storage, storage,
types, and choosing a Database, RDMBS
Vs Nosql, is vital when designing a
system. It is essential to deeply
understand for what requirements
RDBMS is the right fit, and nosql is the
right fit.
Observability
It's important to know what is happening
in the system. How do we make the
system easier to debug and fix?

Observability helps the system to drive


maintainability, resiliency, and reliability.
Task Scheduler
This is an essential component that helps
one decide how the tasks must be
prioritized delegated and how much
resources need to be allocated
Unstructured Data Storage
Storing photos, videos, audio, and other
unstructured data is essential when
designing a system since these cannot be
stored in traditional databases.

One needs to create a highly available,


consistent, reliable solution. and efficient
Scaling Services
Scalability is an essential building block
of any robust system. Understanding
various scaling options and what options
should be used is critical.

From Serverless to ID generators, one


must understand the multiple options and
how to design them
Rolling Test
Rolling Test – The team releases a new
version by updating subsets of servers
group by group while conducting testing
during the rollout process.
Distributed Search
Search is the core of many system
designs. Understating how to build a
search system and the core building
blocks of a search system, including
crawling and indexing, is essential.
Blue-Green Test
Two exactly similar environments running
the application, deploy a new version on
one environment, test and switch over
traffic so end users get to use the latest
version.
Circuit breaker
A circuit breaker Designed to handle
faults with remote services that have
variable recovery time so that the entire
system does not go down and has any
cascading impact
Tap Compare Test
Record the response traffic from existing
and recently launched environments.
Analyze and compare the outcomes based
on predefined evaluation criteria used for
testing purposes.
Synthetic Test
Automated tests, including UI, SSL, and
performance, are regularly conducted to
evaluate the production environment's
functionality, performance, page loading
speed, 404 errors, and other critical
aspects, ensuring proper operation.
Throttling
Regulate resource usage for applications,
tenants, and services to avert any
overload that might impact the system.
Chaos Test
Tests are carried out to evaluate how the
system behaves when there are outages
or failures in a production environment.
Tests are devised to simulate many of
these failure conditions in the production
environment
Centralized &
Decentralized Pattern
In a centralized pattern, A Server node
AKA controller node, which manages the
work distribution, and worker nodes,
which do the actual work, are connected.

In a Decentralized pattern, multiple


server nodes can control the system, and
they can make their own decisions. The
known examples are blockchain and
crypto.
Feature Flag Test
Feature Flag Test - Feature flags are
where you can turn features on and off.
Features are tested on production using
this under various conditions to evaluate
the correctness, risk, performance, and
other aspects
Layered Architecture
The system is divided into multiple layers
(presentation, business, data), making
components reusable and portable. Each
layer is a collection of modules. The
layers are dependent on each other.

Think of a computer system as a


sandwich with different layers. Each layer
has its job (presentation, business,
data), and they all work together.
Observability-Based Test
Traces, logs, and metrics provide vital
information; testing can also benefit from
this traceability. Captured traces can be
asserted, tests can be generated based
on these traces, and many other
validations can be carried out
Pipe & Filter
The pipe and filter pattern is a standard
method for organizing data
transformation systems.

In this pattern, individual filters carry out


specific transformations on input data
and pass the converted data to the next
filter through a pipe. Image and text
editors use this.
Dog Fooding Test
It is a strategy to have internal employees
test the new version of the product,
diverting 50% of employee traffic to a few
production instances such that the
product can be evaluated before being
turned over to customers
Hexagonal
Also known as ports and adapters
architecture, it takes input via an adapter
to an application, and output is sent from
that application through a port to an
adapter
Shadow Test
Traffic is captured and replayed across
environments to test for correctness,
functioning, performance etc. Production
traffic is captured and replayed on the
new version of the release for testing and
evaluation
Monolithic Architecture
This has business logic, UI, and data are
tightly coupled and integrated and
deployed as one entity
Distributed Test
Mimicking everything in QA environments
as the production environment is
challenging, and hence, running some of
the tests like integration tests, scalability
tests, and devising disaster recovery
tests becomes vital on the production
environment
Microservices Architecture
The services are broken down into small
components that can be developed,
managed, and deployed independently.
MVC
The application is broken down into
model, view, and controller. The model
holds business logic, the view holds UI,
and the controller handles user input and
interactions.
Broker Pattern
A middleman called a broker will integrate
clients and servers. When a client
requires a service, it sends a request to
the broker, and then the broker forwards
it to the appropriate service.
Serverless
The underlying infrastrucutre is
completely abstracted so that the team
doesn't worry about the provision,
deployment, and maintenance of the
servers. Developers can focus on
developing the applications.
MVVM
MVVM: Presentation, business logic, and
data needs are segregated in this pattern.
The view model integrates the model and
view. Like a superhero team where each
member has an extraordinary power
Event Driven
Based on external events, another system
or component will act/make changes as
needed. For example, A successful
ordering system can trigger the
generation of gift vouchers to the
customer.
Mean time to detection
MTTD is amount of time between a failure
occurring and when repair operations
begin

MTTD can be reduced by having a good


observability in place
Mean time to repair
MTTR is a period of time when the
workload is unavailable while the failed
subsystem is repaired or returned to
service

MTTR can be reduced by having a good


replication, automation in place
Mean time between failure
MTBF is the average time between when a
workload begins normal operation and its
next failure

MTBF can be reduced by focusing more


on testing and implement, using the
required tools, process for that
Availabality
Every request receives a non-error
response, percentage of time the service
is accessible

A = (Uptime - Downtime / Uptime) * 100


Reliability
The probability of a service that will
continue to perform its actions for a
specified time
Scalability
A system capable of handling the
increase in workload without
compromising performance
Maintainability
The probability of the system Or service
being restored when a fault occurs

MTTR=Total Number of Repairs / Total


Maintenance Time​
Fault Tolerance
The ability of a system to continue to
operate and respond despite one of the
components failing

Fault tolerance techniques:

Replication
Checkpointing
Key-Value Store
Data is stored as a collection of key-value
pairs. A key is unique and binds a specific
value.

Scalability
Performance
Session management
Cache
Sequencer
Generating a global unique ID to track &
identify events

Range Handler
UUID
Vector Clocks
True Time
Blob Store
Used to store unstructured data that can
include audio, video, and multimedia.
This follows a flat hierarchy

Scalable
Reliable
Available
Sharded Counters
Counters have a specified number of
shards that are running on various nodes
to which requests are forwarded, and
values stored in all the shards are
summed up to get the total

Example - To get the count of all likes


of Linkedin post of a celebrity
follower
Heartbeat
At frequent intervals, nodes send a
message to indicate if they are healthy or
not. Nodes are considered failed if they
don't send the heartbeat.
Consistent Hashing
It is an algorithm used to remap the
nodes & keys when the number of nodes
changes. Keys get reassigned when the
nodes are added or removed easily. The
data distribution is not impacted
HTTP Long Polling
Used to exchange information between
client and server, the server pushes data
as it is available without the client
explicitly requesting it once the
connection has been established.

The connection is open until it reaches


the timeout.

Short polling has clients requesting


updates frequently from the server, which
server might have or may not
Web Sockets
Data can be sent and received
simultaneously by the client or server. It
is a stateful protocol.

Websocket facilitates real-time data


transfer, used in chat systems

A WebSocket initially sets up an HTTP


connection and then switches it to the
WebSocket protocol, enabling direct
transmission over the TCP channel.
Typeahead Search
It provides a list of suggestions to
complete a query in the search box for a
user, improving the overall user
experience.

The trie data structure is used to


implement this.
Multi-Tenancy
A SaaS architecture that allows multiple
users (tenants) to share the same system
while maintaining data isolation.
SaaS
SaaS is a business model that can be
multi-tenancy or single-tenancy.
Multi-Tenancy is not SaaS.

Silo, Pool, and Bridge are some SaaS


models built and offered to meet
compliance, cost, security, performance,
and other requirements. Some tenants
are okay with having a shared database
vs. dedicated one
MapReduce
Initially developed by Google to batch
process large volumes of data. In the map
phase, Key-Value data is divided into
small chunks and processed by multiple
nodes in parallel.

In the reduce phase, the data processed


is grouped & sorted by keys.
Process, Threads
Process loads the program into the
memory, which contains process ID, state

A subset of the program is a thread, and


there can be multiple threads for a
program
SOAP, REST
Both are used for data exchange.

SOAP - Web communication protocol with


pre-defined rules and highly structured.
It can work with any transport protocol
and stores state.SOAP has built-in
compliance for

REST - An architectural style with loose


guidelines, which only works with HTTPS
and is stateless
SSL
A way to secure the connection and
communication between client & server
by encryption

TLS is the more secured version of the


SSL

HTTPS is a secure version of the HTTP


protocol. SSL is part of the this
encryption
Cache Eviction Policies
First In First Out (FIFO): The cache
removes the first accessed block without
considering its frequency or previous
access count.

Last In First Out (LIFO):The cache


removes the most recently accessed
block without considering its frequency
or previous access count.

Least Recently Used (LRU): Removes the


least recently used
Cache Eviction Policies
Most Recently Used (MRU): Removes the
most recently used items.

Least Frequently Used (LFU): Removes


least frequently used items by counting.

Window TinyLFU (W-TinyLFU): Prioritizes


cache entries based on recentness and
frequency of access to retain the most
relevant data.

Time To Live (TTL): Cache entries expire


based on set dates, ignoring access
frequency or recency.
Symmetric Encryption
A Mechanism to encode the data uses the
same key for encryption and decryption.

Data can be viewed by anybody having


access to the key.
Asymmetric Encryption
A Mechanism to encode the data that
uses a combination of public and private
keys that forms the key pair

The public key is available to all, whereas


the private key must remain confidential.
Cache Hit & Cache Miss
When a component requests data, the
system first checks the cache. If the
requested data is found in the cache, it's
considered a cache hit.

It's called a cache miss if the data is not


found in the cache.
Reverse Proxy
A proxy server which is located between
the client and the backend server acts as
a gateway for client requests.

It directs requests to the appropriate


backend server

Loadbalancing
Caching
SSL
Forward Proxy
A proxy server, located between the client
and the internet

It forwards requests to the internet and


returns response to the client

Anonymity
Content Filtering
Caching
Websocket
WebSocket is a communication protocol
enabling full-duplex asynchronous
communication through a single TCP
connection.

The client and the server can send and


receive data simultaneously.

WebSocket operates as a stateful


protocol, offering faster performance
than HTTP due to its lightweight nature
HTTP Push
A communication method where the
server pushes the data to the client
without the client explicitly requesting it

Message Application
Stock App
HTTP Pull
A communication method where client
request for the data and server sends it

Message Application
Stock App
Distributed Cache
A temporary data storage that helps serve
data faster to end users by keeping the
data in memory

Multiple servers coordinate to form a


distributed cache to store and serve data.
Also, enhancing scalability and
availability as the data grows
TrueTime
TrueTime, a distributed clock available on
all Google servers, enables applications
to generate monotonically increasing
timestamps, ensuring that a timestamp T
is always greater than any timestamp T'
generated before T started across all
servers and timestamps.

Knowing about true time is vital when


designing a Sequencer
Api Gateway
An essential element facilitating
communication between clients and
servers by managing API requests and
routing them efficiently, consolidating
the requests for processing.

Key Features are :


1. Authorization and access control
2. Traffic management
3. Throttling
4. API monitoring
Quorum
In a distributed system, consensus
among nodes is crucial to avoid
inconsistencies.

For instance, a predefined minimum


number of participating nodes must
commit to an operation before it is
declared successful.
ACID Transactions
In a relational database management
system (RDBMS), four key
characteristics ensure the reliability of
transaction behavior, particularly in
maintaining consistency.

Atomicity (A): Transaction


operations are executed entirely or
not at all

Consistency: The DB remains valid


despite transaction failures or
aborts.
ACID Transactions
Isolation: One transaction's
execution doesn't interfere with
another's.

Durability: Committed transactions


persist despite failures.
Idempotency
An operation that maintains the same
result regardless of how many times they
are applied after the initial execution.

It will not change the result when applied


once or multiple times
Consensus
The agreement is necessary for a
distributed system in leader election,
data replication, blockchain, distributed
locking & transactions.

Some Algorithm used are :

Paxos
Raft
Proof of Work
Concurrency
Strategy to deal with multiple tasks or
processes executed simultaneously and
in parallel.

Optimistic concurrency - Allow


concurrent execution without locks

Pessimistic concurrency - Locks are


acquired and operations performed
Distributed Coordination
Facilitates nodes in a distributed system
to coordinate and operate to drive
synchronization and consistency
Failover
A backup system / service to taken over
when the primary has failed
Micro-Frontend Architecture
The frontend components can be built,
tested, deployed, and released
independently as part of the micro-
frontend architecture.

When interacting, all the independent


components are bundled and displayed as
one single UI, but in reality, they are
interacting with independent child
components. Each component is called a
micro-frontend, which drives a specific
business functionality.

You might also like