0% found this document useful (0 votes)

4 views73 pages

OOPs and System Design

Object-Oriented Programming (OOP) is a programming paradigm that organizes code using classes and objects, allowing for encapsulation, inheritance, polymorphism, and abstraction. These principles enhance code security, reusability, and manageability by combining data and methods into single units (objects) and controlling access to internal data. OOP is essential for developing complex applications, as it simplifies problem-solving and improves code clarity and maintainability.

Uploaded by

manishkohli030

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views73 pages

OOPs and System Design

Uploaded by

manishkohli030

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 73

OOPs and System Design

Explanation of Object-Oriented Programming (OOP) Introduction:

Object-Oriented Programming (OOP) is a way of writing and organizing code in modern programming
languages like C++, Java, Python, PHP, C#, and more. Its main focus is around classes and objects, which
allow us to structure programs in a way that mirrors real-world things and ideas.

Key Concepts in OOP:

1. Classes: A blueprint or template for creating objects. Think of a class like a blueprint for a house—you
can create multiple houses (objects) from the same blueprint.
2. Objects: Instances created from a class. If the class is a blueprint for a car, the actual car built from
that blueprint is an object.

Real-World Example:

Let’s say we’re programming a car system:

● Class: Car (the blueprint)

● Objects: Your car, my car, John’s car (each car has its own attributes, like color or model).

Each car (object) is built from the car blueprint (class), but they can have different colors, engine sizes, etc.

Why OOP is Important:

The goal of OOP is to combine data (like variables) and functions (actions or methods) that work on that data
into a single unit, called an object. This makes the code safer and easier to manage. Only the specific
functions in that object can change the data, so other parts of the code can’t mess with it.

Example:

● A car’s speed might be private data inside the object, and only a specific function like accelerate()
or brake() can change the speed.
● This prevents anyone from accidentally changing the car’s speed without using the proper methods.

Summary of OOP Features:

● Encapsulation: Protects the data by keeping it inside the object and only allowing specific functions to
access it.
● Inheritance: Allows one class to inherit features (like attributes and methods) from another class,
making code reuse easier.
● Polymorphism: Lets objects of different classes be treated as objects of a common class (e.g., both
cars and bikes are vehicles but behave differently).
● Abstraction: Simplifies complex systems by hiding unnecessary details and showing only the essential
features.
In simple terms, OOP organizes code to make it more secure, reusable, and easy to manage, just like how
real-world objects (like cars or phones) have specific behaviors and properties that are protected from random
changes.

Explanation of the Line:

"The goal of OOP is to combine data (like variables) and functions (actions or methods) that work on
that data into a single unit, called an object."

OOP combines data (variables) and functions (methods) into a single unit called an object.

● Data (Variables): Characteristics of the object, e.g., a car's color, speed, and fuel level.
● Functions (Methods): Actions the object can perform, e.g., accelerate, brake, refuel.
● Object: A combination of data and functions. A car object includes both its properties and actions.

Example: A car object has:

● Data: color, speed, fuel level.

● Functions: accelerate, brake, refuel.

By combining these into one object, the car's behavior and data are grouped together, making it easier to
manage and protect. It helps organize code and ensures that data is modified only through specific actions.

In this example, the Car class combines both the data (like color, speed, fuel level) and the functions (like
accelerate, brake, refuel) into one object. This encapsulation of data and methods in a single unit makes the
object easy to manage, and the data is only modified through specific functions, maintaining safety and control
over how the car’s state changes.

How is this secure?

1. Encapsulation:
Encapsulation is a key principle in OOP that helps protect the data and ensure security by hiding the internal
state of the object and only allowing it to be changed through specific methods.

In the example code I provided earlier, the data members (color, speed, and fuelLevel) are private,
meaning they cannot be directly accessed or modified from outside the Car class. This ensures that the
internal state of the car (e.g., its speed or fuel level) cannot be accidentally or maliciously altered by other parts
of the program.

How Encapsulation Provides Security:

1. Data Protection:
○ By making the data private, you ensure that only the class’s methods can modify the data. In
the example, speed and fuelLevel can only be changed using the accelerate(),
brake(), or refuel() methods.
○ This ensures that the car’s speed doesn’t accidentally get set to a negative number or some
invalid value, as it is controlled by the class logic.

2. Controlled Access:
○ Only methods in the class can change the data. So if a program wants to change the car’s
speed, it has to do so in a controlled way through accelerate() or brake() methods.
○This allows you to validate the input before changing the data. For example, in the brake()
function, if the speed goes below zero, we set it back to zero. This prevents the car from having
an invalid speed like -10.
3. Prevents Unexpected Behavior:
○ Imagine if you could modify the car’s fuel level directly without checks. You could accidentally
set the fuel level to something nonsensical, like 200%, which would break the logic of your
program.
○ In the refuel() method, we make sure the fuel level stays within a valid range (0% to 100%).
If you allow any random part of the program to change the fuel level directly, it can introduce
bugs or security vulnerabilities.

2. Abstraction:

Encapsulation also ties into abstraction, where you expose only the necessary details (the methods) and hide
the internal implementation from the outside world. In the Car class example:

● You don't need to know how the car increases speed or how the fuel system works. You just call the
accelerate() or refuel() method.
● This prevents external code from messing with the internal logic of the class.

How This Improves Security:

● No Direct Access to Data: Since the data (like speed or fuelLevel) is private, no other code
outside the Car class can directly change or manipulate the data. This ensures that changes to the
data are made through well-defined methods that control how the data is updated.
● Controlled Modifications: Since modifications to the data (like increasing speed or refueling) go
through methods, you can add checks, validations, and logic to prevent incorrect or harmful changes.
For example:
○ You can prevent setting a car's speed to negative values.
○ You can ensure fuel levels stay within a proper range (0-100%).
● Error Prevention: Encapsulation helps prevent bugs or errors caused by random parts of the code
directly changing important data without checks or validation. For example, you don't want someone to
accidentally set the car’s speed to 1,000 mph!

Conclusion:

The reason OOP is secure is because of encapsulation—which keeps data hidden and only allows controlled
access through specific methods. This keeps the internal state of an object safe from unexpected changes and
makes the program more robust and less prone to errors.

If anyone tries to bypass these rules (like directly changing the speed), the compiler throws an error. This
ensures that data is protected and only modified in a controlled manner.

When we say "data is kept inside the object", we are really referring to the fact that the class defines the
structure and behavior (data and methods), and when we create an object from that class, the actual data is
stored inside that object.

Clarification on Data in OOP:

1. Class:
○ Acts as a blueprint that defines the structure (data members) and behavior (methods).
○ Data members are defined but don’t hold actual values yet.
2. Object:
○ When an object is created, it gets its own copy of the data members defined in the class.
○ The data (specific values) is stored inside the object, not the class.

Why This Matters:

1. Data Protection:
○ The object stores the data safely and only allows changes via controlled methods.
2. Object-Specific Data:
○ Each object maintains its own data, ensuring independence and flexibility.
3. Organized Code:
○ Grouping data and methods within an object simplifies management and prevents unintended
interference.

A class is like a blueprint for creating objects. Think of it as a template that describes what properties
(variables) and behaviors (functions) an object will have. It doesn't take up any memory itself until you actually
create an object from it.
Objects Interaction:

Objects can interact by calling each other's methods

or sharing data. For example, multiple car objects in
a race can compare their speeds or exchange
positions, enabling collaboration or competition
between them.

Memory Allocation in Classes:

1. Static Allocation:

Memory for class properties is allocated only when an

object is created. Each object has its own separate
memory for properties.

2. Dynamic Allocation:Objects can be created

dynamically using new, allocating memory at runtime.
Access members using ->.

Key Differences:
● Static Object: Memory is automatically managed; object is destroyed when it goes out of scope.
● Dynamic Object: Memory must be explicitly deallocated using delete.

delete carPtr;
1. Abstraction

Abstraction is the concept of hiding complex

implementation details and showing only the necessary
features to the user. It allows us to focus on what an
object does, rather than how it does it.

Example: When you use a car, you don't need to know

how the engine works internally. You just need to know
how to start it, accelerate, and stop.

In programming, abstraction hides the implementation details and only shows the functionality.

2. Inheritance

Inheritance allows one class to inherit the properties

and methods of another class. This helps in code reuse
and creating a relationship between classes
(parent-child or base-derived relationship).

Example: Imagine you have a base class Vehicle

that has common features like speed. A Car class can
inherit from Vehicle, and it will automatically have the
speed property without you having to define it again.

3. Polymorphism

Polymorphism means "many forms." In OOP, it allows a single

method or function to work differently depending on the object
that calls it.

This can be achieved in two main ways:

● Compile-time polymorphism (function overloading)
● Run-time polymorphism (method overriding)

4. Encapsulation

Encapsulation is about bundling data (variables)

and the methods that operate on that data together
into a single unit (class). It also involves restricting
access to some of the object's components, which is
done by using access modifiers like private and
public.

Recap:

● Abstraction: Hides unnecessary details and

shows only the relevant parts to the user.
● Inheritance: Allows one class to inherit the
properties and behaviors of another class.
● Polymorphism: Lets methods do different
things based on the object that calls them.
● Encapsulation: Protects the internal data of
an object and only allows controlled access.

Why do we need Object-Oriented Programming (OOP)?

1. Easier development and maintenance: Dividing code into classes and objects makes large projects
manageable and maintainable.
2. Data hiding and security: Encapsulation keeps sensitive data safe and protected.
3. Solving real-world problems: Objects mirror real-world entities, making it intuitive to model complex
problems.
4. Code reusability: Inheritance allows the reuse of code across different parts of your system.
5. Generic code: Polymorphism lets you write methods that work for different types of data, reducing
repetition.

In conclusion, OOP makes programming more efficient, organized, secure, and scalable, which is
essential for building complex and robust applications.

When you create an object dynamically in C++ using the new keyword:

1. Default Constructor is Called: If the class

doesn’t have a user-defined constructor, the compiler
automatically uses a default constructor to initialize the
object.
2. Memory Allocation:The new keyword allocates
memory for the object at runtime, not before the program
runs (compile time).
3. Address Stored in Pointer: The new keyword
returns the memory address of the object created, and
this address is stored in a pointer variable. A pointer
stores the address of the allocated object, allowing you to
access and manipulate it.

The new keyword dynamically allocates memory for an

object of the Car class. The default constructor of the
Car class is called, initializing the model to "Unknown".
The memory address of the object is stored in
carPointer (which is a pointer to the object).

"You need to have a class before you can create an object. When a class is defined, no memory is
allocated, but memory is allocated when it is instantiated (i.e., an object is created)."

Explanation of Access Specifiers:

In Object-Oriented Programming (OOP), Access Specifiers control how the members (data and methods) of
a class are accessed. They determine whether the data or functions of a class can be used directly by code
outside the class or from other classes.

Access Specifiers help enforce the principle of encapsulation, which keeps sensitive data hidden from
unauthorized access and defines how objects interact with each other.

There are three main access specifiers in C++:

1. Public
2. Private
3. Protected
1. Public:

● Members marked as public can be

accessed from anywhere in the program—both
inside and outside the class.
● It’s like an open gate; any function or class
can access these members

2. Private:

● Members marked as private can only be

accessed by other member functions of the same class.
● Private members are hidden from the outside
world. Even objects of the same class cannot directly
access private members.

3. Protected:

● Members marked as protected can only be accessed

within the class itself and by derived classes (child classes).
● Protected members are like private members but are
accessible in subclasses.

Default Access Modifier:

● By default, if no access specifier is mentioned in a class,

all members are considered private.
Why Use Access Specifiers?
1. Security: They protect sensitive data by restricting access.
2. Encapsulation: By controlling access, access specifiers help maintain the integrity of the data.
3. Control: They allow specific parts of the class to be exposed while keeping other parts hidden.

Why do we use OOPs?

● It provides clarity in programming and simplifies solving complex problems.

● Encapsulation binds data and code together.
● Code can be reused, reducing redundancy.
● OOP helps hide unnecessary details through abstraction.
● Problems can be divided into smaller, more manageable parts (modular approach).
● It improves the readability, understandability, and maintainability of the code.

What are the main features of OOPs?

○ Inheritance: Allows classes to inherit properties and behaviors from other classes.
○ Encapsulation: Combines data and methods into a single unit and restricts access to some of
the object's components.
○ Polymorphism: Allows objects to be treated as instances of their parent class. It provides the
ability to call the correct method depending on the object.
○ Data Abstraction: Provides only essential information to the outside world and hides the
background details.

What are the disadvantages of OOPs?

○ Requires proper planning and pre-design work.

○ Programs can consume a lot of memory, making them less efficient in certain cases.
○ Not suitable for very small problems.
○ Requires detailed documentation for later use and maintenance.

When to Use:

● Class: Use when you need

data hiding, complex logic,
inheritance, and encapsulation.
● Struct: Use when you just
need to group data together (e.g., a
lightweight data container).
Constructors in C++

A constructor is a special member function of a class that is called automatically when an object is created. It
is used to initialize the object's data members. There are three types of constructors in C++:

1. Default Constructor: A constructor with no parameters.

2. Parameterized Constructor: A constructor that accepts parameters to initialize the object.
3. Copy Constructor: A constructor that initializes an object using another object of the same class.

Key Points about Constructors:

● A constructor has the same name as the class.

● It has no return type.
● Constructors are typically public so they can be invoked anywhere.
● If you don't define any constructor, the C++ compiler provides a default constructor.

Types of Constructors:

1. Default Constructor:
a. A constructor that doesn't take any arguments.
b. If not explicitly defined, the compiler provides a default constructor.
2. Parameterized Constructor:
● Takes parameters to initialize an object with specific values.
3. Copy Constructor:
a. A constructor that initializes an object by copying another object of the same class.
b. The copy constructor is invoked when an object is initialized using another object.
Copy Constructor Flow

A copy constructor is a special constructor in C++ that is used to create a new object as a copy of an existing
object. The flow of a copy constructor involves the following key steps:

1. Invocation: The copy constructor is called when:

○ An object is initialized using another object of the same class (e.g., Class obj2 = obj1;).
○ An object is passed by value to a function.
○ An object is returned by value from a function.
2. Parameter:

The copy constructor takes a reference to an object of the

same class as a parameter. This reference is typically passed
as a constant (const) to prevent modification of the original
object.

3. Copying Data Members:

Inside the copy constructor, the data members of the

passed object (the object to be copied) are assigned to
the new object. This creates an exact copy of the
original object.
4. Shallow vs. Deep Copy:
● Shallow Copy: The default copy constructor provided by the compiler performs a shallow
copy, where member variables (such as pointers) are copied directly. This can lead to issues
like double-free errors when both objects point to the same memory.
● Deep Copy: In a deep copy, a new copy of the dynamically allocated memory is created so that
both objects have separate memory.

Shallow Copy vs. Deep Copy

In object-oriented programming, when copying an object, a shallow copy copies the object's fields as-is,
including pointers or references to the same memory addresses. A deep copy, on the other hand, creates a
completely independent copy of the object, including the memory that the object references, so the new object
is a separate entity in memory.

1. Shallow Copy Example

A shallow copy is the default behavior when copying objects. It copies the values of data members, including
pointers. If the data members are pointers, both the original and the copied objects point to the same memory
location. This can lead to issues like double-free errors.

Explanation:

● The shallow copy

constructor simply copies the
pointer from obj1 to obj2.
Both objects now share the
same memory location for
name.
● When you modify
obj2.name, it also changes
obj1.name, because they both
point to the same memory.

Problem:

● When obj1 and obj2

go out of scope and their
destructors are called, both will
attempt to delete[] the same
memory. This leads to
double-free errors.
2. Deep Copy Example

A deep copy creates a new memory allocation for the copied object. This ensures that both objects have their
own separate memory, and changes to one object do not affect the other.

Explanation:

● The deep copy

constructor allocates new
memory for obj2's name and
then copies the actual string into
that memory.
● When obj2.name is
changed, obj1.name remains
unaffected, because they no
longer share the same memory.

Summary:
● Use a shallow copy when your class does not manage dynamic memory (i.e., doesn't use pointers).
● Use a deep copy when your class uses dynamic memory allocation to avoid issues like double-free or
unintended data sharing between objects.
Constructor Overloading in C++
Constructor overloading allows a class to have multiple constructors, each with a different set of parameters.
The compiler differentiates these constructors based on the number and type of parameters passed when
creating an object. This allows objects to be initialized in different ways, depending on the available
information.

Key Points:
1. Overloaded constructors must have different signatures (different numbers or types of parameters).
2. The appropriate constructor is invoked based on the arguments provided when an object is created.
3. Overloaded constructors enable flexibility in object initialization, depending on the needs of the
application.

Example of Constructor Overloading

In the example below, the smartphone class has three constructors with varying numbers of parameters:

Explanation:

● Constructor with 0 parameters: Initializes the object with default values (unknown, 0, false).
● Constructor with 2 parameters: Initializes the model and _5g_supported fields while leaving
year_of_manufacture with a default value (0).
● Constructor with 3 parameters: Fully initializes the object with all the provided values for model,
year_of_manufacture, and _5g_supported.

Benefits of Constructor Overloading:

1. Flexibility: Objects can be initialized differently based on the data available at the time of creation.
2. Convenience: Users of the class can choose which constructor to use depending on the information
they want to provide.
3. Code Reusability: You avoid writing multiple constructors that handle different combinations of
parameters. Instead, you define multiple constructors with the same name but different signatures.

A destructor in C++ is a special member

function that is automatically invoked when
an object is destroyed or goes out of
scope. Its main purpose is to perform
clean-up tasks, such as releasing memory
or closing file handles, that were allocated
or opened during the object's lifetime.

Key Characteristics of
Destructors:
● A destructor's name is the same as
the class name, prefixed by a tilde (~).
● There can only be one destructor
in a class, and it cannot be overloaded.
● Destructors do not accept
parameters and have no return type.
● They are typically declared in the
public section of the class.
● If a destructor is not explicitly
defined, the C++ compiler provides a
default destructor.

When is a Destructor Called?

● When the object goes out of scope (for local objects).
● When the program execution reaches the end of the main() function or the containing block.
● When the delete operator is called for a dynamically allocated object.
● When a program terminates or a block containing the object ends.

Destructor Rules:

1. Tilde Symbol: The name of the destructor should begin with a tilde (~).
2. One Destructor: There can be only one destructor in a class, and it cannot be overloaded.
3. No Parameters: Destructors cannot take any parameters.
4. No Return Type: Destructors do not have any return type, not even void.
5. Automatic Invocation: Destructors are automatically called by the compiler, and programmers cannot
call them directly.
6. Compiler-Generated Destructor: If you don't define a destructor, the compiler automatically generates
a default one, which performs a shallow cleanup (destroying non-dynamically allocated members).

Destructor Use Case:

Destructors are especially useful when dynamically allocated memory needs to be freed when an object is
destroyed.
When should the destructor use delete to free memory?

● If an object is created using new or dynamically allocates memory in its constructor, the destructor
should use delete to free that memory when the object is destroyed.

What is the return type of constructors and destructors?

● Constructors and destructors do not have a return type, not even void. They are special member
functions whose purpose is to initialize and clean up resources for an object.

The this pointer is an important

concept in C++ that holds the
address of the current object. It is
used to refer to the instance
variables of the current object,
especially in cases where local
variables and instance variables
have the same name. Let's break
down the concept and its main
usages with examples.

Explanation:

● In the set_details
method, both the local variables
(model, year_of_manufacture)
and the instance variables of the
mobile class share the same
name.
● The this pointer is used to distinguish between the local variables and the instance variables. For
example, this->model refers to the instance variable model of the current object, while model
(without this) refers to the local parameter.
● The this pointer is implicitly available in every member function and points to the object that invoked
the method. It is necessary to avoid ambiguity and to make clear that the assignment is intended for the
instance variables, not the local parameters.

Main Usages of this Pointer:

1. Referring to Instance Variables:
○ When local variables and instance variables share the same name, the this pointer is used to
distinguish between them. By using this->, you can explicitly refer to the instance variables of
the current object.
2. Passing Current Object as a Parameter:
○ The this pointer can be passed as a parameter to other methods or functions when you need
to refer to the current object within another context.
3. Returning the Current Object:
○ The this pointer can be used to return the current object from a method, which is common in
method chaining

When to Use the this Pointer:

● When you need to refer to the current object’s data members, especially in cases of naming conflicts
(like in the example above).
● When returning the current object from a method for chaining.
● When passing the current object to other methods or functions.

1. Returning the Current Object from a Method for Chaining:

In method chaining, the current object is returned by a
method to allow multiple method calls on the same
object in a single statement. This is often used in
APIs like setting multiple properties on an object.

Explanation:

● The set_model() and set_year()

methods both return a reference to the current object
(*this), which allows chaining multiple method calls
in one line.
● This is useful in cases where you want to
perform multiple operations on the same object
without writing separate statements for each method
call.

Output:Model: Note 7 Pro Year: 2019

2. Passing the Current Object to Other Methods or Functions:

The this pointer can be used to pass the current

object to another method or function, which is useful
when you need to operate on the current object inside
another context.

Explanation:

● The send_to_function() method passes

the current object (*this) to the external function
print_mobile_details().
.

● The this pointer is dereferenced using

*this, converting it from a pointer to an actual object
so it can be passed by reference.

Output:

yaml

Copy code

Mobile Model: Note 7 Pro

Manufacture Year: 2019

Summary:

1. Method Chaining: You return the current object (*this) from a method so you can call multiple methods on the
same object in a single statement.
2. Passing the Current Object: You pass the current object to another method or function by dereferencing the
this pointer (*this), which allows the function to operate on the invoking object.

In C++, whenever an object calls a member function, it implicitly passes its own address (reference to itself) to that
function using the this pointer. This this pointer holds the memory address of the current object and allows the
member function to know which object's data members are being referred to during the execution.

Key Points:
1. Implicit Object Passing: When you call a member function using an object, the object’s address is implicitly
passed to the function through the this pointer. This lets the function know which specific object it is working on,
especially when dealing with multiple objects.
2. Disambiguation: When a member function has local variables with the same name as the class's data members,
there can be ambiguity. The this pointer allows the function to distinguish between the local variables and the
object’s data members by using this-> to explicitly refer to the object's members.

Static vs. Non-static Member Functions in C++:

Non-static Member Functions:

Non-static member functions belong to a specific object of the class. They can access both static and
non-static data members of the class. A non-static member function is invoked using the object of the class,
and the this pointer is automatically passed to it, which points to the calling object.

Static Member Functions:

Static member functions, on the other hand, belong to the class itself, not to any individual object. They can
access only static data members and static member functions. Since static functions are not tied to any
specific object, the this pointer is not available in static member functions.

"The this pointer is available only within the non-static member functions of a class. If the member
function is static, it will be common to all the objects, and hence a single object can’t refer to those
functions independently."

● Non-static Functions: These are called by individual objects. The this pointer helps the function identify the
calling object and access its specific members. Each object has its own copy of non-static data members, and the
this pointer ensures the correct object is referred to.
● Static Functions: These are common to the class and not tied to any specific object. Since static functions do not
operate on individual objects, the this pointer is not needed. Static functions can only operate on static data
members, which are shared by all instances of the class. Therefore, no particular object can invoke a static
function independently, which is why the this pointer isn't available in them.

In C++, when you do not explicitly define a copy constructor or assignment operator, the compiler
automatically provides them. These are known as the default copy constructor and default assignment
operator, and they perform a shallow copy by default.

Shallow Copy by Default:

● Copy Constructor: The compiler-generated copy constructor simply copies the values of data
members from one object to another. If a member is a pointer, only the pointer's value (i.e., the memory
address) is copied, not the actual data pointed to.
● Assignment Operator: Similarly, the compiler-generated assignment operator will copy the values of
all members, including pointer members, leading to the same shallow copying issue.

Implicit Shallow Copy: The default behavior of the compiler-generated copy constructor and assignment
operator leads to a shallow copy unless explicitly overridden by the programmer to perform a deep copy.
Encapsulation in C++:
Encapsulation is a fundamental concept in object-oriented programming (OOP) that combines data and the
methods that manipulate that data into a single entity, typically a class. This provides two major benefits:
Data Hiding: Prevents direct access to the object's internal state from outside the class.
Data Control: Controls how data is accessed or modified using methods.
Key Points:
Private Members: Encapsulated data (variables) that cannot be accessed directly from outside the class.
Public Methods: Functions that provide controlled access to the private members, such as get and set
methods.

Encapsulation ensures that the internal representation of an object is hidden from the outside. Only the
methods of the object can manipulate the data, thus protecting the object from unintended interference or
misuse.

Benefits of Encapsulation:
1. Data Security: By hiding the data, you protect the class’s internal state from being altered directly from
outside, ensuring integrity.
2. Modularization: The implementation details are hidden, allowing you to change how the class operates
internally without affecting other parts of the code.
3. Maintainability: By controlling how the data is accessed and modified, encapsulation leads to better
code organization and reduces complexity.

Abstraction in C++:
Abstraction is one of the core principles of Object-Oriented Programming (OOP), which focuses on showing
only essential details to the user and hiding the internal implementation of functionalities. It simplifies complex
systems by reducing the visible complexity and only presenting the necessary interfaces to interact with an
object.

Key Points:
1. Hides Complexity: The user does not need to know how the functionality is implemented, only how to
use it.
2. Provides Relevant Information: Only the necessary details (like methods and interfaces) are exposed
to the user.
3. Implemented via Classes: In C++, abstraction is implemented using classes and access specifiers
(private, public, protected).

Real-Life Example:
When you send an email, you only see a "Send" button. You don’t know (or need to know) the technical steps
that happen behind the scenes to send the email. This is an example of abstraction where the complex
process of sending data is hidden from the user.

Advantages of Abstraction:
1. Data Security: It restricts direct access to data, ensuring data protection and security.
2. Reduces Complexity: Users interact with the object through a simple interface, without needing to
understand the underlying implementation.
3. Increases Reusability: The same class can be used in different contexts without exposing internal
details.
4. Avoids Duplication: By hiding the implementation, the same functionality can be reused without code
duplication.

Access Specifiers and Abstraction:

● Private: Members declared as private are hidden from the outside world. They can only be accessed
by the member functions of the class.
● Public: Members declared as public can be accessed from outside the class.

Access specifiers help in defining what part of the class is abstracted from the user and what is available for
external use.

Inheritance in C++

Inheritance is a fundamental concept of

Object-Oriented Programming (OOP), where one class
(derived class) acquires the properties and behavior
(data members and member functions) of another class
(base class). This allows for code reuse, better
organization, and easier maintenance.

inheritance in OOP is about creating a relationship

between classes that facilitates the extension of
functionality rather than simple code copying. It enables developers to define new classes based on existing
ones, enriching them with new capabilities while maintaining a clear and structured hierarchy. This is what sets
inheritance apart from merely using or copying existing code: it’s about building on top of it in a meaningful
way.

Key Concepts of Inheritance:

1. Base Class: The class whose properties are inherited by another class.
2. Derived Class: The class that inherits the properties and methods of the base class.
3. Modes of Inheritance: Determines how the members of the base class are accessible in the derived
class.

Modes of Inheritance:
● Public Inheritance: The public members of the base class remain public in the derived class, and
protected members stay protected.
● Protected Inheritance: The public and protected members of the base class become protected in the
derived class.
● Private Inheritance: The public and protected members of the base class become private in the
derived class.

Benefits of Inheritance:
1. Code Reusability: Inherited classes can reuse the functionality of the base class.
2. Reduced Redundancy: Common features are implemented once in the base class and reused in
derived classes.
3. Easier Maintenance: Changes in the base class automatically propagate to derived classes.
4. Faster Development: New classes can be built on top of existing ones without starting from scratch.

Types of Inheritance:
1. Single Inheritance: One class inherits from one base class.
2. Multiple Inheritance: A class can inherit from more than one base class.
3. Multilevel Inheritance: A derived class acts as a base class for another class.
4. Hierarchical Inheritance: One base class serves as the parent class for multiple derived classes.
5. Hybrid Inheritance: A combination of more than one type of inheritance.

Polymorphism in Object-Oriented Programming

Polymorphism is a core concept in Object-Oriented Programming (OOP) that enables objects to be treated as
instances of their parent class, allowing for a single interface to represent different underlying forms (data
types). It literally means "many forms" from the Greek words "poly" (many) and "morph" (forms).
A class can only have one destructor.
The destructor does not take any parameters and does not return a value.
Because a destructor has a unique role—cleaning up an object—it cannot be overloaded like other member
functions.

Real-Life Example
Consider a person who embodies different roles at the same time. For example:

● As a father, he cares for his children.

● As a husband, he supports his partner.
● As an employee, he fulfills his job responsibilities.

In different contexts, the same person exhibits different behaviors or characteristics, which is a practical
illustration of polymorphism.
Types of Polymorphism in C++
There are two main types of polymorphism in C++:

1. Compile-Time Polymorphism (Static Polymorphism)

○ Achieved through function overloading and operator overloading.

a) Function Overloading

Function overloading allows multiple functions in the same scope to

have the same name but different parameters. This enhances code
readability and enables functions to operate in various ways based
on the arguments passed.

b) Operator Overloading

C++ allows operators to be overloaded to work with user-defined

types. This means that you can redefine how operators behave
when they are used with objects of classes.

Example: Here’s how we can overload the + operator to add two

complex numbers.

Operator overloading allows you to redefine how operators (like +, -,

*, etc.) work with user-defined types (like classes). This means you
can use these operators with your custom objects in a way that
makes sense for your application, just like you would with built-in
types.
2. Runtime Polymorphism (Dynamic Polymorphism)
● Achieved through method overriding.

a) Method Overriding

This feature allows a derived class to provide a specific

implementation of a method that is already defined in its base class.
This is useful when a subclass needs to modify or completely redefine
a method of the parent class.

Rules for Method Overriding:

● Both the parent and child class methods must have the same
name.
● Both methods must have the same parameters (signature).
● It is applicable through inheritance.

Example: Here’s an example demonstrating method overriding.

Questions:

What is the difference between Abstraction and Encapsulation?

● Abstraction is the concept of hiding the complex implementation details and showing only the essential
features of the object. It's a way to create a simple interface for complex systems.
● Encapsulation, on the other hand, is about bundling the data and methods that operate on that data
within a single unit and restricting access to some of the object's components.
Example:
○ Abstraction: Using an abstract class or interface to define a set of methods without
implementing them.
○ Encapsulation: Using access modifiers (private, public, protected) to protect the object's state.

What is the difference between overloading and overriding?

● Overloading:
○ Occurs at compile time.
○ Allows multiple methods in the same class to have the same name but different parameters
(different types or numbers of arguments).
○ Example: Method overloading can be seen with constructors or function definitions.
● Overriding:
○ Occurs at runtime.
○ A derived class redefines a method from its base class with the same name and parameters.
○ Example: Method overriding occurs when you have a base class method and a derived class
method with the same signature.

What are the differences between Polymorphism and Inheritance in C++?

● Nature:
○ Inheritance: Represents a parent-child relationship between classes.
○ Polymorphism: Utilizes the inheritance relationship to allow methods to be redefined in derived
classes.
● Purpose:
○ Inheritance: Aims for code reusability and establishing a hierarchical relationship.
○ Polymorphism: Aims for flexibility and dynamic behavior in method invocation.
● Usage:
○ Inheritance: Enables derived classes to inherit behavior from base classes.
○ Polymorphism: Allows derived classes to provide their own implementation of methods defined
in base classes.

The Diamond Problem in C++ arises in the context of multiple

inheritance when two base classes inherit from the same parent
class and provide a method with the same name. This can create
ambiguity for the derived class when it tries to access the inherited
method, as the compiler cannot determine which base class's
method to call.

Explanation

1. Class A: The base class with a method show().

2. Class B: Inherits from Class A and overrides the show() method.
3. Class C: Also inherits from Class A and overrides the show() method.
4. Class D: Inherits from both Class B and Class C.
The Diamond Problem

● When you try to call show() inside D::display(), the compiler raises an error because it cannot
determine whether to call B::show() or C::show().

Test Cases to Demonstrate the Problem

You can use the following test cases to show how the diamond problem works:

○ Ambiguous Call Test: This test should result in a compile-time error.

○ Explicit Call Test: Demonstrates how to resolve ambiguity by explicitly specifying which base
class’s method to call.

Solution to the Diamond Problem

To resolve the diamond problem in C++, you can use virtual

inheritance. By declaring the inheritance from the base class
as virtual, you ensure that only one instance of the base class
is created in the derived class.
Virtual Functions in C++

A virtual function is a member function in a base

class that you expect to override in derived classes. It
is declared with the virtual keyword. The main
purpose of a virtual function is to achieve runtime
polymorphism. This means that the appropriate
function is called based on the type of the object
being pointed to, not the type of the pointer.

Output: Derived Function

Explanation

● In the main() function, a Derived object is

created, and a pointer of type Base is used to point to
it.
● When base1->print() is called, C++
determines which version of the print function to invoke at runtime based on the type of the object
(derived1), resulting in the output of "Derived Function."

Pure Virtual Functions

A pure virtual function is a virtual function

that is declared in a base class but does not
have an implementation in that class. It is
denoted by assigning 0 in its declaration.
Pure virtual functions make a class abstract,
meaning you cannot instantiate the class
directly; it must be derived from.
Syntax for Pure Virtual Function
class A {
public:
virtual void s() = 0; // Pure virtual function
};

Explanation

● In this example, AbstractBase is an

abstract class with a pure virtual function show().
● The ConcreteDerived class provides an implementation for the show() method.
● Attempting to instantiate AbstractBase would result in a compile-time error since it is abstract.
● The derived class is instantiated, and both direct and polymorphic calls to show() work correctly.

Output
Concrete Implementation of show()
Concrete Implementation of show()
Summary

● Virtual functions allow you to call derived class methods through base class pointers, enabling
runtime polymorphism.
● Pure virtual functions make a class abstract and require derived classes to provide implementations,
ensuring that they cannot be instantiated without implementing the required methods.

Abstract Class in C++

An abstract class in C++ is a class that

cannot be instantiated on its own. It serves
as a blueprint for derived classes, which
must implement its pure virtual functions.
An abstract class typically has at least one
pure virtual function, which is a function
declared with = 0.

Output

Virtual Function in Derived Class

Explanation

● Base Class: The Base class is an

abstract class with a pure virtual function
s(). This means you cannot create an
object of Base.
● Derived Class: The Derived class implements the pure virtual function s(). This allows the Derived
class to be instantiated.
● Upcasting: In main(), a pointer of type Base is created. It points to an object of Derived. When
b->s() is called, it invokes the s() function from the Derived class.

Key Properties of Abstract Classes:

1. Cannot be Instantiated: You cannot create objects of an abstract class directly.

2. Pure Virtual Functions: An abstract class contains at least one pure virtual function.
3. Can Have Normal Functions: It can have both pure virtual functions and regular member functions.
4. Upcasting: Abstract classes are often used for upcasting, allowing derived classes to be treated as
base classes.
5. Derived Class Implementation: If a derived class does not implement all pure virtual functions of the
base abstract class, it also becomes an abstract class.

Important Notes

● If the derived class does not provide an implementation for the pure virtual function, it will also be
treated as an abstract class and cannot be instantiated.
● Abstract classes are commonly used in scenarios where you want to define a common interface for
multiple derived classes without providing a complete implementation in the base class.

Use Case of Abstract Classes

Abstract classes are often used in designing frameworks where you want to enforce a particular set of
functionalities in derived classes while allowing them to have their own specific implementations. For
instance, in a GUI framework, you might have an abstract class Shape with a pure virtual function
draw(), and various derived classes like Circle, Rectangle, etc., would implement the draw()
function according to their specific requirements.

Friend Function in C++

A friend function in C++ is a special type of function that is

allowed to access the private and protected members of a class.
This function is not a member of the class but can be granted
access to the class's private and protected data members by
declaring it as a friend within the class.

Explanation

1. Class Definition: The Rectangle class contains a

private member length.
2. Friend Function Declaration: The function
printLength is declared as a friend of the Rectangle class.
This allows printLength to access the private member length.
3. Function Definition: The function printLength takes a Rectangle object as a parameter, modifies
its length member, and returns the updated value.
4. Main Function: An object of Rectangle is created, and printLength is called with this object. The
friend function is able to access and modify the private data member of the Rectangle class.

Key Points About Friend Functions

● Access Rights: A friend function can access the private and protected data of the class it is a friend of,
just like a member function.
● Declaration: The friend function is declared inside the class using the friend keyword.
● Scope: Friend functions are not member functions; hence they are not in the scope of the class. They
can be called just like any other function, without an object of the class.
● Not Inherited: Friendship is not inherited. If a class A has a friend function, and class B inherits from A,
class B does not automatically have access to A's friend function.

Characteristics of Friend Functions

● Access Control: A friend function can access private and protected members of the class, providing
flexibility when designing interfaces.
● Normal Function: It can be called like a normal function, without needing an object of the class.
● Scope: Friend functions are not part of the class's scope and cannot be called using the class object.
● No Direct Access: While they can access private data, they need an object of the class to do so (using
the dot operator).

Use Cases
● Operator Overloading: Friend functions are often used in operator overloading to enable operators to
access private members of the classes they operate on.
● Non-Member Functions: When you want to create functions that operate on class objects but do not
logically belong to any particular class.

Yes, a friend function in C++ can be either:

● Global Function: Can be defined outside any class and still access private members of the
class it is a friend of.
● Member of Another Class: A member function of another class can also be declared as a
friend and can access the private members of the class it is a friend of.

1. Can we have a constructor as Virtual?

Answer: Constructors cannot be virtual because they need to be defined in the class.

Explanation: Virtual functions are used to achieve runtime polymorphism, while constructors are used for
object initialization. Since constructors must always be called to create an instance of a class, it doesn't make
sense to have a virtual constructor. However, you can use virtual destructors to ensure proper cleanup of
derived class resources when an object is deleted through a base class pointer.

2. Does every virtual function need to be always overridden?

Answer: No, it is not always mandatory to redefine a virtual function. It can be used as it is in the base class.
Explanation: A derived class can choose to override a virtual function from its base class, but it is not required
to do so. If a derived class does not override the virtual function, it will inherit the base class's implementation.
This allows flexibility in how derived classes can use or customize the behavior of the base class.

SYSTEM DESIGN
What is System Design? System design is the process of defining different parts or "elements" that make up
a large application or system, organizing how these parts work together to meet specific requirements. Think of
it as creating a blueprint that shows how all the components of an application fit and communicate with each
other to perform a set of tasks effectively.

Elements in System Design

When designing a system, we need to consider several key elements:

1. Architecture: The overall structure of the system. It’s like the backbone that connects all parts. We
design the architecture to determine how different modules or components will interact.
2. Modules: These are independent parts of the system that handle specific tasks. For example, in a
shopping website, the payment module, product catalog module, and user account module are
separate modules. Each module has a defined responsibility.
3. Components or Subsystems: These smaller parts are built within the modules to perform more
specific functions. For example, in the payment module, a payment processor and a transaction log
could be separate components.
4. Flow of Data: This refers to how information moves between different parts of the system. For
instance, when a user places an order on an e-commerce site, their order details go from the shopping
cart module to the payment processing module.
5. Business Logic: The set of rules that define how the system behaves based on inputs. For instance,
"If a user adds an item to the cart, reduce the stock quantity by one" is an example of business logic.
This is coded within the modules.
6. Implementation: This is where the actual code or program is written to make the blueprint work as
intended. The implementation brings the design to life.

Types of System Design

1. High-Level Design (HLD): This gives a broad view of the system. It includes the overall structure,
showing how the different modules connect but does not go into deep detail.
2. Low-Level Design (LLD): This dives into specific details within each module, including functions and
classes, and explains exactly how each part works.

Example

Imagine designing a social media application. Here’s how the design process would look:

1. Define the architecture: Decide on the major modules—such as user profiles, posts, notifications, and
search.
2. Design modules: Each module handles specific tasks—such as managing user details or showing
recent posts.
3. Set up data flow: Define how information moves from one part to another—for example, from user
posts to followers’ feeds.
4. Implement business logic: Write code for actions like “follow a user” or “like a post.”
5. Implement and test: Create the application based on this structure and ensure it works as expected.

Questions for Review

1. What is the purpose of system design in application development?

2. Can you explain the difference between high-level design and low-level design?
3. Why is it important to define modules in a system design? Give an example.
4. What is meant by ‘data flow’ in a system, and why is it important?

What is Architecture in System Design? In system design, architecture refers to the internal structure of an
application—how it's organized and built from the inside. When developing an application, architecture helps
determine how all its components or "modules" connect and communicate with each other. It’s essentially a
detailed design blueprint that guides how the system will function as a whole.

Monolithic Architecture Monolithic architecture is one of the most straightforward types of system
architecture. In this architecture, the entire application—front end, back end, and data storage—is bundled
together in a single structure. This means all the code for the different components resides in one place,
functioning as a unified application.

Key Components of a Monolithic Architecture

1. Frontend: This is the user interface where users interact with the application.
2. Backend: The part of the system that processes user inputs, performs logic, and communicates with
the database.
3. Data Storage: A place to store information, like a database, where user data or application data is
saved.

Advantages of Monolithic Architecture

1. Simplicity: Because all components are in one place, it’s easier to develop, test, and deploy.
2. Easy Integration and Testing: With everything integrated, testing and securing the application are
straightforward.
3. Performance: Monolithic applications can perform faster in certain conditions since all components are
in the same system and there’s no need for network calls between services.

Disadvantages of Monolithic Architecture

1. Scalability Issues: Scaling a monolithic app is challenging because you can’t scale individual
components independently.
2. Code Complexity: As the application grows, the code becomes complex and harder to manage.
3. Deployment Challenges: If any single component needs an update, the whole application needs
redeployment. This can be time-consuming and lead to downtime.

Comparison with Distributed System Architecture

In a distributed or "microservices" architecture, different parts of the application are divided into smaller,
independent services. Each service can be deployed, updated, or scaled independently. However, this requires
additional communication between services, which may increase complexity.

Questions for Review

1. What is the purpose of system architecture in application development?

2. Can you describe monolithic architecture in simple terms?
3. What are the key components of a monolithic architecture?
4. Name one advantage and one disadvantage of monolithic architecture.
5. How is monolithic architecture different from distributed (or microservices) architecture?

Key Components of a Monolithic Architecture

Advantages of Monolithic Architecture

Disadvantages of Monolithic Architecture

Questions for Review

1. What is the purpose of system architecture in application development?

Distributed System Overview A distributed system is a collection of independent machines or "nodes" that
work together and communicate over a network to achieve a unified goal. Unlike monolithic systems, where all
modules and components are located in one place, a distributed system has its components spread across
multiple servers or computers. This distribution allows each component to perform specific functions, making
the system scalable and efficient.

Differences Between Monolithic and Distributed Systems

● Monolithic System: All modules and components reside in a single system, which makes it easier to
manage but challenging to scale. If any part fails, it could bring down the entire application due to a
single point of failure.
● Distributed System: Components are spread across multiple machines, which communicate over a
network. If one machine fails, others continue to operate, reducing the risk of complete system failure.

Key Features of Distributed Systems

1. Scalability: Distributed systems can be scaled horizontally, meaning that more machines can be added
to handle increased workload or functionality. If a task is too large for one machine, it can be divided
among multiple machines to improve processing power.
2. No Single Point of Failure: In a distributed setup, if one node or machine fails, the system remains
operational, as other nodes can take over the workload.
3. Replication: Data is copied across multiple nodes to ensure availability. If a machine fails, data can still
be accessed from another node with a copy, thus reducing the risk of data loss.

Advantages of Distributed Systems

1. Improved Reliability: Since the system is not dependent on a single machine, it can continue to
operate even if one node fails.
2. Increased Scalability: More machines can be added to the system as the workload increases,
allowing the system to handle more users or requests.
3. High Availability: Replicating data across multiple nodes ensures data can always be accessed, even
if one server goes offline.
Challenges and Disadvantages of Distributed Systems

1. Complexity in Management: Distributed systems require network management, load balancing, and
synchronization between nodes, which can increase complexity.
2. Security: Each node needs individual security, which can require more resources and oversight.
3. Latency Issues: Since nodes communicate over a network, there can be delays, especially if nodes
are located far apart or are under heavy load.

Real-World Use Cases

Distributed systems are widely used in cloud computing, large-scale applications, and global systems where a
high volume of data and concurrent users must be supported. Examples include Google Search, Netflix, and
Amazon.

Questions to Test Your Understanding

1. What is a distributed system, and how does it differ from a monolithic system?
2. Explain the concept of "no single point of failure" in distributed systems.
3. What is horizontal scaling, and how does it benefit a distributed system?
4. What are the main advantages of using a distributed system?
5. Describe a challenge associated with managing a distributed system.

What is Latency? Latency in web applications is the total time it takes for a user request to travel to a server,
be processed, and return a response. For example, when you type "facebook.com," a request goes to
Facebook's server, where it is processed, and a response is sent back. The total time for this "round trip" is
called latency.

If we break it down:

1. T1: Time for the request to reach the server.

2. T2: Time for the response to return.
3. T3: Processing time on the server.

So, Latency (ΔT) = T1 + T2 + T3.

Factors Affecting Latency

Latency comprises:

● Network Delay: Time for the data to travel over the network.
● Compute Delay: Time for the server to process the request.

In monolithic architectures, all components are deployed in a single location, which reduces network delay
since there is minimal communication between separate servers. In distributed architectures, the application
is divided into multiple modules, which are often deployed on separate servers, introducing network delay as
data travels between these modules.
Reducing Latency

1. Caching: Cache data close to the user to avoid unnecessary trips to the main server. For example,
once a response is received, it is stored temporarily in a cache. If a similar request is made, the system
can retrieve the cached data instead of querying the server again.
2. Content Delivery Network (CDN): CDNs host copies of static data (e.g., images, scripts) on multiple
servers worldwide. When a user requests data, it is delivered from the nearest CDN server, reducing
the response time.
3. Server Improvements: Upgrading server hardware or optimizing software can also help decrease
processing time and improve overall response speed.

Difference Between CDN and Caching

● CDN: Primarily serves static data and is geographically distributed to ensure that users worldwide can
access content quickly.
● Caching: Stores data closer to where it’s frequently requested, such as on the server itself, reducing
the need to re-process data that has already been computed.

Key Takeaways

● Caching, CDNs, and hardware improvements can help in reducing latency.

● Distributed systems have additional network latency due to their modular setup, whereas monolithic
systems avoid most of these internal delays.

Questions to Test Your Understanding

1. What is latency, and what factors contribute to it?

2. How does caching help reduce latency in web applications?
3. What is the difference between a CDN and caching?
4. Why might a distributed architecture have higher latency than a monolithic architecture?
5. List three methods to reduce latency in distributed systems.

What is Throughput? Throughput represents the amount of data processed by a system per unit of time. It
measures the "work volume" or "information flow" through a system. For instance, if someone creates 100
videos in 100 days, their throughput is 1 video per day.

Formally, Throughput = Amount of Data Transmitted per Unit of Time. In network terms, it's often
measured in bits per second (bps).

Throughput in Monolithic vs. Distributed Systems

● Monolithic System: A monolithic system has all components deployed in a single codebase, making it
limited by the resources available to that single server.
● Distributed System: A distributed system divides work across multiple servers or nodes. This
approach allows for horizontal scaling by adding more machines, which can handle tasks in parallel and
increase throughput. This makes distributed systems generally have higher throughput because they
can process more requests without being limited to a single resource.

Ways to Improve Throughput

1. Load Balancing: A load balancer distributes incoming requests across multiple servers, balancing the
load to avoid overloading any single server. This method improves system efficiency and throughput.
2. Caching: By temporarily storing frequently accessed data close to where it’s needed, caching reduces
the need to repeatedly fetch data from the main server, speeding up response times.
3. Content Delivery Network (CDN): CDNs store static data on geographically distributed servers. This
reduces the time needed to retrieve this data, especially for users who are far from the main server.
4. Distributed System Architecture: Distributed systems allow horizontal scaling, which means you can
add more servers to handle additional tasks, thus increasing throughput.
5. Upgrading Hardware and Resources: Upgrading server hardware or increasing processing resources
can improve the rate at which data is processed.

Factors Affecting Low Throughput

1. High Latency: Latency delays the response time, which can decrease the number of tasks a system
completes per second.
2. Protocol Overhead: Certain communication protocols between servers (like handshakes) may add
overhead, increasing latency and reducing throughput.
3. Network Congestion: Too many simultaneous requests can cause congestion, resulting in slower
response times and decreased throughput.

Key Takeaways

● Throughput measures the data processing rate of a system.

● Distributed systems generally have higher throughput than monolithic systems due to greater
scalability.
● Methods to improve throughput include load balancing, caching, CDNs, distributed architectures,
and resource upgrades.

Questions to Test Your Understanding

1. What is throughput, and how is it different from latency?

2. How does a load balancer improve throughput in a distributed system?
3. What are the main differences in throughput capabilities between monolithic and distributed
architectures?
4. Explain how caching helps increase throughput in networked systems.
5. List the three main factors that contribute to low throughput and how each affects the system.
Understanding Availability in Monolithic vs. Distributed Systems

Availability is the degree to which a system remains operational and accessible for users. This concept
becomes particularly relevant when many users try to access a service simultaneously, such as checking
examination results online. For instance, during CBSE 12th-grade result announcements, the website
experiences a massive increase in requests, which can overwhelm a monolithic system if all processes are
handled by a single server.

Monolithic System Availability Challenges

In a monolithic architecture:

● All components or "modules" are on a single server or in a single codebase.

● If this single point of access fails, the entire system goes down, reducing availability. This is known as a
single point of failure.

If the system is not designed to handle high availability, it cannot serve users effectively when a failure occurs.
Therefore, availability in monolithic systems is generally limited due to reliance on a single resource.

Distributed System Advantage

In a distributed system:

● Components or modules are spread across multiple servers, allowing work to be distributed and
reducing the risk of a single point of failure.
● This fault-tolerant design means that if one server or module fails, others can continue to operate,
improving overall availability.

In distributed systems, we can also create replicas of critical components. For example, multiple servers may
hold copies of essential data or applications, and a load balancer directs requests to available servers. This
redundancy improves fault tolerance, ensuring that users experience minimal downtime.

Fault Tolerance and Availability

Fault tolerance directly impacts availability. Systems with higher fault tolerance are more available because
they can continue to serve users even if some components fail.

Replication is a key strategy in distributed systems to increase availability:

● Application replication: Copies of an application are deployed across multiple servers.

● Database replication: Database servers are replicated to ensure data is accessible even if one server
goes down.

Differences Between Application and Data Redundancy

● Application Redundancy: In systems with multiple application nodes, each application can
independently serve requests. For example, in microservices, each service can run on separate nodes,
allowing services to work in isolation.
● Data Redundancy: Ensures that data storage is spread across multiple locations, especially crucial in
distributed databases where each node stores a copy of the data to provide continuity.
replication and redundancy both refer to strategies used in distributed systems to increase availability, but
they are applied in different ways:

1. Replication:
○ Refers specifically to creating exact copies of data or applications across multiple servers or
nodes in a distributed system.
○ It ensures that even if one server or instance fails, other replicas can continue to provide the
required data or services without interruption.
○ For example, database replication allows multiple copies of the data to be available, which
supports continuity and fault tolerance in case one database server goes down.
2. Redundancy:
○ A broader term that includes replication but also refers to any method that ensures backup
resources are available to handle failures.
○ Redundancy may involve not only replicating data or applications but also adding extra
hardware or system resources to act as backup in case of component failure.
○ This approach is essential in distributed systems to enhance availability, as it prevents single
points of failure by spreading critical functions across multiple, redundant resources.

Key Points

● Monolithic Systems have limited availability due to a single point of failure.

● Distributed Systems increase availability through replication and fault tolerance, with load balancers
improving request distribution.
● Fault Tolerance is essential for high availability, allowing distributed systems to continue operating
under failure conditions.

Questions to Test Your Understanding

1. What is the primary reason distributed systems have higher availability than monolithic
systems?
2. Explain how replication enhances availability in distributed systems.
3. Describe the concept of a single point of failure and its impact on system availability.
4. What is the difference between application redundancy and data redundancy?
5. How does a load balancer contribute to fault tolerance in distributed systems?

Consistency in distributed systems, especially focusing on how it affects data availability across different
servers or locations.

Key Points

1. Understanding Consistency:
○ Consistency means that the same data should be accessible by all users at any time,
regardless of where they access it.
○ For instance, if you deposit ₹100 in your bank account, it should immediately reflect across all
systems like ATMs and customer care services.
2. Problem of Inconsistency:
○ If data is not updated simultaneously across all systems, users may see outdated or incorrect
data. This issue is common in distributed systems.
○ An example is when someone books a movie ticket in one location (e.g., Delhi), but another
user from a different location (e.g., Pune) might still see the ticket as available if the database
hasn’t synced instantly.
3. Monolithic vs. Distributed Systems:
○ In a monolithic system, everything runs on one server, so consistency is typically high
because there’s no need to sync across servers.
○ In a distributed system, data is spread across multiple servers or locations. Syncing across
servers can lead to a delay, causing temporary inconsistency.
4. Factors Improving Consistency:
○ Network Speed: A faster network reduces sync time, helping data become consistent more
quickly.
○ Stopping Read Operations: Disabling read operations while data updates can prevent users
from reading incorrect data.
○ Physical Proximity of Servers: Placing servers closer to each other can reduce sync time, as
data travels faster over shorter distances.
5. Types of Consistency:
○ Strong Consistency: All reads always return the most recent write. Systems won’t allow read
operations until all replicas are updated. This is critical in systems like booking, where exact
data is required.
○ Eventual Consistency: Reads might temporarily return outdated data, but over time, all
replicas will sync up. This type of consistency is often used in social media, where minor delays
in updates are acceptable.
○ Weak Consistency: Not all replicas need to be updated immediately or consistently. It’s based
on the business requirement and is often used when real-time updates are not crucial.

Example Scenarios:

● Strong Consistency: Train ticket booking systems.

● Eventual Consistency: Social media feeds, where seeing a slightly delayed post isn’t a big issue.
● Weak Consistency: Systems where updates aren’t critical or are only required periodically.

Questions for Practice

1. What is consistency in distributed systems, and why is it important?

2. Give an example of inconsistency in a real-life system and explain why it happens.
3. What are the three types of consistency, and where would each be ideally used?
4. How does network speed influence consistency?
5. In what kind of system (monolithic or distributed) is achieving consistency easier, and why?
CAP Theorem Explanation

The CAP theorem explains that in distributed systems, we can only achieve two out of three properties
simultaneously: Consistency (C), Availability (A), and Partition Tolerance (P). Here’s what each means:

1. Consistency (C): Every read receives the most recent write. For example, if two people, A and B, are
booking a movie ticket, they should see the same availability status. If person A sees a seat available,
person B should also see it, ensuring that the system remains consistent.
2. Availability (A): The system should always be accessible for requests. For instance, sites like Google
are almost always available, regardless of when you access them. However, sometimes maintaining
consistency compromises availability, especially when updates are being processed.
3. Partition Tolerance (P): The system should continue to function despite network failures that may
disconnect parts of the system. This is essential in distributed systems, where data is stored across
different servers. If one server goes down, others should handle the load.

CAP Theorem’s Triangle of Choice: According to CAP theorem, you can choose only two of the three
properties:

● Consistency and Availability (no Partition Tolerance): Typically a centralized system where data is
always up-to-date and accessible but doesn’t handle network failures well.
● Consistency and Partition Tolerance (no Availability): Systems where data is consistent and resilient
to network issues but may become inaccessible temporarily.
● Availability and Partition Tolerance (no Consistency): Systems that prioritize uptime and fault
tolerance but may show outdated data, common in social media where exact consistency is not critical.

Examples:

● Banking: Prioritizes consistency over availability to ensure account balances are accurate.
● Social Media: Availability is crucial; slight delays in data synchronization are acceptable.
● Stock Trading: Consistency is prioritized to show accurate stock prices.

Interview Questions

1. Explain the CAP theorem. Why can’t all three properties be achieved at once?
2. Describe a real-world example where availability is more important than consistency.
3. What type of system would prioritize consistency and partition tolerance, and why?
4. How does CAP theorem influence the design of distributed systems?
5. In CAP theorem, which property is most important for a banking application, and why?

Time in Distributed Systems

In distributed systems, we have multiple servers located in different geographical locations, such as one in
Australia, one in the US, and one in India. Each server operates on its own clock, which may show different
times due to time zone differences. This leads to a significant challenge: determining the sequence of events.
In a monolithic system (a system where all components are interconnected), the time is uniform and events
can be easily sequenced. However, in distributed systems, where events can occur simultaneously across
different servers, it's difficult to ascertain which event happened first.

The Need for Logical Clocks

Physical clocks are insufficient for determining the order of events because they do not account for the inherent
differences in time between the different servers. To address this issue, logical clocks were introduced. The
significance of time in this context is to determine the sequence of events.

Lamport Logical Clocks

In a distributed system, every process (e.g., P1 and P2) maintains its own logical clock, referred to as a
counter. Each time an event occurs within a process, that process increments its counter.

1. Event Occurrence: When an event occurs in a process, it increments its counter. For example, if P1
has a counter value of 0 and an event occurs, the counter is incremented to 1.
2. Message Sending: When a process sends a message, it includes its counter value with the message.
3. Message Receiving: Upon receiving a message, a process takes the maximum value between its own
counter and the counter received in the message, and then increments it by one. This ensures that the
receiving process is aware of the sequence of events in relation to the sender's events.

Example of Counting

● If P1's counter is 2 and it sends a message to P2, P2 will compare its current counter with the one
received from P1.
● If P2’s counter is 1, it will update its counter to max(1, 2) + 1, which will result in 3.

This method effectively allows processes to determine the order of events across distributed systems by
maintaining a logical sequence, even though physical time may differ.

Conclusion

The Lamport logical clock helps in establishing a partial ordering of events in distributed systems. By using
logical clocks, we can deduce which event occurred before another, even when they happen on different
servers at different times.

Questions to Test Your Understanding

1. What is the main challenge of determining event order in distributed systems?

2. How does a Lamport logical clock work? Describe the process of sending and receiving
messages with respect to the counter.
3. Why can physical clocks be insufficient in distributed systems?
4. Explain how you would handle an event that occurs simultaneously across different processes.
5. What happens to the counter of a process when it sends a message?

What is Scalability?
Scalability refers to a system's ability to handle increased loads or requests without significant degradation in
performance. In simple terms, it assesses whether a system can manage a rise in the number of
requests—whether that means keeping the response time low or maintaining it at an acceptable level.

Types of Scalability

There are two primary types of scalability: Vertical Scaling and Horizontal Scaling.

1. Vertical Scaling (Scaling Up)

Vertical scaling involves increasing the capacity of a single machine (server) to handle more load. This could
mean adding more RAM, CPU power, or disk space to an existing server.

Example:

● Imagine you have a server with 8 GB of RAM, and your user base grows from 100 to 100,000. To
accommodate this increase, you might upgrade the server to 32 GB of RAM or add a larger SSD for
more storage.

Advantages of Vertical Scaling:

● Ease of Implementation: It's usually straightforward to upgrade a single server's resources.

● Less Complexity: Management is easier since there's only one server to deal with.

Disadvantages of Vertical Scaling:

● Single Point of Failure: If the server goes down, the entire system becomes unavailable.
● Resource Limits: There's a physical limit to how much you can upgrade a single machine.
● Higher Costs: Upgrading to higher-quality components can become expensive.

2. Horizontal Scaling (Scaling Out)

Horizontal scaling involves adding more machines (servers) to the system rather than upgrading existing ones.
This approach distributes the load across multiple servers.

Example:

● If you have a database that requires 100 GB of storage, instead of upgrading one server to hold all that
data, you might set up four servers with 25 GB each. If your needs increase to 125 GB, you simply add
another server.

Advantages of Horizontal Scaling:

● No Single Point of Failure: If one server fails, others can continue to handle the load, improving
overall system reliability.
● Flexibility: You can easily add more servers as your requirements grow without hitting a physical limit.
● Cost-Effectiveness: You can often use less expensive resources since you don't need the highest
quality for each machine.

Disadvantages of Horizontal Scaling:

● Management Complexity: Managing multiple servers can be more complicated than managing a
single one.
● Resource Consumption: More machines can lead to increased power consumption and require
careful resource management.

Conclusion

The choice between vertical and horizontal scaling depends on the specific needs and constraints of your
application and infrastructure. Understanding these two methods helps in designing scalable systems that can
efficiently handle varying loads.

Questions to Test Your Understanding

1. What does scalability mean in the context of distributed systems?

2. Explain vertical scaling and provide an example of how it works.
3. What are the advantages and disadvantages of vertical scaling?
4. Describe horizontal scaling and how it differs from vertical scaling.
5. What are some benefits of using horizontal scaling in a distributed system?
6. Why is horizontal scaling often considered to be more fault-tolerant than vertical scaling?
7. Can you give an example of when you would prefer vertical scaling over horizontal scaling?

Explanation: Redundancy and Replication in Distributed Systems

1. Redundancy

Redundancy means having a duplicate copy of a server or node in a system. This duplication ensures that if
one server fails, another is available to handle requests without interrupting the service for users. Redundancy
is a common feature in high-availability systems to make sure they stay up even if some components go down.

Redundancy comes in two main types:

● Active Redundancy: Here, all redundant servers are actively running and ready to accept requests.
For instance, if there are three servers in a distributed system, all three can receive requests
simultaneously. The load balancer distributes requests among all active servers, deciding which server
will handle each incoming request. This setup is often used to improve performance since multiple
servers share the load.
● Passive Redundancy: In passive redundancy, only one server is actively handling requests at any
given time, while the others remain in standby mode. If the active server fails, a passive server takes
over, ensuring continuous operation. This type of redundancy is particularly useful when an immediate
switchover is needed without splitting the load across multiple servers.

2. Replication

Replication involves not just duplicating servers but also ensuring synchronization across them, meaning
that the data on all replicated servers is always the same. For example, in a replicated database system, if you
update one server, that change should appear on all other servers as well, so they stay consistent.
Similar to redundancy, replication also has two types:

● Active Replication: In active replication, all replicated servers are synchronized in real-time. This
means that if there are three database servers, any changes made to one database (such as adding a
new record) are immediately mirrored on all other servers. This setup helps in scenarios where
consistent data availability across multiple servers is necessary.
● Passive Replication (Master-Slave): In passive replication, there is a master server and one or more
slave servers. All updates are made on the master server, which then synchronizes those changes with
the slave servers. The master is responsible for all read and write operations, while the slave servers
update themselves based on changes made to the master. If the master server fails, one of the slave
servers is promoted to take over as the master, ensuring continuity.

Key Differences Between Redundancy and Replication:

● Redundancy is simply about duplicating servers to ensure availability if one fails. No synchronization is
required between them.
● Replication adds a layer of synchronization so that data consistency is maintained across all replicated
servers.

1. Synchronous (Sync) Replication

In synchronous replication, every update or change made on the master server is immediately copied to the
slave servers before the operation is considered complete. This means that:

● The master waits for confirmation from each slave that it has received and applied the change.
● Only once the change is confirmed across all slaves does the master report the operation as
successful.

Pros of Sync Replication:

● Data Consistency: All nodes (master and slaves) stay synchronized in real-time, ensuring that every
server has the exact same data at any moment.
● High Reliability: If the master server fails, a slave can take over with minimal data loss.

Cons of Sync Replication:

● Latency: Sync replication can be slower since the master has to wait for each slave to confirm the
change before proceeding.
● Network Overhead: Each write operation requires network communication with all slaves, which can
slow down the process, especially with more slaves or longer distances between servers.

Use Case: Sync replication is ideal for critical applications where data accuracy and consistency are essential,
such as financial transactions or e-commerce systems.

2. Asynchronous (Async) Replication

In asynchronous replication, the master server does not wait for confirmation from the slave servers when
an update is made. Instead:

● Changes are applied to the master, and the update is queued for the slaves.
● Slaves replicate changes at their own pace, meaning there may be a delay (a lag) before they reflect
the latest data from the master.

Pros of Async Replication:

● Lower Latency: The master can continue processing other requests immediately without waiting for
slaves, making it faster.
● Scalability: With async replication, the system can handle more slaves without affecting the master’s
performance as much.

Cons of Async Replication:

● Data Inconsistency: There may be a delay, so slave servers may not have the most up-to-date data. If
the master fails, recent changes might be lost.
● Risk of Data Loss: If the master goes down before changes are synced to slaves, some data can be
lost.

Use Case: Async replication is suited for applications where speed is more important than real-time data
consistency, like content delivery networks (CDNs) or backup systems.

Questions to Test Your Understanding

1. What is redundancy in distributed systems, and why is it used?

2. Describe the difference between active and passive redundancy.
3. What is replication, and how does it differ from redundancy?
4. In an active replication setup, what would happen if a piece of data changes on one server?
5. Explain what a master-slave replication model is.
6. Why might a system use passive replication instead of active replication?

Load Balancing in Distributed Systems

1. Purpose of Load Balancing:

○ Distributes incoming network requests efficiently across multiple nodes in a system.
○ Ensures that no single node is overwhelmed, preventing resource underutilization and system
overload.
2. Example Scenario:
○ Imagine a user makes a request. Load balancers decide which node should handle it.
○ Even though each node may have the same code and capability, the load balancer distributes
the requests to balance the load.
3. Virtual IP and Routing:
○ A domain (like xyz.com) doesn’t directly point to a single server but to a virtual IP managed by a
load balancer.
○The load balancer routes the request to the optimal server based on predefined rules or
conditions.
4. Roles of Load Balancers:
○ Equal Load Distribution: Distributes requests evenly to avoid overloading any one node.
○ Health Checks: Monitors server health; if a node fails, it redirects requests to other active
nodes.
○ Ensures High Scalability, Throughput, and Availability: Achieves system flexibility by adding
more servers and scaling horizontally as needed.
5. Challenges of Load Balancing:
○ Single Point of Failure: If the load balancer fails, requests can’t be processed. Backup
(passive) load balancers are often added to mitigate this risk.

Load Balancing Algorithms

1. Round Robin:
○ Sequentially distributes requests to each server in turn.
2. Weighted Round Robin:
○ Distributes requests based on server capacity. Servers with higher capacity receive more
requests.
3. IP Hash:
○ Maps client IPs to specific servers, ensuring requests from the same client go to the same
server.
4. Least Connections:
○ Routes requests to the server with the fewest active connections, aiming for the lowest
response time.
5. Source IP Hash:
○ Combines client IP and destination address, so each client’s requests are directed to the same
node, ensuring consistency for specific clients.

Key Benefits of Load Balancing

● Better User Experience: High availability and quick response times due to resource distribution.
● Downtime Prevention: If a server fails, the load balancer directs requests to other servers.
● Scalability and Flexibility: Additional servers can be added, and load balancers distribute workloads
among them effectively.

Types of Load Balancing Algorithms

● Static Algorithms (e.g., Round Robin): Defined in advance; routing doesn’t adapt dynamically.
● Dynamic Algorithms (e.g., Least Connections): Real-time adjustments based on server conditions
and active connections.

This systematic approach ensures system efficiency, reliability, and scalability in distributed environments.

Introduction: In a distributed system, where multiple servers (called nodes) handle requests, load balancing is
a method used to manage and distribute incoming network traffic efficiently. Imagine we have a system where
multiple users send requests, and we have four servers that can process these requests. A load balancer
decides which server (or node) should handle each incoming request. This is crucial to avoid overloading a
single server and to ensure smooth system performance.

Key Concepts:

1. Load Balancing:
○ Load balancing refers to the process of distributing network traffic across all nodes in a system
to ensure efficient handling of requests. This prevents any single server from becoming a
bottleneck due to too many requests, while others remain underutilized.
○ The load balancer decides where each request should go among the available nodes. This
allows the system to provide faster responses and maintain high availability.
2. Health Check of Servers:
○ The load balancer continuously monitors the health of each server. It checks whether each
server is active and capable of handling requests.
○ If a server becomes unresponsive, the load balancer stops sending requests to it and directs
them to a healthy server instead. This ensures high availability and reduces the risk of
downtime.
3. High Scalability and Throughput:
○ Scalability is the system’s ability to handle an increasing number of requests by adding more
servers. With load balancing, additional servers can be easily integrated into the system.
○ This high scalability is essential in distributed systems, as it allows the system to handle more
requests efficiently.
4. Redundancy:
○ Redundancy means having backup servers with the same code and configuration as the main
servers. If one server fails, another identical server can take over, ensuring there is no
interruption in service.

Load Balancing Algorithms

Load balancers use different algorithms to decide how to distribute requests among servers. Here are some
commonly used ones:

1. Round Robin:
○ Requests are distributed in a rotating manner. For example, the first request goes to the first
server, the second to the second server, and so on. After reaching the last server, the cycle
repeats.
○ This method is simple and works well when all servers have equal capacity.
2. Weighted Round Robin:
○ If some servers have more capacity than others, Weighted Round Robin assigns weights to
each server. Servers with higher weights will receive more requests.
○ For instance, if Server A can handle three times as many requests as Server B, then Server A
will get three times more requests than Server B.
3. IP Hash:
○ This algorithm uses the IP address of the client to determine which server will handle the
request. It applies a hash function to the client’s IP, and the output corresponds to a specific
server.
○ This method is useful when each client needs to consistently connect to the same server (e.g.,
for session management).
4. Least Connections:
○ The load balancer directs the request to the server with the least number of active connections.
This is a dynamic approach, as it considers the real-time load on each server.
○ This ensures that servers with fewer connections take on more requests, balancing the load
effectively.
5. Static vs. Dynamic Load Balancing:
○ Static algorithms like Round Robin and IP Hash have predefined rules that don’t change in
real-time.
○ Dynamic algorithms like Least Connections are adaptive and distribute traffic based on the
current load, making them suitable for systems with unpredictable traffic patterns.

Challenges and Solutions in Load Balancing

1. Single Point of Failure:

○ If the load balancer fails, it becomes a single point of failure for the entire system.
○ Solution: Introduce a passive (or backup) load balancer that can take over if the primary load
balancer fails. This increases reliability.
2. Resource Utilization:
○ Efficient load balancing maximizes the use of available resources, ensuring each server is
utilized optimally.
○ By distributing requests evenly, the system can respond quickly and avoid overloading any
single server.

Advantages of Using Load Balancing

1. Improved User Experience:

○ With high availability and low response times, users experience smooth performance even
during peak traffic.
2. Prevents Downtime:
○ The system reroutes requests if a server goes down, ensuring that there is no interruption in
service.
3. Flexibility and Scalability:
○ New servers can be added to handle more traffic, and the load balancer automatically includes
them in the system.

Questions to Test Your Understanding

1. What is load balancing, and why is it important in a distributed system?

2. Explain the difference between Round Robin and Weighted Round Robin. When would you use
each?
3. How does the IP Hash algorithm work, and what is a typical use case for it?
4. What role do health checks play in load balancing, and how do they contribute to high
availability?
5. Describe the Least Connections algorithm. How does it differ from static algorithms like Round
Robin?
6. What are some challenges in load balancing, and how can a system avoid a single point of
failure?
Caching Explained

Imagine you open a social media app like Instagram. When you click on your profile, you see your bio, profile
picture, followers, following, and posts. Normally, to load all this information, the system fetches data from a
database. This process can be slow if every element (followers, following, posts, etc.) is retrieved each time
you open your profile.

To improve speed, especially when the profile content doesn’t change frequently, caching is used. Caching
temporarily stores data, so the next time you open your profile, the data is quickly loaded from the cache
instead of the database. Since cache often uses faster memory, like RAM, loading from cache is much faster.

For example:

1. The first time you open your Instagram profile, data is loaded from the database and stored in the
cache.
2. For subsequent profile views, the app pulls the information from the cache until there’s an update (like a
new post or a bio change).

Benefits of Caching

1. Reduced Database Load: Caching prevents repeated calls to the database, reducing server load.
2. Faster Response Times: Caching leverages faster storage, improving the app’s performance by
reducing loading time.
3. Cost Efficiency: Fewer database queries mean lower operational costs and faster processing speeds.

Types of Caching

1. In-Memory Cache (Local): Stores data on a single server, accessible only to that server. It’s faster but
limited to individual servers.
2. Distributed Cache: Shared across multiple servers, making it ideal for large applications with several
servers. Tools like Redis and Memcached are commonly used for distributed caching.

Use Cases for Caching

● Read-Intensive Applications: News sites like Times of India or Wikipedia, where users mostly view
content.
● Static Content: Unchanging data, like HTML, images, or common API responses, is cached to reduce
load time.
● Content Delivery Networks (CDNs): CDNs cache static assets geographically to deliver them faster to
users around the world.

Sample Interview Questions

1. What is caching, and why is it used in system design?

2. Describe the difference between in-memory caching and distributed caching.
3. What kind of data would you store in a cache? Give examples.
4. How does caching improve the performance of an application?
5. When would you invalidate or refresh a cache? Why is this important?
Cache Eviction Explained

In systems that use caching to store data, there's a need to manage the storage efficiently. As the cache has a
limited size, some data must be deleted over time to make room for new data. This process is known as cache
eviction. Here’s a structured explanation of the concepts:

1. Why Cache Eviction is Necessary:

○ Limited Size: Caches are not infinite in size. When the cache becomes full, older or less
relevant data must be removed.
○ Dynamic Changes: For example, in a service like Netflix with multiple subscription plans, if a
user updates their plans, the old data must be evicted to reflect the new offerings.
2. Time-to-Live (TTL):
○ Cached data can be set to expire after a certain period, known as TTL. For example, if you set a
cache to expire after one year, it needs to be re-evaluated if the data is still relevant.
3. Cache Miss:
○ A cache miss occurs when the requested data is not found in the cache. In this case, the
system retrieves the data from the primary database, and after retrieval, it can be stored in the
cache for future requests.

Cache Eviction Strategies

Several strategies exist for determining which data to evict from the cache:

1. Least Recently Used (LRU):

○ This strategy evicts the least recently accessed data first. It tracks how long ago data was last
used, making it effective for removing infrequently accessed items.
2. Most Recently Used (MRU):
○ Opposite of LRU, this strategy evicts the most recently accessed data. This can be beneficial if
the access pattern suggests that recently accessed data is less likely to be accessed again.
3. Least Frequently Used (LFU):
○ This strategy evicts data that has been accessed the least number of times. It focuses on usage
frequency rather than recency.
4. First-In-First-Out (FIFO):
○ The oldest data in the cache is evicted first. This is a simple approach but does not consider
how frequently or recently the data was accessed.
5. Last-In-First-Out (LIFO):
○ The most recently added data is removed first. This is less common but can be useful in specific
scenarios.
6. Random Replacement:
○ In this strategy, a random item is evicted from the cache, regardless of how frequently or
recently it was accessed.

Sample Interview Questions

1. What is cache eviction, and why is it important?

2. Explain the difference between LRU and LFU cache eviction strategies.
3. In which scenarios would you use FIFO over LRU?
4. What happens during a cache miss?
5. How do you determine the appropriate TTL for cache entries?
Advantages of RDBMS

1. Reduced Data Redundancy:

○ Unlike file-based systems, RDBMS minimizes redundancy and inconsistency by normalizing
data and enforcing constraints. For example, college details can be stored in one central table,
ensuring that all references to this data remain consistent.
2. Built-in Search Capabilities:
○ RDBMS provides robust querying capabilities through SQL, allowing users to search for data
without manually searching through files.
3. Data Integrity:
○ Constraints can be set on data types within columns (e.g., ensuring a column only contains
numbers), which helps maintain data integrity.
4. Concurrency Control:
○ RDBMS systems include locking mechanisms that prevent multiple users from altering the same
data simultaneously, thereby maintaining consistency.

Challenges of RDBMS

1. Rigidity:
○ RDBMS requires a predefined structure for tables, making it difficult to change the schema once
established. For example, adding new fields often necessitates altering the entire table
structure.
2. Scalability Issues:
○ RDBMS can face challenges with horizontal scaling (adding more machines). If data is
distributed across multiple tables, managing relationships can become complex. For instance, if
customer and order data are stored across different servers, ensuring consistency across these
tables can be problematic.
3. Complexity in Distributed Environments:
○ When scaling horizontally, managing relationships between tables becomes challenging. If an
order is on one server and the corresponding customer data is on another, it complicates data
retrieval and integrity.

RDBMS primarily stores data in tables. However, RDBMS faces several challenges, such as slow speed and
difficulty in horizontal scaling.

Key Challenges with RDBMS:

1. Speed: RDBMS can become slow, especially as data volume increases.

2. Scaling: Horizontal scaling, or distributing the database load across multiple servers, can be complex
and cumbersome.

To overcome these challenges, NoSQL databases were introduced. The term "NoSQL" stands for "Not Only
SQL" or "Non-relational" databases. Unlike RDBMS, which is structured, NoSQL databases provide a flexible
schema and are designed to scale easily.

Types of NoSQL Databases: NoSQL is an umbrella term encompassing four main types of databases:

1. Key-Value Stores:
○These databases store data in the form of key-value pairs. A key corresponds to a single value,
making it a simple and efficient way to access data.
○ Example: Redis is a popular key-value store, often used for caching.
2. Document Databases:
○ Document databases store data in documents, typically in JSON or XML format. This approach
allows for a dynamic schema, meaning you can easily add new fields without needing to modify
a predefined table structure.
○ Example: MongoDB is a well-known document database that combines relational concepts with
the flexibility of NoSQL.
3. Columnar Databases:
○ In columnar databases, data is stored in columns rather than rows. This method is
advantageous for analytical queries because it allows for faster data retrieval for specific
columns.
○ Example: These databases are commonly used in data analytics, where querying specific
columns can yield results without scanning entire rows.
4. Graph Databases:
○ Graph databases represent data as nodes and relationships. They excel at storing
interconnected data, making them ideal for applications like social networks, where relationships
between entities are critical.
○ Example: Neo4j is a popular graph database, widely used for social networking applications
and recommendations.

Interview Explanation

When explaining NoSQL databases in an interview, you could frame it like this:

"Traditional relational databases store data in tables with a fixed schema, which can lead to performance
bottlenecks and challenges in scaling. NoSQL databases, which include key-value stores, document stores,
columnar databases, and graph databases, address these limitations. Each type of NoSQL database is
designed for specific use cases, allowing for flexibility and improved performance. For instance, document
databases like MongoDB are great for applications needing dynamic schemas, while graph databases like
Neo4j are perfect for applications that rely on complex relationships, such as social networks."

Questions to Test Understanding

1. What are the main differences between RDBMS and NoSQL databases?
2. Can you explain how a key-value store works and give an example of its use case?
3. What are the advantages of using a document database over a traditional relational database?
4. Why would you choose a columnar database for data analysis tasks?
5. How do graph databases represent data, and in what scenarios are they particularly useful?
6. What kind of database would you use for a real-time chat application and why?
7. In which situations might you still choose to use an RDBMS over a NoSQL solution?

Explanation of Polyglot Persistence

In this segment, we’re discussing Polyglot Persistence. This concept refers to using multiple types of
databases to handle different parts of an application’s data storage needs. The primary motivation behind this
approach is that a single type of database often cannot efficiently satisfy all the requirements of a complex
application.

Key Points:

1. What is Polyglot Persistence?

○ Polyglot Persistence involves utilizing different databases for different functions within an
application. For instance, you might use a relational database for structured data and a NoSQL
database for unstructured data.
2. Examples of Databases:
○ RDBMS: PostgreSQL, which is suitable for transactional operations needing strong consistency.
○ Key-Value Store: Redis, often used for caching.
○ Document Database: MongoDB, which allows for flexible schema and is ideal for managing
data with varying structures.
○ Columnar Database: Apache Cassandra, useful for analytical queries.
○ Graph Database: Neo4j, which excels in managing interconnected data, such as social
networks.
3. Use Case: E-commerce Platform:
○ Consider an e-commerce platform as an example of an application that might employ Polyglot
Persistence:
■ Shopping Cart Data: A key-value store (like Redis) could manage simple key-value
pairs for cart items.
■ Order Details: A document database (like MongoDB) would be ideal for storing complex
order details and any nested data (like items within an order).
■ Payments: An RDBMS would handle payment transactions because it requires high
consistency and ACID properties.
■ Customer Social Graph: A graph database could be used to manage connections
between customers, such as who follows whom, enhancing the social features of the
platform.

Interview Explanation

When explaining Polyglot Persistence in an interview, you might say:

"Polyglot Persistence is a strategy that allows applications to use multiple database technologies, each
selected for its strengths based on specific use cases. For example, an e-commerce application might use a
key-value store for simple cart operations, a document database for complex order details, an RDBMS for
handling payment transactions, and a graph database to manage customer relationships. This approach
ensures that each component of the application is optimized for its specific data requirements, leading to
improved performance and scalability."

Questions to Test Understanding

1. What is Polyglot Persistence, and why is it beneficial for applications?

2. Can you give an example of a scenario where using multiple databases would be
advantageous?
3. In the context of an e-commerce platform, what type of database would you choose for
managing order details and why?
4. What characteristics make a graph database suitable for managing social connections between
users?
5. How does the choice of using a key-value store for cart data improve performance in an
e-commerce application?
6. What are the challenges of implementing Polyglot Persistence in an application?
7. When would you still prefer a single database solution over Polyglot Persistence?

Explanation of Normalization and Denormalization

In this segment, we’ll discuss Normalization and Denormalization in database design, essential concepts for
managing data efficiently in relational databases.

What is Normalization?

1. Definition:
○ Normalization is the process of organizing data in a database to reduce redundancy and
improve data integrity. This involves dividing a single table into multiple tables to minimize
duplicate data.
2. Example:
○ Consider an Employee table that contains the following columns: Employee ID, Department ID,
Department Name, and Department Description. If there are 100 employees and only 2
departments, each employee record would repeat the department details, leading to
redundancy.
○ To normalize this, you would keep the Employee table with just the Employee ID and
Department ID, and create a separate Department table with Department ID, Name, and
Description. This reduces redundancy and storage requirements.

What is Denormalization?

1. Definition:
○ Denormalization is the reverse process, where you combine multiple normalized tables into a
single table. This is done to improve read performance and simplify data retrieval, especially
when read operations are more frequent than writes.
2. Example:
○ If you have an Employee table and a Department table, denormalizing would involve merging
them into a single table that includes Employee ID, Department ID, Department Name, and
Department Description. This allows for faster data retrieval, as all relevant data can be
accessed from one table without the need for joins.

Benefits of Denormalization:

● Faster Data Reads:

○ Since all data is in one table, read operations become faster, as you avoid the overhead of
joining multiple tables.
● Ease of Management:
○ Managing a single table is simpler than dealing with multiple tables, reducing the complexity of
queries and operations.
● High Data Availability:
○ With a single table, there's no concern about the availability of multiple tables or the need for
checks to ensure data consistency across them.
● Reduced Network Calls:
○ Fewer database calls are required to retrieve data, as everything is contained in one table.

Challenges of Normalization:

1. Data Redundancy:
○ While normalization reduces redundancy, denormalization may reintroduce it, which can lead to
larger database sizes.
2. Complexity:
○ Normalized databases can become complex due to multiple tables and relationships, making
them harder to manage.
3. Inconsistency:
○ If the same data is spread across multiple tables, inconsistencies can arise if updates are not
applied uniformly.
4. Slow Write Operations:
○ Denormalization can lead to slower write operations, as multiple places must be updated when
changes are made.

Interview Explanation

When explaining Normalization and Denormalization in an interview, you might say:

"Normalization is a technique used to reduce data redundancy in a relational database by dividing data into
multiple related tables. For instance, in an employee management system, we can separate employee details
from department details to minimize repetition. Denormalization, on the other hand, involves merging these
tables back into a single table to optimize read performance, especially in scenarios where data retrieval is
more frequent than data modification. Each approach has its advantages and challenges, and the choice
between them depends on the specific requirements of the application."

Questions to Test Understanding

1. What are the primary goals of normalization in database design?

2. Can you explain the difference between normalization and denormalization?
3. What might be a scenario where you would prefer denormalization over normalization?
4. How does normalization help maintain data integrity?
5. What challenges can arise from denormalization?
6. In what situations would you consider it necessary to perform normalization?
7. How can data inconsistency occur in a normalized database?

What is Indexing?

Indexing is a method used to optimize the speed of data retrieval operations on a database table. Think of it as
a way to organize your information so that you can find what you need quickly, much like how a well-organized
library allows you to find books easily.

Real-World Analogy

1. Unorganized Room vs. Medical Store:

○ Imagine your room is cluttered; if someone asks you to find your watch, it will take time because
everything is disorganized.
○ Now, think of a medical store. If you ask for vitamin D capsules, the staff can immediately
retrieve it because they have a systematic way of organizing their medicines. This is similar to
indexing in a database.

How Does Indexing Work?

● When you have a table with many records, searching through every entry one by one (like a linear
search) can be time-consuming, especially if there are millions of entries.
● For example, if you have a million records and need to find a specific one, a linear search would require
you to check each record until you find the one you're looking for. This has a time complexity of O(n)
where n is the number of records.

Optimized Searching

● Binary Search: If the records are sorted, you can use binary search, which is much faster and has a
time complexity of O(log n).
● Indexing: When you create an index on a column (e.g., net worth), the database maintains a separate
memory structure that allows for quick lookups. It sorts the values in that column and maintains pointers
to the original rows in the table, significantly speeding up search queries.

Implementation of Indexing

1. Creating an Index: Suppose you have a table of students and you frequently search based on their net
worth. By indexing the net worth column, you create a lookup table that points to the rows in your main
table.
2. Lookup Table Structure: The index will have sorted entries of the net worth and pointers to the
respective rows in the original table. This allows the database to quickly find and return the desired
records with minimal searching.

When to Use Indexing

● Read-Intensive Applications: If your application frequently reads data (e.g., querying user
information), indexing is beneficial as it speeds up these read operations.
● Write-Intensive Applications: If your application involves many write operations (inserts, updates,
deletes), adding indexes might slow down these operations because the index must also be updated
with every change. In such cases, it’s often better not to use indexing.

Common Interview Questions

1. What is indexing, and why is it used in databases?

○ Indexing is a method to optimize data retrieval speed by creating a separate structure that
allows quick access to rows based on indexed columns.
2. Explain the difference in search time complexity between linear search and indexed search.
○ Linear search has a time complexity of O(n) because it checks each record sequentially, while
indexed search can use binary search for sorted data, achieving a time complexity of O(log n).
3. When would you choose not to use indexing?
○ I would choose not to use indexing in write-intensive applications where frequent updates could
lead to overhead in maintaining the index, potentially degrading performance.
4. What data structure is commonly used for implementing an index?
○ Indexes are often implemented using data structures like B-trees or hash tables, depending on
the use case.
5. How does indexing improve query performance in large datasets?
○ Indexing allows the database to quickly locate rows without scanning the entire table,
significantly reducing the time required to fulfill queries, especially in large datasets.

Now, let’s test your understanding with some questions:

1. Can you explain how indexing can affect the performance of read vs. write operations in a
database?
2. What might happen if you index too many columns in a database table?
3. How does the use of B-trees in indexing help maintain order and efficiency?

Explanation of Synchronous Communication

Synchronous Communication: Synchronous communication occurs when a sender and receiver interact in
real-time, meaning the sender must wait for the receiver's response before continuing with other tasks. This
can be likened to making a cash withdrawal at an ATM.

1. Sequential Process:
○ Another way to understand this is through a sequence of events. For instance, consider you
want to eat at a restaurant but first need to withdraw money from the ATM. The process must
follow specific steps:
1. Drive to the ATM.
2. Withdraw cash.
3. Go to the restaurant and order food.
○ Each step must be completed in order; skipping any step (like not withdrawing cash) means you
cannot proceed to the next (eating at the restaurant). This sequential dependency highlights the
nature of synchronous communication.
2. Programming Context:
○ In programming, consider a scenario with three statements:
1. Fetch data from the database.
2. Process that data.
3. Return and print the processed data.
○ These statements will execute sequentially. You cannot start processing data until it has been
fetched, and you cannot print the data until it has been processed. This ensures consistency
and order, which are vital in synchronous communication.
3. Achieving Consistency:
○ Synchronous communication is essential for achieving consistency in systems with multiple
replicas. For instance, if you update a value in a database, that change must be reflected across
all replicas. If one replica is not updated in a timely manner, it may lead to inconsistencies,
where different replicas have different values.
4. High Consistency Needs:
○ Industries like stock markets and banking require high consistency. For example, when you
make a payment, the transaction must ensure that all systems reflect the same state
immediately to avoid issues like double spending.

Transition to Non-blocking Calls

While synchronous communication is useful for ensuring consistency, it can be limiting in scenarios requiring
responsiveness. In the next video, we'll learn about non-blocking calls, which allow processes to continue
running without waiting for a response, enhancing system performance in distributed environments.

Asynchronous Communication Explained

Asynchronous communication allows a process to send a request and then continue executing without waiting
for the response. This is in contrast to synchronous communication, where the process must wait for the
request to be completed before it can proceed.

Key Advantages of Asynchronous Communication:

1. Improved User Experience:

○ Users are not left waiting for long operations to complete. For example, a web page can load
quickly while backend processes continue to run.
2. Scalability:
○ Asynchronous communication can handle more requests simultaneously since processes do
not have to wait for one another. This is particularly important in high-load situations.
3. Avoiding Bottlenecks:
○ If one part of the system is slow (like a third-party API call), asynchronous communication
prevents it from slowing down the entire application.
4. Reduced Overload:
○ For example, if a client makes multiple requests, the server can handle these requests more
efficiently without getting overloaded.

asynchronous communication is necessary in several situations:

1. When Processing Time is High:

○ If a task takes a long time to process (like the bakery taking three hours to prepare a cake), it’s
inefficient to block the user or the application while waiting. In such cases, allowing other
processes to continue while waiting for a long-running task is beneficial.
2. To Enhance User Experience:
○ For example, when a webpage loads quickly, the application can continue to process data in the
background (like fetching additional information or processing requests). This prevents the user
from experiencing delays, making the application feel more responsive.
3. For Scalability:
○ Asynchronous communication allows multiple requests to be processed simultaneously. If one
request is being handled (for example, sending a notification), it does not block other requests
from being processed, thereby enhancing the scalability of the application.
4. To Avoid Bottlenecks:
○ In scenarios where one part of a system may become a bottleneck (like receiving a response
from a third-party service), asynchronous communication allows other operations to proceed
without waiting for that part to finish. This can be particularly important when dealing with high
volumes of requests.
5. In Competitive Environments:
○ When there is competition for resources or when tasks take varying amounts of time, using
asynchronous communication helps manage those situations more efficiently. For instance, if
many users are placing orders at once, allowing notifications to be sent asynchronously
prevents the system from becoming overwhelmed.

Why Asynchronous Communication is Important:

● Efficiency: It improves resource utilization by allowing tasks to run in parallel.

● User Satisfaction: Users receive feedback (like notifications) without delay, enhancing their
experience.
● Performance: Applications can handle more users and requests, leading to better overall performance.

Explanation of Message-Based Communication

In this topic, we are focusing on message-based communication, which is a method of data exchange
between systems or services by sending messages back and forth. Let’s break down the main elements:

1. What is Message-Based Communication?

○ Message-based communication involves a client (the sender) and a receiver. The client sends a
message to the server or service, and the server processes it and can respond with another
message. The benefit is that the client doesn’t have to wait for a response—it can continue to
send more requests without being blocked by the server's response time.
2. Key Terms:
○ Producer: The component or user that creates and sends messages (requests).
○ Consumer: The component or user that receives and processes these messages.
○ Agent: Acts as a middleman or "channel" between the producer and consumer, handling the
flow of messages. An agent often uses a system called queues (FIFO - First In, First Out),
ensuring that the messages sent first are received first.
3. Two Main Models of Message-Based Communication:
○ Peer-to-Peer (P2P) Model:
■ In this model, only two parties are involved, like a direct email exchange between two
people. When one person sends a message, it might not be received immediately, but it
follows a FIFO order, ensuring that messages are received in the order they were sent.
■ Example: Sending an email. It may take some time to arrive, but it will be delivered to the
recipient eventually.
○ Publish-Subscribe (Pub-Sub) Model:
■ In this model, multiple consumers (or subscribers) receive messages from a single
producer (publisher).
■ Example: A newsletter subscription. Many users subscribe to a newsletter, and each
morning, they receive the latest news at the same time. The publisher sends the news
once, and all subscribers get it without sending individual messages.
4. Tools for Message-Based Communication:
○ To implement message-based communication, certain tools can make the process easier by
handling the complexities of message delivery. Examples include Kafka and RabbitMQ.
○ These tools manage the queuing and delivery of messages, whether it’s in a P2P or Pub-Sub
setup, making it easier for developers to handle communication without writing extensive code.

Testing Questions

1. What is the primary benefit of message-based communication for the client?

2. Can you describe the difference between the producer, consumer, and agent in message-based
communication?
3. How does the Peer-to-Peer (P2P) model differ from the Publish-Subscribe (Pub-Sub) model?
4. Why is FIFO (First In, First Out) important in message-based communication, especially in P2P
scenarios?
5. Name two tools used for implementing message-based communication and explain their
purpose.

Explanation of Web Servers

A web server is a system that enables web applications to run continuously, ensuring they are always
available to users. This can include both hardware (the physical machine) and software (programs that
manage web requests).

1. Definition of a Web Server:

○ A web server comprises tools and programs that keep web applications "up and running." It
ensures applications are accessible by handling requests and delivering responses.
2. Hardware and Software Components:
○ A web server can be a combination of both hardware and software:
■ Hardware: Think of it as a computer that hosts the website files and resources, such as
HTML, CSS, JavaScript, and images. It connects to the internet and enables data
exchange with other connected devices.
■ Software: This is typically an HTTP server (like Apache or Nginx) that understands
URLs (web addresses) and HTTP (HyperText Transfer Protocol), the communication
language between the browser and server. HTTP defines how web pages are requested
and served.
3. How Web Servers Work:
○ When you enter a URL (e.g., google.com) into your browser, it sends an HTTP request to the
server. The server then processes the request, finds the appropriate files (like a webpage or
image), and sends them back to your browser, which displays them.
○ In simpler terms, your browser (acting as a "client") requests data, and the server provides it,
enabling you to view the web page.
4. Examples of Web Servers:
○ Popular web servers include Apache, Nginx, and IIS. They all provide the core functionality
required to handle requests and respond with the appropriate resources.
5. Real-World Example:
○ Imagine you visit a website on your browser, like a news page. The browser sends a request to
the web server, asking for the latest articles. The web server then retrieves the content and
sends it back to the browser.

Testing Questions
1. What is the purpose of a web server?
2. Explain the difference between the hardware and software components of a web server.
3. What role does HTTP play in a web server?
4. Give an example of how a browser and web server interact when you visit a website.
5. Name two common web servers and describe what they do.

Web Application vs. Website

A web application is an interactive application that operates on the internet. Unlike a website (which is mostly
static and just shows content to users), a web application allows for user interaction. For example, in a website,
content rarely changes unless the owner updates it, like a blog. In contrast, web applications, such as social
media platforms like Facebook, allow all users to interact with and update content.

Client and Server

In web systems, there’s a client-server model:

● Client: This is the user-side device or application that makes requests. Examples include mobile apps
or browsers on a laptop.
● Server: This is the system that responds to the client’s requests by providing data or performing tasks.
For example, when you use Instagram on your phone (the client), your device sends requests to
Instagram’s servers.

Servers can also act as clients when they request data from other servers. This is essential in more complex
systems where different servers may need to interact for information.

REST (Representational State Transfer)

REST is a set of guidelines for communication between client and server over the internet:

1. Language-Independent: REST allows the client and server to communicate regardless of

programming language.
2. Lightweight: REST protocols aim to make data transfer efficient and minimize overhead, which speeds
up the interaction.
3. Uses URIs and HTTP Verbs: A unique identifier (URI) combined with HTTP methods (GET, POST,
PUT, DELETE) determines the specific action taken. For instance, to retrieve Netflix subscription plans,
the client (your Netflix app) sends a GET request to a specific URI.

CRUD Operations (Create, Read, Update, Delete)

In REST, operations are categorized as follows:

● POST: For creating new data (like adding a new subscription).

● GET: For reading data (like retrieving subscription details).
● PUT: For updating existing data.
● DELETE: For removing data.
For example, when you request subscription details, the Netflix app (client) sends a GET request to the server,
which responds with the information. If you were an admin updating subscription rates, you would use PUT to
make changes.

Service-Oriented Architecture (SOA) and Microservices

● SOA: This is an architectural style where an application is divided into different services that can be
used independently or together. SOA helps with reusability and allows selective scaling, so we can
scale only the necessary parts.
● Microservices: This is an evolved form of SOA with more loose coupling. Each service operates
independently, making it highly scalable. In a microservices architecture, each service can also have its
own data storage, allowing it to operate without relying on other services.
Questions for Practice

1. Explain the difference between a web application and a website.

2. What roles do the client and server play in a web application?
3. Why is REST considered lightweight and language-independent?
4. What are CRUD operations in REST, and give an example of each.
5. Describe the difference between Service-Oriented Architecture and Microservices. Why might a
company choose Microservices over a monolithic architecture?

Communication Protocols Overview

Communication protocols define how data is exchanged between a client (like your web browser or app) and a
server. The goal is to manage requests and responses efficiently, especially under different conditions (high
demand, specific timing needs, etc.). Here are the main types:

1. Polling
● Explanation: Polling is like going to a shop and asking if a product is available. The client requests
information, and the server responds if it has the requested data. However, if the data isn’t ready, the
client just keeps asking until the server can fulfill the request.
● Limitations: Polling is simple but can strain the server if multiple clients make frequent requests. For
example, if 100 clients are constantly asking for the same new update, the server may struggle to
handle these repeated requests.
● Usage: Polling works well for situations where updates aren’t frequent but are still regular enough to
justify periodic checks.

2. Long Polling

● Explanation: Imagine you go to a shop, ask for a product, and the shopkeeper takes note of your
request. When the product is available, the shopkeeper will let you know instead of making you
repeatedly ask. Here, the client sends a request and the server keeps the connection open until it has a
response.
● Limitations: This approach increases server load because it has to keep requests open. The server
has to maintain a kind of "registry" for each request, adding complexity and resource usage.
● Usage: Long polling is used when you need updates that aren’t frequent but must be delivered as soon
as available, like news or notifications that should show up as soon as they’re posted.

3. Push

● Explanation: Push works like setting up a subscription. You tell the server you’re interested in updates,
and whenever there’s new data, it pushes that to you without any request from your side. Think of
notifications on your phone—like getting alerts about Instagram likes or new messages.
● Limitations: Push notifications can be a bit disruptive if data arrives when it’s not needed or if the client
isn’t actively using the service.
● Usage: Push is widely used in real-time notification systems, like email, chat apps, and social media
alerts.

4. Sockets (WebSockets)

● Explanation: Sockets allow continuous, real-time two-way communication. It’s like opening a phone
line—you can speak and listen simultaneously without needing to hang up and redial. The connection
remains open, and data can flow back and forth continuously.
● Usage: This is ideal for chat applications, live video streaming, and other systems that need rapid,
ongoing updates without the overhead of repeatedly opening and closing connections.

5. Server-Sent Events (SSE)

● Explanation: This is like push but happens only while the user is on a particular page. The client
subscribes to a data stream from the server, and the server continues sending updates until the client
disconnects. SSEs work well for updates that happen over long-lived connections, like live sports
scores or stock prices.
● Limitations: SSE only works as long as the page is open and active.
● Usage: Examples include live dashboards or continuously updating data pages.

Quick Recap for the Interview

In your own words:

1. Polling: Client requests periodically; server responds when data is available.

2. Long Polling: Client requests once; server holds connection until data is ready.
3. Push: Server sends updates automatically to the client without a request.
4. Sockets: Two-way, continuous communication; ideal for real-time systems.
5. SSE: One-way continuous data stream, client listens until connection is closed.

In software architecture, a tiered architecture divides an application into multiple logical layers or "tiers" based
on functionality. Each tier is responsible for specific tasks, allowing for separation of concerns, scalability, and
maintainability. Here’s a look at the common tiers in a tiered architecture:

1. Presentation Tier (Client Tier)

● Description: The presentation tier is the top-most layer responsible for the user interface. It interacts
directly with users and displays information from the system. It also collects input from users and sends
it to the application tier.
● Examples: Web browsers, mobile apps, desktop applications.

2. Application Tier (Logic Tier or Middle Tier)

● Description: This tier contains the application’s business logic, processing rules, and core
functionalities. It processes inputs from the presentation tier, makes decisions, and interacts with the
data tier as necessary. This layer is crucial for implementing the application’s primary operations and
workflows.
● Examples: Web servers, APIs, business logic code.

3. Data Tier (Database Tier or Storage Tier)

● Description: The data tier is responsible for managing and storing data. It communicates with the
application tier to fetch, store, or update data. This layer often contains a relational or NoSQL database
and is optimized for data management.
● Examples: Databases like MySQL, PostgreSQL, MongoDB, or data warehouses.

Types of Tiered Architectures

1. 2-Tier Architecture
○ In a 2-tier architecture, the presentation and data layers communicate directly. This architecture
is typically simpler but has limited scalability and is less flexible.
○ Example: Client-server applications where a client directly interacts with the database.
2. 3-Tier Architecture
○ In a 3-tier architecture, the application logic is separated into its own tier, sitting between the
presentation and data layers. This separation allows for better scalability, security, and
maintenance.
○ Example: Most modern web applications use this structure, where the client (browser) talks to
the web server (logic tier), which then interacts with the database (data tier).
3. N-Tier Architecture
○ An extension of the 3-tier model, N-tier architecture includes additional layers, such as a service
layer or an integration layer. This setup is more flexible, allowing for complex, distributed
applications.
○ Example: Enterprise-level applications that may include multiple services, integrations, and
specialized business logic.

Benefits of Tiered Architecture

● Modularity: Allows for easier maintenance and updates, as each tier can be modified independently.
● Scalability: Each tier can be scaled independently to handle increased load.
● Security: Sensitive data can be protected in the data tier, with strict access control from the application
tier.

Let's break down authentication and authorization in simple terms, especially focusing on how you might
explain this to an interviewer in system design.

1. Understanding Authentication

● What it is: Authentication is the process of verifying the identity of a user. In simple words, it asks,
"Who are you?" When you try to log into a system, the system needs to confirm you are who you say
you are.
● Example: If your name is Ram and you want to access an application, you provide your
credentials—like a username and password. When you submit these, the system checks if they match
the stored records and confirms your identity. If the credentials are correct, it authenticates you. This
process is called authentication.

2. Understanding Authorization

● What it is: Authorization defines what actions you are allowed to perform after you are authenticated.
In other words, it asks, "What can you do?" Just because you have logged in does not mean you have
full control over everything.
● Example: Suppose you join a company, and after logging in, you access their database using your
credentials. You might be able to see the data but not necessarily modify it. Let’s say you have
“read-only” access; you can view tables but cannot delete or update records. The permissions you are
given—whether to read, update, or delete—are determined by authorization.

Difference Between Authentication and Authorization

● Authentication: Confirms your identity. It’s like showing your ID to verify who you are.
● Authorization: Defines your permissions. It’s like showing a ticket at an event that only allows you into
certain areas.

Let’s delve into token-based authentication, a popular method used to enhance security and improve user
experience in applications. Here’s how you can explain it in an interview context, followed by questions to test
your understanding.
Token-Based Authentication Explained

1. Initial Registration:
○ The process starts with the client (user) registering on the application. During registration, the
user creates a username and password. Once registered, these credentials are stored securely
in the system.
2. Login Process:
○ When the user wants to log in, they provide their username and password. Upon successful
verification, the server generates a token (usually a JSON Web Token, or JWT). This token is
unique and contains information about the user's identity and possibly their permissions.
3. Using the Token:
○ After logging in, the user will receive this token. For every subsequent request to the server
(e.g., accessing resources or services), the user does not need to provide their username and
password again.
○ Instead, the user includes the token in the HTTP headers of their requests. This allows the
server to authenticate the user without needing the original credentials again, enhancing
security.
4. Token Expiration:
○ Tokens are typically designed to expire after a certain period (e.g., 30 minutes). This means that
if the user is inactive for too long, the token will become invalid. If the user tries to use an
expired token, they will need to log in again to obtain a new token.
○ For example, in a banking application, after logging in, the user receives a token. If they remain
inactive for a while, the token will expire, and they will be logged out for security reasons.
5. Advantages:
○ Security: Since passwords are not sent with every request, it minimizes the risk of interception.
○ Convenience: Users don't need to enter their credentials repeatedly.
○ Statelessness: Tokens allow the server to be stateless, meaning it doesn't need to keep track
of session states, which can improve scalability.

Understanding OAuth

OAuth (Open Authorization) is a widely used protocol that allows users to grant third-party applications limited
access to their resources on a different service without sharing their credentials (like passwords).

How OAuth Works

1. User Requests Access:

○ A user wants to use a third-party application (let's call it App A) to access some data from a
different service (like Google or Facebook).
2. Redirect to Authorization Server:
○ Instead of giving their username and password to App A, the user is redirected to the service
provider's authorization server (e.g., Google) where they log in.
3. User Grants Permission:
○ After logging in, the user sees a consent screen that lists what permissions the third-party app is
requesting (like accessing their email or contacts). The user can choose to grant or deny
access.
4. Authorization Code:
○ If the user approves, the authorization server sends an authorization code back to App A.
5. Exchange Code for Access Token:
○ App A then sends this authorization code to the authorization server to obtain an access token.
This token is what App A will use to access the user's data.
6. Accessing Resources:
○ App A can now use the access token to make requests to the resource server (like Google’s
API) on behalf of the user, accessing the user’s data securely.
7. Limited Access:
○ The access token has specific scopes (permissions) and may have an expiration time, meaning
it can only perform certain actions and will eventually need to be refreshed.

Why We Use OAuth

1. Security:
○ OAuth allows users to grant access to their information without exposing their passwords to
third-party applications. This minimizes the risk of credential theft.
2. Convenience:
○ Users can log into multiple applications without having to remember different usernames and
passwords for each service.
3. Granular Control:
○ Users can specify what data a third-party application can access and revoke permissions at any
time. For example, a user can allow an app to view their email but not modify it.
4. Standardization:
○ OAuth is an industry-standard protocol, which means many applications and services support it.
This makes it easier for developers to integrate secure authentication into their apps.
5. Single Sign-On (SSO):
○ OAuth is often used in conjunction with SSO solutions, allowing users to access multiple
applications with a single set of credentials.

Understanding Proxy Servers

A proxy server acts as an intermediary between a client (like a user's computer) and a server (like a web
service). It helps to route requests and responses between the client and the server. Let's break down the key
concepts and functions of a proxy server.

What is a Proxy Server?

● A proxy server is either hardware or software that sits between a client and an application server,
providing various services, including anonymity, security, and caching.

Types of Proxy Servers

1. Forward Proxy:
○ A forward proxy is used to access the internet on behalf of a client. When a client requests a
resource (like a webpage), the request goes to the forward proxy, which then forwards it to the
target server. The target server only sees the proxy's IP address, not the client's IP address.
○Example: If a student (client) wants to access a blocked website, they can use a forward proxy.
The proxy makes the request, and the blocked website does not know who the actual requester
is.
2. Reverse Proxy:
○ A reverse proxy sits in front of a web server and forwards requests to it. It can be used to
distribute load, provide SSL termination, and enhance security. Unlike a forward proxy, the client
is unaware of the reverse proxy's existence; they only know the IP of the reverse proxy.

How Forward Proxy Works

1. Client Request:
○ When a client wants to access a resource (for example, a website), it sends a request to the
forward proxy.
2. Proxy Processes Request:
○ The proxy server receives the request and determines where to forward it based on the URL.
3. Request to Actual Server:
○ The proxy forwards the request to the actual server hosting the desired resource.
4. Response from Server:
○ The actual server sends the response back to the proxy.
5. Return to Client:
○ The proxy then sends the response back to the client.
6. Identity Concealment:
○ The actual server never sees the client's IP address; it only sees the proxy's IP address, which
helps to maintain the client's anonymity.

Benefits of Using a Proxy Server

● Anonymity: The actual identity (IP address) of the client is hidden from the server, providing a level of
anonymity.
● Access Control: Organizations can use proxies to restrict access to certain websites or services based
on policies.
● Caching: Proxies can cache frequently accessed resources, reducing load times and saving
bandwidth.
● Bypassing Restrictions: Users can access blocked or restricted content by routing their requests
through a proxy server located in a different region.
● Improved Security: Proxies can act as an additional layer of security, filtering out malicious requests
before they reach the actual server.

Understanding Reverse Proxy

A reverse proxy is a type of proxy server that sits in front of one or more web servers and forwards client
requests to them. It acts as an intermediary, but unlike a forward proxy that hides the client's identity, a reverse
proxy hides the identity of the backend servers from the clients. Let’s break down the concept:

What is a Reverse Proxy?

● A reverse proxy server accepts requests from clients and forwards them to the appropriate backend
server. The response from the server is then sent back to the client through the reverse proxy.

Key Functions of Reverse Proxy

1. Load Balancing:
○ Reverse proxies can distribute incoming traffic across multiple backend servers. This helps
manage load effectively and ensures no single server becomes overwhelmed, improving overall
performance and reliability.
2. Abstraction:
○ By using a reverse proxy, clients do not need to know the details of the backend servers. They
only interact with the reverse proxy, which abstracts the complexity of multiple servers.
3. Security:
○ A reverse proxy hides the identity and structure of the backend servers. If a client sends a
request, the actual server’s IP address remains concealed, protecting it from direct exposure to
the internet and potential attacks.
4. SSL Termination:
○ Reverse proxies can handle SSL encryption, allowing backend servers to focus on processing
requests rather than managing secure connections.
5. Caching:
○ They can cache responses from backend servers. If the same request is made multiple times,
the reverse proxy can serve the cached response instead of forwarding the request to the
server each time, thus improving response time and reducing load.

How Reverse Proxy Works

1. Client Request:
○ A client (like a web browser) makes a request for a resource (e.g., a webpage) using the
domain name (e.g., amazon.com).
2. Reverse Proxy Receives Request:
○ The request goes to the reverse proxy instead of directly to the backend servers.
3. Routing the Request:
○ The reverse proxy determines which backend server should handle the request based on the
URL or request type.
4. Forwarding to Backend Server:
○ The reverse proxy forwards the request to the appropriate backend server, which processes the
request.
5. Response from Backend Server:
○ The backend server sends the response back to the reverse proxy.
6. Return to Client:
○ The reverse proxy then sends the response back to the client, completing the request.

Benefits of Using a Reverse Proxy

● Simplified Client Interaction: Clients interact with a single point (the reverse proxy), not worrying
about multiple backend servers.
● Enhanced Security: The backend server's IP addresses are not exposed, reducing the risk of attacks.
● Dynamic Routing: Requests can be routed to different servers based on various factors (like server
load or type of request), optimizing resource usage.
● Efficient Management: Administrators can change backend servers without disrupting service, as
clients continue to connect to the reverse proxy.

Questions to Test Your Understanding

1. What is the main purpose of a reverse proxy?

2. How does a reverse proxy differ from a forward proxy?
3. What are some advantages of using a reverse proxy for a web application?
4. How does a reverse proxy improve security for backend servers?
5. Can a reverse proxy be used for load balancing? If so, how?

Object Oriented Programming Concepts
No ratings yet
Object Oriented Programming Concepts
11 pages
Introduction To Oop
No ratings yet
Introduction To Oop
12 pages
OOPs
No ratings yet
OOPs
15 pages
Benefits of Object Oriented Programming
No ratings yet
Benefits of Object Oriented Programming
10 pages
Chapter - 1-Odbms Concepts
No ratings yet
Chapter - 1-Odbms Concepts
3 pages
OOPS
No ratings yet
OOPS
20 pages
Oop Assignment
No ratings yet
Oop Assignment
3 pages
Overview
No ratings yet
Overview
34 pages
Updated Lecture Notes On Object Oriented Programming-1
No ratings yet
Updated Lecture Notes On Object Oriented Programming-1
19 pages
Classes OOP
No ratings yet
Classes OOP
36 pages
Worksheet 01 - OOP
No ratings yet
Worksheet 01 - OOP
7 pages
Most Important Oops Question For Interview
No ratings yet
Most Important Oops Question For Interview
72 pages
Unit-3 - OOP in Python - Updated
No ratings yet
Unit-3 - OOP in Python - Updated
70 pages
Oop Lect1
No ratings yet
Oop Lect1
42 pages
OOPS & JAVA-output
No ratings yet
OOPS & JAVA-output
8 pages
Presentation One-Oop
No ratings yet
Presentation One-Oop
14 pages
Object Oriented Programming
No ratings yet
Object Oriented Programming
3 pages
Object Oriented Programming
No ratings yet
Object Oriented Programming
3 pages
Object Oriented Programming C++
No ratings yet
Object Oriented Programming C++
16 pages
Object Oriented Programming
No ratings yet
Object Oriented Programming
6 pages
Unit 2 OOP&M
No ratings yet
Unit 2 OOP&M
15 pages
ADS - Unit 1 and 2 Notes
No ratings yet
ADS - Unit 1 and 2 Notes
28 pages
The Evolution of Object Oriented Programming
No ratings yet
The Evolution of Object Oriented Programming
7 pages
Object Oriented Programming
No ratings yet
Object Oriented Programming
9 pages
OOAD Self
No ratings yet
OOAD Self
29 pages
Unit 1
No ratings yet
Unit 1
17 pages
Object Oriented Programming
No ratings yet
Object Oriented Programming
6 pages
OOP Chapter1
No ratings yet
OOP Chapter1
19 pages
Features of OOPs
No ratings yet
Features of OOPs
5 pages
Data Encapsulation
No ratings yet
Data Encapsulation
11 pages
Oosd
No ratings yet
Oosd
18 pages
OOPs
No ratings yet
OOPs
2 pages
CSC 302 (Introduction To OOP)
No ratings yet
CSC 302 (Introduction To OOP)
15 pages
OOPM Notes
No ratings yet
OOPM Notes
24 pages
Oops Concepts
No ratings yet
Oops Concepts
23 pages
Java Programming - Second Year - Bharathiyar University Sylabbus
No ratings yet
Java Programming - Second Year - Bharathiyar University Sylabbus
268 pages
Week1b - OverviewOf OOP PDF
No ratings yet
Week1b - OverviewOf OOP PDF
3 pages
OOP Unit 1 Content 1
No ratings yet
OOP Unit 1 Content 1
33 pages
MCS024
No ratings yet
MCS024
41 pages
Unit 2 - Object Oriented Programming and Methodology - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Object Oriented Programming and Methodology - WWW - Rgpvnotes.in
15 pages
Comp. Oops Notes....
No ratings yet
Comp. Oops Notes....
6 pages
Oops Feature PPT 9
No ratings yet
Oops Feature PPT 9
15 pages
Classes in Python
No ratings yet
Classes in Python
7 pages
Unit 1 C++ RRU
No ratings yet
Unit 1 C++ RRU
23 pages
Object-Oriented Programming
No ratings yet
Object-Oriented Programming
2 pages
What Is Object-Oriented Programming (OOP) ?: Objects
No ratings yet
What Is Object-Oriented Programming (OOP) ?: Objects
9 pages
Unit-1 Notes
No ratings yet
Unit-1 Notes
35 pages
Oop Concepts N
No ratings yet
Oop Concepts N
12 pages
T - 2 - Key or Basic Concept of Oop
No ratings yet
T - 2 - Key or Basic Concept of Oop
7 pages
Mastering Object Oriented Programming A Comprehensive Guide To Learn Object Oriented Programming
No ratings yet
Mastering Object Oriented Programming A Comprehensive Guide To Learn Object Oriented Programming
264 pages
Study of Object Oriented Programming
No ratings yet
Study of Object Oriented Programming
5 pages
Oop PDF
No ratings yet
Oop PDF
27 pages
OOP Vs SP
No ratings yet
OOP Vs SP
3 pages
Oops Questions
100% (1)
Oops Questions
25 pages
OOPs by Piyush - Slashbyte
No ratings yet
OOPs by Piyush - Slashbyte
17 pages
Object-Oriented Programming
No ratings yet
Object-Oriented Programming
30 pages
Week3 Recap OOP
No ratings yet
Week3 Recap OOP
56 pages
How to pass Magento Certification Exam in 30 days
From Everand
How to pass Magento Certification Exam in 30 days
Magestore
3/5 (4)
Salesforce Developer Interview Questions: 1.0, #1
From Everand
Salesforce Developer Interview Questions: 1.0, #1
SFDC TELUGU
No ratings yet
Learn C++
From Everand
Learn C++
Aishik Dutta
No ratings yet
Week 3
No ratings yet
Week 3
33 pages
Digital Image Processing
No ratings yet
Digital Image Processing
47 pages
Week 1
No ratings yet
Week 1
23 pages
Assignment 2 DIP
No ratings yet
Assignment 2 DIP
36 pages
Affective Comping Book
No ratings yet
Affective Comping Book
772 pages
Local and Global Variables
No ratings yet
Local and Global Variables
5 pages
Braumat - Sistar v5.3 System Description
No ratings yet
Braumat - Sistar v5.3 System Description
52 pages
MS (Se) Syllabus
No ratings yet
MS (Se) Syllabus
93 pages
Object-Oriented Programming (OOP) and Its Features: Copilot
No ratings yet
Object-Oriented Programming (OOP) and Its Features: Copilot
2 pages
Java Material
No ratings yet
Java Material
114 pages
Introduction To Java String Handling - Class - 10
No ratings yet
Introduction To Java String Handling - Class - 10
5 pages
Operators: General Properties of Operators
No ratings yet
Operators: General Properties of Operators
23 pages
Chapter 1 Object Oriented Software Engineering and System Design
No ratings yet
Chapter 1 Object Oriented Software Engineering and System Design
113 pages
ECS ProcessExpert v7.0.2
No ratings yet
ECS ProcessExpert v7.0.2
180 pages
OOPS PPT Sagar Ali
No ratings yet
OOPS PPT Sagar Ali
10 pages
2-1 Lab Manual Oop
No ratings yet
2-1 Lab Manual Oop
23 pages
Svcet: Unit - 2
No ratings yet
Svcet: Unit - 2
53 pages
Ooad Unit-Ii
No ratings yet
Ooad Unit-Ii
15 pages
Unit-Ii: Chapter-4 Object Oriented Methodologies Objectives
No ratings yet
Unit-Ii: Chapter-4 Object Oriented Methodologies Objectives
64 pages
C++ MCQ
No ratings yet
C++ MCQ
7 pages
Features of Object Oriented System Analysis and Design
No ratings yet
Features of Object Oriented System Analysis and Design
5 pages
Unit Wise Important Questions
No ratings yet
Unit Wise Important Questions
4 pages
Online Bookstore Abstraction
No ratings yet
Online Bookstore Abstraction
9 pages
CS 1110 Notes
No ratings yet
CS 1110 Notes
45 pages
BSE2107 OOP II - Object Serialization and Deserialization-1
No ratings yet
BSE2107 OOP II - Object Serialization and Deserialization-1
11 pages
9.4 - Descriptor HowTo Guide
No ratings yet
9.4 - Descriptor HowTo Guide
9 pages
ICT 10 - Lecture
No ratings yet
ICT 10 - Lecture
16 pages
VB Imp PDF
No ratings yet
VB Imp PDF
83 pages
10th Computer Unit Test I Paper
No ratings yet
10th Computer Unit Test I Paper
2 pages
Pdi SF
No ratings yet
Pdi SF
20 pages
02-Introduction To Classes and Objects
No ratings yet
02-Introduction To Classes and Objects
33 pages
Brief Notes On Design Pattern
No ratings yet
Brief Notes On Design Pattern
8 pages
Arunai Engineering College: Department of Computer Science & Engineering
No ratings yet
Arunai Engineering College: Department of Computer Science & Engineering
132 pages
Edureka Python Ebook
100% (3)
Edureka Python Ebook
21 pages
FPD Jimma
No ratings yet
FPD Jimma
63 pages