DS Notes
DS Notes
UNIT – 1
ABSTRACT DATA TYPE
Abstract Data Types (ADTs) – ADTs and classes – introduction to OOP – classes in Python –
inheritance – namespaces – shallow and deep copying Introduction to analysis of algorithms
– asymptotic notations – divide & conquer – recursion – analyzing recursive algorithms
• Data structure is a branch of computer science. The study of data structure helps to
understand how data is organized and how data flow is managed to increase the
efficiency of any process or program.
Definition
Data structure is a way of storing, organizing and retrieving data in a computer, so that it can
be used efficiently. It provides a way to manage large amount of data proficiently.
• Primitive data structures are basic data structures. These can be manipulated or
operated directly by the machine level instructions. Basic data types such as integer,
real, character and Boolean come under this type. Example: int, char, float.
A data structure that maintains a linear relationship among the elements is called a
linear data structure.
Here, the data are arranged in a sequential fashion. But in the memory the
arrangement may not be sequential.
A data structure that maintains the data elements in hierarchical order are
known as nonlinear data structure. Thus, they are present at various levels.
• The content of the data structures can be modified without changing the memory
space allocated to it. Example: Array
In dynamic data structures the size of the structure is not fixed and can be modified during
the operations performed on it.
• Dynamic data structures are designed to facilitate change of data structures in the run
time. Example: Linked list.
• For example, to add a new field to a student record, to keep track of more
information about each student, then it will be better to replace an array with a linked
structure to improve the program’s efficiency. In such a scenario, rewriting every
procedure that uses the changed structure is not desirable.
• Therefore, a better alternative is to separate the use of a data structure from the details
of its implementation. This is the principle underlying use of abstract data type.
Definition
• Examples: List ADT, Stack ADT, Queue ADT, Trees, Graphs etc.
• The definition of ADT only mentions what operations are to be performed but not
how these operations will be implemented. It does not specify how data will be
organized in memory and what algorithms will be used for implementing the
operations. It is called “abstract” because it gives an implementation-independent
view. The process of hiding the nonessential details and providing only the essentials
of problem solving is known as data abstraction.
The set of operations can be grouped into four categories. They are,
• Constructors : Functions used to initialize new instances of the ADT at the time of
instantiation.
• ADTs give the feel of plug and play interface. So it is easy for the programmer to
implement ADT. For example, to store collection of items, it can be easily put the
items in a list. That is,
BirdsList=[‘Parrot’,’Dove’,’Duck’,’Cuckoo’]
• To find number of items in a list, the programmer can use len function without
implementing the code.
• The ADTs reduces the program developing time. Because the programmer can use the
predefined functions rather than developing the logic and implementing the same.
• The usage of ADTs reduces the chance of logical errors in the code, as the usage of
predefined functions in the ADTs are already bug free.
• ADTs provide well defined interfaces for interacting with the implementing code.
In this way, the user can use an abstract data type to define new class by specifying
attributes and operations
• The main ‘actors’ in the object-oriented paradigm are called objects. Each object is an
instance of a class.
• Each class presents to the outside world a concise and consistent view of the objects,
that are instances of this class, without going into too much unnecessary detail or
giving others access to the inner workings of the objects.
• The class definition typically specifies instance variables, also known as data
members, that the object contains, as well as the methods, also known as member
functions, that the object can execute.
• This view of computing is intended to fulfill several goals and incorporate several
design principles.
Example 2:
• This Student class would represent a composite type of data, named Student,
consisting of student_name (String type), student_id (integer or string type) and
student mark (float type).
• Thus, an instance of this student class represents an abstract view of the student
details. The class Student can be considered to represent a data type that is abstract in
nature. Thus, the classes are known to be abstract data type (ADT).
Robustness
• Every good programmer wants to develop software that is correct, which means that a
program produces the right output for all the anticipated inputs in the program’s
application. The software is said to be robust, if it is capable of handling unexpected
inputs that are not explicitly defined for its application. For example, if a program is
expecting a positive integer and instead is given a negative integer, then the program
should be able to recover gracefully from this error.
• Modularity
• Abstraction
• Encapsulation
Modularity
• Modern software systems consist of several different components that must interact
correctly in order for the entire system to work properly. Keeping these interactions
requires that these different components be well organized. Modularity refers to an
organizing principle in which different components of software system are divided
into separate functional units.
related functions and classes that are defined together in a single file of source code.
For example, Python’s standard library math module, provides definitions for key
mathematical constants and functions, and the os module, provides support for
interacting with the operating system.
Abstraction
• Abstraction allows dealing with the complexity of the object. Abstraction allows
picking out the relevant details of the object, and ignoring the non-essential details.
• Applying the abstraction to the design of data structures gives rise to Abstract Data
Types (ADTs). An ADT is a mathematical model of a data structure that specifies the
type of data stored, the operations supported on them, and the types of parameters of
the operations. An ADT specifies what each operation does, but not how it does it.
• Python supports abstract data types using a mechanism known as an abstract base
class (ABC). An abstract base class cannot be instantiated (i.e., you cannot directly
create an instance of that class), but it defines one or more common methods that all
implementations of the abstraction must have. An ABC is realized by one or more
concrete classes that inherit from the abstract base class while providing
implementations for those method declared by the ABC.
Encapsulation
• Encapsulation means information hiding. It hides the data defined in the class.
Encapsulation separates implementation of the class from its interface. The interaction
with the class is through the interface provided by the set of methods defined in the
class. This separation of interface from its implementation allows changes to be made
in the class without affecting its interface.
• It describes the main elements of a solution in an abstract way that can be specialized
for a specific problem at hand. The design pattern can be consistently applied to
implementations of data structures and algorithms.
• These design patterns fall into two groups patterns for solving algorithm design
problems and patterns for solving software engineering problems.
• Recursion
• Amortization
• Divide and conquer
• Prune and search
• Brute force
• Dynamic Programming
• The greedy method
Likewise, the software engineering design patterns include:
Iterator
Adapter
Position
Composition
Template method
Locator
CLASSES IN PYTHON
The data values stored inside an object are called attributes. The state information for
each instance is represented in the form of attributes (also known as fields, instance
variables, or data members).
8
A class provides a set of behaviors in the form of member functions (also known as
methods), with implementations that are common to all instances of that class.
Defining a class
A class is the definition of data and methods for a specific type of object.
• Syntax:
class classname:
<statement1>
<statement>
The class definition begins with the keyword class, followed by the name of the class,
a colon and an indented block of code that serves as the body of the class.
The body includes definitions for all methods of the class. These methods are defined
as functions, with a special parameter, named self, that is used to identify the
particular instance upon which a member is invoked.
When a class definition is entered, a new namespace is created, and used as the local
scope. Thus, all assignments to local variables go into this new namespace.
• Example:
class customer:
self.custName=name
self.custID=iden
self.custAccNo=acno
def display(self):
print("Customer ID = ",self.custID)
c = customer("Ramesh",10046,327659)
c.display()
• In python, the self-identifier places a key role. Self identifies the instance upon which
a method is invoked. While writing function in a class, at least one argument has to be
passed, that is called self-parameter. The self-parameter is a reference to the class
itself and is used to access variables that belongs to the class.
• In python programming self is a default variable that contains the memory address of
the instance of current class. So self is used to reuse all the instance variable and
instance methods.
Object Creation
An object is the runtime entity used to provide the functionality to the python class.
The attributes defined inside the class are accessed only using objects of that class.
As soon as a class is created with attributes and methods, a new class object is created
with the same name as the class.
This class object permits to access the different attributes as well as to instantiate new
objects of that class.
Instance of the object is created using the name same as the class name and it is
known as object instantiation.
object_name = class_name
Syntax:
object_name . function_name()
• Class variable is defined in the class and can be used by all the instances of that class.
Instance variable is defined in a method and its scope is only with in the object that
defines it.
10
Every object of the class has its own copy of that variable. Any change made to the
variable don’t reflect in other objects of that class.
Instance variables are unique for each instance, while class variables are shared by all
instances.
Example:
class Order:
self.coffee_name = coffee_name
self.price = price
print(ram_order.coffee_name)
print(ram_order.price)
print(paul_order.coffee_name)
print(paul_order.price)
• In this example, coffee_name and price are the class variables. ram_order and
paul_order are the two instances of this class. Each of these instances has their own
values set for the coffee_name and price instance variables.
• When ram’s order details are printed in the console, the values Espresso and 210 are
returned. When Paul’s order details are printed in the console, the values Latte and
275 are returned.
• This shows that, instance variables can have different values for each instance of the
class, whereas class variables are the same across all instances.
Constructor
In python _ _init_ _() method is called as the constructor of the class. It is always
called when an object is created.
11
Syntax:
Where,
_ _init_ _() method: It is a reserved method. This method gets called as soon as an
object of a class is instantiated.
self: The first argument self refers to the current object. It binds the instance to the _
_init_ _() method. It is usually named self to follow the naming convention.
Example:
class student:
def __init__(self,rollno,name,age):
self.rollno=rollno
self.name=name
self.age=age
s=student(1091,"Amala",21)
Output:
Types of constructors
12
1. Default Constructor
2. Non-Parameterized Constructor
3. Parameterized Constructor
Python adds a default constructor when the programmer does not include the
constructor in the class or forget to declare it.
Default constructor does not perform any task but initializes the objects. It is an empty
constructor without a body.
Non-Parametrized Constructor
This constructor does not accept the arguments during object creation. Instead, it
initializes every object with the same set of values.
Example:
class Employee:
self.name = "Ramesh"
self.EmpId = 100456
def display(self):
emp = Employee()
emp.display()
Output:
Employee Id = 100456
Parameterized constructor
13
• The first parameter to constructor is self that is a reference to the being constructed,
and the rest of the arguments are provided by the programmer. A parameterized
constructor can have any number of arguments.
Example:
class Employee:
self.name = name
self.age = age
self.salary = salary
def display(self):
emp1.display()
emp2.display()
Output:
Banu 23 17500
Jack 25 18500
Destructor
A class can define a special method called destructor with the help of __del()__. In
Python, destructor is not called manually but completely automatic, when the
instance(object) is about to be destroyed.
class Student:
# constructor
print('Inside Constructor')
self.name = name
print('Object initialized')
def display(self):
# destructor
def __del__(self):
print('Inside destructor')
print('Object destroyed')
# create object
s1 = Student('Raja')
s1.display()
# delete object
del s1
Output:
Inside Constructor
Object initialized
Inside destructor
Object destroyed
ENCAPSULATION
15
class Employee:
# data members
self.name = name
self.salary = salary
self.project = project
def display(self):
def work(self):
emp.display()
emp.work()
Output:
16
Advantages of Encapsulation
1. The main advantage of using encapsulation is the security of the data. Encapsulation
protects an object from unauthorized access.
2. Encapsulation hide an object’s internal representation from the outside called data
hiding.
4. Bundling data and methods within a class makes code more readable and
maintainable.
OPERATOR OVERLOADING
The same built-in operator or function shows different behavior for objects of
different classes, this is called operator overloading.
Example:
# add 2 numbers
print(100 + 200)
print('Python' + 'Programming')
Output:
300
PythonProgramming
The operator + is used to carry out different operations for distinct data types. This is
one of the simplest occurrences of polymorphism in Python.
17
Example:
class item:
self.price = price
b1 = item(400)
b2 = item(300)
Output:
INHERITANCE
There are two ways in which a subclass can differentiate itself from its superclass. A
subclass may specialize an existing behavior by providing a new implementation that
18
overrides an existing method. A subclass may also extend its superclass by providing
brand new methods.
In Python, based upon the number of child and parent classes involved, there are five
types of inheritance. The types of inheritance are listed below:
i. Single inheritance
v. Hybrid Inheritance
Single Inheritance
In single inheritance, a child class inherits from a single-parent class. Here is one
child class and one parent class.
Example:
# Base class
class Vehicle:
def Vehicle_info(self):
# Child class
class Car(Vehicle):
def car_info(self):
19
car = Car()
car.Vehicle_info()
car.car_info()
Output:
Multiple Inheritance
• In multiple inheritance, one child class can inherit from multiple parent classes. So
here is one child class and multiple parent classes.
# Parent class 1
class Person:
# Parent class 2
class Company:
# Child class
20
emp = Employee()
# access data
emp.person_info('Jessica', 28)
Output:
Multilevel Inheritance
• In multilevel inheritance, a class inherits from a child class or derived class. Suppose
three classes A, B, C. A is the superclass, B is the child class of A, C is the child class
of B. In other words, a chain of classes is called multilevel inheritance.
# Base class
class Vehicle:
def Vehicle_info(self):
# Child class
class Car(Vehicle):
def car_info(self):
21
# Child class
class SportsCar(Car):
def sports_car_info(self):
s_car = SportsCar()
s_car.Vehicle_info()
s_car.car_info()
s_car.sports_car_info()
Output:
Hierarchical Inheritance
• In Hierarchical inheritance, more than one child class is derived from a single parent
class. In other words, one parent class and multiple child classes.
class Vehicle:
def info(self):
print("This is Vehicle")
class Car(Vehicle):
class Truck(Vehicle):
obj1 = Car()
obj1.info()
obj1.car_info('BMW')
obj2 = Truck()
obj2.info()
obj2.truck_info('Ford')
Output:
This is Vehicle
This is Vehicle
Hybrid Inheritance
class Vehicle:
def vehicle_info(self):
class Car(Vehicle):
def car_info(self):
class Truck(Vehicle):
def truck_info(self):
def sports_car_info(self):
# create object
s_car = SportsCar()
s_car.vehicle_info()
s_car.car_info()
s_car.sports_car_info()
Output:
• In this application, we wish to create a new list named palette, which is a copy of the
warmtones list and subsequently wanted to add additional colors to palette,
or to modify or remove some of the existing colors, without affecting the contents of
warmtones.
• palette = warmtones
• This creates an alias as shown in figure and no new list is created. Instead, the new
identifier palette references the original list. If we add or remove colors from palette,
it will change the wormtones list also.
• To bake a cake, we can get different recipes from the internet. We can find ‘n’
number of steps for different varieties of cakes. All those different step by step
procedure to make a cake can be called as an algorithm. We can choose a simple, easy
and most convenient way to make a cake.
• Similarly in computer science, multiple algorithms are available for solving the same
problem. The algorithm analysis helps us to determine which algorithm is most
efficient in terms of running time, memory and space consumed. etc.,
25
Complexities of an Algorithm
• The complexity of an algorithm computes the amount of time and spaces required by
an algorithm for an input of size (n). The complexity of an algorithm can be two
types. The time complexity and the space complexity.
Time complexity measures the amount of time required to run an algorithm, as input size of
the algorithm increases.
Space complexity measures the total amount of memory that an algorithm or operation needs
to run according to its input size.
ASYMPTOTIC NOTATIONS
Asymptotic notation is one of the most efficient ways to calculate the time complexity of an
algorithm.
Asymptotic notations are mathematical tools to represent time complexity of algorithms for
asymptotic analysis.
The three asymptotic notations used to represent time complexity of algorithms are,
Big Oh is an Asymptotic Notation for the worst-case scenario. The Big-O notation defines
asymptotic upper bound of an algorithm, it bounds a function only from above.
26
It is represented as f(n) = O(g(n)). That is, at higher values of n, the upper bound of f(n) is
g(n).
f(n) ∈ O(g(n)) if and only if there exist some positive constant c and some non-negative
integer n₀ such that, f(n) ≤ c g(n) for all n ≥ n₀, n₀ ≥ 1 and c>0.
The above definition says, in the worst case, let the function f(n) be the algorithm's runtime,
and g(n) be an arbitrary time complexity. Then O(g(n)) says that the function f(n) never
grows faster than g(n) that is f(n)<=g(n) and g(n) is the maximum number of steps that the
algorithm can attain.
algorithm’s runtime.
• Big Omega is an Asymptotic Notation for the best-case scenario. Big W notation
defines an asymptotic lower bond.
• f(n) ∈ Ω(g(n)) if and only if there exist some positive constant c and some non-
negative integer n₀ such that, f(n) ≥ c g(n) for all n ≥ n₀, n₀ ≥ 1 and c>0.
• The above definition says, in the best case, let the function f(n) be the algorithm’s
runtime and g(n) be an arbitrary time complexity. Then Ω(g(n)) says that the
function g(n) never grows more than f(n) i.e. f(n)>=g(n), g(n) indicates
the minimum number of steps that the algorithm will attain.
• In the above graph, c.g(n) is a function that gives the minimum runtime (lower
bound) and f(n) is the algorithm’s runtime.
27
• Big Theta is an Asymptotic Notation for the average case, which gives the average
growth for a given function. Theta Notation is always between the lower bound and
the upper bound. It provides an asymptotic average bound for the growth rate of an
algorithm. If the upper bound and lower bound of the function give the same result,
then the Θ notation will also have the same rate of growth.
• f(n) ∈ q (g(n)) if and only if there exist some positive constant c₁ and c₂ some non-
negative integer n₀ such that, c₁ g(n) ≤ f(n) ≤ c₂ g(n) for all n ≥ n₀, n₀ ≥ 1 and c>0.
• The divide and conquer algorithm work by recursively breaking down a problem into
two or more sub problems of the same type, until they become simple enough to be
solved directly. The solutions to the sub problems are then combined to give a
solution to the original problem.
i. Divide the problem into a number of subproblems that are smaller instances of the
same problem.
ii. Conquer the subproblems by solving them recursively. If they are small enough,
solve the subproblems as base cases.
28
iii. Combine the solutions to the subproblems into the solution for the original problem.
29
It efficiently uses cache memory without occupying much space because it solves
simple subproblems within the cache memory instead of accessing the slower main
memory.
Divide and conquer problem allows to solve the sub problems independently. This
allows for execution of subproblems in different processors.
Recursion is incorporated into the majority of divide and conquer algorithms; hence it
requires intensive memory management.
If the recursion is carried through rigorously beyond the CPU stack, the system can
even crash.
Applications
Binary Search
Matrix multiplication
RECURSION
• Recursion is a technique by which a function makes one or more calls to itself during
execution.
30
• The factorial function is a classic mathematical function that has a natural recursive
definition.
• The factorial of a positive integer n, is defined as the product of the integers from 1 to
n. if n = 0, then n! is defined as 1.
• In this case, n = 0 is the base class. It is defined non recursively is terms of fixed
quantities. n (n-1)! is a recursive case.
def factorial(n):
if n = = 0:
return 1
else:
return n*factorial(n-1)
This function repetition is provided by the repeated recursive invocation of the
function. When the function is invoked, its argument is smaller by one and when a
base case is reached, no further recursive calls are made.
Binary Search
• When the sequence is unsorted the standard approach for searching a target value is
sequential search. When the sequence is sorted and indexable, then binary search is
used to efficiently locate a target value within a sorted sequence of n elements.
• For any index j, all the values stored at indices 0 to j-1 are less than or equal to the
value at index j, and all the values stored at indices j+1 to n-1 are greater than equal to
that at index j. This allows us to quickly search target value.
• The algorithm maintains two parameters, low and high, such that all the candidate
entries have index at least low and at most high. Initially, low = 0 and high = n−1.
Then we compare the target value to the median candidate, that is, the item data[mid]
with index mid = (low+high)/2.
If the target equals data[mid], then we have found the item, and the search terminates
successfully.
31
If target < data[mid], then we recur on the first half of the sequence, that is, on the
interval of indices from low to mid−1.
If target > data[mid], then we recur on the second half of the sequence, that is, on the
interval of indices from mid+1 to high.
An unsuccessful search occurs if low > high, as the interval [low, high] is empty. This
algorithm is known as binary search.
Implementation
This binary search algorithm requires O(log n) time. Where as the sequential search
algorithm uses O(n) time.
Computing factorials
• To compute factorials (n), there are a total of n + 1 activations. And each individual
activations of factorials execute a constant number of operations. Therefore, we
conclude that the overall number of operations for computing factorial (n) is O(n) as
there are n+1 activations, each of which accounts for O (1) operations.
32
• The efficiency of an English ruler algorithm depends on the number of lines that are
generated by an initial call draw _ interval(c), where c denotes the center length. Each
line of output is based upon a call to the draw_line utility, and each recursive call to
draw_interval makes exactly one call to draw_line, unless 0.
• A call to draw_interval(c) for c>0 spawns two calls to draw_interval (c-1) and a single
call to draw_line. For c ≥ 0, a call to draw_interval (c ) results in precisely 2c – 1 lines
of output.
33