0% found this document useful (0 votes)
19 views97 pages

DATA STRUCTURES-lect1

The document discusses different types of data structures including primitive, non-primitive, linear, non-linear and abstract data types. It also discusses algorithms, approaches to designing algorithms, time and space complexity analysis and asymptotic notations.

Uploaded by

Anu Gau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views97 pages

DATA STRUCTURES-lect1

The document discusses different types of data structures including primitive, non-primitive, linear, non-linear and abstract data types. It also discusses algorithms, approaches to designing algorithms, time and space complexity analysis and asymptotic notations.

Uploaded by

Anu Gau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 97

DATA STRUCTURES

CLASSIFICATION OF DATA STRUCTURES

• Primitive and Non-primitive Data Structures


• Primitive data structures are the fundamental data types which are supported by a programming
language. Some basic data types are integer, real, character, and Boolean.
• The terms ‘data type’, ‘basic data type’, and ‘primitive data type’ are often used interchangeably.
• Non-primitive data structures are those data structures which are created using primitive data
structures.
• Examples of such data structures are linked lists, stacks, trees, and graphs.
• Non-primitive data structures can further be classified into two categories: linear and non-linear
data structures.
• ABSTRACT DATA TYPE
• An abstract data type is an abstraction of a data structure that provides only the
interface to which the data structure must adhere. The interface does not give any
specific details about something should be implemented or in what programming
language.
• Abstract data types are the entities that are definitions of data and operations but do
not have implementation details.
• In this case, we know the data that we are storing and the operations that can be
performed on the data, but we don't know about the implementation details. The
reason for not having implementation details is that every programming language
has a different implementation strategy for example; a C data structure is
implemented using structures while a C++ data structure is implemented using
objects and classes.
• For example, a List is an abstract data type that is implemented using a dynamic
array and linked list. A queue is implemented using linked list-based queue, array-
based queue, and stack-based queue. A Map is implemented using Tree map, hash
map, or hash table.
• Advantage of using ADTs
• In the real world, programs evolve as a result of new requirements or constraints,
so a modification to a program commonly requires a change in one or more of its
data structures.
• For example, if you want to add a new field to a student’s record to keep track of
more information about each student, then it will be better to replace an array with
a linked structure to improve the program’s efficiency.
• In such a scenario, rewriting every procedure that uses the changed structure is not
desirable. Therefore, a better alternative is to separate the use of a data structure
from the details of its implementation. This is the principle underlying the use of
abstract data types.
• ALGORITHMS

• ‘A formally defined procedure for performing some calculation’.


• An algorithm is basically a set of instructions that solve a problem. It is not uncommon
to have multiple algorithms to tackle the same problem, but the choice of a particular
algorithm must depend on the time and space complexity of the algorithm.
• An algorithm provides a blueprint to write a program to solve a particular problem. It is
considered to be an effective procedure for solving a problem in finite number of steps.
• So a well-defined algorithm always provides an answer and is guaranteed to terminate.
• Algorithms are mainly used to achieve software reuse.We can implement it in any high-
level language like C, C++, or Java.
• DIFFERENT APPROACHES TO DESIGNING AN ALGORITHM
• Algorithms are used to manipulate the data contained in data structures. When working with data structures,
algorithms are used to perform operations on the stored data.
• A complex algorithm is often divided into smaller units called modules. This process of dividing an algorithm into
modules is called modularization. The key advantages of modularization are as follows:
∑ It makes the complex algorithm simpler to design and implement.
∑ Each module can be designed independently. While designing one module, the details of other modules
can be ignored, thereby enhancing clarity in design which in turn simplifies implementation, debugging, testing,
documenting, and maintenance of the overall algorithm.
• There are two main approaches to design an algorithm— top-down approach and bottom-up approach


• Top-down approach A top-down design approach starts by dividing the complex algorithm into one or more
modules.
• These modules can further be decomposed into one or more sub-modules, and this process of decomposition
is iterated until the desired level of module complexity is achieved.
• Top-down design method is a form of stepwise refinement where we begin with the topmost module and
incrementally add modules that it calls.
• Therefore, in a top-down approach, we start from an abstract design and then at each step, this design is
refined into more concrete levels until a level is reached that requires no further refinement.

• Bottom-up approach A bottom-up approach is just the reverse of top-down approach.


• In the bottom-up design, we start with designing the most basic or concrete modules and then proceed
towards designing higher level modules.
• The higher level modules are implemented by using the operations performed by lower level modules. Thus,
in this approach sub-modules are grouped together to form a higher level module.
• All the higher level modules are clubbed together to form even higher level modules. This process is
repeated until the design of the complete algorithm is obtained.
• TIME AND SPACE COMPLEXITY
• Analyzing an algorithm means determining the amount of resources (such as time and memory) needed to execute it.
• Algorithms are generally designed to work with an arbitrary number of inputs, so the efficiency or complexity of an
algorithm is stated in terms of time and space complexity.
• The time complexity of an algorithm is basically the running time of a program as a function of the input size.
• In other words, the number of machine

• which a program executes is called its time complexity. This number is primarily dependent on the size of the program’s
input and the algorithm used.

• The space complexity of an algorithm is the amount of computer memory that is required during the program execution as a
function of the input size.
• The space needed by a program depends on the following two parts:

∑ Fixed part: It varies from problem to problem. It includes the space needed for storing instructions, constants, variables, and
structured variables (like arrays and structures).
∑ Variable part: It varies from program to program. It includes the space needed for recursion stack, and for structured
variables that are allocated space dynamically during the runtime of a program.
• Running time efficiency of algorithms
• Worst-case, Average-case, Best-case, and Amortized Time Complexity

• Worst-case running time This denotes the behavior of an algorithm with respect to the worst possible
case of the input instance.
• The worst-case running time of an algorithm is an upper bound on the running time for any input.
• Therefore, having the knowledge of worst-case running time gives us an assurance that the algorithm will
never go beyond this time limit.

• Average-case running time The average-case running time of an algorithm is an estimate of the running
time for an ‘average’ input.
• It specifies the expected behavior of the algorithm when the input is randomly drawn from a given
distribution.
• Average-case running time assumes that all inputs of a given size are equally likely.
• Best-case running time The term ‘best-case performance’ is used to analyze an algorithm
under optimal conditions.
• For example, the best case for a simple linear search on an array occurs when the desired
element is the first in the list.
• However, while developing and choosing an algorithm to solve a problem, we hardly
base our decision on the best-case performance. It is always recommended to improve the
average performance and the worst-case performance of an algorithm.

• Amortized running time Amortized running time refers to the time required to perform a
sequence of (related) operations averaged over all the operations performed.
• Amortized analysis guarantees the average performance of each operation in the worst
case.
Asymptotic Notations
• Asymptotic notations are the mathematical notations used to describe the running
time of an algorithm when the input tends towards a particular value or a limiting
value.
• For example: In bubble sort, when the input array is already sorted, the time taken by
the algorithm is linear i.e. the best case.
• But, when the input array is in reverse condition, the algorithm takes the maximum
time (quadratic) to sort the elements i.e. the worst case.
• When the input array is neither sorted nor in reverse order, then it takes average time.
These durations are denoted using asymptotic notations.
• There are mainly three asymptotic notations:
1.Best Case (Omega Notation (Ω))
2.Average Case (Theta Notation (Θ))
3.Worst Case (O Notation(O))
• Omega notation doesn’t really help to analyze an algorithm because it is
bogus to evaluate an algorithm for the best cases of inputs.
• Time–Space Trade-off
• The best algorithm to solve a particular problem at hand is no doubt the one
that requires less memory space and takes less time to complete its execution.
• Designing such an ideal algorithm is not a trivial task. There can be more than
one algorithm to solve a particular problem.
• One may require less memory space, while the other may require less CPU
time to execute. Thus, it is not uncommon to sacrifice one thing for the other.
Hence, there exists a time–space trade-off among algorithms.
• So, if space is a big constraint, then one might choose a program that takes
less space at the cost of more CPU time. On the contrary, if time is a major
constraint, then one might choose a program that takes minimum time to
execute at the cost of more space.
• Expressing Time and Space Complexity
• The time and space complexity can be expressed using a function f(n) where n is the input size for a
given instance of the problem being solved. Expressing the complexity is required when
• ∑ We want to predict the rate of growth of complexity as the input size of the problem increases.
• ∑ There are multiple algorithms that find a solution to a given problem and we need to find the
algorithm that is most efficient.
• The most widely used notation to express this function f(n) is the Big O notation. It provides the
upper bound for the complexity.

• Algorithm Efficiency
• If a function is linear (without any loops or recursions), the efficiency of that algorithm or the running
time of that algorithm can be given as the number of instructions it contains.
• However, if an algorithm contains loops, then the efficiency of that algorithm may vary depending on
the number of loops and the running time of each loop in the algorithm.
• Let us consider different cases in which loops determine the efficiency of an algorithm .
• CONTROL STRUCTURES USED IN ALGORITHMS
• An algorithm has a finite number of steps. Some steps may involve decision-making and repetition. Broadly speaking,
an algorithm may employ one of the following control structures:
• (a) sequence
• (b) decision
• (c) repetition

• Sequence
By sequence, we mean that each step of an algorithm is executed in a specified order. Let us write an algorithm to
add two numbers. This algorithm performs the steps in a purely sequential order.
• Decision
• Decision statements are used when the execution of a process depends on the outcome of some
condition
• For example, if x = y, then print EQUAL.
IF condition Then process

Repetition
• Involves executing one or more steps for a number of times, can be implemented using constructs such as
while, do–while, and for loops. These loops execute one or more steps until some condition is true.
• BIG O NOTATION
• In today’s era ,we are hardly concerned about the efficiency of algorithms.
• We are more interested in knowing the generic order of the magnitude of the algorithm. If we have two different
algorithms to solve the same problem where one algorithm executes in 10 iterations and the other in 20 iterations, the
difference between the two algorithms is not much. However, if the first algorithm executes in 10 iterations and the
other in 1000 iterations, then it is a matter of concern.
• We have seen that the number of statements executed in the program for n elements of the data is a function of the
number of elements, expressed as f(n). Even if the expression derived for a function is complex, a dominant factor
in the expression is sufficient to determine the order of the magnitude of the result and, hence, the efficiency of the
algorithm. This factor is the Big O, and is expressed as O(n).
• To summarize:
• Best case O describes an upper bound for all combinations of input. It is possibly lower than the
worst case. For example, when sorting an array the best case is when the array is already correctly
sorted.
• Worst case O describes a lower bound for worst case input combinations. It is possibly greater
than the best case. For example, when sorting an array the worst case is when the array is sorted in
reverse order.
• • If we simply write O, it means same as worst case O.
ARRAY
• An array is a collection of similar data elements. These data elements have the same data type.
• The elements of the array are stored in consecutive memory locations and are referenced by an index
(also known as the subscript). The subscript is an ordinal number which is used to identify an element
of the array.
• DECLARATION OF ARRAYS
• An array must be declared before being used. Declaring an array means specifying the following:
∑ Data type—the kind of values it can store, for example, int, char, float, double.
∑ Name—to identify the array.
∑ Size—the maximum number of values that the array can hold.
type name[size];

int, float, double, char, or any other valid data type


• ACCESSING THE ELEMENTS OF AN ARRAY
• To access all the elements, we must use a loop. That is, we can access all the elements of an array by varying
the value of the subscript into the array.

Calculating the address of array elements

size of int is 2.
• STORING VALUES IN ARRAYS
• When we declare an array, we are just allocating space for its elements; no values are stored in the array. There are three ways to
store values in an array.
• First, to initialize the array elements during declaration;
• second, to input values for individual elements from the keyboard;
• third, to assign values to individual elements.
• Inputting Values from the Keyboard
An array can be initialized by inputting values from the keyboard. In this method, a while/do–while or a for loop
is executed to input the value for each element of the array.

The index value i is incremented to access the next element in succession.


Therefore, when this code is executed, arr2[0] = arr1[0], arr2[1] = arr1[1],
arr2[2]= arr1[2], and so on.

In the code, we assign to each element a value equal to twice of


its index, where the index
starts from 0. So after executing this code, we will have arr[0]=0,
arr[1]=2, arr[2]=4, and so on
• Traversing an array
Traversing an array means accessing each and every element of the array for a specific purpose.
• Traversing the data elements of an array A can include printing every element, counting the total number of
elements, or performing any process on these elements.
• Since, array is a linear data structure (because all its elements form a sequence), traversing its elements is
very simple and straightforward.
ASSIGNMENT: Write a program to find
the mean of n numbers using arrays.
• Inserting an element in an array
• If an element has to be inserted at the end of an existing array, then the task of insertion is quite simple.
• We just have to add 1 to the upper_bound and assign the value.
• Here, we assume that the memory space allocated for the array is still available.
• For example, if an array is declared to contain 10 elements, but currently it has only 8 elements, then
obviously there is space to accommodate two more elements. But if it already has 10 elements, then we will
not be able to add another element to it.

In Step 1, we increment the value of the upper bound.


In Step 2, the new value is stored at the position pointed by the upper_bound .
• Step 1, we first initialize I with the total number of elements in the
array.
• In Step 2, a while loop is executed which will move all the elements
having an index greater than POS one position towards right to create
space for the new element.
• In Step 5, we increment the total number of elements in the array by 1
and finally in Step 6, the new value is inserted at the desired position.
ASSIGNMENT: Write a program to insert a number at a given location in an array.
Write a program to insert a number in an array that is already sorted in ascending order.
• Deleting an element from an array
• Deleting an element from an array means removing a data element from an already existing array.
• If the element has to be deleted from the end of the existing array, then the task of deletion is quite simple. We just
have to subtract 1 from the upper_bound.

• To DELETE the element from the middle of the array, we must first find the location from where the
element has to be deleted and then move all the elements (having a value greater than that of the
element) one position towards left so that the space vacated by the deleted element can be occupied by
rest of the elements.
• Figure 3.16 shows the algorithm in which we first initialize I with the position from which the element
has to be deleted.
• In Step 2, a while loop is executed which will move all the elements having an index greater than POS one space
towards left to occupy the space vacated by the deleted element.
• When we say that we are deleting an element, actually we are overwriting the element with the value of its
successive element.
• In Step 5, we decrement the total number of elements in the array by 1
• Merging Two arrays
• Merging two arrays in a third array means first copying the contents of the first array into the third array and then
copying the contents of the second array into the third array.
• Hence, the merged array contains the contents of the first array followed by the contents of the second array.
• If the arrays are unsorted, then merging the arrays is very simple, as one just needs to copy the contents of one array
into another.
• But merging is not a trivial task when the two arrays are sorted and the merged array also needs to be sorted.
If we have two sorted arrays and the resultant merged array also needs to be a sorted one

• Here, we first compare the 1st element of array1 with the 1st element of array2, and then put the smaller element
in the merged array.
• Since 20 > 15, we put 15 as the first element in the merged array.
• We then compare the 2nd element of the second array with the 1st element of the first array.
• Since 20 < 22, now 20 is stored as the second element of the merged array.
• Next, the 2nd element of the first array is compared with the 2nd element of the second array. Since 30 > 22,
we store 22 as the third element of the merged array.
• Now, we will compare the 2nd element of the first array with the 3 rd element of the second array. Because 30 <
31, we store 30 as the 4th element of the merged array.
• This procedure will be repeated until elements of both the arrays are placed in the right location
in the merged array.
• PASSING ARRAYS TO FUNCTIONS
• Like variables of other data types, we can also pass an array to a function. In some situations, you may want
to pass individual elements of the array; while in other situations, you may want to pass the entire array.
• Passing Individual elements
The individual elements of an array can be passed to a function by passing either their data values or
addresses
• Passing Data Values
Individual elements can be passed in the same manner as we pass variables of any other data type. The
condition is just that the data type of the array element must match with the type of the function
parameter.
Passing the entire array
• We have discussed that in C the array name refers to the first byte of the array in the memory.
• The address of the remaining elements in the array can be calculated using the array name and the index value of the
element. Therefore, when we need to pass an entire array to a function, we can simply pass the name of the array.

• A function that accepts an array can declare the formal parameter in either of the two following ways.
func(int arr[]); or func(int *arr);
• When we pass the name of an array to a function, the address of the zeroth element of the array is
copied to the local pointer variable in the function.
• When a formal parameter is declared in a function header as an array, it is interpreted as a pointer to a
variable and not as an array.
• With this pointer variable you can access all the elements of the array by using the expression:
array_name + index. You can also pass the size of the array as another parameter to the function.
• So for a function that accepts an array as parameter, the declaration should be as follows.
func(int arr[], int n); or func(int *arr, int n);
• We can also pass a part of the array known as a sub-array. A pointer to a sub-array is also an array pointer.

func(&arr[2], 8);
• POINTERS AND ARRAYS
• The concept of array is very much bound to the concept of pointer.
int arr[] = {1, 2, 3, 4, 5};
• Array notation is a form of pointer notation. The name of the array is the starting address of the array in
memory. It is also known as the base address.
• In other words, base address is the address of the first element in the array or the address of arr[0]. Now let us
use a pointer variable as given in the statement below.
int *ptr;
ptr = &arr[0];
• Here, ptr is made to point to the first element of the array. Execute the code given below and observe the output
which will make the concept clear to you.

main()
{
int arr[]={1,2,3,4,5};
printf("\n Address of array = %p %p %p", arr, &arr[0], &arr);
}
• If pointer variable ptr holds the address of the first element in the array, then the address of successive
elements can be calculated by writing ptr++.
int *ptr = &arr[0];
ptr++;
printf("\n The value of the second element of the array is %d", *ptr);
• ARRAYS OF POINTERS
An array of pointers can be declared as
int *ptr[10];

•4
• TWO-DIMENSIONAL ARRAYS
• A two-dimensional array is specified using two subscripts where the first subscript denotes the row and the second
denotes the column. The C compiler treats a two-dimensional array as an array of one-dimensional arrays.
• Declaring Two-dimensional arrays
Any array must be declared before being used. The declaration statement tells the compiler the name of the array,
the data type of each element in the array, and the size of each dimension. A two-dimensional array is declared as:
data_type array_name[row_size][column_size];

Consider a 20 * 5 two-dimensional array marks which has its base address = 1000
and the size of an element = 2. Now compute the address of the element, marks[18]
[4] assuming that the elements are stored in row major order.
• Initializing Two-dimensional arrays


• OPERATIONS ON TWO-DIMENSIONAL ARRAYS


• PASSING TWO-DIMENSIONAL ARRAYS TO FUNCTIONS

Passing a Row
A row of a two-dimensional array can be passed by indexing the array name with the row number.
• POINTERS AND THREE-DIMENSIONAL ARRAYS


• SPARSE MATRICES

Sparse matrix is a matrix that has large number of elements with a zero value. In order to efficiently
utilize the memory, specialized algorithms and data structures that take advantage of the sparse
structure should be used. If we apply the operations using standard matrix structures and algorithms
to sparse matrices, then the execution will slow down and the matrix will consume large amount of
memory. Sparse data can be easily compressed, which in turn can significantly reduce memory usage.

You might also like