DATA STRUCTURES-lect1
DATA STRUCTURES-lect1
•
• Top-down approach A top-down design approach starts by dividing the complex algorithm into one or more
modules.
• These modules can further be decomposed into one or more sub-modules, and this process of decomposition
is iterated until the desired level of module complexity is achieved.
• Top-down design method is a form of stepwise refinement where we begin with the topmost module and
incrementally add modules that it calls.
• Therefore, in a top-down approach, we start from an abstract design and then at each step, this design is
refined into more concrete levels until a level is reached that requires no further refinement.
• which a program executes is called its time complexity. This number is primarily dependent on the size of the program’s
input and the algorithm used.
• The space complexity of an algorithm is the amount of computer memory that is required during the program execution as a
function of the input size.
• The space needed by a program depends on the following two parts:
∑ Fixed part: It varies from problem to problem. It includes the space needed for storing instructions, constants, variables, and
structured variables (like arrays and structures).
∑ Variable part: It varies from program to program. It includes the space needed for recursion stack, and for structured
variables that are allocated space dynamically during the runtime of a program.
• Running time efficiency of algorithms
• Worst-case, Average-case, Best-case, and Amortized Time Complexity
• Worst-case running time This denotes the behavior of an algorithm with respect to the worst possible
case of the input instance.
• The worst-case running time of an algorithm is an upper bound on the running time for any input.
• Therefore, having the knowledge of worst-case running time gives us an assurance that the algorithm will
never go beyond this time limit.
• Average-case running time The average-case running time of an algorithm is an estimate of the running
time for an ‘average’ input.
• It specifies the expected behavior of the algorithm when the input is randomly drawn from a given
distribution.
• Average-case running time assumes that all inputs of a given size are equally likely.
• Best-case running time The term ‘best-case performance’ is used to analyze an algorithm
under optimal conditions.
• For example, the best case for a simple linear search on an array occurs when the desired
element is the first in the list.
• However, while developing and choosing an algorithm to solve a problem, we hardly
base our decision on the best-case performance. It is always recommended to improve the
average performance and the worst-case performance of an algorithm.
• Amortized running time Amortized running time refers to the time required to perform a
sequence of (related) operations averaged over all the operations performed.
• Amortized analysis guarantees the average performance of each operation in the worst
case.
Asymptotic Notations
• Asymptotic notations are the mathematical notations used to describe the running
time of an algorithm when the input tends towards a particular value or a limiting
value.
• For example: In bubble sort, when the input array is already sorted, the time taken by
the algorithm is linear i.e. the best case.
• But, when the input array is in reverse condition, the algorithm takes the maximum
time (quadratic) to sort the elements i.e. the worst case.
• When the input array is neither sorted nor in reverse order, then it takes average time.
These durations are denoted using asymptotic notations.
• There are mainly three asymptotic notations:
1.Best Case (Omega Notation (Ω))
2.Average Case (Theta Notation (Θ))
3.Worst Case (O Notation(O))
• Omega notation doesn’t really help to analyze an algorithm because it is
bogus to evaluate an algorithm for the best cases of inputs.
• Time–Space Trade-off
• The best algorithm to solve a particular problem at hand is no doubt the one
that requires less memory space and takes less time to complete its execution.
• Designing such an ideal algorithm is not a trivial task. There can be more than
one algorithm to solve a particular problem.
• One may require less memory space, while the other may require less CPU
time to execute. Thus, it is not uncommon to sacrifice one thing for the other.
Hence, there exists a time–space trade-off among algorithms.
• So, if space is a big constraint, then one might choose a program that takes
less space at the cost of more CPU time. On the contrary, if time is a major
constraint, then one might choose a program that takes minimum time to
execute at the cost of more space.
• Expressing Time and Space Complexity
• The time and space complexity can be expressed using a function f(n) where n is the input size for a
given instance of the problem being solved. Expressing the complexity is required when
• ∑ We want to predict the rate of growth of complexity as the input size of the problem increases.
• ∑ There are multiple algorithms that find a solution to a given problem and we need to find the
algorithm that is most efficient.
• The most widely used notation to express this function f(n) is the Big O notation. It provides the
upper bound for the complexity.
• Algorithm Efficiency
• If a function is linear (without any loops or recursions), the efficiency of that algorithm or the running
time of that algorithm can be given as the number of instructions it contains.
• However, if an algorithm contains loops, then the efficiency of that algorithm may vary depending on
the number of loops and the running time of each loop in the algorithm.
• Let us consider different cases in which loops determine the efficiency of an algorithm .
• CONTROL STRUCTURES USED IN ALGORITHMS
• An algorithm has a finite number of steps. Some steps may involve decision-making and repetition. Broadly speaking,
an algorithm may employ one of the following control structures:
• (a) sequence
• (b) decision
• (c) repetition
• Sequence
By sequence, we mean that each step of an algorithm is executed in a specified order. Let us write an algorithm to
add two numbers. This algorithm performs the steps in a purely sequential order.
• Decision
• Decision statements are used when the execution of a process depends on the outcome of some
condition
• For example, if x = y, then print EQUAL.
IF condition Then process
Repetition
• Involves executing one or more steps for a number of times, can be implemented using constructs such as
while, do–while, and for loops. These loops execute one or more steps until some condition is true.
• BIG O NOTATION
• In today’s era ,we are hardly concerned about the efficiency of algorithms.
• We are more interested in knowing the generic order of the magnitude of the algorithm. If we have two different
algorithms to solve the same problem where one algorithm executes in 10 iterations and the other in 20 iterations, the
difference between the two algorithms is not much. However, if the first algorithm executes in 10 iterations and the
other in 1000 iterations, then it is a matter of concern.
• We have seen that the number of statements executed in the program for n elements of the data is a function of the
number of elements, expressed as f(n). Even if the expression derived for a function is complex, a dominant factor
in the expression is sufficient to determine the order of the magnitude of the result and, hence, the efficiency of the
algorithm. This factor is the Big O, and is expressed as O(n).
• To summarize:
• Best case O describes an upper bound for all combinations of input. It is possibly lower than the
worst case. For example, when sorting an array the best case is when the array is already correctly
sorted.
• Worst case O describes a lower bound for worst case input combinations. It is possibly greater
than the best case. For example, when sorting an array the worst case is when the array is sorted in
reverse order.
• • If we simply write O, it means same as worst case O.
ARRAY
• An array is a collection of similar data elements. These data elements have the same data type.
• The elements of the array are stored in consecutive memory locations and are referenced by an index
(also known as the subscript). The subscript is an ordinal number which is used to identify an element
of the array.
• DECLARATION OF ARRAYS
• An array must be declared before being used. Declaring an array means specifying the following:
∑ Data type—the kind of values it can store, for example, int, char, float, double.
∑ Name—to identify the array.
∑ Size—the maximum number of values that the array can hold.
type name[size];
size of int is 2.
• STORING VALUES IN ARRAYS
• When we declare an array, we are just allocating space for its elements; no values are stored in the array. There are three ways to
store values in an array.
• First, to initialize the array elements during declaration;
• second, to input values for individual elements from the keyboard;
• third, to assign values to individual elements.
• Inputting Values from the Keyboard
An array can be initialized by inputting values from the keyboard. In this method, a while/do–while or a for loop
is executed to input the value for each element of the array.
• To DELETE the element from the middle of the array, we must first find the location from where the
element has to be deleted and then move all the elements (having a value greater than that of the
element) one position towards left so that the space vacated by the deleted element can be occupied by
rest of the elements.
• Figure 3.16 shows the algorithm in which we first initialize I with the position from which the element
has to be deleted.
• In Step 2, a while loop is executed which will move all the elements having an index greater than POS one space
towards left to occupy the space vacated by the deleted element.
• When we say that we are deleting an element, actually we are overwriting the element with the value of its
successive element.
• In Step 5, we decrement the total number of elements in the array by 1
• Merging Two arrays
• Merging two arrays in a third array means first copying the contents of the first array into the third array and then
copying the contents of the second array into the third array.
• Hence, the merged array contains the contents of the first array followed by the contents of the second array.
• If the arrays are unsorted, then merging the arrays is very simple, as one just needs to copy the contents of one array
into another.
• But merging is not a trivial task when the two arrays are sorted and the merged array also needs to be sorted.
If we have two sorted arrays and the resultant merged array also needs to be a sorted one
• Here, we first compare the 1st element of array1 with the 1st element of array2, and then put the smaller element
in the merged array.
• Since 20 > 15, we put 15 as the first element in the merged array.
• We then compare the 2nd element of the second array with the 1st element of the first array.
• Since 20 < 22, now 20 is stored as the second element of the merged array.
• Next, the 2nd element of the first array is compared with the 2nd element of the second array. Since 30 > 22,
we store 22 as the third element of the merged array.
• Now, we will compare the 2nd element of the first array with the 3 rd element of the second array. Because 30 <
31, we store 30 as the 4th element of the merged array.
• This procedure will be repeated until elements of both the arrays are placed in the right location
in the merged array.
• PASSING ARRAYS TO FUNCTIONS
• Like variables of other data types, we can also pass an array to a function. In some situations, you may want
to pass individual elements of the array; while in other situations, you may want to pass the entire array.
• Passing Individual elements
The individual elements of an array can be passed to a function by passing either their data values or
addresses
• Passing Data Values
Individual elements can be passed in the same manner as we pass variables of any other data type. The
condition is just that the data type of the array element must match with the type of the function
parameter.
Passing the entire array
• We have discussed that in C the array name refers to the first byte of the array in the memory.
• The address of the remaining elements in the array can be calculated using the array name and the index value of the
element. Therefore, when we need to pass an entire array to a function, we can simply pass the name of the array.
• A function that accepts an array can declare the formal parameter in either of the two following ways.
func(int arr[]); or func(int *arr);
• When we pass the name of an array to a function, the address of the zeroth element of the array is
copied to the local pointer variable in the function.
• When a formal parameter is declared in a function header as an array, it is interpreted as a pointer to a
variable and not as an array.
• With this pointer variable you can access all the elements of the array by using the expression:
array_name + index. You can also pass the size of the array as another parameter to the function.
• So for a function that accepts an array as parameter, the declaration should be as follows.
func(int arr[], int n); or func(int *arr, int n);
• We can also pass a part of the array known as a sub-array. A pointer to a sub-array is also an array pointer.
func(&arr[2], 8);
• POINTERS AND ARRAYS
• The concept of array is very much bound to the concept of pointer.
int arr[] = {1, 2, 3, 4, 5};
• Array notation is a form of pointer notation. The name of the array is the starting address of the array in
memory. It is also known as the base address.
• In other words, base address is the address of the first element in the array or the address of arr[0]. Now let us
use a pointer variable as given in the statement below.
int *ptr;
ptr = &arr[0];
• Here, ptr is made to point to the first element of the array. Execute the code given below and observe the output
which will make the concept clear to you.
main()
{
int arr[]={1,2,3,4,5};
printf("\n Address of array = %p %p %p", arr, &arr[0], &arr);
}
• If pointer variable ptr holds the address of the first element in the array, then the address of successive
elements can be calculated by writing ptr++.
int *ptr = &arr[0];
ptr++;
printf("\n The value of the second element of the array is %d", *ptr);
• ARRAYS OF POINTERS
An array of pointers can be declared as
int *ptr[10];
•4
• TWO-DIMENSIONAL ARRAYS
• A two-dimensional array is specified using two subscripts where the first subscript denotes the row and the second
denotes the column. The C compiler treats a two-dimensional array as an array of one-dimensional arrays.
• Declaring Two-dimensional arrays
Any array must be declared before being used. The declaration statement tells the compiler the name of the array,
the data type of each element in the array, and the size of each dimension. A two-dimensional array is declared as:
data_type array_name[row_size][column_size];
Consider a 20 * 5 two-dimensional array marks which has its base address = 1000
and the size of an element = 2. Now compute the address of the element, marks[18]
[4] assuming that the elements are stored in row major order.
• Initializing Two-dimensional arrays
•
• OPERATIONS ON TWO-DIMENSIONAL ARRAYS
•
• PASSING TWO-DIMENSIONAL ARRAYS TO FUNCTIONS
Passing a Row
A row of a two-dimensional array can be passed by indexing the array name with the row number.
• POINTERS AND THREE-DIMENSIONAL ARRAYS
•
• SPARSE MATRICES
•
Sparse matrix is a matrix that has large number of elements with a zero value. In order to efficiently
utilize the memory, specialized algorithms and data structures that take advantage of the sparse
structure should be used. If we apply the operations using standard matrix structures and algorithms
to sparse matrices, then the execution will slow down and the matrix will consume large amount of
memory. Sparse data can be easily compressed, which in turn can significantly reduce memory usage.