0% found this document useful (0 votes)
6 views

Chapter4_Data_Structures_and_Algorithms

Chapter 4 of EE3491 covers data structures and algorithms, focusing on the importance of complex data structures in real-life applications. It discusses various data structures such as arrays, lists, queues, stacks, trees, sets, and maps, as well as dynamic memory management techniques in C and C++. The chapter emphasizes efficient data management through appropriate algorithms and memory allocation methods.

Uploaded by

ngu2762005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Chapter4_Data_Structures_and_Algorithms

Chapter 4 of EE3491 covers data structures and algorithms, focusing on the importance of complex data structures in real-life applications. It discusses various data structures such as arrays, lists, queues, stacks, trees, sets, and maps, as well as dynamic memory management techniques in C and C++. The chapter emphasizes efficient data management through appropriate algorithms and memory allocation methods.

Uploaded by

ngu2762005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 141

EE3491 - Kỹ thuật lập trình

EE3490E – Programming Techniques


Chapter 4: Data Structures &
Algorithms
Lecturer: Dr. Hoang Duc Chinh (Hoàng Đức Chính)
Department of Automation
School of Electrical and Electronic Engineering
Email: [email protected]

© HĐC 2024.2
Content

4.1. Introduction of data structures


4.2. Arrays and dynamic memory management
4.3. List structure
4.4. Algorithms Overview
4.5. Sorting Algorithms
4.6. Searching Algorithms

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 2


4.1 Introduction

Many real-life applications require complex data


structures, basic data type may not be sufficient to
represent
Examples:
 Student data: name, birthday, hometown, student identification
(ID), etc.
 Transfer function: numerator, denominator polynomial
 State space model: A, B, C, D matrices
 Process (sensing) data: parameter name, measurement range,
value, unit, timestamp, accuracy, threshold, etc.
 Graphic object: size, color, line weight, font, etc.
Data structure representation method: define a new data
type using structure (struct, class, union, etc.)
© HDC 2024.2 Chapter 4: Data Structures & Algorithms 3
Problem: Represent a set of data

Most of the data belonging to an application are related


to each other it needs to represent a set with structure,
e.g.:
 Student list: data are in alphabet order
 Generic model for a control system: include multiple
components interacting with each other
 Process (sensing) data: a data set which carries value of an
input at discrete time, input values are related to output values
 Graphic object: a window includes some graphic objects, a
schematic also include a number of graphic objects
In general, data in the same set has the same type, or at
least compatible types

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 4


Problem: Manage (a set of) data

Any data set can be represented by properly utilizing


data structure and array
Algorithms (functions) process the data, in order to
manage it efficiently:
 Add a new data record into a list, a table, a set, etc.
 Delete a record from a list, a table, a set, etc.
 Search a record in a list, a table, a set, etc. based on a criteria
 Sort a list to meet a criteria
…

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 5


How to manage data efficiently

 Minimize memory usage: “overhead” information is insignificant when


being compared with main data
 Fast and convenient access: time taken to add, search or delete data
record should be short
 Flexibility: number of data records should not be limited, does not need
to be known in advance, suitable for small and large scale problem
 Efficiency of data management depends on
 Data structure
 Algorithms used to add, search, sort, delete data
 References:
 Goodrich, M. T., Tamassia, R., Mount, D. M. (2011). Data Structures and
Algorithms in C++. United Kingdom: Wiley.
 Cormen, Thomas H., and Thomas H. Cormen. 2001. Introduction to
algorithms. Cambridge, Mass: MIT Press.
 Robert Sedgewick and Kevin Wayne. 2011. Algorithms (4th. ed.). Addison-
Wesley Professional.
© HDC 2024.2 Chapter 4: Data Structures & Algorithms 6
Common data structures
 Array (extended meaning): set of data which can be accessed by index

0 1 2 … n-2 n-1
 List: a data set contains elements each one links to another and the set can be
accessed in sequence
head …

 Queue: a data set with elements sorted in sequence, elements can be pushed
into one end of the queue and extracted from the other end of the queue, i.e.
first-in first-out list (FIFO)
front rear

dequeue enqueue

 Ring buffer (circular buffer): similar to queue that uses a single, fixed-
size buffer as if it were connected end-to-end. If it is full, it will replace the first
element with the new one. R F
6 7 8 9 A B 5

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 7


Common data structures (cont.)
 Stack: data structure with elements sorted in sequence but can be
accessed from one end, i.e. last-in first-out (LIFO) or first-in last-
out (FILO) pop
push

top top

 Tree: a data set contains elements linked to each other and can be
accessed in sequence from the root
 A tree of which each node has at most two children (two branches) is called
binary tree
Root A

B C

D E F G H

I J K
© HDC 2024.2 Chapter 4: Data Structures & Algorithms 8
Common data structures (cont.)
 Set: data structure which stores a collection of distinct elements
Carolyne
Alice
Dave Farah
Bob
Emily
 Map: a data set with sorted elements which can be accessed fast
by using “key”
 Hash table: data structure with elements sorted with integer codes
generated by a special function
key_1 Index Value
0 value_1
Hash
key_2 1 value_2
Function
2 value_3
key_3 3 value_4

 In mathematical calculation and control system: vector, matrix,


polynomial, rational fraction, transfer function, etc.
© HDC 2024.2 Chapter 4: Data Structures & Algorithms 9
4.2 Array and dynamic memory allocation

Data can be represented by using array efficiently:


 Read and write data quickly via index or address
 Saving memory
Fixed size array:
char buffer[SIZE];
Student student_list[100];
 Size of array must be known before compiling, users cannot
enter the number of elements, in another way, this number
cannot be a variable less flexible
 Occupy a fixed slot in stack (if it is a local variable) or data
segment (if it is a global variable) inefficient and inflexible
use of memory

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 10


4.2.1 Dynamic array

 Dynamic array is allocated in the memory as required, during


the run-time
#include <stdlib.h> /* C */
int n = 50;
...
float* p1= (float*) malloc(n*sizeof(float)); /* C */
double* p2= new double[n];// C++
 A pointer is used to handle a dynamic array, the usage is
similar to a fixed array
p1[0] = 1.0f;
p2[0] = 2.0;
 Once it is no longer in use free the allocated memory
free(p1); /* C */
delete [] p2;// C++

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 11


Memory allocation revision

4 memory segments: code segment,


data segment, stack, heap
Operating system
Code segment (sometime called text
segment) Other programs
 Stores constants and constant types Code (Text)
which have been defined
 Read-only Global variables
 Handled by the compiler
Data segment Free memory space (Heap)
 Used when the program is running
 Contains global and static variables Stack (arguments, local
(initialized or uninitialized) variables)
 Readable and writable
 Handled by the compiler Free memory space (Heap)

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 12


Memory allocation revision (cont.)

 Stack
 Temporary memory for local variables Operating system
(automatic extent) Other programs
 Data are in sequence based on the rule FILO
 Data is pushed in when a function is called Code (Text)
and it is freed (pop out) when exiting the
function Global variables

 Handled by the compiler


Free memory space (Heap)
 Heap (Free-store)
 Dynamic memory allocation
Stack (arguments, local
 Handled by developers (not the compiler) variables)
 Managed by standard functions
(malloc(), calloc() or realloc(), Free memory space (Heap)
free()) in C or operators (new, delete)
in C++

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 13


Risks

Developer takes all the responsibility in managing the


dynamic memory
No support from the compiler in handling memory
 May cause real-time issues
Storing data in stack is much faster than that in heap
 Stack is managed by handling pointer, while using heap
requires internal memory management
 It is better to use stack if possible

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 14


Allocating and free dynamic memory

C:
 malloc(): input argument is the number of byte, return a
non-type pointer (void*) which contain the address of
memory block allocated (in heap), return 0 if failed
 free(): input argument is a non-type pointer (void*), free
memory which has the address as provided
C++:
 Operator new[] allocates space for array with data type and
number of elements as defined in heap, return a pointer with
data type, or return 0 if failed
 Operator delete[] deallocate the space and its input
argument is a pointer
 Operator new[] and delete[] are applicable to allocate and
deallocate space of variables, objects besides arrays

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 15


Dynamic memory handling

Functions which handle dynamic memory are provided


by standard library, these are the only method to access
the heap
Allocating memory
void *malloc(size_t size)
int *p = malloc(10 * sizeof(int));
int *p = (int *) malloc(10 * sizeof(int));

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 16


malloc()

If malloc() is called successfully, it returns a pointer


which points to the address of the allocated memory
Other wise it returns NULL
In fact, computers have virtual memory which can be
considered unlimited
It is a good habit to verify the returning value of this
function

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 17


free()

free() release the memory segment allocated by


malloc()
The memory is freed and ready for subsequent usage
void free(void *p)
Don’t need type casting (cast) xxx * to void *
int *p = (int *) malloc(10 * sizeof(int));
...
free(p);

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 18


calloc()

calloc() is similar to malloc() but the memory


is initialized with zero value (clear allocate)
Its interface is a bit different from malloc()
void *calloc(size_t n, size_t size)
int *p = calloc(10, sizeof(int));

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 19


realloc()

realloc() is used to change the memory allocated by


malloc(), calloc(), or realloc()
void *realloc(void *p, size_t size)
Multiple features
 If p is NULL, it is the same as malloc()
 If it is not done successfully, it returns NULL, the memory is
reserved
 If size = 0, it is the same as free(), and returns
NULL

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 20


More on realloc()

realloc() is used to increase or decrease dynamic


memory
If it increases, the existing elements are unchanged and
the newly added elements have no initial values
If it decreases, the existing elements are the unchanged
However, if the memory space is not sufficient,
realloc() will allocate new memory block and copy
the whole old memory block to the new one, then delete
the old one
 This action disables the pointer which point to the old memory
block

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 21


Dynamic memory management

Handled by the developer


It includes
 Pointer which points to the memory block
 Allocating and free memory block
 Size of the memory block
Errors may occur in most of the C applications.

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 22


Example of wrong usage
char *string_duplicate(char *s)
/* Dynamically allocate a copy of a
string. User must remember to free this
memory. */
{
/* +1 for ’\0’ */
char *p = malloc(strlen(s)+1);
return strcpy(p, s);
}

char *s1;
s1 = string_duplicate("this is a string");
...
free(s1);

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 23


Common errors

The pointer points to an undefined value


 “memory corruption”
The pointer points to NULL
 Program halts
Free a pointer pointing to a memory block which is not
dynamic memory like stack, constant data
Not free memory after using (memory leak).
Access elements which are not in the range of the
allocated array

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 24


Good habbits

Each malloc() should be followed by free()


 Avoid memory corruption memory leak
 Always use malloc() and free() within a function
 Always build create() function and then destroy()
with complex objects
Pointer must be declared when starting
 Initialized as NULL or an existing value
 NULL means “point to nowhere”
Pointer should be set as NULL after being freed
 free(NULL) has no meaning (no effect)

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 25


3.2.2 Matrix

A matrix with fixed size (3x3), size of matrix can be


defined at compiler time
The following example shows a matrix created as
required by user
double **matrix = create_matrix(2,3);
matrix[0][2] = 5.4;
...
destroy_matrix(matrix);

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 26


Example: Matrix (cont.)
double **create_matrix1(int m, int n) {
double **p;
int i;
/* Allocate pointer array. */
p = (double **) malloc(m * sizeof(double*));
/* Allocate rows. */
for (i = 0; i < m; ++i)
p[i] = (double *) malloc(n * sizeof(double));
return p;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 27


Example: Matrix (cont.)
double **create_matrix2(int m, int n) {
double **p; int i;
assert(m>0 && n>0);
p = (double **) malloc(m * sizeof(double*));
if (p == NULL)
return p;
for (i = 0; i < m; ++i) { /* Allocate rows. */
p[i] = (double *) malloc(n * sizeof(double));
if (p[i] == NULL)
goto failed; /* Allocation failed */
}
return p;
failed:
for (−−i; i >= 0; −−i)
free(p[i]); /* delete allocated memory.*/
free(p);
return NULL;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 28


Example: Matrix (cont.)
/* Destroy an (m x n) matrix. Notice, the n
variable is not used, it is just there to
assist using the function. */
void destroy_matrix1(double **p, int m, int n) {
int i;
assert(m>0 && n>0);
for (i = 0; i < m; ++i)
free(p[i]);
free(p);
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 29


Example: Matrix (cont.)
double **create_matrix(int m, int n) {
double **p, *q;
int i;
assert(m>0 && n>0);
p = (double **) malloc(m * sizeof(double*));
if (p == NULL)
return p;
q = (double *) malloc(m * n * sizeof(double));
if (q == NULL) {
free(p);
return NULL;
}
for (i = 0; i < m; ++i, q += n)
p[i] = q;
return p;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 30


Example: Matrix (cont.)
void destroy_matrix(double **p)
/* Destroy a matrix. Notice, due to the
method by which this matrix was created,
the size of the matrix is not required.*/
{
free(p[0]);
free(p);
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 31


4.2.3 Mathematical vector structure

 Problem: how to represent a mathematical vector in C/C++


 Simple solution: normal dynamic array, but …
 Inconvenient use: a user must call allocation and deallocation
functions himself; number of array dimensions need to be included as
well
 Unsafe: a small mistake may cause a serious consequence
int n = 10;
double *v1,*v2, d;
v1 = (double*) malloc(n*sizeof(double));
v2 = (double*) malloc(n*sizeof(double));
d = scalarProd(v1,v2,n); // scalar_prod existed
d = v1 * v2;// OOPS!
v1.data[10] = 0;// OOPS!
free(v1);
free(v2);

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 32


Vector structure definition

File name: vector.h


Data structure:
struct Vector {
double *data;
int nelem;
};
Declaration of its basic functions:
Vector createVector(int n, double init);
void destroyVector(Vector);
double getElem(Vector, int i);
void putElem(Vector, int i, double d);
Vector addVector(Vector, Vector);
Vector subVector(Vector, Vector);
double scalarProd(Vector, Vector);
...

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 33


Define basic functions

 File name: vector.cpp


#include <stdlib.h>
#include "vector.h"
Vector createVector(int n, double init) {
Vector v;
v.nelem= n;
v.data= (double*) malloc(n*sizeof(double));
while (n--) v.data[n] = init;
return v;
}
void destroyVector(Vector v) {
free(v.data);
}
double getElem(Vector v, int i) {
if (i < v.nelem&& i >= 0) return v.data[i];
return 0;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 34


Define basic functions (cont.)

void putElem(Vector v, int i, double d) {


if (i >=0 && i < v.nelem) v.data[i] = d;
}

Vector addVector(Vector a, Vector b) {


Vector c = {0,0};
if (a.nelem == b.nelem) {
c = createVector(a.nelem,0.0);
for (int i=0; i < a.nelem; ++i)
c.data[i] = a.data[i] + b.data[i];
}
return c;
}

Vector subVector(Vector a, Vector b) {


Vector c = {0,0};
...
return c;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 35


Usage
#include "vector.h"
void main() {
int n = 10;
Vector a, b, c;
a = createVector(10,1.0);
b = createVector(10,2.0);
c = addVector(a,b);
//...
destroyVector(a);
destroyVector(b);
destroyVector(c);
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 36


4.2.4 Extendable array

Extend array to improve the disadvantage of fixed-size


array in C
Size of the array is up to requirement. Elements are
added to the end of the array and indexes are updated
automatically
Disadvantages:
 The array always occupies memory
 Adding new element results in copying all the old ones to new
positions
Thus, a memory block should be allocated instead of
providing each element a memory slot

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 37


Extendable array

Method
Each memory block:
newsize = K * oldsize;
 Usually, we choose K = 2
Demonstration of realloc() usage

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 38


Example: Extendable array (cont.)
#include "vector.h"
#include <stdlib.h>
#include <assert.h>
/* Private interface */
/* initial vector capacity */
static const int StartSize = 1;
/* geometric growth of vector capacity */
static const float GrowthRate = 1.5;
/* pointer to vector elements */
static int *data = NULL;
/* current size of vector */
static int vectorsize = 0;
/* current reserved memory for vector */
static int capacity = 0;

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 39


Example: Extendable array

/* Vector access operations. */


int push_back(int item);
int pop_back(void);
int* get_element(int index);
/* Manual resizing operations. */
int get_size(void);
int set_size(int size);
int get_capacity(void);
int set_capacity(int size);

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 40


Example: Extendable array (cont.)
/* Add element to back of vector. Return index of new
element if successful, and -1 if fails. */
int push_back(int item) {
/* If out-of-space, allocate more. */
if (vectorsize == capacity) {
int newsize = (capacity == 0) ? StartSize :
(int)(capacity*GrowthRate + 1.0);
int *p = (int *)realloc(data, newsize*sizeof(int));
if (p == NULL)
return -1;
capacity = newsize; /* update data-structure */
data = p;
}
data[vectorsize] = item; /* We have enough room. */
return vectorsize++;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 41


Example: Extendable array (cont.)

/* Return element from back of vector, and


remove it from the vector. */
int pop_back(void) {
assert(vectorsize > 0);
return data[−−vectorsize];
}

/* Return pointer to the element at the


specified index. */
int* get_element(int index) {
assert(index >= 0 && index < vectorsize);
return data + index;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 42


Example: Extendable array (cont.)
/* Manual size operations. */
int get_size(void) {return vectorsize;}
int get_capacity(void) {return capacity;}

/* Set vector size.


Return 0 if successful, -1 if fails. */
int set_size(int size) {
if (size > capacity) {
int *p = (int *) realloc(data, size*sizeof(int));
if (p == NULL)
return −1;
/* allocate succeeds, update data-structure */
capacity = size;
data = p;
}
vectorsize = size;
return 0;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 43


Example: Extendable array (cont.)
/* Shrink or grow allocated memory reserve for array.
A size of 0 deletes the array. Return 0 if
successful, -1 if fails. */
int set_capacity(int size) {
if (size != capacity) {
int *p = (int *) realloc(data, size*sizeof(int));
if (p == NULL && size > 0)
return −1;
capacity = size;
data = p;
}
if (size < vectorsize)
vectorsize = size;
return 0;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 44


Vector in C++ STL

 In C++, vector is a dynamic array which can resize itself


automatically when an element is inserted or deleted
 It is the part of the C++ Standard Template Library and defined in
<vector> header file as
vector<T> vec_name;
where:
 T: Type of elements in the vector.
 vec_name: Name assigned to the vector.
 E.g.: vector<int> vector1 = {1, 2, 3, 4, 5};
 C++ Vector Functions
 size() returns the number of elements present in the vector
 clear() removes all the elements of the vector
 front() returns the first element of the vector
 back() returns the last element of the vector
 empty() returns 1 (true) if the vector is empty
 capacity() check the overall size of a vector
 …
© HDC 2024.2 Chapter 4: Data Structures & Algorithms 45
Vector in C++ STL - Example
// vector::begin/end
#include <iostream>
#include <vector>

int main () {
std::vector<int> myvector;
for (int i=1; i<=5; i++) myvector.push_back(i);

std::cout << "myvector contains:";


for (std::vector<int>::iterator it = myvector.begin();
it != myvector.end();
++it)
std::cout << ' ' << *it;

std::cout << '\n';


return 0;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 46


4.2.5 Further discussion
 A pointer is used to handle dynamic array, it is not a dynamic
array
 Allocate and deallocate memory space but not allocate and
deallocate a pointer
 Free memory space only once
int* p;
p[0] = 1;// never do it
new(p);// access violation!
p = new int[100];// OK
p[0] = 1;// OK
int* p2=p;// OK
delete[] p2;// OK
p[0] = 1;// access violation!
delete[] p;// very bad!
p = new int[50];// OK, new array
...

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 47


Allocate dynamic memory for a single variable

 Purpose: Object can be created dynamically, while the


program is running, such as adding a new student to a list,
drawing a new shape in a schematic, inserting a component
in a system, etc.
 Syntax:
int* p = new int;
*p = 1;
p[0]= 2;// the same as above
p[1]= 1;// access violation!
int* p2 = new int(1);// with initialization
delete p;
delete p2;
Student* ps= new Student;
ps->code = 1000;
...
delete ps;
 A single variable is not an array with one element
© HDC 2024.2 Chapter 4: Data Structures & Algorithms 48
Example: using dynamic memory
Date* createDateList(int n) {
Date* p = new Date[n];
return p;
}
void main() {
int n;
cout << "Enter the number of your national
holidays:";
cin >> n;
Date* date_list= createDateList(n);
for (int i=0; i < n; ++i) { ... }
for (....) { cout << ....}
delete [] date_list;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 49


Output argument is a pointer
void createDateList(int n, Date* &p) {
p = new Date[n];
}

void main() {
int n;
cout << "Enter the number of your national holidays:";
cin >> n;
Date* date_list;
createDateList(n, date_list);
for (int i=0; i < n; ++i) {
...
}
for (....) { cout<< ....}
delete [] date_list;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 50


Summary on dynamic memory

Efficiency
 Memory is allocated as much as it is required and when it is
required while the program is running
 Memory is allocated within the free space of the computer
(heap), it depends on only computer memory
 Allocated memory can be freed once it is not used anymore
Flexibility
 Lifetime of the dynamically allocated memory may be longer
than the lifetime of the object allocated it
 It is possible to call a function to allocated memory and another
function to deallocate it
 The flexibility may cause memory leak

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 51


4.3 List structure

Problem: create a structure to manage dynamic data


efficiently and flexibly, e.g.:
 Email
 To-do list
 Graphical objects in a figure
 Dynamic blocks in a simulation model (similar to SIMULINK)
Requirements:
 The number of records/elements in the list changes frequently
 Add or delete data operations should be fast and simple
 Minimize memory usage

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 52


Using array?

Number of elements in an array is actually fixed.


Memory space must be known when being allocated, it
cannot be extended or shrunk
If memory space used is less than allocated one
wasting memory
If memory space is full and more elements need to be
added, it is required to reallocate memory space and
copy the whole existing data to the new array time
consuming if the array size is large
If an element which needs to be added/deleted is at the
first position or in the middle of the array, it is required
to copy and shift the rest of the data time consuming

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 53


4.3.1 Linked list

pHead

Item A Data A

Item B Data B

Item C Data C

Item X Data X

Item Y 0x00 Data Z

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 54


Linked list: Insert data

pHead pHead

pHead Data T Data A

Data A Data B

Data B Data T

Data C Data C

Data X Data X

0x00 Data Z 0x00 Data Z

At the beginning of the list In the middle of the list

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 55


Linked list: Delete data

pHead pHead

Data A Data A

Data B Data B

Data C Data C

Data X Data X

0x00 Data Z 0x00 Data Z

At the beginning of the list In the middle of the list

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 56


Summary

Advantages:
 Flexible usage, allocating memory when needed and
deallocating after using
 Add/delete element via pointer; time taken to perform these
task is constant, doesn’t depend on data length or position
 Access data in sequence
Disadvantages:
 Added element must be allocated dynamic memory
 Deleting element requires respected memory space to be freed
 If data type is not large, the overhead may be dominant
 Searching data is based on linear methods which consume more
time

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 57


Example: mail box

#include <string>
using namespace std;
struct MessageItem {
string subject;
string content;
MessageItem* pNext;
};
struct MessageList {
MessageItem* pHead;
};
void initMessageList(MessageList& l);
void addMessage(MessageList&, const string& sj,
const string& ct);
bool removeMessageBySubject(MessageList&l,
const string& sj);
void removeAllMessages(MessageList&);

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 58


Example: mail box (cont.)
#include "List.h"
void initMessageList(MessageList& l) {
l.pHead = 0;
}
void addMessage(MessageList& l, const string& sj,
const string& ct) {
MessageItem* pItem = new MessageItem;
pItem->content = ct;
pItem->subject = sj;
pItem->pNext = l.pHead;
l.pHead = pItem;
}
void removeAllMessages(MessageList& l) {
MessageItem *pItem = l.pHead;
while (pItem != 0) {
MessageItem* pItemNext = pItem->pNext;
delete pItem;
pItem = pItemNext;
}
l.pHead = 0;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 59


Example: mail box (cont.)
bool removeMessageBySubject(MessageList& l,
const string& sj) {
MessageItem* pItem = l.pHead;
MessageItem* pItemBefore;
while (pItem != 0 && pItem->subject != sj) {
pItemBefore = pItem;
pItem = pItem->pNext;
}
if (pItem != 0) {
if (pItem == l.pHead)
l.pHead = 0;
else
pItemBefore->pNext = pItem->pNext;
delete pItem;
}
return pItem != 0;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 60


Example: mail box usage (cont.)
#include <iostream>
#include "list.h"
using namespace std;
void main() {
MessageList myMailBox;
initMessageList(myMailBox);
addMessage(myMailBox,"Hi","Welcome, my friend!");
addMessage(myMailBox,"Test","Test my mailbox");
addMessage(myMailBox,"Lecture Notes","Programming Techniques");
removeMessageBySubject(myMailBox,"Test");
MessageItem* pItem = myMailBox.pHead;
while (pItem != 0) {
cout << pItem->subject << ":" << pItem->content << '\n';
pItem = pItem->pNext;
}
char c;
cin >> c;
removeAllMessages(myMailBox);
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 61


Homework

Create a linked-list consisting of public holidays of a


year and description of each day (as string), so that
 A new public holiday can be added to the beginning of the list
 Insert a new public holiday in a specific position in the list, e.g.
after the closest day in the list.
 Search for the description of the day (input argument is a date
including day and month)
 Delete a public holiday at the beginning of the list
 Delete a public holiday in the middle of the list (input argument
is a date including day and month)
 Clear the whole list
Write a program to demonstrate the usage of the above
list

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 62


4.3.2 More on lists

Double Link List


Stack
Circular Buffer
Hash table

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 63


Double Link List

 A Doubly Linked List (DLL) contains an extra pointer,


typically called previous pointer, together with next pointer
and data which are there in singly linked list.
struct List {
int item;
struct List *next;
struct List *prev;
};

 Advantages:
 A DLL can be traversed in both forward and backward direction
 The delete operation in DLL is more efficient
 We can quickly insert a new node before a given node

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 64


Double Link List example

struct List *insert_after(struct List *node, int item)


{
/* Allocate memory for new node. */
struct List *newnode = (struct List *)
malloc(sizeof(struct List));
if (newnode == NULL)
return NULL; /* allocation failed */
/* If list is not empty, splice new node into list. */
if (node) {
newnode−>next = node−>next;
node−>next = newnode;
}
else
newnode−>next = NULL;
newnode−>item = item;
return newnode;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 65


typedef and Structure

 typedef is similar to #define but it is a C keyword


 It enables to create a new data type with a new name
typedef int Length;
Length len, maxlen;
Length lengths[50];
 typedef simplifies the use of structure in C (but not required
in C++)
typedef struct Point {
int x;
int y;
} Point;

Point pt1, pt2;

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 66


typedef and Link List

Example:
typedef struct list_t List;
struct list_t {
int item;
List *next;
};
Main reasons to use typedef
 Simplify the complex name
 Example of function pointer
typedef int (*PFI)(char *, char *);
PFI pfarray[10];
 Self-defined data type make the program easier to read, e.g.:
Length is more intelligible than int).

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 67


Usage of typedef

typedef enable to hide the incompatible codes


amongst different microprocessors
 E.g.: in a 32 bit computer, we can write:
typedef short INT16;
typedef int INT32;
 In a 16 bit computer, they can be written as:
typedef int INT16;
typedef long INT32;

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 68


Usage of typedef

typedef enables to write the program compatible with


different data types
typedef int ValueType;
typedef struct List {
ValueType item;
struct List *next;
} List;

List *insert_back(List *node, ValueType item);


List *insert_after(List *node, ValueType item);
This is the simplest form of generic programming.

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 69


Stack

Basic Operations
 Initializing, using it and then de-initializing the stack
Two primary operations:
 push() − Pushing (storing) an element on the stack
 pop() − Removing (accessing) an element from the stack

push pop

top top

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 70


Stack
typedef struct Stack {
double buffer[MAXSIZE]; /* Stack buffer. */
int count; /* Number of elements in stack. */
} Stack;

void push(Stack *s, double item) {


s->buffer[s->count++] = item;
}
double pop(Stack *s) {
return s->buffer[--s->count];
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 71


Circular Buffer

An array with pre-defined size. It is also called “Ring


Buffer”
It returns to the beginning position once it reaches the
last element
It is also a queue but first element will be replaced by a
new one when the queue is full
It is usually used in a real-time control system in which
many processes interacts with the object

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 72


Circular Buffer
F R
1

F R
1 2

F R
1 2 3

F R
2 3

F R
3

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 73


Circular Buffer
F R
3 4

F R
3 4 5 …

R F
6 7 8 9 3 4 5

R F
6 7 8 9 A 4 5

R F
6 7 8 9 A B 5

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 74


Circular Buffer – Implementation using fix array
#include <stdio.h> int get (int * value) {
if (front == rear) {
enum { N = 20 }; // buffer is empty
// N elements of the circular buffer return 0;
}
int buffer [N];
// Note that N-1 is the actual *value = buffer[front];
// capacity, see put function front = (front + 1) % N;
return 1;
int rear = 0; int front = 0; }

int put (int item) { int main () {


if ((rear + 1) % N == front) { // test circular buffer
// buffer is full, avoid overflow int value = 1001;
return 0; while (put (value ++));
} while (get (& value))
buffer[rear] = item; printf ("read %d\n", value);
rear = (rear + 1) % N; return 0;
return 1; }
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 75


Circular Buffer – Implementation using dynamic array

typedef struct CircBuf_t {


ValueType *array; /* Pointer to array of items */
int size; /* Maximum number of items in buffer */
int nelems; /* Current number of items in buffer */
int front; /* Index to front of buffer */
int back; /* Index to back of buffer */
} CircBuf;

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 76


Example

typedef double ValueType;


typedef struct CircBuf_t CircBuf;
/* create-destroy buffer */
CircBuf *create_buffer(int size);
void destroy_buffer(CircBuf *cb);
/* add-remove elements */
int add_item(CircBuf *cb, const ValueType *item);
int get_item(CircBuf *cb, ValueType *item);
/* query state */
int get_nitems(const CircBuf *cb);
int get_size(const CircBuf *cb);

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 77


Example
int add_item(CircBuf *cb, const ValueType *item) {
/* Add a new element to front of buffer.
Returns 0 for success, and -1 if buffer is full. */
if (cb−>nelems == cb−>size)
return −1;
cb−>array[cb−>front] = *item;
if (++cb−>front == cb−>size) /* wrap around */
cb−>front = 0;
++cb−>nelems;
return 0;
}

int get_item(CircBuf *cb, ValueType *item) {


/* Remove element from back of buffer, and assign it to *item.
Returns 0 for success, and -1 if buffer is empty. */
if (cb−>nelems == 0)
return −1;
−−cb−>nelems;
*item = cb−>array[cb−>back];
if (++cb−>back == cb−>size) /* wrap around */
cb−>back = 0;
return 0;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 78


Hash Table

Data is stored in an array format, where each data value


has its own unique index value
Use for fast looking up if we know the index of desired
data
The most common implementation is a data array
combined with link-list
Basic Operations are Search, Insert, Delete
key_1 Index Value
0 value_1
Hash
key_2 1 value_2
Function
2 value_3
key_3 3 value_4

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 79


Example
typedef struct Dictionary_t Dictionary;
Dictionary *create_table(void);
void destroy_table(Dictionary *);
int add_word(Dictionary *, const char *key, const char
*defn);
char *find_word(const Dictionary *, const char *key);
void delete_word(Dictionary *, const char *key);

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 80


Example
#define HASHSIZE 101
struct Nlist {
char *word; /* search word */
char *defn; /* word definition */
struct Nlist *next; /* pointer to next entry in chain */
};

struct Dictionary_t {
/* table is an array of pointers to entries */
struct Nlist *table[HASHSIZE];
};

static unsigned hash_function(const char *str) {


/* Hashing function converts a string to an index within hash
table. */
const int HashValue = 31;
unsigned h;

for (h = 0; *str != ’\0’; ++str)


h = *str + HashValue * h;
return h % HASHSIZE;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 81


Example
int add_word(Dictionary *dict, const char *key, const char *defn) {
/* Add new word to table. Replaces old definition if word already exists.
Return 0 if successful, and -1 is fails. */
unsigned i = hash_function(key); /* get table index */
struct Nlist *pnode = dict−>table[i];
while (pnode && strcmp(pnode−>word, key) != 0) /* search chain */
pnode = pnode−>next;
if (pnode) { /* match found, replace definition */
char *str = allocate string(defn);
if (str == NULL) /* allocation fails, return fail and keep old defn */
return −1;
free(pnode−>defn);
pnode−>defn = str;
}
else { /* no match, add new entry to head of chain */
pnode = makenode(key, defn);
if (pnode == NULL)
return −1;
pnode−>next = dict−>table[i];
dict−>table[i] = pnode;
}
return 0;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 82


Example
char *find_word(const Dictionary *dict, const char *key) {
/* Find definition for keyword.
Return NULL if key not found. */

unsigned i = hash_function(key); /* get table index */


struct Nlist *pnode = dict−>table[i];

while (pnode && strcmp(pnode−>word, key) != 0)


pnode = pnode−>next; /* search index chain */

if (pnode) /* match found */


return pnode−>defn;

return NULL;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 83


4.4. Algorithms Overview

An algorithm is a well-defined sequential computational


technique that accepts a value or a collection of values as
input and produces the output(s)needed to solve a
problem.
An algorithm is thus a sequence of computational steps
that transform the input into the output.
An algorithm is to solve a computational problem
 Algorithm solves a problem if it returns a correct output for
every problem input
 An algorithm is said to be accurate if and only if it stops with
the proper output for each input instance

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 84


4.4. Algorithms Overview

Why algorithm?
 Efficiency: Algorithms can perform tasks quickly and
accurately.
 Consistency: Algorithms are repeatable and produce consistent
results every time they are executed.
 Scalability: Algorithms can be scaled up to handle large
datasets or complex problems.
 Automation: Algorithms can automate repetitive tasks, reduce
the need for human intervention and freeing up time for other
tasks.
 Standardization: Algorithms can be standardized and shared
among different teams or organizations.

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 85


4.4. Algorithms Overview

Fundamental types of algorithm:


 Sorting algorithms: Bubble sort, Insertion sort, Merge sort, …
 Searching algorithms: Linear search, Bisectional search,
Interpolation search, …
Graph algorithms
 Shortest path algorithm Minimum spanning tree/Maximum
flow algorithms, Network flow algorithm, …
Optimization algorithms
 Greedy, Dynamic programming, …
 Metaheuristic, Nature inspired optimization algorithms
and many more

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 86


4.5. Sorting algorithms

Objective: To rearrange a given array or list elements


according to a comparison operator on the elements. The
comparison operator is used to decide the new order of
element in the respective data structure. E.g. sorting an
array of number in:
 Ascending order: from the smallest to the largest number
 Descending order: from the largest number to smallest number
Problem statement:
 Input: given an array a of size n: a[0], a[1], …, a[n-1]
 Output: a permutation (reordering) of
a (a’[0], a’[1], …, a’[n-1] in such a way that:
a’[0] <= a’[1] <= ... <= a’[n-1]

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 87


4.5. Sorting algorithms

 A number of applications in reality:


 Commercial computing. Government organizations, financial
institutions, and commercial enterprises organize much of this
information by sorting it
 Search for information. Keeping data in sorted order makes it
possible to efficiently search through it. E.g.: sorting the number of
access of websites using a search engine like Google
 String processing algorithms are often based on sorting
 As fundamental algorithm to solve more complex problems such as
numerical computations, operations research, optimization, etc.
 Early sorting algorithms were proposed in 1950s such as
bubble sort, merge sort, quick sort, etc.; new ones are still
being invented
 Introduction of 4 algorithms: selection sort, merge sort,
insertion sort and quick sort

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 88


4.5. Sorting algorithms

Input data
0 n-1
a Unsorted

Desired output:
 Data has been sorted in certain order

0 n-1
a Sorted array: a[0]<=a[1]<=…<=a[n-1]

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 89


4.5.1. Selection sort

Initial state
0 n-1
The smallest numbers The remaining data,
a which have been sorted unsorted
Steps:
 Find the smallest number in a[k..n-1]
 Swap a[k] and the smallest number we found above

0 n-1
The smallest numbers
a which have been sorted
a[k] x

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 90


Find the smallest number
/* Yield location of smallest element
in a[k..n-1] */
/* Assumption: k < n */
/* Returns index of smallest, does not return the
smallest value itself */

int min_loc (int a[], int k, int n) {


/* a[pos] is smallest element found so far */
int j, pos;
pos = k;
for (j = k + 1; j < n; j = j + 1)
if (a[j] < a[pos])
pos = j;
return pos;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 91


Swapping for rearrangement

/* Sort a[0..n-1] in non-decreasing


order (rearrange
elements in a so that
a[0]<=a[1]<=…<=a[n-1] ) */

int sel_sort (int a[], int n) {


int k, m;
for (k = 0; k < n - 1; k = k + 1) {
m = min_loc(a,k,n);
swap(&a[k], &a[m]);
}
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 92


Example

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 93


Example

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 94


Example

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 95


Algorithm evaluation

How many iterations do we need to sort n numbers?


 Before swapping position, it is needed to check through the
whole unsorted part of the array
 The length of unsorted part in the initial array is n
 The algorithm repeats n iterations of checking/swapping
 Total of steps is proportional to
Conclusion: selection sort is pretty slow with large
arrays

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 96


Better algorithm?

Algorithms with complexity of nlogn


 Merge sort
 Quick sort
When array size grows, time taken to execute algorithms
with complexity of n2 is much higher than that with
nlogn complexity

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 97


4.5.2 Merge sort

Basic idea:
 Starting with sorted arrays: pointers are at the beginning of the
arrays
 Use the pointer to mix the position of each pair of elements
within sorted arrays to make a bigger array with sorted
elements
 After merging 2 arrays, we get the desired array eventually
The basic operator is merging

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 98


Merge sort

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 99


Merge sort

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 100


Merge sort

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 101


Merge sort

We need n comparisons and copying data n times, thus


the workload is proportional to n
However, this is not yet a sorting algorithm
So how to do?

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 102


Merge sort implementation

It is needed to know the position of pointers to merge the


respected arrays together
At the starting point, each element in the array is a
pointer
Merge sort:
 Merge each pair of 1-element arrays to be a 2 element array
 Merge each pair of 2-element arrays to be a 4 element array
 Merge each pair of 4-element arrays to be a 8 element array
 And so on until completing

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 103


Example

a 3 12 -5 6 142 21 -17 45

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 104


Example

Sorting task is done


Each merging step take a duration proportional to n
How many time do we merge? It is 3 in this example
In general, we need log2n merging steps
 When n =8:
Total time consumed is proportional to nlog2n (or
nlogn).

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 105


In short

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 106


Any approach performed better than nlogn?

In general, the answer is NO


However, in some special cases, we may do better
E.g.: sorting exam papers based on the scores put the
papers into one of 10 piles with respect to their scores
required time is proportional to n.
The performance of the algorithm can be evaluated via
mathematical calculation without using computer
This specific area in mathematics is approximation
theory. There are a lot of interesting topics to be solved
E.g.: The P versus NP problem: Whether every problem
whose solution can be quickly verified can also be
solved quickly

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 107


Efficiency

The resources can be saved


It is usually measured by execution duration and
memory space required
A lot of details in programming has little or no effect on
efficiency
It is often achieved by selecting the right algorithm
and/or data structure

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 108


4.5.3 Example of sorting complex data structure

Revision of structure and array: A structure represent a


simple data record, computer program processes a set of
data records
E.g.: student record, staff record, customer record, etc.
In each case, there are a number of variables, thus it
would require to use an array of structure to store data

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 109


Sorting an array of structure

David Kathryn Sarah Phil Casey


920915 901028 900317 920914 910607
2.9 4.0 3.9 2.8 3.6

Phil David Casey Sarah Kathryn


920914 920915 910607 900317 901028
2.8 2.9 3.6 3.9 4.0

typedef struct {
char name[MAX_NAME + 1];
int id;
double score;
} StudentRecord;

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 110


Revision of selection sort
int min_loc (int a[ ], int k, int n) {
int j, pos; pos = k;
for (j = k + 1; j < n; j = j + 1)
if (a[j] < a[pos])
pos = j;
return pos;
}
void swap (int *x, int *y);
void sel_sort (int a[ ], int n) {
int k, m;
for (k = 0; k < n - 1; k = k + 1) {
m = min_loc(a,k,n);
swap(&a[k], &a[m]);
}
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 111


Sorting an array of structure

Initially, it is required to identify the field to sort by


 E.g.: sort by scores
Change the data type of the array to StudentRecord
Rewrite the code for comparison in min_loc function
Write swap function for StudentRecord

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 112


Sorting an array of structure
int min_loc (StudentRecord a[ ], int k, int n)
{
int j, pos; pos = k;
for (j = k + 1; j < n; j = j + 1)
if (a[j].score < a[pos].score)
pos = j;
return pos;
}

void swap (StudentRecord *x, StudentRecord


*y);
void sel_sort (StudentRecord a[ ], int n) {
int k, m;
for (k = 0; k < n - 1; k = k + 1) {
m = min_loc(a,k,n);
swap(&a[k], &a[m]);
}
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 113


Sorting by alphabetic order

It is required to write a function for comparing two


strings
David Kathryn Sarah Phil Casey
920915 901028 900317 920914 910607
2.9 4.0 3.9 2.8 3.6

Casey David Kathryn Phil Sarah


910607 920915 901028 920914 900317
3.6 2.9 4.0 2.8 3.9

typedef struct {
char name[MAX_NAME + 1];
int id;
double score;
} StudentRecord;

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 114


String compare revision

“Alice” < “Bob”


“Dave” < “David”
“Rob” < “Robert”
#include <string.h>
int strcmp(char str1[ ], char str2[ ]);
Returning value is
 A negative number if str1 < str2
 Zero if str1 = str2
 A positive number if str1 > str2

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 115


Sorting by alphabetic order
int min_loc (StudentRecord a[ ], int k, int n)
{
int j, pos; pos = k;
for (j = k + 1; j < n; j = j + 1)
if (0 > strcmp(a[j].name, a[pos].name))
pos = j;
return pos;
}
void swap (StudentRecord *x, StudentRecord
*y);
void sel_sort (StudentRecord a[ ], int n) {
int k, m;
for (k = 0; k < n - 1; k = k + 1) {
m = min_loc(a,k,n);
swap(&a[k], &a[m]);
}
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 116


Thinking in data structure

If we want to store information of a song in a computer


 What kind of information needs to be stored?
 How to organize the information?
 How to implement in C?
And if
 We need information of a CD
 Or information of a set of CDs

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 117


4.5.4 Insertion sort

Insertion sort is a simple sorting algorithm that works by


iteratively inserting each element of an unsorted list into
its correct position in a sorted portion of the list.
Steps for ascending sorting:
 Compare the 2nd element with the 1st one assumed to be sorted
and swap them if 2nd element < 1st element
 Compare the 3rd element with the 2nd one, then the 1st one and
move the 3rd to the correct position as neccessary
 Continue process till the end of the array

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 118


4.5.4 Insertion sort

a 3 12 -5 6 142 21 -17 45

a 3 12 -5 6 142 21 -17 45

a -5 3 12 6 142 21 -17 45

a -5 3 6 12 142 21 -17 45

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 119


4.5.4 Insertion sort

a -5 3 6 12 142 21 -17 45

a -5 3 6 12 21 142 -17 45

a -17 -5 3 6 12 21 142 45

a -17 -5 3 6 12 21 45 142

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 120


Implementation

/* sort student records a[0..size-1] in */


/* ascending order by score */
void sort (student_record a[ ], int size)
{
int j;
for (j = 1; j < size; j = j + 1)
insert(a, j);
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 121


Implementation

/* given that a[0..j-1] is sorted, move a[j]


to the correct location so that that a[0..j]
is sorted by score */
void insert (student_record a[ ], int j) {
int i;
student_record temp;
temp = a[j];
for (i = j; i > 0 &&
a[i-1].score > temp.score; i = i-1) {
a[i] = a[i-1];
}
a[i] = temp;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 122


Implementation

/* given that a[0..j-1] is sorted, move a[j] to


the correct location so that that a[0..j] is
sorted by name */
void insert (student_record a[ ], int j) {
int i;
student_record temp;
temp = a[j];
for (i = j; i > 0 &&
strcmp(a[i-1].name, temp.name) > 0;
i = i-1) {
a[i] = a[i-1];
}
a[i] = temp;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 123


4.5.5 Quick sort

Sorting the following integer array


40 20 10 80 100 50 7 30 60

Select the ‘pivot’ element: Choose the element to be


compared with: e.g. select the first element
40 20 10 80 60 50 7 30 100

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 124


4.5.5 Quick sort

Divide the array into smaller ones: given number of


elements, the array is divided into 2
 The first array includes elements the pivot one
 The other include elements the pivot one

7 20 10 30 40 50 60 80 100
[0] [1] [2] [3] [4] [5] [6] [7] [8]

<= data[pivot] > data[pivot]

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 125


4.5.5 Quick sort

Recursion: Quicksort Sub-arrays

7 20 10 30 40 50 60 80 100
[0] [1] [2] [3] [4] [5] [6] [7] [8]

<= data[pivot] > data[pivot]

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 126


Quicksort Analysis

 Assume that keys are random, uniformly distributed.


 What is best case running time?
 Recursion:
1. Partition splits array in two sub-arrays of size n/2
2. Quicksort each sub-array
 Depth of recursion tree? O(log2n)
 Number of accesses in partition? O(n)

7 20 10 30 7 20 10 7 10 20

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 127


Quicksort Analysis

 Assume that all elements are randomly distributed


 Shortest duration: O(n log2n)
 How about the worst-case scenario?
 Assumed that all the elements are distributed ascendingly and
the left most element is selected as pivot

pivot_index = 0 2 4 10 12 13 50 57 63 100
[0] [1] [2] [3] [4] [5] [6] [7] [8]

 Array is partitioned into two extreme unbalanced ones.


 Time complexity is

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 128


Another approach to choose the pivot

Select the mean of 3 elements in the array


 data[0], data[n/2], and data[n-1].
Use this mean value as the pivot

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 129


Optimizing Quicksort

Find the pivot element


If the size is smaller or equal to 3:
 One element: do nothing
 If there is 2 elements: if(data[first] > data[second]) swap them
 If there is 3 elements: homework
Implementation of quicksort can be done by using
recursion (homework)

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 130


4.6. Searching

Why are searching algorithms important?


Used in a wide range of problem-solving tasks
 Find an approximated root of hard-to-solve equations
 Search for the optimal operation point of a system
 Figure out the best route in a map
 Choose the best move in a game
 Optimize an industrial process by changing the parameters of
the process
 Solve the scheduling problems
To improve the efficiency and performance of a system
 Good searching algorithm can shorten decision time
 Performance is a critical criteria for real-time systems
Two popular algorithms: Linear Search and Binary Search
© HDC 2024.2 Chapter 4: Data Structures & Algorithms 131
4.6.1. Linear Search

Assume that we need to find the number k = 5 in an


array a[]
i=0 2 -3 -8 5 11 a[]

k 2

i=1 2 -3 -8 5 11
k -3

i=2 2 -3 -8 5 11
k -8

return i i=3 2 -3 -8 5 11
k 5

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 132


4.6.1. Linear Search
#include <stdio.h>
#include <stdlib.h>

// search an element with value x in


// an N- element array
int linearSearch(int arr[], int N, int x)
{
for (int i = 0; i < N; i++)
if (arr[i] == x)
return i;
return -1;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 133


4.6.1. Linear Search
int main(void)
{
int arr[] = { 4, 6, 7, 12, 46 };
int x = 7;
int N = sizeof(arr) / sizeof(arr[0]);

// Function call
int result = search(arr, N, x);
if (result == -1) {
printf("Element is not present in array");
}
else {
printf("Element is present at index %d", result);
}
return 0;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 134


4.6.2. Binary Search

Binary search is an algorithm; its input is a sorted list of


elements.
 If an element you’re looking for is in that list, binary search
returns the position where it’s located.
 Otherwise, binary search returns null.

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 135


4.6.2. Binary Search

Steps:
 Divide the search space into two halves by finding the middle
index “mid”
 Compare the middle element of the search space with the key
 If the key is found at middle element, the process is terminated
 If the key is not found at middle element, choose which half
will be used as the next search space
o If the key is smaller than the middle element, then the left side is used
for next search
o If the key is larger than the middle element, then the right side is used
for next search
 This process is continued until the key is found or the total
search space is exhausted

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 136


4.6.2. Binary Search

Consider an array arr[] below and the target = 23 is to


be found. < 23
arr[] 2 5 8 12 16 23 38 56 72 91

low mid high


> 23
arr[] 2 5 8 12 16 23 38 56 72 91

low mid high


= 23
arr[] 2 5 8 12 16 23 38 56 72 91

low high
mid
© HDC 2024.2 Chapter 4: Data Structures & Algorithms 137
4.6.2. Binary Search

Implementation:
 Iterative Binary Search Algorithm
 Recursive Binary Search Algorithm
Complexity
 Time complexity:
o Worst case complexity: log
o Best case complexity: Ω1
o Average case complexity: Θ(log )
 Space complexity: 3, i.e., three extra variables for indices
Applications:
 Be a part of complex algorithms used in machine learning
 Searching in computer graphics
 Searching in database

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 138


Binary Search - Implementation
#include <stdio.h>
#include <stdlib.h>

// An iterative binary search function.


int binarySearch(int arr[], int low, int high, int x) {
while (low <= high) {
int mid = low + (high - low) / 2;

// Check if x is present at mid


if (arr[mid] == x) return mid;

// If x greater, ignore left half


if (arr[mid] < x) low = mid + 1;

// If x is smaller, ignore right half


else high = mid - 1;
}

// If we reach here, then element was not present


return -1;
}

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 139


Binary Search - Implementation
int main(int argc, char* argv[]) {
int arr[] = { 4, 6, 7, 12, 46 };
int n = sizeof(arr) / sizeof(arr[0]);
int x = 12;

int result = binarySearch(arr, 0, n - 1, x);


if (result == -1)
printf("Element is not present in array");
else
printf("Element is present at index %d", result);

return 0;
}

Homework:
 Implementation of Binary search algorithm with Recursive
approach

© HDC 2024.2 Chapter 4: Data Structures & Algorithms 140


END OF CHAPTER 4

You might also like