0% found this document useful (0 votes)
27 views20 pages

178 Midterm Review

Uploaded by

Ryan Wilder
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views20 pages

178 Midterm Review

Uploaded by

Ryan Wilder
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

178 Midterm Review

Weekly Review

Week 1

Long and Short Qualifiers


- Only suggestions to the compiler, not directives
oLong ints and ints have the same length of 32 bits for windows compiler.
oIn other systems ling ints could be 64 while ints are 32 like Linux and Mac OS.
- sizeof(char)<=sizeof(short)<=sizeof(int)<=sizeof(long)
- The role of thing and short qualifiers is compiler dependent and related to the target
system that the executable will run on.
- Short is typically 2 bytes, int is 4, long is 4 or 8 depending on target system.

Pointers
- A pointer is declaration: int *p or int* p
- A pointer stores an address in memory as its value.
oInt a = 5;
P = &a; // p = the address of a
- Once a pointer is assigned to the address oof a variable it is said to point to that variable
- If you look at the value of p you will see a memory address
oPrintf(“Value store in variable ‘a’: %d”, a) //prints 5
oPrintf(“Address of variable ‘a’: %d”, &a) //prints something like ad243
oPrintf(“Value store in pointer ‘p’: %d”, p) //prints the same address ad243
- You are able to retrieve or change the value of whatever is stored in the memory
location pointed to by a pointer using a dereferencing operation.
oThe dereferencing operator is simply an asterisk (*) placed before the pointer
name. Every time this is seen aside from the pointer’s declaration, it means ‘the
value pointed to by’.
o*p = 6; should be read as “the value pointed to by p is assigned the value of 6”
which results in a being assigned the value of 6.
oInt b = *p; should be read as “b is assigned the value pointed to by p” which results
in b being assigned the value of 6.

Double Pointers
- Double pointer declaration: int **dp or int** dp.
- A double pointer stores the address of another pointer in memory as its value
(remember, pointers themselves take up space in memory and have their own address)
dp = &p;
oOnce a double pointer is assigned to the address of another pointer, it is said to
point to that point (dp points to p)
oIf you take a peek at the value of dp, you will see a memory address (the memory
address that pointer p is located at)
- You are able to retrieve or change the address that is stored in the pointer pointed to by
a double pointer using a single dereferencing operation.
o*dp = &b //the pointer pointed to by dp (p) now points to b.
//in other words, p now points to b.
- You are able to retrieve or change the value that is stored in the memory location
pointed to by the pointer pointed to by the double pointer using a double dereferencing
operation (**)
o**dp = 10; //b now equals 10
- To change the value of the pointer passed to a function as the function argument you
require pointer to a pointer because arguments are passed into a function with pass by
value.
oPass by value means you are making a copy in memory of the actual parameter’s
value that is passed in, thus the function uses a copy of the contents of the actual
parameter.
 Ex: I create a function that takes two pointers as arguments and has them
swap their values (i.e. what they’re pointing to).
 If only a pointer is used: the function will not work because when the
function is called elsewhere and pointers are passed in as arguments, a
duplicate set of pointers (or rather the values help by the pointers which
are addresses) are passed to the function instead of the original pointers.
This, any modifications to the pointers in the SwapRefs function applies
only to the pointer copies and not the original pointers due to pass-by-
value characteristic.
 If a double pointer is used:
void SwapRefs(int **dptr_1, int **dptr_2) {
int *temp;
temp = *dptr_1;
*dptr_1 = *dptr_2;
*dpter_2 = temp;
}
If we use double pointers as function arguments, due to pass-by-value,
copies of those pointers’ values will be made and passed into the
function. However, the value held by the double pointers that reference
addresses of the original pointer will be the same and thus any changed
to them in the function will be retained outside of the function.
- In simple words, pass in double pointers (**) when you want to preserve or retain
changes in pointer assign,ents or memory allocations outside of a function’s call.

Examples
- void example2() {
printf(“Showing example 2: Pointers to Pointers Example\n”);
int var = 2022;
int* ptr;
int** ptrToPtr;
ptr = &var;
ptrToPtrt = &ptr;
printf(“Output of var = %d\n”, var); //2022
printf(“Output of *ptr = %d\n”, *ptr); //2022 (contents of memory
// cell pointed to by ptr)
printf(“Output of ptr = %d\n”, ptr); //0x1230 (address of var)
// (contents of ptr)
printf(“Output of **ptrToPtr = %d\n”, **ptrToPtr); //2022 (content of variable
//whose address is pointed
//to by ptr whose address is
//pointed to by ptrToPtr)
printf(“Output of *ptrToPtr = %d\n”, *ptrToPtr); //0x1230 (address of var)
//(content of the pointer
//which is pointed to by that
//pointer-to-pointer)
printf(“Output of ptrToPtr = %d\n”, ptrToPtr); //0x1220 (address of ptr)
//content of ptrToPtr
}

Conclusion:
- the asterisk in front of pointers gives access to the content of the memory cells they
point to, a process known as dereferencing.
- (*) in front of pointer: gives access to the content of the variables to which they are
pointing to, and these variables can be of various data types such as int, char, float, or
any other.
- (*) in front of pointer-to-pointer: gives access to the content of the pointer they are
pointing to, and this content is typically the address of a memory cell.
- (**) in front of pointer-to-pointer: gives access to the content of the variable pointed to
by the pointer which in turn is pointed to by that pointer-to-pointer.
Week 2

Function Pointers

- Pointers to functions
oEx: If you had a function fun() that returned an int and took in a single variable int
as a parameter a function pointer could be declared and initialised like this:
Int (*fun_p)(int) = &fun;
Or
Int (*fun_p)(int);
fun_p = &fun;
oThe first int in the declaration refers to the function pointer’s return type. The
function name with * must be in (); if you didn’t have the (), the function pointer
declaration becomes a function prototype. The int in () refers to the input
parameter.
- You don’t need the & and * operators for getting ‘the address of a function’ and the
‘function pointed to by the function pointer’, respectively. So, this would also be legal
Int (*fun_p)(int) = fun; //& removed
Fun_p(10); //* removed, synonymous to (*fun_p)(10);
- Functions pointers can be useful when you want to create a callback mechanism (a
function that is not called explicitly by the programmer; instead, there is some
mechanism that continually waits for events to occur such as an interrupt, and it will cal
selected functions in response to particular events) and need to pass the address of a
function to another function. They can also be useful when you want to store and array
of functions, to call dynamically.
Example
- you have data in an array you want to process but it isn’t immediately clear how you
want to process that data and want to have a user input what should be don’t with that
data at runtime. For instance, a user could:
 Press 0 to add an integer to all elements of an array.
 Press 1 to subtract an integer from all elements of an array.
 Press 2. To multiply an integer to all elements in the array.
And you can write functions for each of these operations:
ovoid add(int a[], int a_size, int b){
for (int i=0; i<a_size, i++)
a[i]+=b;
}
ovoid subtract(int a[], int a_size, int b){
for (int i=0; i<a_size, i++)
a[i]-=b;
}
ovoid multiply(int a[], int a_size, int b){
for (int i=0; i<a_size, i++)
a[i]*=b;
}
Instead of having to write a switch statement to take in an input from the user and run
the correct function, you can refer to these functions in an array of function pointers,
where whatever the user inputs through a scanf can directly be used as a reference of
that array.
Int main(){
//fun_ptr is an array of function pointers
void (*fun_ptr_arr[])(int, int) = {add, subtract, multiply};
unsigned int ch, a = 15, b = 10;

printf(“Enter Choice: 0 for add, 1 for subtract and 2 for multiply\n”);

scanf(“%d”, &ch);

if (ch>2)
return 0;

(*fun_ptr_arr[ch])(a, b);
return 0;
}

Static Variables vs Global Variables in C

- Static variables are the variables which one declared, get destroyed only when the
program has completed its execution.
- They have the property of retaining their previous cop value if they are already declared
once in the program.
- They are different from normal variables because normal variables do not retain their
previous value. Normal variables get destroyed one they go out of their scop. But when
static variables are initialized, they get destroyed only after the whole program gets
executed.
oConsider a seneario where you have a function, and you have to kepp track of the
number of calls made to this function. Using a normal variable to keep the count
of the function calls will not work sinse at every function call, the count variable
will reinitialize its value.
oIn order to count the function calls seperatly, we can use static variables since the
static variables have a property to retain the value of their previous scope.
 #include <stdio.h>
Int function1(){
static int count1 =0;
count1++
printf(“count of function 1 is %d\n”, count1);
}

int main(){
function1(); //prints ‘count of function 1 is 1’
function1(); //prints ‘count of function 1 is 2’

Syntax of Static Variable in C

- The syntax for declaring a static variable in C is as follows:


static int variable_name; //declaring static integer value with variable_name
- We can also initialize the value of the static variable while declaring it as follows:
Static int variable_name = 10; //initializing and declaring the value of static int
- Note: The value of a static variable can be reinitialized wherever its scope exists.
#include <stdio.h>
Int main(){
Static int variable = 10;
Variable++;
Printf(“%d\n”, variable); //prints 11
Variable = 0; //reinitializing the value of the static variable
Printf(“%d\n”, variable); //prints 0
}

Difference Between Static and Global Variables

- Static variables can be declared both inside and outside the main function, while global
variables are always declared outside the main function.
- If we call a function in which a static variable is declared as well as initialized, calling this
function again will not reinitialize the value of this static variable again.
oIf this function is called multiple times, the static variable will retain its previous
scope value instead of reinitializing its value.
- If we have a global variable with the name count declared outside the main function and
if we have a function with a variable having the name count, then changing the value of
the count inside the function will not alter the value of the global variable count.
Primary Uses of a Static Keyword

- Use of static keyword inside the function call


oIf we want to use a common value of a variable at every function call, we can
declare a static variable inside the function.
oWhen a static keyword is used inside a function, it prevents the reinitialization of
the variable on multiple function calls.
 Let us say that a function has a static variable having name as
variable_name and its value is initialized with 10. Also the function
increments the value of variable_name on every function call.
#include<stdio.h>
void my_function(){
static int variable_name = 10;
variable_name++;
printf(“%d\n”, variable_name);
}
int main(){
my_function(); //prints 11
my_function(): //prints 12
}
 On calling this function the first time, the variable_name will have its
value as 10 and after incrementation, its value will become 11. So, the
function will print 11. Now, if we call this function again, the value of
variable_name will not be re-initialized to 10. It will extract its value from
the previous scope, which is 11 and after incrementation, it will become
12. So, this time the function will print 12.
- Use of static variable to declare a global variable
oStatic variables can also be declared global. This means that we can have the
advantages of a static as well as global variable in a single variable.
oThe difference between a static variable and a global variable lies in their scope. A
global variable can be accessed from anywhere inside the program while a static
variable only has a block scope.
 So, the benefit of using a static variable as a global variable is that it can
be accessed from anywhere inside the program since it is declared
globally.
#include<stdio.h>
static int static_global_variable = 0; //Static global variable
void my_function(){
static int static_variable = 0; //This variable can only be accessed
in this function
}
int main(){
static_global_variable++; //Static global variable can be used
here
static_variable++; //gives compilation error since
normal static variables can only have
a block scope
}
- Properties of static variable
1. A static variable is destroyed only after the whole program gets executed. It
does not depend on the scope of the function in which it is declared.
2. A static variable has a property to retain its value from its previous scope. This
means that its value does not get re-initialized if the function in which it is
declared gets called multiple time.
3. If no value is assigned to a static variable, by default, it takes the value 0.
4. The memory of the static variable is available throughout the program, but its
scope is only restricted to the block where it is declared. For example, if we
declare a static variable inside a function, we cannot access this static variable
from outside of the function.

Other

- The difference between static global variables and static local variables is that a static
global variable can be accessed from anywhere inside the program while a static local
variable can be accessed only where its scope exists.
- A static variable gets destroyed only after the whole program gets executed.

Week 3

Arrays vs Linked List

Arrays Linked List


Fixed size: Resizing is expensive Dynamic size
Insertions and Deletions are inefficient: Insertion and deletions are efficient: No
Elements usually have to be shifted shifting required
Random access: efficient indexing No random access
Not suitable for operation requiring accessing
elements by index such as sorting
No memory waste if the array is full or almost Since memory is allocated dynamically,
full; otherwise, may result in much memory according to our need, there is no waste of
waste memory
Sequential access is faster since elements in Sequential access is slow since elements are
contiguous memory location not in contiguous memory

Stack
- A stack is an ordered collection of items where the insertion and deletion of items occur
at the same end, often referred to as the ‘top’.
- It follows the Last-In-First-Out (LIFO) principle where Push (add an item to the top) and
Pop (remove the item from the top_ operations are defined on stacks.
- This ADT is simple and efficient for managing function calls, recursion, and undo
mechanisms.

Reverse Polish Notation (RPN)

- Also known as Postfix Notation


- A mathematical expression format where operators are placed after their operands.
oIt eliminated the need for parentheses and follows a straightforward evaluation
approach.
- Ex: 3 4 + 5 * 6 -
1. Push ‘3’ onto the stack.
2. Push ‘4’ onto the stack.
3. Encountering ‘+’: Pop two operands (4,3) preform addition, push result (7)
onto the stack.
4. Push ‘5’ onto the stack.
5. Encountering ‘*’: Pop two operands (5,7) preform multiplication, push result
(35) onto the stack.
6. Push ‘6’ into the stack.
7. Encountering ‘-‘: Pop two operands (6, 35) preform subtraction, push result
(-29) onto the stack.
8. The final result is -29.
- RPN is particularly suited for computer processing as it avoid the need for parentheses
and facilitates stack-based evaluation.

Polish Notation (PN)

- Also known as Prefix Notation.


- A mathematical expression format where operators precede their operands.
oProvides a concise and unambiguous representation of expressions.
- Ex: * + 3 4 5 – 2
1. Encountering ‘*’: Multiply operands ‘3’ and ‘4’ to get result (12).
2. Encountering ‘+’: Add result (12) and ‘5’ to get result (17).
3. Encountering ‘-‘: Subtract ‘2’ form result (17) to get result (15).
4. Final result is 15.
- Polish Notation eliminates the need for parentheses and avoid ambiguity in the order of
operations.

Infix Notation
- The standard mathematical expression format where operators are placed between
their operands.
oMost familiar notation but may require parentheses for clarity.
- Ex: (3+4)*5-6
1. Inside parentheses, add ‘3’ and ‘4’ to get (7).
2. Multiply (7) and ‘5’ to get (35).
3. Subtract ‘6’ from (35) to get (29).
4. Final result is 29.

Use Cases and Considerations:

- Infix Notation
oWidely used in mathematical literature and everyday expression.
oHuman-readable but requires parentheses for clarity.
- Polish Notation (Prefix) (PN)
oCommonly used in computer science and artificial intelligence.
oFacilitates easy parsing and evaluation.
- Reverse Polish Notation (Postfix) (RPN)
oPrevalent in stack-based calculators and some programming languages.
oEliminates the need for parentheses, making it suitable for automated processing.
- In conclusion, the choice between infix, prefix, or postfix, depends on the context and
requirements of the application, with considerations for both human readability and
computational efficiency.

Week 4

Stack ADT
- A stack is a linear data structure that stores arbitrary objects.
- Stack operations are preformed based on the Last In First Out (LIFO) principle.
oData is inserted at the top (last node) and removed at the top (last node) of the
stack.
- A stack can be implemented using wither an Array or a singly Linked List as the
underlying data structure.
1. History and Applications of a stack
oUsed by text editors for the undo or redo operations, string reversal, and web
browsers for page-visited history.
oCan be used indirectly as an auxiliary data structure when writing algorithms or as
a component of other data structures.
2. Operations
opush(e)
 Inserts element e, to be at the top of the stack.
opop()
 Removes the topmost element from the stack, or null if the stack is
empty.
oisEmpty
 Returns a Boolean value indicating if the stack is empty or not.
otop()
 Returns the topmost element in the stack or null if the stack is empty.
3. Performance and Memory Usage
oA stack can be implemented using wither an Array or a Singly Linked List
Arrays Singly Linked List
An ArrayStack is easy to implement and Singly Linked List Stack aims to solve the
efficient problem identified in the ArrayStack
implementation
Each operation runs un constant time, time By using a Singly Linked List under the hood,
complexity: O(1) the capacity of the stack is dynamic
The memory usage (space complexity) is The top of the stack will be set to the head of
O(n), where n represents the number of the Singly Linked List in order to be able to
elements insert (push(e)) and delete and (pop()) in
constant time o(1) at the head
The drawbacks of this implementation are For the isEmpty() method to run in constant
the capacity of the stack needs to be set at time (O(1)) we have to keep track of the
compile-time and cannot be changed, and if number of elements in an instance variable
we try to push a new element onto an (like “count” variable in Queue)
already full stack it will throw an exception
Stack backed by a Singly Linked List is the
most efficient implementation since all
operations run in O (1) and memory usage is
O(n)

Queue ADT

- A queue is a linear data structure that stores arbitrary objects.


- Queue operation performed based on the First In First Out (FIFO) principle.
oData is inserted at the tail (last node) and removed at the head (first node) of the
queue.
- A queue can be implemented using either an Array or a Singly Linked List as the
underlying data structure.
1. History and Application of a Queue
oUsed to access shared resourced (printer), message queues, and process
scheduling.
oCan also be used indirectly as an auxiliary data structure when writing algorithms
or as a component of other data structures.
2. Operations
oenqueue(e)
 Inserts the element e at the rear of the queue.
odequeue()
 Removes and returns the element at the front of the queue, and returns
null if the queue is empty.
osize()
 Returns the number of elements in the queue, similar to ‘count’ defined
previously.
oisEmpty()
 Returns a Boolean value that indicated whether the queue is empty.
3. Performance and memory usage
oA queue can be implemented using wither an Array, Array-List, or a Singly Linked
List.
oArray Based
 To avoid the dequeue() operation from taking O(n) time to trun, where n
represents the number of elements needed to shift forward, we need to
define two variables front and rear which have the following meaning.
 Front: refers to the index of the first element in the queue, which
is next to be dequeued unless the queue is empty (front = rear = -
1). Initially set to -1. When an element is dequeued we need to
increment the front index to the next element in the queue.
 Rear: refers to the index of the next available cell in the Queue.
Initially set to -1. When a new element is enqueued we need to
increment the r index to the next available cell in the queue.
 By using front and rear variables we are able to implement enqueue(e)
and dequeue() methods in constant time, O(1).
 Drawbacks
 IndexOutOfBounds
o When we repeatedly enqueue and deququ elements N
times until front = rear = N. If we then want to enqueue
one more, and IndexOutOfBounds error would be thrown.
 Inflexible Meximum Size
o We have to specify the maximum size of the queue. The
memory usage (space complexity) in a normal Array-Based
queue is O(n), where n represents the number of
elements.
 Circular Queue (Array-Based)
 The circular queue implementation aims to solve the first problem
identified in the normal Array-Based queue implementation.
 To implement a Circular queue the modulo(%) operator is used, to
warp indices around the end of the array, (front+1)%N or
(rear+1)%N.
 Each method in a circular queue runs in constant time, O(1).
 The memory usage in a circular queue is O(n), where n represents
the number of elements.
 Since this implementation also uses an Array under the hood, the
second drawback (inflexible maximum size) identified in the Array-
Based implementation still exists here.
oSingly Linked List
 The singly linked list implementation aims to solve the drawbacks
identified in the previous two implementations.
 By using a singly linked list under the hood, each queue operation runs in
time O(1) and we don’t have to specify the maximum size of the queue.
 The memory usage is O(n), where n represents the number of elements.
 To implement a queue using a singly linked list we need to maintain
references to both head and tail nodes in the list.
 In this way, we can remove from the head and add new nodes at
the tail efficiently.

Deque ADT

- Also known as a double-ended queue


- A data structure that supports insertion and deletion operations at both the front and
back of the queue.
- Is usually pronounced as ‘deck’ to avoid confusion with the dequeue method of a Queue
ADT.
1. History and Application of a Deque
oApplications in a web browser’s history, where the most recently visited, sites are
added to the front of the deque and the sites at the back of the deque are
removed after N minutes.
oCould also be thought of as waiting in line at wonderland (fast pass, leaving long
line as outliers to typically queue functionality).
2. Operations
oinsert_front(e)
 Inserts a new element e at the front of the deque.
oinsert_rear(e)
 Inserts a new element e at the back of the deque.
oremove_front()
 Removes and returns the first element of the deque. Returns null if the
deque is empty.
oremove_rear()
 Removes and returns the last element of the deque. Returns null if the
deque is empty.
oisEmpty()
 Returns a Boolean value indicating whether or not the deque is empty.
3. Performance and memory usage
oA deque ADT mostly using a doubly linked list under the hood, so it does not have a
predefined size.
 Thus, the space used by a list with n elements is O(n).
 All deque operation run in constant time, O(1).

Week 5

Tree

Terminology
- Root
oThe first node of the tree
- Edge
oThe connecting link of any two nodes is called the edge of the tree. N number of
nodes connecting with N-1 number of edges.
- Parent
oThe node that is the predecessor of any node is known as a parent node.
- Child
oThe descendent of any node is known as a child node.
oIn a tree, a parent node can have up to the maximum number of allowed children
per node which is determined by the tree node’s structure.
oEvery node except for the root node is a child node.
- Siblings
oIn tree data structure, node that belong to the same parent are called siblings.
- Leaf
oThe node with. no child, is known as a leaf node.
- Internal nodes
oNodes with at least one child are known as internal nodes.
oIn trees, nodes other than leaf nodes are internal nodes.
- Degree
oThe total number of children of a node is called the degree of the node.
- Level
oThe root node is said to be at level 0, and the root node’s children are at level 1,
and the children of that node at level 1 will be level 2, and so on.
- Path
oThe sequence of nodes from one node to another node is called tha path between
those two nodes.
oThe length of a path is the total number of edges in that path.
- Height
oThe length of the path from the root node to a node is called the height of that
node. In particular, the total number of edges from the root node to the leaf
nose is the longest path is known as ‘height of tree’.
- Subtree
oEach child from a node shapes a sub-tree recursively and every child in the tree will
form a sub-tree on its parent node.

Tree Traversal

Breadth-first search vs Depth-first search


- Breadth-first search
oIs when you inspect every node on a level starting at the top of the tree and then
move to the next level.
oStoring the nodes in a queue data structure creates the level-by-level pattern of a
breadth-first search.
oChild nodes are searched in the order they are added to the queue. The nodes on
the next level are always behind the nodes on the current level.
oBreadth-first search is known as a complete algorithm since no matter how deep
the goal node is in the tree, it will always traverse every node up to and including
the goal node level.
- Depth-first search
oIs where you search deep into a branch and don’t move to the next one until
you’ve reached the end.
oTree nodes stored in a stack data structure create the deep dice of a depth-first
search.
oNodes added to the frontier early on can expect to remain in the stack while their
siblings’ children are searched.
oDepth-first search is not considered a complete algorithm since searching an
infinite branch in a tree can go on forever.
 In this situation, an entire section of the tree would be left uninspected.
- The location of the goal node has a significant impact on determining which search
algorithm will be able to find the goal first.
- With more information on the location of the goal value in the tree, you can choose to
use wither the breadth-first search or depth-first search algorithms.
- A tree traversal is an algorithm that vists every node in a tree in a specific order

Binary Tree

Properties
- Binary trees can have at most two child nodes.
- These two children are called the left child and the right child.

DFS traversal algorithm


- Preorder
oroot before children.
- Inorder
oleft child, then root, then right child.
- Postorder
ochildren before root.
oit comes up in problems where we have to aggregate information about the entire
subtree rooted at each node.
oClassic examples are computing the size, the height, or the sum of values of the
tree.

Implementation notes
- Because trees are recursive data structures, algorithms on trees are most naturally
expressed recursively.
- However, some compilers set a limit on how many nested calls a program can make.
oIf the height of the tree is larger than this limit, the program will crash with a stack
overflow error.
oA recursive implementation is safe to use if:
 Somehow, we know that the input trees will be small enough.
 The tree is balanced, which means that, for each node, the left and right
subtrees have roughly the same height.
oHowever, if we are not in wither of the cases above, an iterative solution is safer (in
terms of space complexity).
 Recursive and iterative traversals have the same runtime complexity, so
this is not a concern when choosing wither (all can be solved in linear
time O(n)).
oThe main approach for converting recursive implementations to iterative ones is to
‘simulate’ the stack memory with an actual stack where we push and pop the
nodes explicitly.

More terminology
- Balanced Binary Tree
oa binary tree where the left and right subtrees of every node differ in height by no
more than 1.
- Full Binary Tree
oA full binary tree is a type of binary tree in which every node has either two or no
children/subtrees.
- Complete Tree
oIs a binary tree where every level except the last one is completely filled, and
where the last level is filled from left to right.
Searching cont.
- These definitions, especially the balanced tree, are important when we are talking about
self-balancing of the binary search trees.

Binary Search Tree


- a binary tree with the additional property that every node in the left subtree has a
smaller value than the root, and every node in the right subtree has a larger value than
the root.
- This ordering property makes it possible to efficiently search, insert, and delete elements
in the tree.
- A balanced BST supports operations like search, insert, deletion in O(h) time where h is
height of the BST. To keep BST balanced, self-balancing BSTs like AVL are used in practice.

Week 6

AVL Tree

What is an AVL Tree


- Is a binary search tree that is balanced.
- This means that the heights of the two subtrees of any node differ by one at most.
- An AVL tree is a binary search tree in which the difference between the heights of the
left and right subtrees of any node is at most one.

Properties of AVL Trees


1. The height of the tree is at most O(log n), where n is the number of nodes in the tree.
2. For every node in the tree, the heights of the left and right subtrees differ by at most
one.
3. The left and right subtrees of every node are themselves AVL trees.

Balancing Factor
- In an AVL tree, the balancing factor can either be -1, 0, or 1.
oIf the balancing factor of a node is 1, it means the node is left-skewed (left heavy).
oIf the balancing factor is 0, it means that the node is balanced.
oIf the balancing factor is -1, it means that the node is right-skewed (right heavy).
- The AVL tree uses the balancing factor to maintain a balanced tree structure.
oWhenever a new node is added to the tree, or an existing node is removed, the AVL
tree preforms rotation operations to ensure that the balancing factor of every
node in the tree is either -1, 0, 1.
oThis ensures that the height of the tree remains logarithmic in proportion to the
number of nodes in the tree.

Rotations
- In an AVL tree, there are four types of rotation operations that can be performed to
balance the tree.
1. Left Rotation
oA left rotation is performed when a node becomes right-heavy.
oIt involves moving the node to its left child’s position, and the left child becomes
the new root of the subtree.
oThis operation ensures that the left child’s height is increased by one, and the right
child’s height is decreased by one.
2. Right Rotation
oA right rotation is performed when a node becomes left-heavy.
oIt involves moving the node to its right child’s position, and the right child
becoming the new root of the subtree.
oThis operation ensures that the right child’s height is increased by one, and the eft
child’s height is decreased by one.
3. Left-Right Rotation
oThis is performed when a node becomes left-heavy, and its left child becomes right-
heavy. It involves first performing a left rotation on the left child, and then a right
rotation on the original node.
4. Right-Left Rotation
oThis is performed when a node becomes right-heavy, and its right child becomes
left-heavy. It involves first performing a right rotation on the right child, and then
a left rotation on the original node.

Insertion and deletion in AVL Trees

Insertion
- This operation inserts a new node into the tree while maintaining the balance of the
tree:
1. Create a new node with the given value to be inserted.
2. If the tree is empty, make the new node the root of the tree.
3. If the tree is not empty, perform a binary search to find the appropriate position for the
new node in the tree.
4. Insert the new node at the appropriate position as in a regular binary search tree.
5. Starting from the newly inserted node, move up the tree towards the root, updating the
height of each node and checking the balancing factor of each node.
6. If the balancing factor of any node become greater than 1 or less than -1, perform the
appropriate rotations(s) to balance the tree.
7. Continue moving up the tree until the root node is reached, and the tree is balanced.

Deletion
- this operation removes a node from the tree while maintaining the balance of the tree:
1. Perform a binary search to find the node to be deleted.
2. If the node is a leaf node or has only one child, delete the node and replace it with its
child (if any).
3. Find the node’s in-order successor (the smallest node in its right subtree) or in-order
predecessor (the largest node in its left subtree).
4. Replace the node to be deleted with its in-order successor or predecessor.
5. Recursively delete the in-order successor or predecessor from its new location
6. Starting from the parent of the deleted node, move up the tree towards the root
updating the height of each node and checking the balancing factor of each node.
7. If the balancing factor of any node becomes greater than 1 or less than -1, perform the
appropriate rotation(s) to balance the tree.
8. Continue moving up the tree until the root node is reached, and the tree is balanced.

Time Complexity
- Search, insert, delete, and tree traversal.
oO(log n), where n is the number of nodes in the tree.
osame even if rebalancing needs to be completed.

Space Complexity
- Each node in an AVL tree contains data, pointers to its left and right children, and a
balance factor.
- The balance factor is a single bit of information that indicated whether the tree is
balanced or not.
- AVL trees also require space for pointer to the root node and any temporary values used
during operations like insertion or deletion.
oThe space required for these is negligible to the space required for the nodes
themselves.
- Space complexity is O(n), making them a space-efficient data structure.

Advantages
1. Efficient operations
oAVL trees have a guaranteed logarithmic time complexity for operations such as
insertion, deletion, and search, making them efficient for large datasets.
2. Balanced structure
oAVL trees maintains a balanced structure, ensure that the height is minimised, and
the operations are efficient.
3. Self-balancing
oAVL trees are self-balancing, which means that after an operation is performed, the
tree is automatically rebalanced, eliminating the need for manual rebalancing.

Disadvantages
1. Space overhead
oAVL trees require additional space which can add significant overhead for large
datasets.
2. Complex implementation
oImplementing AVL trees can be complex, require careful consideration of balance
factor and rotation operations.

When to use
- AVL trees are a good choice when the dataset is large and efficient, search, delete, and
insert operations are needed, or when the application requires real-time updates or
frequent updates, like building a real-time stock market application.
- AVL trees may not be the best choice for small datasets, static datasets, limited memory,
or when a simple implementation is required.

You might also like