What Is Dynamic Memory Allocation & Garbage Collection
What Is Dynamic Memory Allocation & Garbage Collection
Whenever a new node is created, memory is allocated by the system. This memory is taken from
list of those memory locations which are free i.e. not allocated. This list is called AVAIL List.
Similarly, whenever a node is deleted, the deleted space becomes reusable and is added to the list
of unused space i.e. to AVAIL List. This unused space can be used in future for memory allocation.
When memory is allocated during compilation time, it is called ‘Static Memory Allocation’. This
memory is fixed and cannot be increased or decreased after allocation. If more memory is allocated
than requirement, then memory is wasted. If less memory is allocated than requirement, then
program will not run successfully. So exact memory requirements must be known in advance.
When memory is allocated during run/execution time, it is called ‘Dynamic Memory Allocation’.
This memory is not fixed and is allocated according to our requirements. Thus in it there is no
wastage of memory. So there is no need to know exact memory requirements in advance.
Whenever a node is deleted, some memory space becomes reusable. This memory space should be
available for future use. One way to do this is to immediately insert the free space into availability
list. But this method may be time consuming for the operating system. So another method is used
which is called ‘Garbage Collection’. This method is described below: In this method the OS
collects the deleted space time to time onto the availability list. This process happens in two steps.
In first step, the OS goes through all the lists and tags all those cells which are currently being used.
In the second step, the OS goes through all the lists again and collects untagged space and adds this
collected space to availability list. The garbage collection may occur when small amount of free
space is left in the system or no free space is left in the system or when CPU is idle and has time to
do the garbage collection.
1
(c) Overflow & Underflow-
Overflow happens at the time of insertion. If we have to insert new space into the data structure, but
there is no free space i.e. availability list is empty, then this situation is called ‘Overflow’. The
programmer can handle this situation by printing the message of OVERFLOW.
Underflow happens at the time of deletion. If we have to delete data from the data structure, but
there is no data in the data structure i.e. data structure is empty, then this situation is called
‘Underflow’. The programmer can handle this situation by printing the message of UNDERFLOW.
Resources are always a premium. We have strived to achieve better utilization of resources at all
times; that is the premise of our progress. Related to this pursuit, is the concept of memory
allocation.
Memory has to be allocated to the variables that we create, so that actual variables can be brought
to existence. Now there is a constraint as how we think it happens, and how it actually happens.
When we think of creating something, we think of creating something from the very scratch, while
this isn’t what actually happens when a computer creates a variable ‘X’; to the computer, is more
like an allocation, the computer just assigns a memory cell from a lot of pre-existing memory cells
to X. It’s like someone named ‘RAJESH’ being allocated to a hotel room from a lot of free or
empty pre-existing rooms. This example probably made it very clear as how the computer does the
allocation of memory.
Now, what is Static Memory Allocation? When we declare variables, we actually are preparing all
the variables that will be used, so that the compiler knows that the variable being used is actually an
important part of the program that the user wants and not just a rogue symbol floating around. So,
when we declare variables, what the compiler actually does is allocate those variables to their
rooms (refer to the hotel analogy earlier). Now, if you see, this is being done before the program
executes, you can’t allocate variables by this method while the program is executing.
void fun()
int a;
2
}
int main()
int b;
int c[10]
Why do we need to introduce another allocation method if this just gets the job done?
Why would we need to allocate memory while the program is executing? Because, even though it
isn’t blatantly visible, not being able to allocate memory during run time precludes flexibility and
compromises with space efficiency. Specially, those cases where the input isn’t known
beforehand, we suffer in terms of inefficient storage use and lack or excess of slots to enter data
(given an array or similar data structures to store entries). So, here we define Dynamic Memory
Allocation: The mechanism by which storage/memory/cells can be allocated to variables
during the run time is called Dynamic Memory Allocation (not to be confused
with DMA). So, as we have been going through it all, we can tell that it allocates the memory
during the run time which enables us to use as much storage as we want, without worrying about
any wastage.
Dynamic memory allocation is the process of assigning the memory space during the execution
time or the run time.
Reasons and Advantage of allocating memory dynamically:
1. When we do not know how much amount of memory would be needed for the program
beforehand.
2. When we want data structures without any upper limit of memory space.
3. When you want to use your memory space more efficiently.Example: If you have allocated
memory space for a 1D array as array[20] and you end up using only 10 memory spaces then
the remaining 10 memory spaces would be wasted and this wasted memory cannot even be
utilized by other program variables.
4. Dynamically created lists insertions and deletions can be done very easily just by the
manipulation of addresses whereas in case of statically allocated memory insertions and
deletions lead to more movements and wastage of memory.
5. When you want you to use the concept of structures and linked list in programming, dynamic
memory allocation is a must.
int main()
{
3
// Below variables are allocated memory
// dynamically.
int *ptr1 = new int;
int *ptr2 = new int[10];
There are two types of available memories- stack and heap. Static memory allocation can
only be done on stack whereas dynamic memory allocation can be done on both stack and heap.
An example of dynamic allocation to be done on the stack is recursion where the functions are
put into call stack in order of their occurrence and popped off one by one on reaching the base
case. Example of dynamic memory allocation on the heap is:
int main()
{
// Below variables are allocated memory
// dynamically on heap.
int *ptr1 = new int;
int *ptr2 = new int[10];
4
2. A static array has fixed size. We cannot increase its size to handle situations requiring more
elements. As a result, we will tend to declare larger arrays than required, leading to wastage of
memory. Also, when fewer array elements are required, we cannot reduce array size to save
memory.
3. It is not possible (or efficient) to create advanced data structures such as linked lists, trees and
graphs, which are essential in most real-life programming situations.
The C language provides a very simple solution to overcome these limitations: dynamic memory
allocation in which the memory is allocated at run-time, i. e., during the execution of a program.
Dynamic memory management involves the use of pointers and four standard library functions,
namely, malloc, calloc, realloc and free. The first three functions are used to allocate memory,
whereas the last function is used to return memory to the system (also called freeing/deallocating
memory). The pointers are used to point to the blocks of memory allocated dynamically.
When called, a memory allocation function allocates a contiguous block of memory of specified
size from the heap (the main memory of a computer) and returns its address. This address is
stored in a pointer variable so as to access that memory block.
The memory allocated dynamically remains with the program as long as we do not explicitly
return it to the system using the free function. Thus, when such memory is no longer required in
our program, we should return it to the system. This is an important responsibility of a
programmer, failing in which means that this memory will not be available to any program
running on that machine including subsequent executions of our program. Thus, ill-designed
programs (that do not free all allocated memory) will continue to eat up heap space, progressively
reducing the computer’s data processing ability.
This is not desirable on any computer/device. Further execution of such programs will eventually
lead to system hangs due to unavailability of memory and the system must be rebooted. Such a
reboot may not be feasible on computers used in critical applications, e. g., servers such as
Google, Yahoo, MSN, etc. Hence, a programmer should be carefully free all the allocated memory
blocks.
The heap is managed by the operating system and is allocated (on demand) to running programs.
The heap is much larger than the program’s local memory used to hold program data. Thus, we
can create much larger arrays and other data structures in the heap. Note that the allocation on the
heap is not in a sequence, i. e., the memory blocks can be allocated anywhere. As a result, a heap
memory is usually fragmented. The memory manager in the operating system decides the optimal
location for allocation of a particular memory block.
Standard library functions for dynamic memory management
Recall that the C language provides four functions for dynamic memory management, namely,
malloc, calloc, realloc and free. These functions are declared in stdli.b .h header file. They are
summarized in Table and are described below.
malloc malloc (sz ) Allocate a block of size sz bytes from memory heap and
5
return a pointer to the allocated block
Allocate a block of size n x sz bytes from memory heap,
calloc calloc in (sz)
initialize it to zero and return a pointer to the allocated block
Adjust the size of the memory block blk allocated on the heap
realloc realloc (bl,; sz)
to sz, copy the contents to a new location if necessary and
return a pointer to the allocated block
free free (blk) Free block of memory blk allocated from memory heap
The malloc, calloc and realloc functions allocate a contiguous block of memory from heap. If
memory allocation is successful, they return a pointer to allocated block (i. e., starting address of
the block) as a void (i. e., a typeless) pointer; otherwise, they return a NULL pointer.
We’ll be covering the following topics in this tutorial:
The malloc (memory allocate) function is used to allocate a contiguous block of memory from
heap. If the memory allocation is successful, it returns a pointer to the allocated block as a void
pointer; otherwise, it returns a NULL pointer. The prototype of this function is given below.
void *malloc(size_t size);
The malloc function has only one argument, the size (in bytes) of the memory block to be
allocated. Note that size_t is a type, also defined in stdlib.h, which is used to declare the sizes of
memory objects and repeat counts. We should be careful while using the malloc function as the
allocated memory block is uninitialized, i. e., it contains garbage values.
As different C implementations may have differences in data type sizes, it is a good idea to use the
sizeof operator to determine the size of the desired data type and hence the size of the memory
block to be allocated. Further, we should typecast the void pointer returned by these functions to a
pointer of appropriate type. Thus, we can dynamically allocate a memory block to store 50
integers as shown below.
6
This is often sufficient to allocate a memory block, particularly in small toy-like programs.
However, in coding applications, it is a good idea to ensure, before continuing with program
execution, that the memory allocation was successful, i. e., the pointer value returned is not NULL
as shown below.
pa= (int*) malloc(50 * sizeof(int)); /* alloc memory*/
if (pa == NULL) {
printf("Error: Out of memory ...\n'');
exit(l); /*Error code 1 is used here to indicate
out of memory situation */
}
/* continue to use pa as an array here onwards */
Note that once memory is allocated, this dynamically allocated array can be used as a regular
array. Thus, the ith element of array pa in the above example can be accessed as pa [i].
Note that we can combine the memory allocation statement with the NULL test to make the code
concise (sacrificing readability) as shown below.
if ((pa= (int*) malloc(50 * sizeof(int))) == Null) {
printf("Error: Out of memory ...\n");
exit(1);
}
If we use this approach to create concise programs, we are likely to make mistakes at least in the
beginning. Hence, observe carefully the use of parentheses in the if statement. You will soon
realize that it is not as difficult as it appears to be. Since the assignment statement has lower
precedence than the equality operator, the memory allocation statement is first included in a pair
of parentheses and then compared with the NULL value within the parentheses of if statement as
illustrated below.
The calloc function is similar to malloc. However, it has two parameters that specify the number
of items to be allocated and the size of each item as shown below.
void *callee ( size_t n_items, size_t size ) ;
7
Another difference is that the calloc initializes the allocated memory block to zero. This is useful
in several situations where we require arrays initialized to zero. It is also useful while allocating a
pointer array to ensure that the pointers will not have garbage values.
The realloc function
The realloc (reallocate) function is used to adjust the size of a dynamically allocated memory
block. If required, the block is reallocated to a new location and the existing contents are copied to
it. Its prototype is given below.
void *realloc( void *block; size_t size);
Here, block points to a memory block already allocated using one of the memory allocation
functions (malloc, calloc or realloc). If successful, this function returns the address of the
reallocated block; or NULL otherwise.
The free function
The free function is used to deallocate a memory block allocated using one of the memory
allocation functions (malloc, calloc or realloc). The deallocation actually returns that memory
block to the system heap. The prototype of free function is given below.
void free (void *block);
when a dynamically allocated block is no longer required in the program, we must return its
memory to the system.
Heap allocation is a necessity for modern programming tasks, and so is automatic reclamation
of heap- allocated memory. However, memory management comes with significant costs and has
implications for how we generate code.
One way to avoid the cost of heap allocation is to stack-allocate objects. If an object allocated
during execution of some function always becomes unreachable when the function returns, the
object can be allocated on the function’s stack frame. This requires an escape analysis to ensure
it is safe. In fact, we can view the escape analysis used to stack-allocate activation records in
languages with higher-order functions as an important special case of this optimization.
If an object will never be used again, it is safe to reclaim it. However, whether an object will
be used again is an undecidable problem, so we settle for a conservative approximation. An
object is garbage if it cannot be reached by traversing the object graph, starting from the
immediate references available to the current computation(s): registers, stack, and global
variables.
A garbage collector automatically identifies and reclaims the memory of these garbage objects.
8
1.1 Linear allocation
To collect garbage, we first have to create it. One strategy for heap management is linear
allocation, in which the heap is organized so that all used memory is to one side of a next pointer.
This is depicted in Figure1. An allocation request of n bytes is satisfied by bumping up the next
pointer by n and returning its previous value. If the next pointer exceeds the limit pointer, the
non-garbage objects are compacted into the first part of the memory arena so allocation can
resume.
Linear allocation makes allocation very cheap, and it also tends to allocate related objects
near each in memory, yielding good locality. For functional languages, which do a lot of
allocation, it is a good choice. It is commonly used with copying collectors, which perform
compaction.
9
1.2 Freelist allocation
Another standard organization of memory is to manage free blocks of memory explicitly. The
free memory blocks form a linked list in which the first word of each block points to the next
free block (it is therefore an endogenous linked list). An allocation request is satisfied by
following the freelist pointer and finding a block big enough to hold the requested object.
The basic freelist approach wastes space because of two problems: it has internal
fragmentation in which objects may be allocated a larger block of memory than they requested.
And it has external fragmentation in which the available memory is split among free memory
blocks that are too small to accommodate a request.
We can speed up allocations and reduce internal fragmentation by having multiple freelists
indexed by the size of the request. Typically small sizes are covered densely, and sizes grow
geometrically after some point. Note, however, that when sizes grow as powers of two, on
average about 30% of the space is wasted.
freelist
Mutator Collector
Threads thread(s)
roots
10
Figure 3: Mutator and GC. Garbage objects shown in gray.
External fragmentation can be reduced by merging adjacent free blocks. This requires
additional space for each memory block to keep track of its neighbors. The space may be in the
memory block itself, or in an external data structure that keeps track of whether neighbors are in
use (e.g., as a bitmap). The Berkeley allocator’s “buddy system” is a neat example of the latter,
with power-of-two sized memory blocks that can be merged to form a block of the next size up.
The standard Linux malloc does not use a buddy system, however.
These days we expect the language to collect garbage for us. This is more complex than we
might expect, since modern language implementations contain sophisticated garbage collectors.
Garbage collectors can do a better job if they can take advantage of help from the compiler, so it
is worth understanding how GC and compilers interact.
An ideal garbage collector would have the following attributes:
• Fully automatic.
• Low overhead in time and space.
• Pause times in which the program waits for the collector should be short.
• Safe: only garbage is reclaimed.
• Precise: almost all garbage is collected, and reasonably soon.
• Parallel: able to take advantage of additional cores.
• Simple.
Abstractly, we can think of a garbage collector as a thread or threads running perhaps
concurrently with the actual compute thread(s). This is shown in Figure3. We refer to the threads
doing real computations as the mutator, since they are changing the object graph that the garbage
collector is trying to manage.
3 Reference counting
Reference counting generates a lot of write traffic to memory and hurts performance. This
traffic is particularly a problem in the case of objects that are shared between multiple cores. It is
critical that reference counts be updated atomically and that all cores see increments to reference
counts.
The overhead of writes to update reference counts can be reduced by avoiding updating
reference counts. The key to this optimization is to notice that it is not necessary that the
reference count exactly match the number of incoming references at all times. It is enough to
enforce a weaker invariant:
• For safety, the reference count should only be decremented to zero if there are no incoming
references.
• For precision, an object with no incoming references should eventually have a reference
count of zero.
This weaker invariant implies that it is not necessary to update the reference count to an
object over some scope during which an additional reference is known to exist, if it is known that
another reference to the object exists and will prevent its count from going to zero. Also, it is
possible to defer decrementing the reference count of an object, perhaps until the time of a later
increment, in which case both updates can be eliminated. And multiple increments and
decrements to the same object can be consolidated. These are all helpful optimizations when
generating code that uses reference counting. Researchers have reported reducing reference
counting overhead by roughly 80% through such optimizations.
Note that doing the decrement first could be a disaster if x and y happen to be aliases!
Another interesting optimization based on reference counting is to reuse existing allocated
space for new allocations. If it can be determined statically that the number of references to a
particular object is zero at a certain point in the code, that allocated object can be reused for other
allocations that happen in the same scope.
To identify cycles, we need an algorithm that traverses the object graph. Such algorithms, of
12
which there are many, can be viewed as instances of Dijkstra’s tricolor graph traversal algorithm.
In this algorithm, there are three kinds (or colors) of objects. White objects are the unvisited
objects that have not yet been reached in the traversal. Gray objects have been reached, but may
have successor objects that have not been reached. Black objects are completed objects. The key
invariant of the algorithm is that black objects can never point to white objects. The algorithm
starts with all nodes colored white. It then proceeds as follows:
1.The roots of garbage collection are colored
gray.
3. All objects are now black or white, and by the invariant, white objects are not reachable
from black objects, and are therefore garbage. Reclaim them.
4. Optionally compact black objects.
5 Finding pointers
To be able to traverse objects in the heap, the collector needs to be able to find pointers within
objects, on the stack, and in registers, and to be able to follow them to the corresponding heap
objects. Several techniques are used.
In this approach, some number of bits in each word are reserved to indicate what the type of the
information is. At a minimum, one bit indicates whether the word is a pointer or not. It’s
convenient to use the low bit and set it to one for pointers, zero for other information such as
integers. With this representation, a number n is represented as 2n, allowing addition and
subtraction to be performed as before. Accesses via pointers must have their offsets adjusted. For
example, the address 0x34 (hexadecimal) would be represented as 0x35. A memory operand
4(%rsi) would become 3(%rsi) to compensate for the one’s bit being set in %rsi.
To avoid using space in each word, an alternative is to store elsewhere the information about
which words are pointers. In an OO language, this information can be included in the dispatch
table and shared among all objects of the same type. The compiler also needs to compute which
words on the stack are pointers and record it for use by the garbage collector. The program
counter for each stack frame is saved in the next stack frame down, so it is possible to use the
program counter to look up the apropriate stack frame description for that program point. Note
that in general the stack frame description may change during a procedure, so there may be
multiple such table entries per procedure.
13
5.3 Conservative GC
Mark and sweep is a classic GC technique. It uses a depth-first traversal of the object graph (the
mark phase), followed by a linear scan of the heap to reclaim unmarked objects (the sweep). The
mark phase is an instance of the tricolor algorithm, implemented recursively:
visit(o) {
if (o is not white) return color o
gray
foreach (oJ >o) visit(oJ )
color o black
}
The marked objects are the non-white objects; gray objects are those whose subtrees of the
traversal is not yet complete. Once the mark phase is done, the sweep phase scans through the
heap to find unmarked (white) blocks. These blocks are added to the freelist.
One problem with mark-and-sweep is that when the recursive traversal is implemented
naively, the recursion depth and hence the stack space is unbounded. The solution is to convert
the algorithm into an iterative one. On returning from a call to visit, the state to be restored
consists of the object being scanned in the foreach loop. The pointer to this predecessor object
can be stored into the current object, which is called pred in the following iterative algorithm:
while (o.pred != null)
{ color o gray
foreach (oJ >o) {
if (oJ is white)
{ oJ .pred = o
o = oJ
continue
}
}
color o black o
= o.pred
}
14
To avoid the overhead of storing this extra pointer, theDeutsch–Waite–Schorr pointer reversal
technique overwrites the pointer to the next object oJ with this predecessor pointer, and restores it
when traversal of oJ completes. The only extra storage needed per object is enough bits to
indicate which pointer within the object is being traversed (and hence has been overwritten). The
visit() procedure is modified to return the
current object, so that the overwritten pointer to it can be restored in the caller. However, pointer
reversal makes the heap unusable for computation. The mutator must be paused during the mark
phase.
Mark-and-sweep is famous for introducing long garbage collection pauses, though
incremental versions exist.
Compacting the heap improves locality and eliminates external fragmentation. During the sweep
phase, it is possible to compute and record new compacted locations for each of the live objects.
A second pass over the heap can then adjust all pointers stored in objects to point to the new
locations of their referents. The objects can then be copied down to their new locations in a third
pass. Of course, three passes over the heap is expensive and poor cache performance can be
expected. Copying can be done at the same time as updating all the pointers if the new object
locations are stored elsewhere than in the objects themselves, eliminating one pass.
8 Copying collection
Copying collection can be a better way to achieve efficient compacting collection. The idea is to
split memory into two equal-sized portions; the mutator uses only half the memory at any given
time. Linear allocation is used in the half of memory that is active. When the next pointer
reaches the limit pointer, the half that is in use becomes the from-space and the other half
becomes the to-space. The from-space is traversed, copying reached objects into to-space.
Objects that are copied to the to space must have each pointer rewritten to point to the
15
corresponding to-space object; this is achieved by storing a forwarding pointer in the object,
pointing from from-space to the corresponding to-space object.
Depth-first traversal can be used to perform copying, but has the problem of unbounded stack
usage. An alternative that does not require stack space is Cheney’s algorithm, which uses
breadth-first traversal. As shown in Figure5, two pointers are maintained into the to-space. The
first is the scan pointer, which separates objects that have been copied and completely fixed to
point to to-space (black objects) from objects that have been copied but not yet fixed (gray
objects). The second pointer is the next pointer, where new objects are copied. The collector
starts by copying root objects to to-space. It then repeatedly takes the first object after the scan
pointer and fixes all of its pointers to point to to-space. If its pointers point to an object that has
already been copied, the forwarding pointer is used. Otherwise, the object is copied to to-space at
next. Once all the pointers are fixed, the object is black, and scan pointer is adjusted to point to
the next
forwar
ding
poin
ters next
gray
scan
black
from-space to-space
object. When the scan and next pointers meet, any uncopied objects left in from-space are
garbage. On the next garbage collection, the roles of from-space and to-space swap. One nice
property of this algorithm is that the work done is proportional to the number of live objects.
Breadth-first traversal is simple but has the unfortunate property of destroying locality in the
to-space. A hybrid approach is to augment breadth-first traversal to use depth-first traversal. As
each object is copied to to-space, a depth-first traversal to limited depth is performed from the
object, copying all reached objects. This helps keep objects in memory near objects they point to.
To avoid having long pause times, it is helpful to have the collector doing work at the same time
as the mutator. A concurrent collector runs concurrently with the mutator and therefore needs to
synchronize with the mutator to avoid either breaking the other. An incremental collector is one
16
that can run periodically when permitted by the mutator, and make some progress toward
garbage collector but without completing a GC. Typically an incremental collector would run on
each allocation request.
The danger of both concurrent and incremental collection is that invariants of the collector
may be broken by the mutator. In particular, the key invariant of the tricolor algorithm—that
black objects cannot point to white objects—might be broken if the mutator stores a white object
the collector hasn’t seen into a black object that the collector thinks it is done with.
There are two standard ways to prevent this invariant from being broken by the mutator: read
barriers and write barriers, which are generated by the compiler as part of code generation.
• Read barriers. The mutator checks for white objects as it reads objects. If a white object is
encountered, it is clearly reachable, and the mutator colors the object gray so that it is safe
to store it anywhere. Since the collector must know about all gray objects, this means in
practice that the reached object is put onto a queue that the collector is reading from. The
objects considered gray include this queue.
An example of this approach is Baker’s algorithm for concurrent copying collection. Rather
than pausing the mutator during all of collection, the mutator is paused only to copy the
root objects to to-space. The mutator then continues executing during collection, using
only to-space objects. However, the mutator may follow a pointer from a gray object back
to from-space. The read barrier then checks for from-space pointers. A from-space pointer
to an already-copied object is corrected to a to-space pointer using the forwarding pointer; a
from-space pointer to an object not copied yet triggers copying.
• Write barriers.
With write barriers, the mutator checks directly whether it is storing a pointer into black
objects; if it is, the object is colored gray, causing the collector to rescan the object. This
approach requires less tight coordination between the mutator and the collector, and the
barrier is less expensive because writes are less frequent than reads, and because the write
barrier only affects pointer updates, not other types such as integers.
This approach was pioneered by the “replication-based copying” garbage collection
technique of Nettles and O’Toole. The mutator runs in from-space, unlike in Baker’s
algorithm, and allocates new objects there. Updated objects in from-space might contain
pointers to objects that need to be saved, so the updates need to be propagated to to-space
and those pointers need to be followed before swapping the spaces. Therefore, these objects
are added to a queue that the garbage collector processes before declaring GC complete.
Once GC completes, the mutator and collector synchronized so the roots can be switched to
point to to-space. Since the mutator runs on from-space objects, an extra header word is
needed per object to store the forwarding pointer to to-space.
Using either read or write barriers, the pause time caused by incremental garbage collectors
can be reduced to the point where it is imperceptible.
17
Most objects die young. This observation motivates generational garbage collection. Effort
expended traversing long-lived objects is likely to be a waste. It makes sense to instead focus
garbage collection effort on recently created objects that are likely to be garbage.
Generational collection of the youngest generation works particularly well with copying
collection. Since the youngest generation×is small, the 2 space overhead is not a problem. Since
copying collection does work proportional to the live objects, and the youngest generation is
likely mostly dead, collection is also fast. The other benefits of copying collection: good locality
from compaction, and fast linear allocation. For older generations, copying collection is usually a
bad idea: it causes needless copying, increases the total memory footprint of the system, and does
not bring significant compaction benefits.
When doing a generational collection, pointers from older generations into the generation(s)
being collected are treated as roots. Fortunately, it is atypical for older generations to point to
younger generations, because the older-generation object is pointing to an object that was created
later. These older-generation objects are called the remembered set for the younger generation.
Objects become part of the remembered set only when they are updated imperatively to point to
the younger generation. The remembered set is therefore tracked by using a write barrier to
detect pointer updates that affect the remembered set.
To reduce the overhead of tracking the remembered set, card tables are often used. The
insight is that it is not necessary to track the remembered set precisely—any superset of the
remembered set suffices for safety, though we would like to avoid including very many extra
objects. A card table is a bit vector mapping older-generation heap regions to bits. A 0-bit in the
vector means no remembered-set objects are in that heap region; a 1-bit means there are some
remembered-set objects in the region. The card table is a compact representation of the
remembered set and it can updated very quickly when necessary. When collecting the younger
generation, all objects in the heap sections marked “1” must be scanned, conservatively, because
they might contain a pointer to the younger generation. Card table bits can be cleared during this
scan if no younger-generation pointers are found.
Another approach to tracking the remembered set (and implementing other write barriers) is
to use the virtual memory system to trap writes to older generations. The pages containing older
generations are marked read-only, triggering a page fault when a write to an old-generation
object is attempted. The page fault handler adds the old-generation object to the remembered set
if it is updated to point to a younger-generation object.
18
Cars processed in order
11 Mature object GC
The heap is divided into multiple trains containing some number of cars. Cars are parts
of the heap that are garbage-collected together. They are the units of GC. However, each
train can be garbage-collected independently. Given a train to be garbage-collected, the
first test is whether it has any incoming edges. If not, the whole train is garbage-collected.
Of course, keeping track of incoming edges requires a remembered set and therefore a
write barrier.
Given that the train has some incoming edges, the algorithm selects the first car from
the train. Using the remembered set for the car, all objects in the car that are referenced
externally are evacuated to the last car of a train containing a reference to the object (which
may or may not be the same train). If the destination train’s last car is too full, a new car is
created there as needed.
In addition, a traversal is done from these evacuated objects to find all other objects in
the same car reachable from them; these other objects are evacuated to the same train (and
car, if possible). After evacuating objects, all remaining objects are garbage. The car is
destroyed along with its remaining objects.
Since only one car is collected at a time, GC pause time is very short.
By moving objects to the same car as their predecessors, the train algorithm tends to
migrate cycles into the same train car. This allows them to be garbage-collected efficiently.
19
Sparse Matrix
A matrix can be defined as a two-dimensional array having 'm' columns and 'n' rows
representing m*n matrix. Sparse matrices are those matrices that have the majority of their
elements equal to zero. In other words, the sparse matrix can be defined as the matrix that has
a greater number of zero elements than the non-zero elements.
We can also use the simple matrix to store the elements in the memory; then why do we need
to use the sparse matrix. The following are the advantages of using a sparse matrix:
o Storage: As we know, a sparse matrix that contains lesser non-zero elements than
zero so less memory can be used to store elements. It evaluates only the non-zero
elements.
o Computing time: In the case of searching n sparse matrix, we need to traverse only
the non-zero elements rather than traversing all the sparse matrix elements. It saves
computing time by logically designing a data structure traversing non-zero elements.
Representing a sparse matrix by a 2D array leads to the wastage of lots of memory. The
zeroes in the matrix are of no use to store zeroes with non-zero elements. To avoid such
wastage, we can store only non-zero elements. If we store only non-zero elements, it reduces
the traversal time and the storage space.
The non-zero elements can be stored with triples, i.e., rows, columns, and value. The sparse
matrix can be represented in the following ways:
o Array representation
o Linked list representation
Array Representation
The 2d array can be used to represent a sparse matrix in which there are three rows named as:
Let's understand the sparse matrix using array representation through an example.
20
As we can observe above, that sparse matrix is represented using triplets, i.e., row, column,
and value. In the above sparse matrix, there are 13 zero elements and 7 non-zero elements.
This sparse matrix occupies 5*4 = 20 memory space. If the size of the sparse matrix is
increased, then the wastage of memory space will also be increased. The above sparse matrix
can be represented in the tabular form shown as below:
In the above table structure, the first column is representing the row number, the second
column is representing the column number and third column represents the non-zero value at
index(row, column). The size of the table depends upon the number of non-zero elements in
the sparse matrix. The above table occupies (7 * 3) = 21 but it more than the sparse matrix.
Consider the case if the matrix is 8*8 and there are only 8 non-zero elements in the matrix
then the space occupied by the sparse matrix would be 8*8 = 64 whereas, the space occupied
by the table represented using triplets would be 8*3 = 24.
In the 0th row and 1nd column, 4 value is available. In the 0th row and 3rd column, value 5 is
stored. In the 1st row and 2nd column, value 3 is stored. In the 1st row and 3rd column, value 6
21
is stored. In 2nd row and 2nd column, value 2 is stored. In the 3rd row and 0th column, value 2 is
stored. In the 3rd row and 1st column, value 3 is stored.
1. #include <stdio.h>
2. int main()
3. {
4. // Sparse matrix having size 4*5
5. int sparse_matrix[4][5] =
6. {
7. {0 , 0 , 7 , 0 , 9 },
8. {0 , 0 , 5 , 7 , 0 },
9. {0 , 0 , 0 , 0 , 0 },
10. {0 , 2 , 3 , 0 , 0 }
11. };
12. // size of matrix
13. int size = 0;
14. for(int i=0; i<4; i++)
15. {
16. for(int j=0; j<5; j++)
17. {
18. if(sparse_matrix[i][j]!=0)
19. {
20. size++;
21. }
22. }
23. }
24. // Defining final matrix
25. int matrix[3][size];
26. int k=0;
27. // Computing final matrix
28. for(int i=0; i<4; i++)
29. {
30. for(int j=0; j<5; j++)
31. {
32. if(sparse_matrix[i][j]!=0)
33. {
22
34. matrix[0][k] = i;
35. matrix[1][k] = j;
36. matrix[2][k] = sparse_matrix[i][j];
37. k++;
38. }
39. }
40. }
41. // Displaying the final matrix
42. for(int i=0 ;i<3; i++)
43. {
44. for(int j=0; j<size; j++)
45. {
46. printf("%d ", matrix[i][j]);
47. printf("\t");
48. }
49. printf("\n");
50. }
51. return 0;
52. }
Output
In linked list representation, linked list data structure is used to represent a sparse matrix. In
linked list representation, each node consists of four fields whereas, in array representation,
there are three fields, i.e., row, column, and value. The following are the fields in the linked
list:
23
o Value: It is the value of the non-zero element which is located at the index (row,
column).
o Next node: It stores the address of the next node.
Let's understand the sparse matrix using linked list representation through an example.
In the above figure, sparse represented in the linked list form. In the node, first field
represents the index of row, second field represents the index of column, third field represents
the value and fourth field contains the address of the next node.
The most natural representation is to use two-dimensional array A[m][n] and access the
element of ith row and jth column as A[i][j]. If a large number of elements of the matrix are
zero elements, then it is called a sparse matrix.
24
Struct snode{
Int row,col,val;
Struct snode *next;
};
So a sparse matrix can be represented using a list of such nodes, one per non–zero element of
the matrix. For example, consider the sparse matrix
If the procedure sadd is applied to the above linked list representations then we get the
resultant list.
25
This matrix is an addition of the matrices of a and b, respectively.
Points to Remember
1. If the sparse matrices to be added have n and m non-zero terms, respectively, then the
linked list representation of these sparse matrices contains m and n terms,
respectively.
2. Since sadd traverses each of these lists sequentially, the maximum number of
iterations that sadd will make will not be more than m n. So the computation time
of sadd is O(m n).
A sparse matrix can be represented by using TWO representations, those are as follows...
Triplet Representation
Linked Representation
Triplet Representation:
In this representation, we consider only non-zero values along with their row and column
index values. In this representation, the 0th row stores total rows, total columns and total non-
zero values in the matrix.
For example, consider a matrix of size 5 X 6 containing 6 number of non-zero values. This
matrix can be represented as shown in the image...
26
In above example matrix, there are only 6 non-zero elements ( those are 9, 8, 4, 2, 5 & 2) and
matrix size is 5 X 6. We represent this matrix as shown in the above image. Here the first row
in the right side table is filled with values 5, 6 & 6 which indicates that it is a sparse matrix
with 5 rows, 6 columns & 6 non-zero values. Second row is filled with 0, 4, & 9 which
indicates the value in the matrix at 0th row, 4th column is 9. In the same way the remaining
non-zero values also follows the similar pattern.
Linked Representation:
In linked representation, we use linked list data structure to represent a sparse matrix. In this
linked list, we use two different nodes namely header node and element node. Header node
consists of three fields and element node consists of five fields as shown in the image.
A storage volume is the basic unit of storage, such as allocated space on a disk or a single
tape cartridge. A storage pool is a collection of storage volumes. The server uses the storage
volumes to store backed-up, archived, or space-managed files. The group of storage pools
that you set up for the TSM server to use is called server storage. Storage pools can be
arranged in a storage hierarchy.
The server has two types of storage pools that serve different purposes: primary storage pools
and copy storage pools.
When a user tries to restore, retrieve, recall, or export file data, the requested file is obtained
from a primary storage pool if possible. Primary storage pool volumes are always located
onsite.
A primary storage pool can use random access storage (DISK device class) or sequential
access storage (for example, tape or FILE device classes).
The server has a default DISKPOOL storage pool that uses random access disk storage. You
can easily create other disk storage pools and storage pools that use tape and other sequential
access media by using Device Configuration Wizard in the TSM Console.
The server does not require separate storage pools for archived, backed-up, or space-managed
files. However, you may want to have a separate storage pool for space-managed files.
Clients are likely to require fast access to their space-managed files. Therefore, you may want
to have those files stored in a separate storage pool that uses your fastest disk storage.
27
Copy Storage Pool
When an administrator backs up a primary storage pool, the data is stored in a copy storage
pool. See Backing Up Storage Pools for details.
A copy storage pool can use only sequential access storage (for example, a tape device class
or FILE device class).
The copy storage pool provides a means of recovering from disasters or media failures. For
example, when a client attempts to retrieve a file and the server detects an error in the file
copy in the primary storage pool, the server marks the file as damaged. At the next attempt to
access the file, the server obtains the file from a copy storage pool. For details, see Restoring
Storage Pools, Using Copy Storage Pools to Improve Data Availability, Recovering a Lost or
Damaged Storage Pool Volume, and Maintaining the Integrity of Files.
You can move copy storage pool volumes offsite and still have the server track the volumes.
Moving copy storage pool volumes offsite provides a means of recovering from an onsite
disaster.
Figure 17 shows one way to set up server storage. In this example, the storage defined for the
server includes:
Three disk storage pools, which are primary storage pools: ARCHIVE, BACKUP,
and HSM
One primary storage pool that consists of tape cartridges
One copy storage pool that consists of tape cartridges
Policies defined in management classes direct the server to store files from clients in the
ARCHIVE, BACKUP, or HSM disk storage pools. For each of the three disk storage pools,
the tape primary storage pool is next in the hierarchy. As the disk storage pools fill, the server
migrates files to tape to make room for new files. Large files may go directly to tape. For
more information about setting up a storage hierarchy, see Overview: The Storage Pool
Hierarchy.
You can back up all four of the primary storage pools to the one copy storage pool. For more
information on backing up primary storage pools, see Backing Up Storage Pools.
28
To set up this server storage hierarchy, do the following:
1. Define the three disk storage pools, or use the three default storage pools that are
defined when you install the server. Add volumes to the disk storage pools if you have
not already done so.
2. Define policies that direct the server to initially store files from clients in the disk
storage pools. To do this, you define or change management classes and copy groups
so that they point to the storage pools as destinations. Then activate the changed
policy. See Changing Policy for details.
3. Attach one or more tape devices, or a tape library, to your server system. Use Device
Configuration Wizard in the TSM Console to configure the device. See Chapter 5,
Configuring Storage Devices for more information. For detailed information on
defining a storage pool, see Defining or Updating Primary Storage Pools.
4. Update the disk storage pools so that they point to the tape storage pool as the next
storage pool in the hierarchy. See Example: Updating Storage Pools.
5. Define a copy storage pool. This storage pool can use the same tape device or a
different tape device as the primary tape storage pool. See Defining a Copy Storage
Pool
6. Set up administrative schedules or a script to back up the disk storage pools and the
tape storage pool to the copy storage pool. Send the volumes offsite for safekeeping.
See Backing Up Storage Pools.
The objective in the implementation of a storage pool is to make the running times
for Acquire and Release operations as small as possible. Ideally, both operations run in
constant time. In this section, we present a storage pool implementation that uses a singly-
linked list to keep track of the unused areas of memory. The consequence of using a this
approach is that the running times are not ideal.
There are several requirements that the implementation of a storage pool must satisfy: It must
keep track somehow of the blocks of memory that have been allocated as well as the areas of
memory that remain unallocated.
29
For example, in order to implement the Acquire operation, we must have the means to locate
an unused area of memory of sufficient size in order to satisfy the request. The approach
taken in this section is to use a singly-linked list to keep track of the free areas in the pool.
In addition to keeping track of the free areas, it is necessary to keep track of the size of each
block that is allocated. This is necessary because the Release operation takes only a pointer to
the block of memory to be released. I.e., the size of the block is not provided as an argument
to the Release function.
Where should we keep track of this extra information? It turns out that the usual approach is
to keep the necessary information in the storage pool itself. An area that has not been
allocated to a user is available for use by the pool itself. Specifically, the nodes of the linked
list of free areas themselves occupy the free areas.
We implement the storage pool as an array of Blocks. The structure of a Block is shown in
Figure . A sequence of consecutive, contiguous blocks in the array constitutes an area. Only
the first block in each area is used to keep track of the entire area.
An area which has been allocated is said to be reserved . The first word of the first block in
the area is used to keep track of the length of the area (in blocks). The remaining memory
locations in the area are given up to the user.
An area which has not been allocated is said to be free . The first word of the first block in the
area is used to keep track of the length of the area (in blocks). All of the free areas are linked
together in a singly-linked list, known as the free list . The second word of the first block in
the area contains a pointer to the next free area in the free list. For reasons explained below,
we keep the free list sorted by the address of areas contained therein.
30
Input and Output
Input:
This algorithm will take the maze as a matrix.
In the matrix, the value 1 indicates the free space and 0 indicates the wall or blocked area.
In this diagram, the top-left circle indicates the starting point and the bottom-right circle
indicates the ending point.
Output:
It will display a matrix. From that matrix, we can find the path of the rat to reach the
destination point.
Algorithm
isValid(x, y)
Input: x and y point in the maze.
Output: True if the (x,y) place is valid, otherwise false.
Begin
if x and y are in range and (x,y) place is not blocked, then
return true
return false
End
solveRatMaze(x, y)
Input− The starting point x and y.
31
Output − The path to follow by the rat to reach the destination, otherwise false.
Begin
if (x,y) is the bottom right corner, then
mark the place as 1
return true
if isValidPlace(x, y) = true, then
mark (x, y) place as 1
if solveRatMaze(x+1, y) = true, then //for forward movement
return true
if solveRatMaze(x, y+1) = true, then //for down movement
return true
mark (x,y) as 0 when backtracks
return false
return false
End
Rat in a Maze
Let us discuss Rat in a Maze as another example problem that can be solved using
Backtracking.
A Maze is given as N*N binary matrix of blocks where source block is the upper left most
block i.e., maze [0] [0] and destination block is lower rightmost block i.e., maze [N-1][N-1].
A rat starts from source and has to reach the destination. The rat can move only in two
directions: forward and down.
In the maze matrix, 0 means the block is a dead end and 1 means the block can be used in the
path from source to destination. Note that this is a simple version of the typical Maze
problem. For example, a more complex version can be that the rat can move in 4 directions
and a more complex version can be with a limited number of moves.
Following is an example maze.
Gray blocks are dead ends (value = 0).
32
Following is a binary matrix representation of the above maze.
{1, 0, 0, 0}
{1, 1, 0, 1}
{0, 1, 0, 0}
{1, 1, 1, 1}
Following is a maze with highlighted solution path.
Following is the solution matrix (output of program) for the above input matrix.
{1, 0, 0, 0}
{1, 1, 0, 0}
{0, 1, 0, 0}
33
{0, 1, 1, 1}
All entries in solution path are marked as 1.
Backtracking Algorithm: Backtracking is an algorithmic-technique for solving problems
recursively by trying to build a solution incrementally. Solving one piece at a time, and
removing those solutions that fail to satisfy the constraints of the problem at any point of time
(by time, here, is referred to the time elapsed till reaching any level of the search tree) is the
process of backtracking.
Approach: Form a recursive function, which will follow a path and check if the path reaches
the destination or not. If the path does not reach the destination then backtrack and try other
paths.
Algorithm:
1. Create a solution matrix, initially filled with 0’s.
2. Create a recursive function, which takes initial matrix, output matrix and position of rat (i,
j).
3. if the position is out of the matrix or the position is not valid then return.
4. Mark the position output[i][j] as 1 and check if the current position is destination or not.
If destination is reached print the output matrix and return.
5. Recursively call for position (i+1, j) and (i, j+1).
6. Unmark position (i, j), i.e output[i][j] = 0.
Complexity Analysis:
Time Complexity: O(2^(n^2)).
The recursion can run upper-bound 2^(n^2) times.
Space Complexity: O(n^2).
Output matrix is required so an extra space of size n*n is needed.
A maze is in the form of a 2D matrix in which some cells/blocks are blocked. One of the cells
is termed as a source cell, from where we have to start. And another one of them is termed as
a destination cell, where we have to reach. We have to find a path from the source to the
destination without moving into any of the blocked cells. A picture of an unsolved maze is
shown below, where grey cells denote the dead ends and white cells denote the cells which
can be accessed.
34
To solve these types of puzzle, we first start with the source cell and move in a direction
where the path is not blocked. If the path taken makes us reach the destination, then the
puzzle is solved. Otherwise, we come back and change our direction of the path taken.
If this is your very first foray into Backtracking, fear not — it’s mine, too! Let’s tackle it
together — and try not to lose our sanity in the process. So many things in this world would
have never come into existence if there hadn’t been an issue problem that needed solving.
This truth applies to almost everything, but boy, is it obvious in the world of computer
science?
So, it’s a big “YES”, but I do think that it’s unique in one way: computer science’s
innovations rely and build upon its own abstractions. But what leads to the need for
Backtracking algorithms?
35
What is Backtracking?
Depth First Search (DFS) uses the concept of backtracking at its very core. So, in DFS, we
basically try exploring all the paths from the given node recursively until we reach the goal.
After we search in a particular branch of a tree in DFS, we can land up in two possible states.
In recursion, function calls itself until it reaches a base case. In backtracking, we use
recursion in order to explore all the possibilities until we get the best result for the problem.
36
What is backtracking in stack?
Recursion uses a call stack to keep the value of each recursive call and then pop as the
function ends. We can eliminate recursion by using our own stack to do the same thing.
Always turn left/right when you are in a maze and you will find your way to the destination.
If you always turn to the same direction, you will eventually find the exit or whatever the
maze is about finding.
It is easy to implement this kind of processing by constructing a tree of choices being made,
called the state-space tree. Its root represents an initial state before the search for a problem
begins. The nodes of the first
level in the tree represent the options made for the first component of a solution, the nodes of
the second level show the choices for the second component, and so on.
In most of the cases, a state-space tree for a backtracking algorithm is constructed in the
manner of depth-first search. If the current node is termed as promising, its child is generated
by adding the first remaining legitimate choice for the next component of a solution, and the
processing moves to this child.
If the current node turns out to be non-promising, the algorithm recursively backtracks to the
node’s parent to
consider the next possible option for its last component; if there is no such available choice, it
backtracks one more level up the tree, and so on. In the end, if the algorithm reaches a
complete solution to the problem, it either stops (if just one solution is required) or continues
searching for other possible solutions.
Backtracking is one of the famous algorithms because of its simplicity and elegance; it
doesn’t always have great performance, but the branch cutting part is really amazing and
gives you the notion of progress in performance while you code.
37
Code:
Step 1: Include the header file, define the original maze and initialise the solution matrix.
Step 2: Assign ‘0’ to all values of the solution matrix and call the ‘solveMaze’ function.
Step 3: Define ‘solveMaze’ function, which is the main implementation of the backtracking
algorithm.
38
Step 4: Define ‘printMaze’ and ‘printSolution’ function.
39
Output:
Original Maze: 0 means the block is a dead end and 1 means the block can be used in the
path from source to destination.
Solution Path: Following ‘1’ from the source node to the destination node defines the final
solution.
40
Explanation of the Code:
o printMaze(): This function is just printing the original maze matrix.
o printSolution(): This function is just printing the solution matrix derived after
applying the backtracking algorithm.
o solveMaze(): This is the main function where we are implementing our backtracking
algorithm. First, we are checking if our cell/block is the destination cell or not if
(r==SIZE-1) and (c==SIZE-1). If it is the destination cell then our maze is already
solved. If not, then we search whether it is a valid cell to move or not. A valid cell
must be in the 2D-matrix i.e., indices must between 0 to SIZE-1 i.e r>=0 && c>=0
&& r
Complexity Analysis:
o Time Complexity: O(2^(n^2)): The recursion can run upper bound 2^(n^2) times.
o 2) Space Complexity: O(n^2): Output matrix is required so an extra space of size n*n
is needed.
Real-Life Applications:
o Puzzles such as eight queens puzzle, crosswords, verbal arithmetic, Sudoku and Peg
Solitaire.
o Combinatorial optimisation problems such as parsing and the knapsack
o problem.
Pros and Cons of using Backtracking algorithm:
Pros:
o It’s very intuitive to code. It’s almost like a small kid trying to solve the problem.
o It is a step-by-step representation of a solution to a given problem, which is very easy
to understand
41
o It is easy to first develop an algorithm, and then convert it into a flowchart and then
into a computer program.
o It is very easy to implement and contains fewer lines of code. Almost all of the
backtracking codes are generally few lines of recursive function code.
o It has got a definite procedure.
Cons:
o More optimal algorithms for the given problem may exist.
o Very time inefficient in lot of cases when branching factor is large.
o Large space complexity because we are using recursion so function information is
stored on stack.
Conclusion:
o Backtracking is like a sort of permutation in steroids, once a pattern doesn’t match we
find another one until there is no more available source, or a certain pattern matched.
o Backtracking can almost solve any problem, e.g. Chess (famous 8 queens problem) or
Sudoku (complete solution set), due to its brute-force nature (analogy to permutation).
o Backtracking requires recursion which can be something worse, because CPU stack
space is limited and can be consumed quickly by recursion.
o Backtracking takes polynomial time, oh no!
o Backtracking is non-deterministic unless you tracked it.
o Backtracking is hard to do/simulate by human simulate.
o 7) Backtracking can rarely be tail-call optimised.
Array Representation
Array is a container which can hold a fix number of items and these items should be of the
same type. Most of the data structures make use of arrays to implement their algorithms.
Following are the important terms to understand the concept of Array.
Element − Each item stored in an array is called an element.
Index − Each location of an element in an array has a numerical index, which is used
to identify the element.
Array Representation
Arrays can be declared in various ways in different languages. For illustration, let's take C
array declaration.
42
Arrays can be declared in various ways in different languages. For illustration, let's take C
array declaration.
As per the above illustration, following are the important points to be considered.
Index starts with 0.
Array length is 10 which means it can store 10 elements.
Each element can be accessed via its index. For example, we can fetch an element at
index 6 as 9.
Basic Operations
Following are the basic operations supported by an array.
Traverse − print all the array elements one by one.
Insertion − Adds an element at the given index.
Deletion − Deletes an element at the given index.
Search − Searches an element using the given index or by the value.
Update − Updates an element at the given index.
In C, when an array is initialized with size, then it assigns defaults values to its elements in
following order.
bool false
char 0
int 0
float 0.0
double 0.0 f
43
void
wchar_t 0
Traverse Operation
This operation is to traverse through the elements of an array.
Example
Following program traverses and prints the elements of an array:
#include <stdio.h>
main() {
int LA[] = {1,3,5,7,8};
int item = 10, k = 3, n = 5;
int i = 0, j = n;
printf("The original array elements are :\n");
for(i = 0; i<n; i++) {
printf("LA[%d] = %d \n", i, LA[i]);
}
}
When we compile and execute the above program, it produces the following result −
Output
The original array elements are :
LA[0] = 1
LA[1] = 3
LA[2] = 5
LA[3] = 7
LA[4] = 8
Insertion Operation
Insert operation is to insert one or more data elements into an array. Based on the
requirement, a new element can be added at the beginning, end, or any given index of array.
Here, we see a practical implementation of insertion operation, where we add data at the end
of the array −
Example
Following is the implementation of the above algorithm −
44
#include <stdio.h>
main() {
int LA[] = {1,3,5,7,8};
int item = 10, k = 3, n = 5;
int i = 0, j = n;
n = n + 1;
while( j >= k) {
LA[j+1] = LA[j];
j = j - 1;
}
LA[k] = item;
45
Algorithm
Consider LA is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to delete an element available at the Kth position of LA.
1. Start
2. Set J = K
3. Repeat steps 4 and 5 while J < N
4. Set LA[J] = LA[J + 1]
5. Set J = J+1
6. Set N = N-1
7. Stop
Example
Following is the implementation of the above algorithm −
#include <stdio.h>
void main() {
int LA[] = {1,3,5,7,8};
int k = 3, n = 5;
int i, j;
j = k;
while( j < n) {
LA[j-1] = LA[j];
j = j + 1;
}
n = n -1;
46
The array elements after deletion :
LA[0] = 1
LA[1] = 3
LA[2] = 7
LA[3] = 8
Search Operation
You can perform a search for an array element based on its value or its index.
Algorithm
Consider LA is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to find an element with a value of ITEM using sequential search.
1. Start
2. Set J = 0
3. Repeat steps 4 and 5 while J < N
4. IF LA[J] is equal ITEM THEN GOTO STEP 6
5. Set J = J +1
6. PRINT J, ITEM
7. Stop
Example
Following is the implementation of the above algorithm −
#include <stdio.h>
void main() {
int LA[] = {1,3,5,7,8};
int item = 5, n = 5;
int i = 0, j = 0;
j = j + 1;
}
47
Output
The original array elements are :
LA[0] = 1
LA[1] = 3
LA[2] = 5
LA[3] = 7
LA[4] = 8
Found element 5 at position 3
Update Operation
Update operation refers to updating an existing element from the array at a given index.
Algorithm
Consider LA is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to update an element available at the Kth position of LA.
1. Start
2. Set LA[K-1] = ITEM
3. Stop
Example
Following is the implementation of the above algorithm −
#include <stdio.h>
void main() {
int LA[] = {1,3,5,7,8};
int k = 3, n = 5, item = 10;
int i, j;
LA[k-1] = item;
48
Output
The original array elements are :
LA[0] = 1
LA[1] = 3
LA[2] = 5
LA[3] = 7
LA[4] = 8
The array elements after updation :
LA[0] = 1
LA[1] = 3
LA[2] = 10
LA[3] =
49