Dsa Cat 2 QP & kEY
Dsa Cat 2 QP & kEY
Course Outcomes:
To realize the properties of tree data structure and its importance in searching large
CO3
database.
CO4 To understand graph data structure and its applications.
Knowledge Level: K1-Knowledge, K2-Understand, K3-Apply, K4-Analyze, K5-Synthesis K6 - Evaluate
ANSWER KEY
PART A
1. Define tree?
Tree is a non linear data structure. There is no linear relation between the data items.
It can be defined as finite set of more than one node.
There is a special node designated as root node.
8. Define set.
A set is an abstract data structure that can store certain values, without any particular order, and no repeated
values. It is a computer implementation of the mathematical concept of a finite set. Unlike most other
collection types, rather than retrieving a specific element from a set, one typically tests a value for
membership in a set.
PART B
11. Explain about the B+Trees
A B-tree is very efficient with respect to search and modification operations that involve a single
record –For example, let m = 100 and N = 1000000; in such case the search of a key requires at most 4
disk accesses
A B-tree however is not particularly suited for sequential operations nor for range searches The
retrieval of the next key value may requires accessing a large number of nodes
To address such problem a variation to the B-tree structure has been proposed, known as B+-tree
Main idea
In a B-tree, the key values have two functions: separatorso determine the path to follow during the
search
Key values to allow accessing the information associated with them (that is, the pointers to the data)
In a B+-tree such functions are kept distinct:
The leaf nodes contain all the key values (and the associated information)
The internal nodes (organized as a B-tree) store some separators which have the only function of
determining the path to follow when searching for a key value
In addition the leaf nodes are linked in a list, in order to efficiently support range searches or
ACHARIYA
COLLEGE OF ENGINEERING TECHNOLOGY
(Approved by AICTE New Delhi & Affiliated to Pondicherry University)
An ISO 9001 : 2008 Certified Institution
DEPARTMENT OF ARTIFICIAL INTELLIGENCE & DATA SCIENCE
sequential searches (there is also a pointer to the first element of such list to support fast accesses to
the minimum key value) partial duplication of the keys
T\he index entries (keys and data references) are only stored in the leaf nodes
A search for a given key value must always determine a leaf node
The subtree on the left side of a separator contains key values that are lower than the separator; the
subtree on the right side of a separator contains key values which are greater or equal than the
separator
In the case of alphanumeric keys, one can reduce the space requirements by using separators that
have reduced lengths
B+Tree insertion:
Insert at bottom level If leaf page overflows, split page and copy middle element to next index page
If index page overflows, split page and move middle element to next index page
ACHARIYA
COLLEGE OF ENGINEERING TECHNOLOGY
(Approved by AICTE New Delhi & Affiliated to Pondicherry University)
An ISO 9001 : 2008 Certified Institution
DEPARTMENT OF ARTIFICIAL INTELLIGENCE & DATA SCIENCE
ACHARIYA
COLLEGE OF ENGINEERING TECHNOLOGY
(Approved by AICTE New Delhi & Affiliated to Pondicherry University)
An ISO 9001 : 2008 Certified Institution
DEPARTMENT OF ARTIFICIAL INTELLIGENCE & DATA SCIENCE
B+Tree Deletion:
Delete key and data from leaf page
If leaf page underflows, merge with sibling and delete key in between them
If index page underflows, merge with sibling and move down key in between them
ACHARIYA
COLLEGE OF ENGINEERING TECHNOLOGY
(Approved by AICTE New Delhi & Affiliated to Pondicherry University)
An ISO 9001 : 2008 Certified Institution
DEPARTMENT OF ARTIFICIAL INTELLIGENCE & DATA SCIENCE
Search in a
trie:
Follow links corresponding to each character in the key.
Search hit: node where search ends has a non-null value.
Search miss: reach null link or node where search ends has null value.
ADJACENCY MATRIX:
The adjacent matrix A of graph G = (V,E) with n vertices is an n * n matrix , such that
Aij =1 , if there is an edge Vi to Vj
Aij =0, if there is no edge.
Adjacent matrix for Directed Graph:
Example:
V1,2 =1, since there is an edge V1 to Vw
Similarly V1,3 =1, there is an edge V1 to V3
ACHARIYA
COLLEGE OF ENGINEERING TECHNOLOGY
(Approved by AICTE New Delhi & Affiliated to Pondicherry University)
Advantages:
1. Simple to implement
Disadvantages:
1. Takes O(n2) space to represent the graph
2. Takes O(n2) time to solve the most of the problem
ADJACENCY LIST
In this representation, we store a graph as a linked structure.We store all vertices in a list then for each
vertex , we have a linked list of its adjacency vertices.
ACHARIYA
COLLEGE OF ENGINEERING TECHNOLOGY
(Approved by AICTE New Delhi & Affiliated to Pondicherry University)
Disadvantages:
1. It take O(n) time to determine whether there is an arc from vertex I to vertex j. Since there can be O(n)
vertices on the adjacency list for vertex i.
14. Explain transitive closure with examples
A problem related to the all pairs shortest path problem is that of determining for every pair of vertices i,j in G the
existence of a path from i to j. Two cases are of interest, one when all path lengths (i.e., the number of edges on the
path) are required to be positive and the other when path lengths are to be nonnegative. If A is the adjacency matrix
+ +
of G, then the matrix A having the property A (i,j) = 1 if there is a path of length > 0 from i to j and 0 otherwise is
* *
called the transitive closure matrix of G. The matrix A with the property A (i,j) = 1 if
there is a path of length 0 from i to j and 0 otherwise is the reflexive transitive closure matrix of G.
ACHARIYA
COLLEGE OF ENGINEERING TECHNOLOGY
(Approved by AICTE New Delhi & Affiliated to Pondicherry University)
+ * * + +
Figure shows A and A for a digraph. Clearly, the only difference between A and A is in the terms on the diagonal. A
*
(i,i) = 1 iff there a cycle of length > 1 containing vertex i while A (i,i) is always one as there is a path of length 0 from i
to i. If we use algorithm ALL_COSTS with COST(i,j) = 1 if <i,j> is an edge in G and COST(i,j) = +
if <i,j> is not in G, then we can easily obtain A+ from the final matrix A by letting A+ (i,j) = 1 iff A (i,j) < +∞. A*
can be obtained from A+ by setting all diagonal elements equal 1. The total time is O(n3). Some simplification is
achieved by slightly modifying the algorithm. In this modification the computation of line 9 of ALL_COSTS
becomes A(i,j) A (i,j) or (A(i,k) and A(k,)) and COST(i,j) is just the adjacency matrix of G. With this modification,
A need only be a bit matrix and then the final matrix A will be A+.