BDA LabManual 2024-25
BDA LabManual 2024-25
EXPERIMENT NO: 1
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
EXPERIMENT NO.2
Theory:
NoSQL databases have grown in popularity with the rise of Big Data applications. In
comparison to relational databases, NoSQL databases are much cheaper to scale, capable
of handling unstructured data, and better suited to current agile development approaches.
The advantages of NoSQL technology are compelling but the thought of replacing a
legacy relational system can be daunting. To explore the possibilities of NoSQL in your
enterprise, consider a small-scale trial of a NoSQL database like MongoDB. NoSQL
databases are typically open source so you can download the software and try it out for
free. From this trial, you can assess the technology without great risk or cost to your
organization.
Commands of Neo4j:
1)Create Clause
CREATE (sample)
CREATE (sample1),(sample2)
CREATE (Vidya:Student)
2)Creating Relationships
CREATE (Vivek1)-[r:STUDENT_OF]->(MU)
create (Ind:Country{name:"India"})
CREATE (Vivek2)-[r:STUDENT_OF]->(MU1)
create (MU1)-[r1:UNIVERSITY_OF]->(Ind)
3)Merge
4)Delete
Sample Output:
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
EXPERIMENT NO: 3
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
EXPERIMENT NO: 6
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
EXPERIMENT NO: 5
Theory:
Map reduce
MapReduce is a style of computing that has been implemented in several systems,
including Google’s internal implementation (simply called MapReduce)and the popular
open-source implementation Hadoop which can be obtained,along with the HDFS file
system from the Apache Foundation. You can usean implementation of MapReduce to
manage many large-scale computations in a way that is tolerant of hardware faults. All
you need to write are two functions, called Map and Reduce, while the system manages
the parallel execution, coordination of tasks that execute Map or Reduce, and also deals
with the possibility that one of these tasks will fail to execute. In brief, a MapReduce
computation executes as follows:
1. Some number of Map tasks each are given one or more chunks from a distributed file
system. These Map tasks turn the chunk into a sequence of key-value pairs. The way
key-value pairs are produced from the input data is determined by the code written by the
user for the Map function.
2. The key-value pairs from each Map task are collected by a master controller and sorted
by key. The keys are divided among all the Reduce tasks, so all key-value pairs with the
same key wind up at the same Reduce task.
3. The Reduce tasks work on one key at a time, and combine all the values associated
with that key in some way. The manner of combination of values is determined by the
code written by the user for the Reduce function.
Matrix Multiplication
Suppose we have an nxn matrix M, whose element in row i and column j will be denoted
by Mij. Suppose we also have vector v of length n, whose jthelement is Vj . Then the
matrix vector product is the vector of length n, whose ith element xi .
Program
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
Matrix_Mapper.py
#!/usr/bin/env python
import sys
for line in sys.stdin:
line = line.strip()
entry = line.split(",")
key = entry[0]
value= line
if key=='a':
print('{0}\t{1}'.format(key,value))
elif key=='b':
print('{0}\t{1}'.format(key,value))
Matrix_Reducer.py
#!/usr/bin/env python
import sys
a={}
b={}
for input_line in sys.stdin:
input_line = input_line.strip()
this_key,value = input_line.split("\t",1)
v = value.split(",")
if this_key=='a':
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
a[(int(v[1]),int(v[2]))]=int(v[3])
elif this_key=='b':
b[(int(v[1]),int(v[2]))]=int(v[3])
output
(0,0) 30
(0,1) 51
(0,2) -5
(0,3) 15
(0,4) 14
(1,0) 15
(1,1) -12
(1,2) -25
(1,3) 12
(1,4) 28
(2,0) 50
(2,1) 65
(2,2) 5
(2,3) 33
(2,4) -26
(3,0) -5
(3,1) 2
(3,2) -3
(3,3) -6
(3,4) 16
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
EXPERIMENT NO: 5
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
EXPERIMENT NO.6
Theory:
Filtering Streams
In the technique known as Bloom filtering, we use that main memory as a bit array.
2. A collection of hash functions h1, h2, . . . , hk. Each hash function maps “key” values
to n buckets, corresponding to the n bits of the bit-array.
The purpose of the Bloom filter is to allow through all stream elements whose keys are in
S, while rejecting most of the stream elements whose keys are not in S.To initialize the
bit array, begin with all bits 0. Take each key value in S and hash it using each of the k
hash functions. Set to 1 each bit that is hi(K) for some hash function hi and some key
value K in S. To test a key K that arrives in the stream, check that all of
h1(K), h2(K), . . . , hk(K) are 1’s in the bit-array. If all are 1’s, then let the stream element
through. If one or more of these bits are 0, then K could not be in S, so reject the stream
element.
Example:
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
Program
#include <bits/stdc++.h>
#define ll long long
using namespace std;
// hash 1
int h1(string s, int arrSize)
{
ll int hash = 0;
for (int i = 0; i < s.size(); i++)
{
hash = (hash + ((int)s[i]));
hash = hash % arrSize;
}
return hash;
}
// hash 2
int h2(string s, int arrSize)
{
ll int hash = 1;
for (int i = 0; i < s.size(); i++)
{
hash = hash + pow(19, i) * s[i];
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
// hash 3
int h3(string s, int arrSize)
{
ll int hash = 7;
for (int i = 0; i < s.size(); i++)
{
hash = (hash * 31 + s[i]) % arrSize;
}
return hash % arrSize;
}
// hash 4
int h4(string s, int arrSize)
{
ll int hash = 3;
int p = 7;
for (int i = 0; i < s.size(); i++) {
hash += hash * 7 + s[0] * pow(p, i);
hash = hash % arrSize;
}
return hash;
}
// lookup operation
bool lookup(bool* bitarray, int arrSize, string s)
{
int a = h1(s, arrSize);
int b = h2(s, arrSize);
int c = h3(s, arrSize);
int d = h4(s, arrSize);
&& bitarray[d])
return true;
else
return false;
}
// insert operation
void insert(bool* bitarray, int arrSize, string s)
{
// check if the element in already present or not
if (lookup(bitarray, arrSize, s))
cout << s << " is Probably already present" << endl;
else
{
int a = h1(s, arrSize);
int b = h2(s, arrSize);
int c = h3(s, arrSize);
int d = h4(s, arrSize);
bitarray[a] = true;
bitarray[b] = true;
bitarray[c] = true;
bitarray[d] = true;
// Driver Code
int main()
{
bool bitarray[100] = { false };
int arrSize = 100;
string sarray[33]
= { "abound", "abounds", "abundance",
"abundant", "accessible", "bloom",
"blossom", "bolster", "bonny",
"bonus", "bonuses", "coherent",
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
EXPERIMENT NO: 7
Aim: Social Network Analysis using R (for example: Community Detection Algorithm)
Theory
It is the process of exploring or examining the social structure by using graph theory. It is
used for measuring and analyzing the structural properties of the network. It helps to
measure relationships and flows between groups, organizations and other connected
entities.
● A network is represented as a graph which shows links between each vertex and
its neighbors
● A line indicating a link between vertices is called an edge
● A group of vertices that are mutually reachable by following edges on the graph is
called component
● The edges followed from one vertex to another are called a path
● R software
● Package
○ igraph
○ sna(social network analysis)
A community with respect to graphs, can be defined as a subset of nodes that are densely
connected to each other and loosely connected to the nodes in the other communities in
the same graph. Detecting communities in a network is one of the most important tasks in
network analysis. In a large scale network such as an online social network we could
have millions of nodes and edges. Detecting communities in such networks becomes a
herculean task. Therefore we need community detection algorithms that can partition the
network into multiple communities
Under the Girvan Newman algorithm the communities in a graph are discovered
by iteratively removing the edges of the graph based on the edge betweenness removed
first.
The edge betweenness centrality can be defined as the number of shortest paths
that pass through an edge in a network. Each and every edge is given an EBC score based
on the shortest paths among all the nodes in the graph
Program
# Network measures
degree(g1, mode='all')
degree(g1, mode='in')
degree(g1, mode='out')
# Create network
net <- graph.data.frame(y,directed=F)
V(net)$label <- V(net)$name
V(net)$degree <- degree(net)
# Network diagram
plot(net)
plot(net,
vertex.size=as*30,
main = 'Authorities',
vertex.color = rainbow(52),
edge.arrow.size=0.1,
layout = layout.kamada.kawai)
par(mfrow=c(1,1))
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
# Community detection
net <- graph.data.frame(y, directed = F)
cnet <- cluster_edge_betweenness(net)
plot(cnet,net,vertex.size=10,vertex.label.cex=0.8)
Output
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
ATHARVA EDUCATIONAL TRUST'S
ATHARVA COLLEGE OF ENGINEERING
(Approved by AICTE, Recognized by Government of Maharashtra
& Affiliated to University of Mumbai - Estd. 1999 - 2000)
ISO 2100:2018 ISO 14001:2015 ISO 9001:2015
NAAC Accredited A+ Grade
EXPERIMENT NO: 8
Mini Project: One real life large data application to be implemented (Use standard
- Streaming data analysis – use flume for data capture, HIVE/PYSpark for analysis
of twitter data, chat data, weblog analysis etc.
Examples :