BDA Module Wise Important Questions
BDA Module Wise Important Questions
hi
A quick Reference for the students to all the Documentations and numericals for the subject
s
an
Revisions of all the topics module wise :
aw
supported with nice examples
related to real world
applications/
y
graphs/tables/assumptions
suitable to the questions)
1: Introduction to
https://docs.google.com/document/d/16Gdf3qg2I
Fm07JqT7p-11noLwHX4bCSM/edit?usp=sharing
&ouid=111378912346065138505&rtpof=true&sd=t
l
rue
na
hi
8. Difference between SQL and NOSQL
https://docs.google.com/document/d/1OC
s
dkF-Wl8afBuXaEOYj81UBQWsvAXgU-v9q
dWlKMW_Q/edit?usp=sharing
an
3. MapReduce Introduction 1. Relational-Algebra Operations - join,
Paradigm Matrix Multiplications: ( any method can eb union, intersection, Grouping and
asked) Aggregation
1) Matrix vector multiplication https://docs.google.com/presentation/d/1j
w
2) 1-step matrix matrix multiplications eqEZ-GEfvN4qYHcw4FF32Be2O0m2KIc/e
3) 2-step matrix matrix multiplications dit?usp=sharing&ouid=1113789123460651
ya
(pseudocodes ) 38505&rtpof=true&sd=true
https://drive.google.com/file/d/1akJU_l1T-X_Bg7 https://medium.com/swlh/relational-operat
usCk5B-JWNCYDWL9M9/view?usp=sharing ions-using-mapreduce-f49e8bd14e31#:~:t
ur
ext=For%20example%2C%20If%20a%20re
https://docs.google.com/document/d/15yPb7OyT lation,values%20and%20output%20the%2
GT8QSNdYRXfaQ-yX6nEqJFor/edit?usp=sharing 0result.
iS
&ouid=111378912346065138505&rtpof=true&sd=t
rue
https://drive.google.com/file/d/15nHeLdkHkKm8x
b492OuFseOORyBUBcsG/view?usp=sharing 2. Grouping by Key,
l
3. Shuffle , sort and Reduce Tasks
na
4. Combiners,
Mapreduce: Word count problems 5. Details of MapReduce Execution,
https://drive.google.com/file/d/1J43oANUTDboht 6. Coping With Node Failures.
12Tizz6w_GEUEjr8f2z/view?usp=sharing
So
4. Mining Big 1) The Bloom Filter:(if the value is present or 1. The Stream Data Model: A
Data Streams not) DataStream-Management System(DSMS)
https://docs.google.com/presentation/d/1k 2. Difference between DSMS and DBMS
u40VHl8xGJMp71EiKOAKJ5MTrjSXXu0/ed 3. Issues in Stream Processing.
hi
it?usp=sharing&ouid=11137891234606513 4. Sampling Techniques.
8505&rtpof=true&sd=true 5. (FM algorithm, DGIM, Bloom filter)
s
2) Flejolet martin algorithm (FM https://docs.google.com/presentation/d/1l2380Qt
algorithm):(no of distinct elements) A8XPsUGDzX4MaxWAtK4Gt1NUB/edit?usp=shar
an
https://docs.google.com/presentation/d/1k ing&ouid=111378912346065138505&rtpof=true&s
tgOTlkuz0B2XeaIYC8K0laS27VAQPOW/ed d=true
it?usp=sharing&ouid=11137891234606513
8505&rtpof=true&sd=true
w
https://drive.google.com/file/d/1s9DzRjMP
FhLVu91izW_UZDp-1WekEp2y/view?usp=
ya
sharing
ur
https://docs.google.com/presentation/d/1c
g_7NGbuFmT00fl3W_HIdMH3-6xtmPX8tv
Y2QSORJ6A/edit?usp=sharing
iS
5. Big Data Mining 1) Algorithm of Park, Chen, and 1. CURE algorithm :
Algorithms Yu. (PCY algorithm for frequent items) 2. SON algorithm’
3. Canopy Clustering,
l
https://docs.google.com/presentation/d/1Ohd9p4
na
evmdxSckaG-J-Ap3j7Htnyazc5/edit?usp=sharing https://drive.google.com/file/d/1tBWgvUn
&ouid=111378912346065138505&rtpof=true&sd=t YNZEZNA9jiBom0yJi6Cua-bxN/view?usp=
rue sharing
So
https://drive.google.com/file/d/1s9vOThD7tX8Ix5
VkFKsnJddnQHHmzK1t/view?usp=sharing
4. PCY algorithm problem
https://drive.google.com/file/d/1miG1Pr2Q
CqV5jHZ7HsFM5uzEixp1tqe1/view?usp=s
haring 5. Multistage and multi-hash algorithm:
https://docs.google.com/document/d/11Cx
hi
oSopZA4aR1Sh6x7Ee69mVOtESHVR2CUt
2) Distance measure problems : jaccard vCqqEpLU/edit?usp=sharing
distance, cosine distance, hamming
s
distance, edit distance
:https://docs.google.com/presentation/d/1
an
pBxucVb7NvGwvzrV-y7CZ-nAu2ym5UvF/e
dit?usp=sharing&ouid=1113789123460651
38505&rtpof=true&sd=true
aw
6. Big Data 1) Page rank using teleportation (damping 1. Structure of the web- Bow tie structure
Analytics factor beta) 2. Content-Based Recommendations &
Applications https://drive.google.com/file/d/1D8HpTMT Collaborative Filtering
y
ubxGcfJE0xJmhjKi0Njzr8zeC/view?usp=s https://docs.google.com/presentation/d/1s
haring sk5AIyKsAnEBjJXzmmvrahAyB8KoQ4j/ed
ur
it?usp=sharing&ouid=11137891234606513
https://drive.google.com/file/d/1G2VxVrkT 8505&rtpof=true&sd=true
d9R5xundYVZT0zGFjLAU68b2/view?usp=
iS
sharing 3. HITS algorithm - hubs and authority
explanation
sharing B6-WlW6Fs_DGJlbtq8MP8BKXA8-9trNT/e
dit?usp=sharing&ouid=1113789123460651
Correct answer of jan 2023 question 38505&rtpof=true&sd=true
So
https://drive.google.com/file/d/182qjH6Nkqqzb9f6
lPnKnrYq1uKGuE3Ql/view?usp=sharing
3) Girvan Newman algorithm - community
finding
hi
https://docs.google.com/presentation/d/1
AEr84eiDQgds3dAmAY6FiyH7XOjvOMQV/
edit?usp=sharing&ouid=111378912346065
s
138505&rtpof=true&sd=true
an
Problem
https://drive.google.com/file/d/1EnjiyOl1d
xkhmSl_vJPMxt5EPp97RuNm/view?usp=
sharing
aw
4) Clique Percolation and community finding
algorithm
y
https://drive.google.com/file/d/1yAfUPcfuI
PLIf1WidOJJdpFncY7QEwp6/view?usp=s
ur
haring
https://drive.google.com/file/d/1unoDwqxx
iS
DM0_L9ytBudYcmYTA57bme0r/view?usp
=sharing
l
na