0% found this document useful (0 votes)

56 views4 pages

University of Mumbai Examination 2020 Under Cluster - 4 - (Lead College: PCE-New Panvel)

This document contains a past exam for a Big Data and Analytics course at the University of Mumbai. The exam contains 20 multiple choice questions testing concepts related to Hadoop, MapReduce, NoSQL databases, data streams, and graph algorithms like PageRank. It also lists 3 short answer questions to choose from that require explaining Hadoop ecosystem components, relational algebra operations in MapReduce, NoSQL data architectures, distance metrics, the DGIM algorithm, and PageRank.

Uploaded by

yo fire

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views4 pages

University of Mumbai Examination 2020 Under Cluster - 4 - (Lead College: PCE-New Panvel)

Uploaded by

yo fire

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

University of Mumbai

Examination 2020 under cluster _4_ (Lead College: PCE-New Panvel)

Program: BE Computer Engineering
Curriculum Scheme: Rev 2016
Examination: BE Semester VII
Course Code: CSDLO7032 Course Name: Big Data & Analytics
Time: 2 hour Max. Marks: 80
=====================================================================
Choose the correct option for the following questions. All the questions are
Q1. compulsory and carry equal marks

Q1.(s) Which of the following tool is designed for efficiently transferring bulk data
between Apache Hadoop and structured datastores such as relational databases?

Option A: Apache Sqoop

Option B: Pig
Option C: Mahout
Option D: Flume

2.(s) Point out the wrong statement

Option A: A. Replication Factor can be configured at a cluster level (Default is set to 3) and
also at a file level
Option B: Block Report from each DataNode contains a list of all the blocks that are stored
on that DataNode
Option C: User data is stored on the local file system of DataNodes
Option D: DataNode is aware of the files to which the blocks stored on it belong to

3.D Point out the correct statement:

Option A: DataNode is the slave/worker node and holds the user data in the form of Data
Blocks
Option B: Each incoming file is broken into 32 MB by default
Option C: Data blocks are replicated across different nodes in the cluster to ensure a low
degree of fault tolerance
Option D: DataNode is master node and holds the meta data details

4. The output of a mapper task is

Option A: The Key-value pair of all the records of the dataset.
Option B: The Key-value pair of all the records from the input split processed by the mapper
Option C: Only the sorted Keys from the input split
Option D: The number of rows processed by the mapper task

5.M Which of the following operations can’t use Reducer as a combiner ?

Option A: Group by Minimum
Option B: Group by Maximum
Option C: Group by Count
Option D: Group by Average

1 | Page
6.s Which of the following is a wrong statement for a document store

Option A: Documents can contain many different key-value pairs, or key-array pairs, or
even nested documents
Option B: When compared to relational databases, Document stores are more scalable and
provide superior performance
Option C: It requires schema to be defined before you can add data
Option D: Secondary indices are available in Document store

7.D Which architecture is more suitable for NoSQL ?

Option A: Shared Nothing
Option B: Shared Memory
Option C: Shared Disk
Option D: Shared All

8.s Which one is not sampling in a data stream?

Option A: Reservoir Sampling

Option B: Biased Reservoir Sampling
Option C: Concise Sampling
Option D: Cosin Sampling

9.m Define exponentially decaying window by

Option A: t-1
∑ at (1-c)i
i=0

Option B: t-1
∑ at (1-c)i
i=0

Option C: t
∑ at-1 (1-c)i
i=0

Option D: t-1
∑ at-1 (1-c)i
i=0

10.s While devising the bloom filter if the filter is of 5 bits 0 0 0 0 0 and 2 hash functions
h1(x) = x mod 5 and h2(x)= (2x+3) mod 5 are used, what is the filter bit positions
when 9 followed by 11 is inserted

Option A: 01001
Option B: 10001
Option C: 11001
2 | Page
Option D: 00001

11.d Stream Queries is one that is supplied to the DSMS before any relevant data has
arrived is called as

Option A: Continuous Queries

Option B: One time Queries
Option C: Adhoc Queries
Option D: Predefined Queries

12.s The angle between two points in Cosine Distance will range from
Option A: 0 to 90 degrees
Option B: 0 to 180 degrees
Option C: 0 to 360 degrees
Option D: 90 to 180 degrees

13.D Which of the step is not performed in the second phase of the CURE algorithm

Option A: clustering the renaming points and output the final cluster
Option B: merge two clusters if they have a pair of representative points, one from each
cluster, that are sufficiently close.
Option C: Move each of the representative points a fixed fraction of the distance between its
location and the centroid of its cluster.
Option D: Each point P is brought from secondary storage and compared with the
representative points

14.M For the distance function, the triangle inequality guarantees the function is well-
behaved. Which of the following shows correct distance function for triangle
inequality?

Option A: d(x,y) = d( x, y) + d( z)
Option B: d(x,y) = d( x,y) + d(x,z)
Option C: d(x,y) = d(x,z) + d(z,y)
Option D: d(x) = d(y) + d(z)

15.s Find the correct Hamming distance between X=111111101 and Y=000111111

Option A: 4
Option B: 5
Option C: 3
Option D: 2

16.D The process of identifying similar users and recommending what similar users like
is called _________ .

Option A: Content Based Recommendation System

Option B: Collaborative Filtering
Option C: Hybrid Recommendation System
Option D: Nearest Neighbor Search

17.M The modified equation for calculating PageRank is

3 | Page
Option A:
Option B:
Option C:
Option D:

18.m “clique” in a graph is a __________ .

Option A: simple sub-graph

Option B: null graph
Option C: trivial sub-graph
Option D: fully connected sub-graph

19.s The _______ , consists of pages that could reach the SCC by following links, but
were not reachable from the SCC.

Option A: out-component
Option B: in-component
Option C: Tendrils
Option D: Tubes

20.D The problems of dead end and spider traps are solved by a method called
__________

Option A: Stochastic Matrix

Option B: Substochastic Matrix
Option C: Taxation
Option D: Transition Matrix

Q2 Solve any Two out of Three 10 marks each

(20
Marks)
A Explain briefly the components of Hadoop Ecosystem with neat diagram.
With appropriate examples explain how these relational algebra operators are
B
solved using Map Reduce functions (i)Selection (ii) Projection (iii) Joins
C Explain different NoSQL data architecture patterns.

Q3 Solve any Two out of Three 10 marks each

(20
Marks)
Find Manhattan distance (Ll-norm) and Euclidean distance (L2-norm} for the
A
following points XI = (1, 2, 2), X2 = {2, 5, 3}
Explain working of DGIM algorithm to count number of l 's (Ones) in a
B
datastream.
Explain Page Rank with Example. Can a Website's Page rank Ever,Increase?What
C
are its chances of Decreasing?

4 | Page

Spark Preliminaries
No ratings yet
Spark Preliminaries
4 pages
B1 90 TB1300 02 Sol
No ratings yet
B1 90 TB1300 02 Sol
24 pages
Sample Test MTCNA 100 PDF
No ratings yet
Sample Test MTCNA 100 PDF
3 pages
SQL: Foreign Key
No ratings yet
SQL: Foreign Key
5 pages
Nptel Big Data Full Assignment Solution 2021
100% (8)
Nptel Big Data Full Assignment Solution 2021
36 pages
Hadoop MCQs
75% (8)
Hadoop MCQs
21 pages
Huawei.H13-311 - V3.0.v2022-03-02.q107: Show Answer
No ratings yet
Huawei.H13-311 - V3.0.v2022-03-02.q107: Show Answer
24 pages
How Is Bigdata Handled in Kaggle?: 17Cp006-Leenanci Parmar 17CP012-DHRUVI LAD
No ratings yet
How Is Bigdata Handled in Kaggle?: 17Cp006-Leenanci Parmar 17CP012-DHRUVI LAD
18 pages
Misra C 2012 Amd-1 PDF
No ratings yet
Misra C 2012 Amd-1 PDF
29 pages
Experiment No.5 Aim: Theory:: Develop An Application That Makes Use of Database
No ratings yet
Experiment No.5 Aim: Theory:: Develop An Application That Makes Use of Database
7 pages
MC5502 - BIG DATA ANALYTICS - MCQ - For All Units
100% (1)
MC5502 - BIG DATA ANALYTICS - MCQ - For All Units
19 pages
Internet Connected Ics
No ratings yet
Internet Connected Ics
23 pages
Log in To PfSense Based On Active Directory Group Membership
No ratings yet
Log in To PfSense Based On Active Directory Group Membership
8 pages
Experiment No 3 Aim: Theory:: Develop An Application That Uses GUI Components
No ratings yet
Experiment No 3 Aim: Theory:: Develop An Application That Uses GUI Components
10 pages
Shantanu Naik (118CP1163A) (118CP1049A) (118CP1334B) : AAA Mini Project Report
No ratings yet
Shantanu Naik (118CP1163A) (118CP1049A) (118CP1334B) : AAA Mini Project Report
19 pages
Origin C
No ratings yet
Origin C
110 pages
Multiplexing in Mobile Computing - Javatpoint
No ratings yet
Multiplexing in Mobile Computing - Javatpoint
18 pages
Line Coding Techniques
No ratings yet
Line Coding Techniques
65 pages
Memory Test Updated V3 Reduced
No ratings yet
Memory Test Updated V3 Reduced
96 pages
B64-3103-00 00 E en
No ratings yet
B64-3103-00 00 E en
104 pages
Exoc Module 15 Network Troubleshooting PDF
No ratings yet
Exoc Module 15 Network Troubleshooting PDF
86 pages
Comp Sem 7 Aisc R-2016
No ratings yet
Comp Sem 7 Aisc R-2016
4 pages
Abstract For Project
No ratings yet
Abstract For Project
3 pages
Golang For Absolute Beginners
No ratings yet
Golang For Absolute Beginners
68 pages
BDA Qbank (2016-2020) : Chapter 1: Introduction To Big Data and Hadoop
No ratings yet
BDA Qbank (2016-2020) : Chapter 1: Introduction To Big Data and Hadoop
7 pages
Experiment - 10 Aim: Theory:: To Implement Mobile Node Discovery
No ratings yet
Experiment - 10 Aim: Theory:: To Implement Mobile Node Discovery
13 pages
Boost Productivity: Short Note: Cellular IP
No ratings yet
Boost Productivity: Short Note: Cellular IP
6 pages
Explain The Functioning of Mobile TCP
No ratings yet
Explain The Functioning of Mobile TCP
7 pages
MGM's College of Engineering and Technology, Kamothe, Navi Mumbai
No ratings yet
MGM's College of Engineering and Technology, Kamothe, Navi Mumbai
3 pages
03-FTP and TFTP Commands - 838792 - 1285 - 0
No ratings yet
03-FTP and TFTP Commands - 838792 - 1285 - 0
34 pages
Differentiate Between Graphical User Interface and Web Page Design
0% (1)
Differentiate Between Graphical User Interface and Web Page Design
4 pages
IT5403 2011 Internet Application Development
No ratings yet
IT5403 2011 Internet Application Development
15 pages
JCLQSTN
No ratings yet
JCLQSTN
26 pages
Addis Ababa University Faculty of Informatics Department of Computer Science
No ratings yet
Addis Ababa University Faculty of Informatics Department of Computer Science
9 pages
fh2018 MSC Cs Sem 1
No ratings yet
fh2018 MSC Cs Sem 1
6 pages
How To Create A Cronjob
100% (1)
How To Create A Cronjob
6 pages
Ugc Net - July 2016 Paper-III
No ratings yet
Ugc Net - July 2016 Paper-III
14 pages
DSBDA Kadak Document
No ratings yet
DSBDA Kadak Document
249 pages
Feedback The Correct Answer Is:analysis of Time Series
No ratings yet
Feedback The Correct Answer Is:analysis of Time Series
42 pages
Iit M Diploma Quiz Exam Qpe2
100% (1)
Iit M Diploma Quiz Exam Qpe2
152 pages
Quiz 1 - Attempt Review
No ratings yet
Quiz 1 - Attempt Review
7 pages
MCQ Amt 1
No ratings yet
MCQ Amt 1
17 pages
Class-1-24th May
No ratings yet
Class-1-24th May
3 pages
I. Choose The Correct Alternative:: II. Fill in The Blanks
No ratings yet
I. Choose The Correct Alternative:: II. Fill in The Blanks
1 page
GIT MCQ S
No ratings yet
GIT MCQ S
8 pages
AWS Training Notes 30-May
No ratings yet
AWS Training Notes 30-May
2 pages
Please Use Either of The 3 Option Given Below While Setting Up The Subjective/descriptive Questions
No ratings yet
Please Use Either of The 3 Option Given Below While Setting Up The Subjective/descriptive Questions
22 pages
Comp Sem 7 BD R-2016
No ratings yet
Comp Sem 7 BD R-2016
7 pages
Previous Year Paper - Sem 7
No ratings yet
Previous Year Paper - Sem 7
12 pages
CS11 Final Exam Study Guide For Chapter 7-9: Chapter 8: Internet and The World
No ratings yet
CS11 Final Exam Study Guide For Chapter 7-9: Chapter 8: Internet and The World
2 pages
ADAM-6700 - Node-Red Application Tutorial & Example
No ratings yet
ADAM-6700 - Node-Red Application Tutorial & Example
29 pages
CML Contest
No ratings yet
CML Contest
16 pages
1 - Page
No ratings yet
1 - Page
11 pages
2023 BD All Assignment
No ratings yet
2023 BD All Assignment
63 pages
Cs614-Mid Term Solved MCQs With References by Moaaz PDF
100% (2)
Cs614-Mid Term Solved MCQs With References by Moaaz PDF
30 pages
Assignment 7solution
No ratings yet
Assignment 7solution
4 pages
MT8163 Android Scatter
No ratings yet
MT8163 Android Scatter
10 pages
University of Mumbai Sample MCQ Question Bank Course Code and Name: BDA ITC801 /R16 Class: BE Semester:8 Options A B C D
No ratings yet
University of Mumbai Sample MCQ Question Bank Course Code and Name: BDA ITC801 /R16 Class: BE Semester:8 Options A B C D
6 pages
r16 Te Sem Viii Choice It Big Data Analytics
No ratings yet
r16 Te Sem Viii Choice It Big Data Analytics
5 pages
Dsebl ZG522
No ratings yet
Dsebl ZG522
4 pages
Ugcnet
100% (1)
Ugcnet
151 pages
Extc Sem 7 Bda R-2016
No ratings yet
Extc Sem 7 Bda R-2016
4 pages
Amiete - It December 2016: Code: At78 Subject: Data Mining & Warehousing
No ratings yet
Amiete - It December 2016: Code: At78 Subject: Data Mining & Warehousing
3 pages
Bda r16 Csdlo7032 QP
No ratings yet
Bda r16 Csdlo7032 QP
4 pages
DS BigDATA 2ièmeN2TR UVT 2022 2023
No ratings yet
DS BigDATA 2ièmeN2TR UVT 2022 2023
4 pages
PUT DBMS Objective 01 06 2022
No ratings yet
PUT DBMS Objective 01 06 2022
3 pages
Ugc Net - January 2017 Paper-II
No ratings yet
Ugc Net - January 2017 Paper-II
11 pages
44
No ratings yet
44
8 pages
Coronel PPT Ch10
No ratings yet
Coronel PPT Ch10
36 pages
Data Warehousing Mining MCQs
No ratings yet
Data Warehousing Mining MCQs
12 pages
DWDM MCQ Qns 2020
No ratings yet
DWDM MCQ Qns 2020
5 pages
Bigdatamcq mcq1
No ratings yet
Bigdatamcq mcq1
21 pages
Cs614-Mid Term Solved MCQs With References by Moaaz PDF
No ratings yet
Cs614-Mid Term Solved MCQs With References by Moaaz PDF
30 pages
CS614 FinalSolvedMCQ
No ratings yet
CS614 FinalSolvedMCQ
19 pages
(It-704c) Data Warehousing and Data Mining (2013-14)
No ratings yet
(It-704c) Data Warehousing and Data Mining (2013-14)
6 pages
MIT Entrance Model Question
No ratings yet
MIT Entrance Model Question
16 pages
DSU - Electricity Bill Calculator Using C
No ratings yet
DSU - Electricity Bill Calculator Using C
9 pages
NEW DST ALL Ques (BEFORE+AFTER) Mid+ExamMid
No ratings yet
NEW DST ALL Ques (BEFORE+AFTER) Mid+ExamMid
33 pages
Data Warehousing&Data Mining AMTCSE0114
No ratings yet
Data Warehousing&Data Mining AMTCSE0114
3 pages
Gre 341 List New
No ratings yet
Gre 341 List New
26 pages
Sem 7 Dec 21 Ai MCC
No ratings yet
Sem 7 Dec 21 Ai MCC
19 pages
HSST Computer Science SR For SCST
No ratings yet
HSST Computer Science SR For SCST
12 pages
Sample Question
No ratings yet
Sample Question
19 pages
Sample MCQs
No ratings yet
Sample MCQs
4 pages
CA2-Question Bank MCQ (PEC-CSBS601D)
No ratings yet
CA2-Question Bank MCQ (PEC-CSBS601D)
9 pages
Big Data Notes
No ratings yet
Big Data Notes
89 pages
Week 2
No ratings yet
Week 2
7 pages
PHDCS
No ratings yet
PHDCS
12 pages
Big Data 2020
No ratings yet
Big Data 2020
13 pages
Pyqs
No ratings yet
Pyqs
9 pages
Bda A1
No ratings yet
Bda A1
15 pages
MCQ-Part1-2025-Question Bank (PEC-CSBS601D)
No ratings yet
MCQ-Part1-2025-Question Bank (PEC-CSBS601D)
7 pages
Data Miningcseit
No ratings yet
Data Miningcseit
2 pages
Data Mining IMP Objective Questions - Sep 2023
No ratings yet
Data Mining IMP Objective Questions - Sep 2023
4 pages
Big Data 22 23 24
No ratings yet
Big Data 22 23 24
10 pages
Pec Cs 602b Cse Final
No ratings yet
Pec Cs 602b Cse Final
6 pages
Seventh Semester
No ratings yet
Seventh Semester
10 pages
CMT308 A
No ratings yet
CMT308 A
6 pages
Adt308 Comprehensive Course Work, December 2024
No ratings yet
Adt308 Comprehensive Course Work, December 2024
6 pages
Od 334322332476014100
No ratings yet
Od 334322332476014100
2 pages
Student Progression To Higher Education in Percentage During The Year
No ratings yet
Student Progression To Higher Education in Percentage During The Year
2 pages
Dig I Skills
No ratings yet
Dig I Skills
2 pages
Anubha Gajargaonkar Resumee Mydoc
No ratings yet
Anubha Gajargaonkar Resumee Mydoc
1 page
Axis Bank Cover Letter
No ratings yet
Axis Bank Cover Letter
1 page
Your Result WPU PG25 MTECH 000914
No ratings yet
Your Result WPU PG25 MTECH 000914
1 page
Bda Bits - Mid I-Qp (2024-25)
No ratings yet
Bda Bits - Mid I-Qp (2024-25)
2 pages
Couchbase Certified Java Developer - Exam Practice Tests
From Everand
Couchbase Certified Java Developer - Exam Practice Tests
Cristian Scutaru
No ratings yet
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
From Everand
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
Sama Alshatali
No ratings yet

University of Mumbai Examination 2020 Under Cluster - 4 - (Lead College: PCE-New Panvel)

Uploaded by

University of Mumbai Examination 2020 Under Cluster - 4 - (Lead College: PCE-New Panvel)

Uploaded by

University of Mumbai

Examination 2020 under cluster _4_ (Lead College: PCE-New Panvel)

Option A: Apache Sqoop

2.(s) Point out the wrong statement

3.D Point out the correct statement:

4. The output of a mapper task is

5.M Which of the following operations can’t use Reducer as a combiner ?

7.D Which architecture is more suitable for NoSQL ?

8.s Which one is not sampling in a data stream?

Option A: Reservoir Sampling

9.m Define exponentially decaying window by

Option A: Continuous Queries

Option A: Content Based Recommendation System

17.M The modified equation for calculating PageRank is

18.m “clique” in a graph is a __________ .

Option A: simple sub-graph

Option A: Stochastic Matrix

Q2 Solve any Two out of Three 10 marks each

Q3 Solve any Two out of Three 10 marks each

You might also like