0% found this document useful (0 votes)

28 views17 pages

BigData Questions

The document provides an overview of Big Data and Hadoop, including questions and answers related to their concepts, functionalities, and components. Key topics include the 4 V's of Big Data, Hadoop's ecosystem, MapReduce programming model, and various data management aspects. It also covers specific tools and languages associated with Hadoop, such as Hive and Pig.

Uploaded by

Gaurav Rahane

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views17 pages

BigData Questions

Uploaded by

Gaurav Rahane

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Topic / Module: Big Data overview

Q. No. 1
Question:
What is not true about Big Data
Answer Choices
A: Hadoop ecosystem handles Big Data
B: It is represented by 4 V's
C: It references OLTP system
D: It references OLAP system.

Answer:C

Q. No. 2
Question:
What is not true about Hadoop
Answer Choices
A: It is a distributed parallel processing ecosystem.
B: It is ideally a Datawarehouse solution
C: It can replace RDBMS systems completely
D: It is a file system

Answer:C

Q. No. 3
Question:
Which one of the following is not among 4V's of Big Data
Answer Choices
A) Volume –Scale of data
B) Velocity –Different forms of data
C) Variety –Analysis of streaming data
D) Volatile –Synchronzation of data

Answer:D

Q. No. 4
Question:
Which one of the following is not Hadoop's Distributiion
Answer Choices
A) MapR
B) Cloudera
C) Hortonworks
D) MapReduce

Answer:D
Q. No. 5
Question:
Which one of the following is not a part of Hadoop's Ecosystem
Answer Choices
A) HDFS
B) MapReduce
C) Hbase
D) MongoDB

Answer:D
Q. No. 6
Question:
Hadoop is a framework that works with a variety of related tools. Common cohorts
include:
A) MapReduce, Hive and HBase
B) MapReduce, MySQL and Google Apps
C) MapReduce, Hummer and Iguana
D) MapReduce, Heron and Trumpet

Answer:A
Q. No. 7
Question:
__________ can best be described as a programming model used to develop Hadoop-
based applications that can process massive amounts of data.
a) MapReduce
b) Mahout
c) Oozie
d) All of the mentioned
Answer:a
Q. No. 8
Question:
__________ can best be described as a programming model used to develop Hadoop-
based applications that can process massive amounts of data.
a) MapReduce
b) Mahout
c) Oozie
d) All of the mentioned
Answer:a

Q. No. 9
Question:
Point out the correct statement :
a) Hive is not a relational database, but a query engine that supports the parts of SQL
specific to querying data
b) Hive is a relational database with SQL support
c) Pig is a relational database with SQL support
d) All of the mentioned
Answer : a

Q. No. 10
Question:
The Pig Latin scripting language is not only a higher-level data flow language but also
has operators similar to :
a) SQL
b) JSON
c) XML
d) All of the mentioned
Answer : a

Q. No. 11
Question:
A ________ node acts as the Slave and is responsible for executing a Task assigned to
it by the JobTracker.
a) MapReduce
b) Mapper
c) TaskTracker
d) JobTracker
Answer : c

Q. No. 12
Question:
Point out the correct statement :
a) MapReduce tries to place the data and the compute as close as possible
b) Map Task in MapReduce is performed using the Mapper() function
c) Reduce Task in MapReduce is performed using the Map() function
d) All of the mentioned
Answer : a

Q. No. 13
Question:
_________ function is responsible for consolidating the results produced by each of the
Map() functions/tasks.
a) Reduce
b) Map
c) Reducer
d) All of the mentioned
Answer : a

Q. No. 14
Question:
_________ is the default Partitioner for partitioning key space.
a) HashPar
b) Partitioner
c) HashPartitioner
d) None of the mentioned
Answer : a

Q. No. 15
Question:
Input to the _______ is the sorted output of the mappers.
a) Reducer
b) Mapper
c) Shuffle
d) All of the mentioned
Answer : a

Q. No. 16
Question:
Point out the wrong statement :
a) Reducer has 2 primary phases
b) Increasing the number of reduces increases the framework overhead, but increases
load balancing and lowers the cost of failures
c) It is legal to set the number of reduce-tasks to zero if no reduction is desired
d) The framework groups Reducer inputs by keys (since different mappers may have
output the same key) in sort stage
Answer : a

Q. No. 17
Question:
Which of the following phases occur simultaneously ?
a) Shuffle and Sort
b) Reduce and Sort
c) Shuffle and Map
d) All of the mentioned
Answer : a

Q. No. 18
Question:
_________ is the primary interface for a user to describe a MapReduce job to the
Hadoop framework for execution.
a) Map Parameters
b) JobConf
c) MemoryConf
d) None of the mentioned
Answer : b

Q. No. 19
Question:
Which of the following phases occur simultaneously ?
a) Shuffle and Sort
b) Reduce and Sort
c) Shuffle and Map
d) All of the mentioned
Answer: a

Q. No. 20
Question:
The need for data replication can arise in various scenarios like :
a) Replication Factor is changed
b) DataNode goes down
c) Data Blocks get corrupted
d) All of the mentioned
Answer :d

Q. No. 21
Question:
________ is the slave/worker node and holds the user data in the form of Data Blocks.
a) DataNode
b) NameNode
c) Data block
d) Replication
Answer :a

Q. No. 22
Question:
The daemons associated with the MapReduce phase are ________ and task-trackers.
a) job-tracker
b) map-tracker
c) reduce-tracker
d) All of the mentioned
Answer :a

Q. No. 23
Question:
The JobTracker pushes work out to available _______ nodes in the cluster, striving to
keep the work as close to the data as possible
a) DataNodes
b) TaskTracker
c) ActionNodes
d) All of the mentioned
Answer :a

Q. No. 24
Question:
InputFormat class calls the ________ function and computes splits for each file and
then sends them to the jobtracker.
a) puts
b) gets
c) getSplits
d) All of the mentioned
Answer :a

Q. No. 25
Question:
InputFormat class calls the ________ function and computes splits for each file and
then sends them to the jobtracker.
a) puts
b) gets
c) getSplits
d) All of the mentioned
Answer :c

Q. No. 26
Question:
On a tasktracker, the map task passes the split to the createRecordReader() method on
InputFormat to obtain a _________ for that split.
a) InputReader
b) RecordReader
c) OutputReader
d) None of the mentioned
Answer :b

Q. No. 27
Question:
The default InputFormat is __________ which treats each value of input a new value
and the associated key is byte offset.
a) TextFormat
b) TextInputFormat
c) InputFormat
d) All of the mentioned
Answer :b

Q. No. 28
Question:
__________ controls the partitioning of the keys of the intermediate map-outputs.
a) Collector
b) Partitioner
c) InputFormat
d) None of the mentioned
Answer :b

Q. No. 29
Question:
Output of the mapper is first written on the local disk for sorting and _________
process.
a) shuffling
b) secondary sorting
c) forking
d) reducing
Answer :a

Q. No. 30
Question:
The __________ is a framework-specific entity that negotiates resources from the
ResourceManager
a) NodeManager
b) ResourceManager
c) ApplicationMaster
d) All of the mentioned
Answer :c

Q. No. 31
Question:
Apache Hadoop YARN stands for :
a) Yet Another Reserve Negotiator
b) Yet Another Resource Network
c) Yet Another Resource Negotiator
d) All of the mentioned
Answer :c

Q. No. 32
Question:
The ____________ is the ultimate authority that arbitrates resources among all the
applications in the system.
a) NodeManager
b) ResourceManager
c) ApplicationMaster
d) All of the mentioned
Answer :b

Q. No. 33
Question:
The __________ is responsible for allocating resources to the various running
applications subject to familiar constraints of capacities, queues etc.
a) Manager
b) Master
c) Scheduler
d) None of the mentioned
Answer :b

Q. No. 34
Question:
ZooKeeper allows distributed processes to coordinate with each other through registers,
known as :
a) znodes
b) hnodes
c) vnodes
d) rnodes
Answer :a

Q. No. 35
Question:
ZooKeeper allows distributed processes to coordinate with each other through registers,
known as :
a) znodes
b) hnodes
c) vnodes
d) rnodes
Answer :a

Q. No. 36
Question:
In Hive SerDe stands for

A - serialize and Desrialize

B - serializer and Deserializer

C - Serialize and Destruct

D - serve and destruct

Answer :B

Q. No. 37
Question:

To select all columns starting with the word 'Sell' form the table GROSS_SELL the query
is

A - select '$Sell*' from GROSS_SELL

B - select 'Sell*' from GROSS_SELL

C - select 'sell.*' from GROSS_SELL

D - select 'sell[*]' from GROSS_SELL

Answer :C

Q. No. 38
Question:
Which of the following hint is used to optimize the join queries

A - /* joinlast(table_name) */

B - /* joinfirst(table_name) */

C - /* streamtable(table_name) */

D - /* cacheable(table_name) */

Answer :C

Q. No. 39
Question:

The drawback of managed tables in hive is

A - they are always stored under default directory

B - They cannot grow bigger than a fixed size of 100GB

C - They can never be dropped

D - They cannot be shared with other applications

Answer:D

Q. No. 40
Question:
In case of one large table and 2 small tables, for an optimized query performance

A - The largest one should be cached to memory and small ones should be streamed

B - The small Ones should be cached and large one should be streamed

C - All of the table should be cached

D - All the tables should be streamed.

Answer:B

Q. No. 41
Question:

What are collection data types in Pig

A - Tuple

B - Bag

C - Map

D - All

Answer:D

Q. No. 42
Question:
What are collection data types in Pig

A - Tuple

B - Bag

C - Map

D - All

Answer:D

Q. No. 43
Question:

How to refer fields in Pig

A – By Names

B – By Positional Notation

C - Both

D - None

Answer:C

Q. No. 44
Question:
Where we store Bag on Pig

A – {}

B–[]

C–()

D-<>

Answer: A

Total Number of Questions Generated: 44___

Advanced Software Engineering
No ratings yet
Advanced Software Engineering
109 pages
Nptel Big Data Full Assignment Solution 2021
100% (8)
Nptel Big Data Full Assignment Solution 2021
36 pages
Hadoop MCQs
75% (8)
Hadoop MCQs
21 pages
Big Data Exam Correction
100% (1)
Big Data Exam Correction
10 pages
Big Data Analytics Unit 1 MCQ
90% (10)
Big Data Analytics Unit 1 MCQ
10 pages
Cloudera CCD 410
100% (1)
Cloudera CCD 410
21 pages
Bigdata MCQ QA Part2
No ratings yet
Bigdata MCQ QA Part2
9 pages
Subject Name:: Knowledge Institute of Technology & Engineering-135
No ratings yet
Subject Name:: Knowledge Institute of Technology & Engineering-135
22 pages
2022 Assignment Answers
No ratings yet
2022 Assignment Answers
37 pages
Question 1: Your Answer
100% (1)
Question 1: Your Answer
26 pages
HBD Quiz Questions For Mid-2
No ratings yet
HBD Quiz Questions For Mid-2
10 pages
Bigdatacourse
No ratings yet
Bigdatacourse
10 pages
Pig
No ratings yet
Pig
24 pages
BDA IMPORTANT QUESTION (5marks)
No ratings yet
BDA IMPORTANT QUESTION (5marks)
7 pages
454U8-Big Data Analytics
No ratings yet
454U8-Big Data Analytics
22 pages
Big Data Questions
100% (1)
Big Data Questions
39 pages
Big Data MCQ
No ratings yet
Big Data MCQ
47 pages
Devoir Surveillé: Please Answer The Following Multiple-Choice Questions
No ratings yet
Devoir Surveillé: Please Answer The Following Multiple-Choice Questions
8 pages
Big Data Technologies - PGDBDA - Feb20
No ratings yet
Big Data Technologies - PGDBDA - Feb20
12 pages
r16 Te Sem Viii Choice It Big Data Analytics
No ratings yet
r16 Te Sem Viii Choice It Big Data Analytics
5 pages
5Th Sem. / Computer Subject: Big Data: What Are The Challenges For Processing Bigdata? (C - 1)
No ratings yet
5Th Sem. / Computer Subject: Big Data: What Are The Challenges For Processing Bigdata? (C - 1)
2 pages
HDP Qestions
No ratings yet
HDP Qestions
4 pages
Big Data QCM 1 PDF
100% (1)
Big Data QCM 1 PDF
7 pages
Big Data Analytics
No ratings yet
Big Data Analytics
6 pages
University of Mumbai Sample MCQ Question Bank Course Code and Name: BDA ITC801 /R16 Class: BE Semester:8 Options A B C D
No ratings yet
University of Mumbai Sample MCQ Question Bank Course Code and Name: BDA ITC801 /R16 Class: BE Semester:8 Options A B C D
6 pages
Final Exam
17% (6)
Final Exam
6 pages
DS BigDATA 2ièmeN2TR UVT 2022 2023
No ratings yet
DS BigDATA 2ièmeN2TR UVT 2022 2023
4 pages
BDC Previous Papers 2 Marks
100% (1)
BDC Previous Papers 2 Marks
7 pages
Week 2
No ratings yet
Week 2
7 pages
BigData Questions
No ratings yet
BigData Questions
4 pages
DS QCM BigData 2021
No ratings yet
DS QCM BigData 2021
6 pages
Test Blanc
No ratings yet
Test Blanc
23 pages
Bigdataqcm PDF
100% (1)
Bigdataqcm PDF
206 pages
PDF
No ratings yet
PDF
23 pages
Bda MCQ
100% (1)
Bda MCQ
44 pages
Week 1 Assignment Answers 2022
No ratings yet
Week 1 Assignment Answers 2022
4 pages
On Bigdata Nha
No ratings yet
On Bigdata Nha
41 pages
Is The World's Most Complete, Tested, and Popular Distribution of Apache Hadoop and Related Projects. A. MDH B. CDH C. ADH
No ratings yet
Is The World's Most Complete, Tested, and Popular Distribution of Apache Hadoop and Related Projects. A. MDH B. CDH C. ADH
21 pages
CCDH Exam With Answers
No ratings yet
CCDH Exam With Answers
17 pages
Question 1: Fill in The Blanks
No ratings yet
Question 1: Fill in The Blanks
4 pages
MCQ Type Questions
No ratings yet
MCQ Type Questions
24 pages
Bda MCQ
No ratings yet
Bda MCQ
9 pages
Hadoopsdsdgs
No ratings yet
Hadoopsdsdgs
29 pages
Bits
No ratings yet
Bits
2 pages
DSBDA Kadak Document
No ratings yet
DSBDA Kadak Document
249 pages
Big Data Solution Assignment-I
No ratings yet
Big Data Solution Assignment-I
4 pages
Cloudera Testpassport CCD-470
No ratings yet
Cloudera Testpassport CCD-470
33 pages
Big Data Question Paper
No ratings yet
Big Data Question Paper
1 page
coursBUTONLYQA Merged
No ratings yet
coursBUTONLYQA Merged
52 pages
Hadoop
No ratings yet
Hadoop
14 pages
Big Data Visualization
No ratings yet
Big Data Visualization
55 pages
44
No ratings yet
44
8 pages
WWW Doubtly in Big Data Analytics Semester 7 Mu Ai Ds Viva Qna
No ratings yet
WWW Doubtly in Big Data Analytics Semester 7 Mu Ai Ds Viva Qna
7 pages
Big Data and Hadoop - Semester Exam - 6th Sem-Set 01
No ratings yet
Big Data and Hadoop - Semester Exam - 6th Sem-Set 01
3 pages
Couchbase Certified Java Developer - Exam Practice Tests
From Everand
Couchbase Certified Java Developer - Exam Practice Tests
Cristian Scutaru
No ratings yet
Hadoop MCQ Challenge
No ratings yet
Hadoop MCQ Challenge
63 pages
BigData Objective
No ratings yet
BigData Objective
93 pages
Important Questions and Answers of Big Data Course
No ratings yet
Important Questions and Answers of Big Data Course
4 pages
End Sem Paper
No ratings yet
End Sem Paper
4 pages
Bda U3, U4 and U5 Two Marks Qs
No ratings yet
Bda U3, U4 and U5 Two Marks Qs
19 pages
100 Puzzles to Learn Data Warehousing
From Everand
100 Puzzles to Learn Data Warehousing
Cristian Scutaru
No ratings yet
WaterGEMS For ArcMap Sesion1
100% (1)
WaterGEMS For ArcMap Sesion1
38 pages
Face Technology PDF
No ratings yet
Face Technology PDF
495 pages
DCIT 24 Reviewer
No ratings yet
DCIT 24 Reviewer
16 pages
Reports
100% (1)
Reports
12 pages
Intermodal Transportation Management System (ITMS)
No ratings yet
Intermodal Transportation Management System (ITMS)
24 pages
CPR 22218
No ratings yet
CPR 22218
18 pages
Exam Ref 70-768 Developing SQL Data Models: List of Urls
No ratings yet
Exam Ref 70-768 Developing SQL Data Models: List of Urls
11 pages
LIBRARY Management System Final Project
100% (1)
LIBRARY Management System Final Project
90 pages
Sudhanshu Shrivastava Data Engineer 2024
No ratings yet
Sudhanshu Shrivastava Data Engineer 2024
4 pages
Opentouch Enterprise Cloud .: Multitenant Otsbc Configuration Guide
No ratings yet
Opentouch Enterprise Cloud .: Multitenant Otsbc Configuration Guide
28 pages
Ccs342devops Syllabus
No ratings yet
Ccs342devops Syllabus
4 pages
CCNA Qualified Person With Coding Skills
No ratings yet
CCNA Qualified Person With Coding Skills
2 pages
Concurrent & Parallel Execution
No ratings yet
Concurrent & Parallel Execution
3 pages
2014 Sony Pictures Hack Case Study
No ratings yet
2014 Sony Pictures Hack Case Study
19 pages
Simple Logistics Questions
No ratings yet
Simple Logistics Questions
6 pages
Effective Instagram Bio Template - Guide - Creative Fabrica
100% (2)
Effective Instagram Bio Template - Guide - Creative Fabrica
15 pages
The Need of Public Wifi
No ratings yet
The Need of Public Wifi
2 pages
Project
No ratings yet
Project
11 pages
All
No ratings yet
All
48 pages
1.overview of Operating System
No ratings yet
1.overview of Operating System
50 pages
Fortios Handbook 56
No ratings yet
Fortios Handbook 56
3,447 pages
What Pro Should Inform Ihcps Frequently Asked Questions (Faqs) Health Information Technology Providers (Hitp)
No ratings yet
What Pro Should Inform Ihcps Frequently Asked Questions (Faqs) Health Information Technology Providers (Hitp)
2 pages
Implementing Capacity Management Within SIAM (Service Integration & Management) Amrit Bhattacharya
No ratings yet
Implementing Capacity Management Within SIAM (Service Integration & Management) Amrit Bhattacharya
22 pages
3 Marketing Analytics
No ratings yet
3 Marketing Analytics
28 pages
1 Network Threats & Security
No ratings yet
1 Network Threats & Security
3 pages
Operating System - Unix Commands:-Displays A Calendar Syntax: - Cal (Options) (Month) (Year) Description
No ratings yet
Operating System - Unix Commands:-Displays A Calendar Syntax: - Cal (Options) (Month) (Year) Description
40 pages
Dockers and Kubernetes
No ratings yet
Dockers and Kubernetes
16 pages
Week #2
No ratings yet
Week #2
17 pages
Sign in Vs Login Why Do Button Labels Confuse Users
No ratings yet
Sign in Vs Login Why Do Button Labels Confuse Users
4 pages

BigData Questions

Uploaded by

BigData Questions

Uploaded by

Topic / Module: Big Data overview

A - serialize and Desrialize

B - serializer and Deserializer

C - Serialize and Destruct

D - serve and destruct

A - select '$Sell*' from GROSS_SELL

B - select 'Sell*' from GROSS_SELL

C - select 'sell.*' from GROSS_SELL

D - select 'sell[*]' from GROSS_SELL

The drawback of managed tables in hive is

A - they are always stored under default directory

B - They cannot grow bigger than a fixed size of 100GB

C - They can never be dropped

D - They cannot be shared with other applications

C - All of the table should be cached

D - All the tables should be streamed.

What are collection data types in Pig

How to refer fields in Pig

Total Number of Questions Generated: ______44_________

You might also like

Total Number of Questions Generated: 44___