0% found this document useful (0 votes)
156 views10 pages

Thecodingshef: Unit 2 Big Data MCQ Aktu

Uploaded by

Ahmed Mohamed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
156 views10 pages

Thecodingshef: Unit 2 Big Data MCQ Aktu

Uploaded by

Ahmed Mohamed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

TheCodingShef Home Course  Aktu Exams MCQs Interview Questions Advertise Brand About us Contact us

Post List Unit 2 Big data Mcq AKTU


Leave a Comment / Aktu Exams MCQs, Big Data MCQs
All Unit MCQ questions of ML

All Unit MCQ’s Questions of Image Processing Hey Guys, If you are preparing for Big data subject for the AKTU exams, then These are the important questions of  Big Data MCQ. so
All Unit MCQ’s Question of Entrepreneurship must go through these questions.

Development
Note*: Do not depend only on these MCQ’s, Yes surely these questions help you in AKTU exams so do it first and after that if you have
All Unit MCQ’s of Data Compression
enough time, study all the core topics as well.
Unit 1- computer network mcq questions

Unit 2- computer network mcq questions You can also check out these subjects as well: Data Compression, Machine learning, Image processing, Computer Network

Unit 3 Computer Network mcq Questions


Unit2: Big data MCQ
Unit 4 Computer Network mcq Questions

Unit 5 Computer Network mcq questions  

Unit 1 Software engineering mcq questions


1. Which one of the following is false about Hadoop?

Unit 2: Software engineering mcq questions a. It is a distributed framework

Unit 3 Software Engineering Mcq Questions b. The main algorithm used in it is

Unit 4: Software engineering mcq questions with Map Reduce

c. It runs with commodity hardware

answers
d. All are true

Unit 5: Software engineering MCQ questions


Answer: (d)

All units Emerging Technologies mcq questions 2. What license is Apache Hadoop distributed under?

Unit1- Fundamental of mechanical engineering a. Apache License 2.0

mcq b. Shareware

c. Mozilla Public License

Unit2- FUNDAMENTALS OF MECHANICAL


d. Commercial

ENGINEERING & MECHATRONICS MCQs


Answer: (a)

Unit3- FUNDAMENTALS OF MECHANICAL 3. Which of the following platforms does Apache Hadoop run on ?

ENGINEERING & MECHATRONICS MCQs a. Bare metal

Unit4- FUNDAMENTALS OF MECHANICAL b. Unix-like

c. Cross-platform

ENGINEERING & MECHATRONICS MCQs


d. Debian

Unit5- FUNDAMENTALS OF MECHANICAL


Answer: (c)

ENGINEERING MCQs
4. Apache Hadoop achieves reliability by replicating the data across multiple hosts and hence does not require ________ storage on

hosts.

a. Standard RAID levels

b. RAID

c. ZFS

d. Operating system

Answer: Option (b)

5. Hadoop works in

a. master-worker fashion

b. master – slave fashion

c. worker/slave fashion

d. All of the mentioned

Answer: (b)

6. Which type of data Hadoop can deal with is

a. Structured

b. Semi-structured

c. Unstructured

d. All of the above

Answer: (d)

7. Which statement is false about Hadoop

Dakraam van FAKRO®


a. It runs with commodity hardware

Voor iedere ruimte en elk soort dak

hebben wij een daglichtoplossing! Meer… b. It is a part of the Apache project

FAKRO® Openen sponsored by the ASF

c. It is best for live streaming of data

d. None of the above

Answer: (c)

8. As compared to RDBMS, Apache Hadoop

a. Has higher data Integrity

b. Does ACID transactions

c. Is suitable for read and write

many times

d. Works better on unstructured

and semi-structured data.

Answer: (d)

9. Hadoop can be used to create distributed clusters, based on commodity servers, that provide low-cost processing and storage for

unstructured data

a. True

b. False

Answer: (a)

10. ______ is a framework for performing remote procedure calls and data serialization.

a. Drill

b. BigTop

c. Avro

d. Chukwa

Answer: (c)

11. IBM and ________ have announced a major initiative to use Hadoop to support university courses in distributed computer
programming.
a. Google Latitude

b. Android (operating system)

c. Google Variations

d. Google

Answer: (d)

12. What was Hadoop written in?

a. Java (software platform)

b. Perl

c. Java (programming language)

d. Lua (programming language)

Answer: (c)

13. Apache _______ is a serialization framework that produces data in a compact binary format.

a. Oozie

b. Impala

c. Kafka

d. Avro

Answer: (d)

14. Avro schemas describe the format of the message and are defined using ______________

a. JSON

b. XML

c. JS

d. All of the mentioned

Answer: (a)

15. In which all languages you can code in Hadoop

a. Java

b. Python

c. C++

d. All of the above

Answer: (d)

16. All of the following accurately describe Hadoop, EXCEPT

a. Open source

b. Real-time

c. Java-based

d. Distributed computing approach

Answer: (b)

17. __________ has the world’s largest Hadoop cluster.


a. Apple

b. Datamatics

c. Facebook

d. None of the mentioned

Answer: (c)

18. Which among the following is the default OutputFormat?

a. SequenceFileOutputFormat

b. LazyOutputFormat

c. DBOutputFormat

d. TextOutputFormat

Answer: (d)

19. Which of the following is not an input format in Hadoop?

a. ByteInputFormat

b. TextInputFormat

c. SequenceFileInputFormat

d. KeyValueInputFormat

Answer: (a)

20. What is the correct sequence of data flow in MapReduce?

a. InputFormat

b. Mapper

c. Combiner

d. Reducer

e. Partitioner

f. OutputFormat

a. abcdfe

b. abcedf

c. acdefb

d. abcdef

Answer: (b)

21. In which InputFormat tab character (‘/t’) is used

a. KeyValueTextInputFormat

b. TextInputFormat

c. FileInputFormat

d. SequenceFileInputFormat

Answer: (a)

Which among the following is true about SequenceFileInputFormat

a. Key- byte offset. Value- It is the contents of the line

b. Key- Everything up to tab character. Value- Remaining part of the line after tab character

c. Key and value- Both are userdefined

d. None of the above

Answer:(c)

22. Which is key and value in TextInputFormat

a. Key- byte offset Value- It is the

contents of the line

b. Key- Everything up to tab

character Value- Remaining part

of the line after tab character

c. Key and value- Both are userdefined

d. None of the above

Answer: (a)

23. Which of the following are Built-In Counters in

Hadoop?

a. FileSystem Counters

b. FileInputFormat Counters

c. FileOutputFormat counters

d. All of the above

Answer: (d)

24. Which of the following is not an output format

in Hadoop?

a. TextoutputFormat

b. ByteoutputFormat

c. SequenceFileOutputFormat

d. DBOutputFormat

Answer: (b)

25. Is it mandatory to set input and output

type/format in Hadoop MapReduce?

a. Yes

b. No

Answer: (b)

26. The parameters for Mappers are:

a. text (input)

b. LongWritable(input)

c. text (intermediate output)

d. All of the above

Answer: (d)

27. For 514 MB file how many InputSplit will be

created

a. 4

b. 5

c. 6

d. 10

Answer: (b)

28. Which among the following is used to provide

multiple inputs to Hadoop?

a. MultipleInputs class

b. MultipleInputFormat

c. FileInputFormat

d. DBInputFormat

14

Answer: (a)

29. The Mapper implementation processes one line

at a time via _________ method.

a. map

b. reduce

c. mapper

d. reducer

Answer: (a)

30. The Hadoop MapReduce framework spawns

one map task for each __________ generated

by the InputFormat for the job.

a. OutputSplit

b. InputSplit

c. InputSplitStream

d. All of the mentioned

Answer: (b)

31. __________ can best be described as a

programming model used to develop Hadoopbased applications that can process massive

amounts of data.

a. MapReduce

b. Mahout

c. Oozie

d. All of the mentioned

Answer: (a)

32. ___________ part of the MapReduce is

responsible for processing one or more chunks

of data and producing the output results.

a. Maptask

b. Mapper

c. Task execution

d. All of the mentioned

Answer: (a)

33. ________ function is responsible for

consolidating the results produced by each of

the Map() functions/tasks.

a. Map

b. Reduce

c. Reducer

d. Reduced

Answer: (b)

34. The number of maps is usually driven by the

total size of

a. task

b. output

c. input

d. none

Answer: (c)

35. The right number of reduces seems to be :

a. 0.65

b. 0.55

c. 0.95

d. 0.68

Answer: (c)

36. Mapper and Reducer implementations can use

the ________ to report progress or just indicate

that they are alive.

a. Partitioner

b. OutputCollector

c. Reporter

d. All of the mentioned

Answer: (c)

37. The major components in the Hadoop 2.0 are:

a. 2

b. 3

c. 4

d. 5

Answer: (b)

38. Which of the statement is true about PIG.

15

a. Pig is also a data ware house system used

for analysing the Big Data Stored in the

HDFS

b. .It uses the Data Flow Language for

analysing the data

c. a and b

d. Relational Database Management System

Answer: (c)

39. Which of the following platforms does Hadoop

run on?

a. Bare metal

b. Debian

c. Cross-platform

d. Unix-like

Answer: (c)

40. The Hadoop list includes the HBase database,

the Apache Mahout ________ system, and

matrix operations.

a. Machine learning

b. Pattern recognition

c. Statistical classification

d. Artificial intelligence

Answer: (a)

41. Which of the Node serves as the master and

there is only one NameNode per cluster.

a. Data Node

b. NameNode

c. Data block

d. Replication

Answer: (b)

42. HDFS consists as the

a. master-worker

b. master node and slave node

c. worker/slave

d. all of the mentioned

Answer: (b)

43. The name node used, when the secondary node

get failed is .

a. Rack

b. Data node

c. Secondary node

d. None of the mentioned

Answer: (c)

44. Which of the following scenario may not be a

good fit for HDFS?

a. HDFS is not suitable for scenarios

requiring multiple/simultaneous writes

to the same file

b. HDFS is suitable for storing data related to

applications requiring low latency data

access

c. HDFS is suitable for storing data related to

applications requiring low latency data

access

d. None of the mentioned

Answer: (a)

45. The need for data replication occurs:

a. Replication Factor is changed

b. DataNode goes down

c. Data Blocks get corrupted

d. All of the mentioned

Answer: (d)

46. HDFS uses only one language for

implementation:

a. C++

b. Java

c. Scala

d. None of the Above

Answer: (d)

47. In YARN which node is responsible for

managing the resources

a. Data Node

b. NameNode

c. Resource Manager

16

d. Replication

Answer: (c)

48. As Hadoop framework is implemented in Java,

MapReduce applications are required to be

written in Java Language

a. True

b. False

Answer: (b)

49. _________ maps input key/value pairs to a set

of intermediate key/value pairs.

a. Mapper

b. Reducer

c. Both Mapper and Reducer

d. None of the mentioned

Answer: (d)

50. The number of maps is usually driven by the

total size of ___________

a. Inputs

b. Outputs

c. Tasks
d. None of the mentioned

Answer: (a)

51. which of the File system is used by HBase

a. Hive

b. Imphala

c. Hadoop

d. Scala

Answer: (c)

52. The information mapping data blocks with their

corresponding files is stored in

a. Namenode

b. Datanode

c. Job Tracker

d. Task Tracker

Answer: (a)

53. In HDFS the files cannot be

a. read

b.deleted

c. excuted

d.archived

Answer: (d)

54. The datanode and namenode are, respectiviley,

which of the following?

a.Slave and Master nodes

b.Master and Worker nodes

c. Both worker nodes

d.both master nodes

Answer: (a)

55. Hadoop is a framework that works with a

variety of related tools. Common cohorts

include

a. MapReduce, Hive and HBase

b.MapReduce, MySQL and Google Apps

c. MapReduce, Hummer and Iguana

d.MapReduce, Heron and Trumpet

Answer: (a)

56. Hadoop was named after?

a. Creator Doug Cuttings favorite circus act

b.The toy elephant of Cuttings son

c. Cuttings high school rock band

d.A sound Cuttings laptop made during

Hadoops development

Answer: (b)

57. All of the following accurately describe

Hadoop, EXCEPT:

a. Open source

b.Java-based

c. Distributed computing approach

d.Real-time

Answer: (d)

58. Hive also support custom extensions written in

a. C

17

b.C#

c. C++

d.Java

Answer: (d)

59. The Pig Latin scripting language is not only a

higher-level data flow language but also has

operators similar to :

a. JSON

b. XML

c. SQL

d.Jquer

Answer: (c)

60. In comparison to Rational DBMS, Hadoop

a. A – Has higher data In

b. B – Does ACID transactions

c. C – IS suitable for read and write many

times

d. D – Works better on unstructured and

semi-structured data.

Answer: (d)

61. The Files in HDFS are ment for

a. Low latency data access

b. Multiple writers and modifications at

arbitrary offsets.

c. Only append at the end of file

d. Writing into a file only once.

Answer: (b)

62. The main role of the secondary namenode is

to

a. Copy the filesystem metadata from

primary namenode.

b. Copy the filesystem metadata from

NFS stored by primary namenode

c. Monitor if the primary namenode is up

and running.

d. Periodically merge the namespace image

with the edit log.

Answer: (b)

63. The MapReduce algorithm contains three

important tasks, namely __________.

a. Splitting, mapping, reducing

b.scanning, mapping, Reduction

c. Map, Reduction, decluttering

d. Cleaning, Map, Reduce

Answer: (a)

64. In how many stages the MapReduce program

executes?

a. 2

b. 3

c. 4

d. 5

Answer: (d)

65. What is the function of Mapper in the

MapReduce?

a. Splitting the Data File

b. Job

c. Scanning the subblock of files

d. PayLoad

Answer: (c)

66. Although the Hadoop framework is

implemented in Java, MapReduce applications

need be written in _______

a. C

b. C#

c. Java

d. None of the above

Answer: (d)

67. What is the meaning of commodity Hardware in

Hadoop

a. Very cheap hardware

b. Industry standard hardware

18

c. Discarded hardware

d. Low specifications Industry grade

hardware

Answer: (d)

68. Which of the following are true for Hadoop?

a. It’s a tool for Big Data analysis

b. It supports structured and unstructured

data analysis

c. It aims for vertical scaling out/in scenarios

d. Both (a) and (b)

Answer: (d)

69. Which of the following are the core components

of Hadoop 2.0?

a. HDFS

b. Map Reduce

c. YARN

d. all the above

Answer: (d)

70. Pogramming Language is used for real time

queries.

a. TRUE

b. FALSE

Answer: (b)

71. What is the default HDFS block size for Hadoop

2.0?

a. 32 MB

b. 128 MB

c. 128 KB

d. 64 MB

Answer: (b)

72. Which of the following phases occur

simultaneously ?

a. Shuffle and Sort

b. Reduce and Sort

c. Shuffle and Map

d. All of the mentioned

Answer: (a)

73. Major Components of Hadoop 1.0 are:

a. HDFS and MapReduce

b. Map Reduce, HDFS and YARN

c. YARN and HDFS

d. None of Above

Answer: (a)

← Previous Post Next Post →

Leave a Comment
Your email address will not be published. Required fields are marked *

Type here..

Name* Email* Website

Save my name, email, and website in this browser for the next time I comment.

Post Comment »
Copyright © TheCodingshef 2020

You might also like