0% found this document useful (0 votes)
34 views7 pages

BDA Question BANK

The document outlines the syllabus for a course on Big Data Analytics, including multiple-choice questions, short answer questions, and detailed topics for examination. It covers fundamental concepts such as Hadoop, MapReduce, HDFS, and MongoDB, along with their applications and advantages. Additionally, it includes practical laboratory components and assessments related to big data technologies.

Uploaded by

aburoobhastudy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views7 pages

BDA Question BANK

The document outlines the syllabus for a course on Big Data Analytics, including multiple-choice questions, short answer questions, and detailed topics for examination. It covers fundamental concepts such as Hadoop, MapReduce, HDFS, and MongoDB, along with their applications and advantages. Additionally, it includes practical laboratory components and assessments related to big data technologies.

Uploaded by

aburoobhastudy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

20AIPC602-BIG DATA ANALYTICS WITH

LABORATORY
UNIT -I

Unit - I / Part - A / 1 Mark/ MCQ


Sl. Marks K–
Questions CO
No. Split-up Level
1. To handle real time analytics , IBM’s big data strategy includes 1 K1 CO1
____________ as a key component
a)Traditional databases b) Batch processing c) Stream
processing d) static methods
2. What is the primary advantage of using Hadoop for data analysis over 1 K2 CO1
traditional systems?
a) Real time data processing b) Lower hardware requirements
c) Ability to analyse small scale datasets
d) Efficient handling of large scale datasets.
3. Which challenge of big data involves the rapid generation and 1 K1 CO1
continuous flow of data demanding real time processing capabilities ?
a) Volume b) Variety c) Velocity d) Veracity
4. _____ features a SQL like interface, HQL language that works similar 1 K2 CO1
to SQL and automatically translates queries into Mapreduce jobs.
a) PIG b) Hive c) Oozie d) Spark
5. ______ analytics is the use of data to understand past and current 1 K2 CO1
business performance and make informed decisions
a) Descriptive b) Predictive c) Prescriptive d) Diagnostic
6. ___________ analytics find the best sets of pricing to maximize the 1 K2 CO1
sale revenue
a) Descriptive b) Predictive c) Prescriptive d) Diagnostic
7. Hadoop supports which type of data 1 K2 CO1
a) Structured data b) Unstructured data c) Semi structured data
d) All the mentioned
8. ________ is a high level scripting language used to execute queries for 1 K2 CO1
large datasets that are used within Hadoop
a) Oozie b) PIG c) Spark d) MapReduce
9. ________ Hadoop component is used for workflow and 1 K2 CO1
scheduling services
a) MapReduce b) HDFS c) Yarn d) Oozie
10. ____________ is used for programming based Data Processing and 1 K2 CO1
____________ is used for Memory data processing
a) MapReduce, Spark b) Spark, MapReduce
c)Hbase , Spark d) YARN , Oozie
11 ________ class is provided by the Aggregate package 1 K2 CO1
a) Reducer b) Mapper c) Aggregator d) All the mentioned
12 Which of the following is correct? 1 K2 CO1
1) Hadoop streaming is a utility that comes with the Hadoop
distribution
2) The utility allows you to create and run MapReduce jobs
with any executable or script as the mapper and/or the
reducer
a) Option 1b) Option 2 c) Both 1& 2 d) None
13 Output collector is a generalization of the facility provided by the 1 K2 CO1
MapReduce framework to collect data output by the ________
a) Mapper b) Reducer c) Mapper or Reducer d) None
14 Reducer has _______ primary phases 1 K2 CO1
a) 3 b) 2 c) 4 d) 5
15 Which of the following are true about Hadoop Distributed File 1 K2 CO1

Page1of7
20AIPC602-BIG DATA ANALYTICS WITH
LABORATORY
System (HDFS)?
a)It is one of the largest Apache projects and primary storage
system of Hadoop
b) It employs a Name node and Data node architecture
c) It is a distributed file system able to store large files running
over the cluster of commodity hardware
d) All of the mentioned
16 __________ represents a MapReduce job configuration 1 K2 CO1
a) Job Conf b) Map Parameters c) Memory Conf d) All the
mentioned

Unit-I / Part-A / 2 Marks


S Mar K–
Questions ks Lev
CO
l.
Split el
N
-up
o.
1. What do you mean by Analytics? 2 K2 CO1
2. List some of the common examples of Descriptive Analytics. 2 K1 CO1
3. Name some of the common techniques used for Diagnostic 2 K1 CO1
Analytic.
4. How digital data are distributed in the real world? 2 K1 CO1
5. What are the sources of the Structured Data? 2 K2 CO1
6. List some of the characteristics of Unstructured data. 2 K1 CO1
7. What are the advantages and disadvantages of Unstructured Data? 2 K1 CO1
8. What could be the possible solution for storing Unstructured data? 2 K1 CO1
9. What are the Problems faced in storing semi-structured data? 2 K1 CO1
10. Define Big Data. 2 K1 CO1
11. What are the 6 V’s of Big Data? 2 K1 CO1
12. What is Veracity? 2 K1 CO1
13 List some of the tools used in Big Data Analytics. 2 K1 CO1
14. What is Map Reduce? 2 K1 CO1
15 Brief about YARN. 2 K1 CO1

Page2of7
20AIPC602-BIG DATA ANALYTICS WITH
LABORATORY

Unit-I / Part-B/ 10Marks


Sl. Mar K–
Questions ks Lev
CO
No
. Split el
-up
1. 10 K2 CO1
Explain in detail about HDFS.
2. 10 K2 CO1
Briefly discuss Map Reduce and YARN.
3. What are the advantages of Hadoop? Explain Hadoop Architecture 10 K2 CO1
and its Components with Proper diagram.
4. Explain different types of Analytics with relevant Scenarios & 10 K2 CO1
Examples.
How digital data are distributed in the real world? Explain the 10 K2 CO1
advantages and disadvantages of each type with possible solutions
5.
for storing and accessing each type.
6. Brief about the following K2 CO1
i) Data Analysis with Hadoop 5
ii) Hadoop Streaming. 5
7. With proper architecture explain Hadoop Ecosystem with its 10 K2 CO1
components in detail.
8. Brief about the following K2 CO1
i) Importance of Big Data Analytics 5
ii) Apache Hadoop 5
What is big data analytics? Explain 6 ‘V’s of Big data. Briefly discuss 10 K2 CO1
9. applications of big data.

10 K2 CO1
Relate how IBM incorporates the Big Data Strategy.
10.

Page3of7
20AIPC602-BIG DATA ANALYTICS WITH
LABORATORY

Unit - II / Part - A / 1 Mark/ MCQ


Sl.
Marks K–
No Questions CO
Split-up Level
.
1. Which MongoDB operator is used to update the value of an 1 K2 CO2
element within an array at a specific position?
a) $set b) $update c)$position d) array Update
2. In MongoDb how would you create an index on an array 1 K2 CO2
field’tags’ to improve query performance?
a)db.collection.createIndex((“tags”:”1”))
b) db.collection.addIndex((“tags”,”ascending”))
c)db.collection.ensureIndex((“tags”:”asc”))
d)db.collection.index((“tags”:”asc”))
3. Which MongoDB aggregation stage is used to group documents 1 K1 CO2
by a specific field and calculate aggregate values?
a)$match b) $project c)$group d) $sort
4. In MongoDB , can a single collection have multiple indexes? 1 K1 CO2
a) No,MongoDB only allows one index per collection
b) Yes, a collection can have multiple indexes, each serving a
different query pattern
c)Yes, but additional indexes are only allowed for system
collections
d) Only if the collection is shared
5. CQL stands for 1 K1 CO2
a) Cassandra Query Language
b) Cassandra Queue Language
c) Collection Quest Language
d) Cassandra Query Limit
6. Systems with _____ are known to have achieved replica 1 K1 CO2
convergence
a) Strong Consistency
b) Eventual Consistency
c) Both Eventual & Strong Consistency
d) None of the mentioned
7. __________ command enables or disables tracing in cqlsh 1 K2 CO2
command prompt
a)TRACING b)HELP c)SHOW HOST d) SOURCE
8. Cassandra was initially developed at _______ to power the inbox 1 K2 CO2
search feature
a) Facebook b) Twitter c) Whatsapp d) Instagram
9. ______ method builds the cluster with the given contact points 1 K2 CO2

a) Cluster init() b) Cluster start() c)Cluster run() d) Cluster


build()
10. Which of the following are correct ways to create a table having 1 K2 CO2
one or more counter columns?
1) Use CREATE TABLE to define the counter and non-
counter columns
2) Use all non-counter columns as part of the PRIMARY

Page4of7
20AIPC602-BIG DATA ANALYTICS WITH
LABORATORY
KEY definition
a)Option (b)option 2 c)Both option1 and option 2 d) None

11 What is the correct syntax for deleting all documents with the “color” 1 K1 CO2
field equal to ”red” in MongoDB?
a) Db.colorDB.deleteMany((color:”red”))
b) Db.colorDB.DELETE((COLOR=”red”))
c) Db.deleteMany((color:”red”))
d) deleteMany((color:”red”))
12 What are the arguments required for updating a document in 1 K1 CO2
MongoDB?

a) Update filter b) Update Action c) Update filter, Update


Action
d) Update condition
13 Which data type of MongoDB is used for implementing embedded 1 K1 CO2
documents?
a) Boolean b) Double c) Object d) String
14 Which of the following is not a feature of NoSQL databases? 1 K2 CO2
a)flexible schema b) vertical scaling c) horizontal scaling d) fast
queries
15 Which of the following statements about MongoDB indexes is is true ? 1 K2 CO2
a) Indexes are created automatically for all collections
b) Indexes can only be created on the _id field of a collection
c) Indexes can significantly speed up query execution
d) Indexes are only used for sorting data
16 What is the primary purpose of creating indexing in MongoDB? 1 K2 CO2

a)To store data in MongoDB collections


b) To enforce data schema validation
c) To improve query performance
d) To encrypt data in the database

17 How can you limit the number of documents returned by a MongoDB 1 K1 CO2
query using a cursor?

a) By setting the limit() method on the cursor


b) By specifying a limit in the query itself
c) By using the count() method on the cursor
d) We cannot limit the number of documents with a cursor

Page5of7
20AIPC602-BIG DATA ANALYTICS WITH
LABORATORY

Unit-II / Part-A / 2 Marks


S Mark K
Questions – CO
l. s Split-
N up Le
vel
o.
1. Compare and contrast NoSQL Vs Relational Database 2 K2 CO2

2. What are the features of MongoDB ? 2 K2 CO2

3. Define indexes in MongoDB. 2 K2 CO2

4. Mention the significance of the covered query. 2 K2 CO2

5. How does MongoDB provide concurrency? 2 K2 CO2

6. What is meant by Sharding & Aggregation in MongoDb ? 2 K2 CO2

7. What is a replica set? 2 K2 CO2

8. What is the CRUD operation process? 2 K2 CO2

9. Write the Data types supported by MongoDB. 2 K2 CO2

10 State any two Query features of MongoDB. 2 K2 CO2


.
11 What are the main components of a Map reduce job? 2 K2 CO2
.
12 What is shuffling and sorting in Map reduce? 2 K2 CO2
.
13 What is identity mapper and chain mapper? 2 K2 CO2
.
14 How to create an Index in MondoDB ? Give an example. 2 K2 CO2
.
15 Give the differences between MongoDB and Cassandra NoSQL 2 K2 CO2
. databases?
16 Name the classifications of different Data Types in Cassandra. 2 K2 CO2
.
17 What are key spaces? Give the syntax to create them. 2 K2 CO2
.
18 What are the various CRUD operations in Cassandra? 2 K2 CO2
.
19 List the various collection types in Cassandra. 2 K2 CO2
.
20 How is the TTL command used? Give example syntax. 2 K2 CO2
.

Page6of7
20AIPC602-BIG DATA ANALYTICS WITH
LABORATORY
Unit-II / Part-B / 10 Marks
S Mar K–
Questions ks Lev
CO
l.
Split el
N
-up
o.
1. 10 K2 CO2
Explain the CRUD operations with examples in MongoDB .

2. Discuss the functions of MongoDB query language and database 10 K2 CO2


commands.

3. Discuss the functions of Group By, partitioning and combining 10 K2 CO2


using one example for each.

4. Describe the MapReduce execution steps with a neat diagram. 10 K2 CO2

5. How does the Hadoop MapReduce Data flow work for a Word 10 K2 CO2
count program? Give an example.

6. Explain the various operations of Cursors in MongoDB with 10 K2 CO2


examples.

7. Describe the various Index types and their properties. 10 K2 CO2

8. Explain the various CRUD operations in Cassandra with examples. 10 K2 CO2

9. (i) With examples, Explain Collections and Counters. 10 K2 CO2

(ii) How Time To Live field is used in real time. Explain with
examples.

10. Explain various features and functions of MongoDB with examples 10 K2 CO2

11. (i) Describe the working of MapReduce algorithm. 5 K2 CO2

(ii) Write MapReduce code for counting occurrences of specific 5


words in the input text files. Also write the commands to compile
and run the code.

12. Discuss the different types and formats of MapReduce with an 10 K2 CO2
example for each.

13. Explain the various CQLSH commands with syntax and examples. 10 K2 CO2

14. (i) Describe the various features of Cassandra . 5 K2 CO2

(ii) How datas are imported and exported in MongoDB and 5


Cassandra NoSQL Databases.

Page7of7

You might also like