0% found this document useful (0 votes)
115 views3 pages

BDA - Assignment and Submission Guidelines PDF

This document outlines an assignment for a Big Data Analytics course. It includes 6 units covering topics such as introduction to big data, Hadoop, HDFS, Hive, Spark, NoSQL and MongoDB. Students are instructed to answer one question from each unit based on the last digit of their enrollment number, for a total of 6 questions. Questions can be the 1st, 2nd, 3rd, 4th, 5th or 6th question from the units depending on the enrollment number. The assignment is to be submitted in soft copy format by November 7th, 2019 and will account for 5 marks out of the total 30 marks for the course.

Uploaded by

jerry tom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
115 views3 pages

BDA - Assignment and Submission Guidelines PDF

This document outlines an assignment for a Big Data Analytics course. It includes 6 units covering topics such as introduction to big data, Hadoop, HDFS, Hive, Spark, NoSQL and MongoDB. Students are instructed to answer one question from each unit based on the last digit of their enrollment number, for a total of 6 questions. Questions can be the 1st, 2nd, 3rd, 4th, 5th or 6th question from the units depending on the enrollment number. The assignment is to be submitted in soft copy format by November 7th, 2019 and will account for 5 marks out of the total 30 marks for the course.

Uploaded by

jerry tom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

A D Patel Institute of Technology

Department of Information Technology


Sem-7 AY: 2020-21
2171607 – Big Data Analytics
ASSIGNMENT
Unit-1: INTRODUCTION TO BIG DATA
[1] What is Big Data? What are the challenges with big data? Explain three basic types of
Big Data in brief.
[2] Define Big Data Analytics. Explain 5 applications of Big Data Analytics.
[3] Explain the architecture of HDFS.
[4] What is distributed file system? Explain important features of HDFS.
[5] Explain ‘Four V Characteristics’ of Big Data with suitable example.
Unit-2: INTRODUCTION TO HADOOP AND HADOOP ARCHITECTURE
[1] What is Hadoop? Explain important components of Hadoop with suitable diagram.
[2] Explain working of MapReduce with the example of ‘WordCount’.
[3] List out components of Hadoop Eco-system and their functionality in brief.
[4] What is data serialization? How to serialize data in Hadoop? How Hadoop serialization
differs from Java serialization?
[5] What is scheduling? Explain any three schedulers used in Apache Hadoop.
[6] Explain following terms with reference to working of Hadoop:
InputSplit, InputFormat, Shuffle, Sort, Reducer, Combiner, OutputFormat,
RecordWriter, Distributed Cache
[7] How to move the data in and out of Hadoop? Explain in brief.
[8] Explain the features and key advantages of Hadoop.
[9] Differentiate followings:
RDBMS vs. Hadoop
[10] Explain following important daemons of HDFS and MapReduce:
DataNode, NameNode and Secondary NameNode
JobTracker and TaskTracker
Unit-3: HDFS, HIVE AND HIVEQL, HBASE
[1] What is Hive? Explain its architecture and its working with suitable diagrams.
[2] Differentiate followings:
Hive vs. RDBMS
HDFS vs. HBase
HBase vs. RDBMS
[3] Differentiate followings:
Apache Pig vs. MapReduce
Pig vs. SQL
Pig vs. Hive
[4] What is Pig? Why do we need Pig? Explain the architecture of Pig and its data model.
[5] What is Zookeeper? How it helps in monitoring a cluster?
[6] Why Apache Zookeeper is useful? Explain the architecture of Apache Zookeeper.
[7] Explain following terms with reference to Apache Zookeeper:
Ensemble, Leader, Znodes, Sessions, Watches
[8] Explain the workflow of Apache Zookeeper with suitable diagram.
[9] What is the importance of HBase? Explain the data model supported by HBase. Also
differentiate Row-oriented database vs. Column-oriented database.
[10] Explain important components of HDFS.
Unit-4: SPARK
[1] What is Spark? Explain its important features.
[2] Explain important components of Spark.
[3] Explain RDD in detail.
[4] How Spark is faster than MapReduce?
[5] Explain interactive and iterative architecture of MapReduce and Spark.
[6] Explain architecture of Spark streaming.
[7] Explain various data types supported by MLlib.
[8] Explain any five machine learning functionalities supported by MLlib.
[9] Explain working of WordCount Example using Spark.
[10] Explain the types of operations supported by RDD. Explain both in brief.
Unit-5: NoSQL
[1] What is NoSQL? Where is it used?
[2] Explain various types of NoSQL databases with suitable examples.
[3] Explain the use of NoSQL databases in industry.
[4] Differentiate SQL vs. NoSQL
[5] Explain NewSQL with its characteristics, advantages and drawbacks.
Unit-6: DATABASE FOR MODERN WEB
[1] What is MongoDB? Explain the important features of MongoDB.
[2] Compare MongoDB with RDBMS. Mention advantages of MongoDB over RDBMS.
[3] Explain following terms with reference to MongoDB with suitable example:
Cursor, Indexes, MongoImport, MongoExport
[4] Explain CRUD operations in MongoDB.
[5] Explain the significance of following Methods with reference to MongoDB Query
Language:
find(), pretty(), count(), limit(), skip(), sort(), update(), insert(), save()

Submission Guidelines:

1. All students are required to write answer of one question from each unit as per the guidelines in
Sr.No.2 below. (Hence, total 6 questions are to be answered. Rest should be studied for exam
preparation.)
2. The questions to be answered from each unit are to be selected based on last one digit of your
Enrolment number. i.e.
a. If your Enr.No. ends with 1 then answer 1st question from each unit.
b. If your Enr.No. ends with 2 then answer 2nd question form each unit.
c. If your Enr.No. ends with 3 then answer 3rd question from each unit.
d. If your Enr.No. ends with 4 then answer 4th question form each unit.
e. If your Enr.No. ends with 5 then answer 5th question from each unit.
f. If your Enr.No. ends with 6 then answer 1st question from units-1,5,6 and 6th question
from units-2,3,4 respectively.
g. If your Enr.No. ends with 7 then answer 2nd question from units-1,5,6 and 7th question
from units-2,3,4 respectively.
h. If your Enr.No. ends with 8 then answer 3rd question from units-1,5,6 and 8th question
from units-2,3,4 respectively.
i. If your Enr.No. ends with 9 then answer 4th question from units-1,5,6 and 9th question
from units-2,3,4 respectively.
j. If your Enr.No. ends with 0 then answer 5th question from units-1,5,6 and 10th question
from units-2,3,4 respectively.
3. The assignment is to be submitted by 7th November, 2019 in Soft form through Microsoft
Teams.
4. The assignment will have weightage of 5 marks out of 30 ‘M’ component marks.

You might also like