0% found this document useful (0 votes)
6 views

Big Data Computing - Week-5

The document outlines the submission details and questions from Week 5 of the NPTEL Big Data Computing course assignment. It includes various questions related to distributed graph processing frameworks, data processing frameworks, and specific use cases for Big Data tools, with correct answers indicated for each question. The assignment was submitted on September 25, 2024, before the deadline.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Big Data Computing - Week-5

The document outlines the submission details and questions from Week 5 of the NPTEL Big Data Computing course assignment. It includes various questions related to distributed graph processing frameworks, data processing frameworks, and specific use cases for Big Data tools, with correct answers indicated for each question. The assignment was submitted on September 25, 2024, before the deadline.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

X

(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)

[email protected]

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Big Data Computing (course)

Course Week 5: Assignment 5


outline The due date for submitting this assignment has passed.
Due on 2024-09-25, 23:59 IST.
About NPTEL
()
Assignment submitted on 2024-09-25, 21:40 IST
How does an 1)What distributed graph processing framework operates on top of 1 point
NPTEL online Spark?
course work?
() MLlib
GraphX
Week-0 ()
Spark streaming
ALL
Week-1 ()
Yes, the answer is correct.
Week-2 () Score: 1
Accepted Answers:
Week-3 ()
GraphX

Week-4 ()
2)Which of the following frameworks is best suited for fast, in-memory 1 point
data processing and supports advanced analytics such as machine learning and
Week-5 ()
graph processing?

Design of
Apache Hadoop MapReduce
HBase (unit? Apache Flink
unit=50&lesson Apache Storm
=51)
Apache Spark
Spark
Yes, the answer is correct.
Streaming and Score: 1
Sliding Window
Accepted Answers:
Analytics (Part-
Apache Spark
I) (unit?
unit=50&lesson
=52)
3)A financial institution needs to analyze historical stock market data to 1 point
predict market trends and make investment decisions. Which Big Data
Spark processing framework is best suited for this scenario?
Streaming and
Sliding Window Apache Spark
Analytics (Part-
II) (unit?
Apache Storm
unit=50&lesson Hadoop MapReduce
=53)
Apache Flume
Sliding Window Yes, the answer is correct.
Analytics (unit? Score: 1
unit=50&lesson Accepted Answers:
=54) Apache Spark
Introduction to 4) A telecommunications company needs to process real-time call logs 1 point
Kafka (unit?
unit=50&lesson
from millions of subscribers to detect network anomalies. Which combination of
=55) Big Data tools would be appropriate for this use case?
Quiz: Week 5:
Assignment 5 Apache Hadoop and Apache Pig
(assessment?
Apache Kafka and Apache HBase
name=144)
Apache Spark and Apache Hive
Week 5: Lecture
Apache Storm and Apache Pig
Notes (unit?
unit=50&lesson No, the answer is incorrect.
=125) Score: 0
Accepted Answers:
Feedback for
Apache Kafka and Apache HBase
Week 5 (unit?
unit=50&lesson
=57)
5) Do many people use Kafka as a substitute for which type of solution? 1 point

Week 5: log aggregation


Assignment 5 compaction
Solution (unit?
unit=50&lesson
collection
=107) all of the mentioned
Yes, the answer is correct.
Week-6 () Score: 1
Accepted Answers:
Week-7 () log aggregation

Text 6)Which of the following features of Resilient Distributed Datasets 1 point


Transcripts () (RDDs) in Apache Spark contributes to their fault tolerance?

DOWNLOAD DAG (Directed Acyclic Graph)


VIDEOS () In-memory computation
Lazy-evaluation
Books ()
Lineage information
Yes, the answer is correct.
Score: 1
Accepted Answers:
Lineage information

7) Point out the correct statement. 1 point

Hadoop do need specialized hardware to process the data


Hadoop allows live stream processing of real-time data
In the Hadoop mapreduce programming framework output files are divided
into lines or records
None of the mentioned
Yes, the answer is correct.
Score: 1
Accepted Answers:
In the Hadoop mapreduce programming framework output files are divided
into lines or records
8) Which of the following statements about Apache Pig is true? 1 point

Pig Latin scripts are compiled into HiveQL for execution.


Pig is primarily used for real-time stream processing.
Pig Latin provides a procedural data flow language for ETL tasks.
Pig uses a schema-on-write approach for data storage.
Yes, the answer is correct.
Score: 1
Accepted Answers:
Pig Latin provides a procedural data flow language for ETL tasks.

9) An educational institution wants to analyze student performance data 1 point


stored in HDFS and generate personalized learning recommendations. Which
Hadoop ecosystem components should be used?
Apache HBase for storing student data and Apache Pig for processing.
Apache Kafka for data streaming and Apache Storm for real-time analytics.
Hadoop MapReduce for batch processing and Apache Hive for querying.
Apache Spark for data processing and Apache Hadoop for storage.
Yes, the answer is correct.
Score: 1
Accepted Answers:
Apache Spark for data processing and Apache Hadoop for storage.

10) A company is analyzing customer behavior across multiple channels 1 point


(web, mobile app, social media) to personalize marketing campaigns. Which
technology is best suited to handle this type of data processing?
Hadoop MapReduce
Apache Kafka
Apache Spark
Apache Hive
Yes, the answer is correct.
Score: 1
Accepted Answers:
Apache Spark

You might also like