0% found this document useful (0 votes)
0 views1 page

Big Data Assigenment 3&4

The document outlines assignments for B. Tech students in the Big Data course, covering topics such as Hadoop Distributed File System (HDFS) architecture, file writing processes, limitations of HDFS, and the Hadoop ecosystem's architecture including YARN. It also includes comparisons of Hadoop file formats and the role of NoSQL databases, specifically focusing on MongoDB indexing. Students are required to explain various components and their interactions, as well as evaluate performance and scalability improvements in Hadoop.

Uploaded by

AJAY PASWAN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views1 page

Big Data Assigenment 3&4

The document outlines assignments for B. Tech students in the Big Data course, covering topics such as Hadoop Distributed File System (HDFS) architecture, file writing processes, limitations of HDFS, and the Hadoop ecosystem's architecture including YARN. It also includes comparisons of Hadoop file formats and the role of NoSQL databases, specifically focusing on MongoDB indexing. Students are required to explain various components and their interactions, as well as evaluate performance and scalability improvements in Hadoop.

Uploaded by

AJAY PASWAN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

RAJKIYA ENGINEERING COLLEGE, BANDA

Department of Information Technology


Assignment-3
B. Tech 3rd Year /VI Semester (2024-25)
Big Data (BCS-061)

1. Explain the architecture of the Hadoop Distributed File System (HDFS). Discuss the
roles of the Name Node, Data Nodes, Secondary Name Node, and how the system
ensures high availability and fault tolerance.
2. Describe the process of writing a file to HDFS from the client’s perspective. Include
in your answer the steps involved in block placement, replication, and the interactions
between the client, Name Node, and Data Nodes.
3. Discuss the limitations of HDFS in terms of small file storage and random access.
What are the causes of these limitations, and what strategies or tools (such as Hadoop
Archive Files or HBase) can be used to mitigate them?
4. Explain the architecture of the Hadoop ecosystem in detail. How do the various
components such as HDFS, YARN, and MapReduce interact with each other to
ensure distributed data processing? Include the roles of NameNode, DataNode,
Resource Manager, and Node Manager in your explanation.
5. Compare and contrast the different Hadoop file formats (e.g., Text, Sequence File,
Avro, Parquet, ORC). In which scenarios would each format be most appropriate?
How do these formats affect storage efficiency, schema evolution, and data
processing speed in a Hadoop environment?
RAJKIYA ENGINEERING COLLEGE, BANDA
Department of Information Technology
Assignment-4
B. Tech 3rd Year /VI Semester (2024-25)
Big Data (BCS-061)

1-What is YARN in the Hadoop ecosystem? Describe its architecture in detail, including the
roles of the Resource Manager, Node Manager, Application Master, and how resource
allocation and job scheduling are managed.

2- Compare and contrast the traditional Map Reduce processing model with the YARN-based
architecture. How does YARN enhance the performance and scalability of Hadoop? Provide
examples where YARN’s features significantly improve job execution.

3- Discuss the integration of Hadoop Ecosystem tools (like Hive, Pig, Spark, and HBase)
with YARN. How does YARN act as a generic resource management layer for these tools,
and what are the implications for multi-tenant workloads in a Hadoop cluster?

4- Discuss the four main types of NoSQL databases (Document, Key-Value, Column-Family,
and Graph).
5- Evaluate the role of indexing in MongoDB and its impact on query performance.
Discuss different types of indexes available in MongoDB (e.g., single field, compound,
multikey, text, geospatial).

You might also like