0% found this document useful (0 votes)
59 views1 page

Lovely Professional University: Assignment-1 SET-A

This document contains an assignment for a Big Data course with three questions. Question 1 asks the student to calculate the time required to retrieve 100TB of data from a hard drive with a read rate of 100MB/s and two channels, and suggest how to minimize the retrieval time. Question 2 asks the student to show the steps and screenshots for installing Apache Hadoop on a single node cluster. Question 3 has two parts that ask the student to copy a file to HDFS and display it, and display a local file, providing screenshots.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views1 page

Lovely Professional University: Assignment-1 SET-A

This document contains an assignment for a Big Data course with three questions. Question 1 asks the student to calculate the time required to retrieve 100TB of data from a hard drive with a read rate of 100MB/s and two channels, and suggest how to minimize the retrieval time. Question 2 asks the student to show the steps and screenshots for installing Apache Hadoop on a single node cluster. Question 3 has two parts that ask the student to copy a file to HDFS and display it, and display a local file, providing screenshots.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

LOVELY PROFESSIONAL UNIVERSITY

Assignment-1 SET-A

Deadline of Submission: 07 Sep-2020 Course Code: INT576

Q1: For any organization X, there is a data of 100TB and the available resource of HDD is having accessing
rate of 100MB/s. The HDD is having two channels for data storage and retrieval. As a Big Data Engineer,
you are required to calculate the time required to retrieve this data with given features. Also, provide an
idea how this organization X can retrieve this data in minimum time and what are the requirements for
fulfilling your idea in solving this problem. (10)

(Hint: Data will be read from two channels in parallel. As a result you need to calculate time of half data
only as half data will be read in parallel at the same time via another channel).

Q2: Show the steps and commands used in the installation of Apache Hadoop for one node cluster on
your machine. Each step must be supported with screenshots of your machine with your name on
terminal. Explain the functionality of each file used in the configuration of Apache Hadoop. (10)

Q3: i) Create a text file named temp.txt and save it in local file system. Write a hadoop command to copy
this file into HDFS and later display this file from HDFS only. Support your answer with screenshot of CLI
fetching text file on HDFS.

Q3: ii) Create a text file named test.txt and save it in local file system. Write a hadoop command to display
the contents of this file. Support your answer with screenshot of CLI fetching text file.

You might also like