0% found this document useful (0 votes)
20 views

Assignment Week 1

Uploaded by

pali.rajtrader
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Assignment Week 1

Uploaded by

pali.rajtrader
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Assignment

Week1: Getting Started with Big Data and


Understanding HDFS Concept along with
Linux Commands
IMPORTANT
Self-assessment enables students to develop:
1. A sense of responsibility for their own learning and the ability & desire to continue learning,
2. Self-knowledge & capacity to assess their own performance critically & accurately, and
3. An understanding of how to apply their knowledge and abilities in different contexts.

All assignments are for self assessment. Solutions will be released on every subsequent
week.Once the solution is out, evaluate yourself.

No discussions/queries allowed on assignment questions in slack channel.

Note: You can raise your doubts once the solution is released
Question 1

1. Create two directories ‘dir1’ and ‘dir2’ using a single hdfs command inside home
directory of hdfs in Cloudera VM. The dir2 should be subdirectory of dir1.
2. Verify that the two folders have been created in the above path Inside dir 2
3. Create an empty file, file1.txt
4. Create a file file2.txt in local filesystem with some text inside it echo " Hello" >> file2.txt

5. Copy file2.txt from local to hdfs inside dir2. -put / -copyFromLocal

6. List the subdirectories and files inside dir1 recursively


7. List the files inside dir2 ,sorted by size but size should be displayed in KBs/MBs
and not bytes
8. Rename the file, file2.txt to file3.txt
9. Remove the directory dir1 using a single command.
Question 2

Suppose there is file of size 514 MB stored in HDFS (Hadoop 2.x) using default block
size configuration and default replication factor:

• How many blocks will be created in total ? 5 block will be created

• What will be the size of each block? The size of each block will be 128 MB
Question 3

1. Create a directory inside home directory of local filesystem named ‘test’


2. Create few empty files inside the test directory namely a.pdf, b.html, c.xml
3. List the files in reverse alphabetical order of file name
4. Display only the file which ends with .html extension By Using grep on list
Question 4

1. Create two new text files, file1 and file2 , with following content using cat command
in your linux home directory:
• file1: This is from file1
• file2: This is from file2
2. Display the contents of the file 1 and file2 using cat command
3. Concatenate the contents of the two files and put them into a new file file3 and
display the results. cat file1 file2 > file3
4. Count the number of lines and number of words in the file3. cat -n file3
Question 5

Create a text file myfile.txt with 5 lines in home directory of local filesystem:
1. Display last 3 lines of that file. tail -n 3 myfile.txt

2. Display all lines of that text file except first line. tail -n 4 myfile.txt

tail :- Used for print the last 10 line of the text file

head :- Used for print the first 10 lines of text file


Question 6

The getmerge command in Hadoop is for merging files existing in the HDFS file
system into a single file in the local file system.
1. Use the help for getmerge command to see the arguments it takes
2. Create file1.txt in local with contents “Hello, this is from file1“,
Create file2.txt in local with contents “Hello, this is from file2”
3. Copy the file1.txt and file2.txt into a hdfs location inside home directory in hdfs
4. Use the getmerge command to merge the contents of two files present in hdfs
and put the merged content into a single local destination file named filenew.txt.
hadoop fs -getmerge user/cloudera/Hadoop_File/file1.txt (1) user/cloudera/Hadoop_File/file1.txt (2) , jisbhi local file me save karna hai waha chale jao and merge file name

5. Display the merged contents of the file filenew.txt


merging ke time pe hi set karna..!

You might also like