Assignment Week 1
Assignment Week 1
All assignments are for self assessment. Solutions will be released on every subsequent
week.Once the solution is out, evaluate yourself.
Note: You can raise your doubts once the solution is released
Question 1
1. Create two directories ‘dir1’ and ‘dir2’ using a single hdfs command inside home
directory of hdfs in Cloudera VM. The dir2 should be subdirectory of dir1.
2. Verify that the two folders have been created in the above path Inside dir 2
3. Create an empty file, file1.txt
4. Create a file file2.txt in local filesystem with some text inside it echo " Hello" >> file2.txt
Suppose there is file of size 514 MB stored in HDFS (Hadoop 2.x) using default block
size configuration and default replication factor:
• What will be the size of each block? The size of each block will be 128 MB
Question 3
1. Create two new text files, file1 and file2 , with following content using cat command
in your linux home directory:
• file1: This is from file1
• file2: This is from file2
2. Display the contents of the file 1 and file2 using cat command
3. Concatenate the contents of the two files and put them into a new file file3 and
display the results. cat file1 file2 > file3
4. Count the number of lines and number of words in the file3. cat -n file3
Question 5
Create a text file myfile.txt with 5 lines in home directory of local filesystem:
1. Display last 3 lines of that file. tail -n 3 myfile.txt
2. Display all lines of that text file except first line. tail -n 4 myfile.txt
tail :- Used for print the last 10 line of the text file
The getmerge command in Hadoop is for merging files existing in the HDFS file
system into a single file in the local file system.
1. Use the help for getmerge command to see the arguments it takes
2. Create file1.txt in local with contents “Hello, this is from file1“,
Create file2.txt in local with contents “Hello, this is from file2”
3. Copy the file1.txt and file2.txt into a hdfs location inside home directory in hdfs
4. Use the getmerge command to merge the contents of two files present in hdfs
and put the merged content into a single local destination file named filenew.txt.
hadoop fs -getmerge user/cloudera/Hadoop_File/file1.txt (1) user/cloudera/Hadoop_File/file1.txt (2) , jisbhi local file me save karna hai waha chale jao and merge file name