HDFS Shell Commands On AWS
HDFS Shell Commands On AWS
AMAZON EC2
Step 1: Start your AWS EC2 instance by logging in to your AWS Management Console.
Step 2: Make sure that your instance is up and running fully after Step 1. Next, log in to the
Cloudera Manager. It is available at the following link:
http://<public ip>:7180 [Here, <public ip> will be the ip of your machine such as 34.239.199.30.]
By default, both the username and the password for Cloudera Manager are ‘admin’. (If you
have changed them manually, then use the new credentials.)
Step 4: Connect to AWS EC2 (via PuTTY, etc.). You learnt how to connect to AWS EC2 in
the previous modules.
IMPORTANT INSTRUCTIONS
● The following notations have been used throughout the file:
BASIC COMMANDS
● To check the commands that are available in the HDFS, run any of
the following commands.
hadoop fs -help or hadoop dfs -help
● To read the list of files in the HDFS, use the ‘ls’ command.
Note: As seen above, trying to create a directory in hadoop using the root user gave us
an error. This error occured due to us trying to create a directory inside hdfs using the
root user. Please note a directory can be created in hadoop only using the hdfs user. So
now, switch to the hdfs user. Please note there is a space between - and hdfs in the
command used below.
Now, as seen above, the owner of the directory created is hdfs (underlined above). To
send a file from any user to hdfs, the owner of the directory inside hdfs should be
changed to the user sending the file. For example: If you have to send a file from the
root user to a directory inside hdfs, the owner of that particular directory inside hdfs
should be changed to root.
● To change the owner of the directory created from hdfs to root, run the following
command:
● You can see that the owner has changed from hdfs to root.
● Note: In these commands, the ‘-p’ argument in the mkdir command means that it will create the
parent directories as well if they didn’t already exist. Also the ‘-R’ argument in chown and chmod
commands mean that it will recursively apply the command on any files or directories present in the
directory mentioned in the command.
● Now, use the ‘exit’ command to shift from the hdfs user to the root user.
● You can also use the “vi test.txt” command to create a text file using vi if you prefer.
Keep in mind that you will have to go to the Input mode by pressing I and then write
into the file and then later on press “Esc” to go back to the command mode and then
type “wq!” to save and exit vi.
● Now verify whether the file has been created or not using the ‘ls’ command.
[root@ip-10-0-0-14 ~]# ls
test.txt
● We can verify whether the file has been copied as shown below:
● Now, check the content of the file, using the ‘cat’ command.
● First, we create a new directory named testing using the ‘mkdir’ command.
● Now, we will copy the file from the HDFS to the local system using the ‘get’
command.
● Now, let us verify the same by navigating to the new directory using the ‘cd’ command.
Then use the ‘ls’ command and verify whether your file is present or not.
● Please note that these steps are just for practise and should not be done while doing
regular practise
● As we know, the default replication factor in the HDFS is 3. Now, we will use the ‘setrep’
command to change it to any value desired by us. In this case, we are setting the
replication factor of the file test.txt to 6.
● Now, let us verify the same using the NameNode browser. As mentioned earlier, we
can access it by using <Public IP address>: 50070. Then click on ‘Utilities’, followed by
“Browse File System”. Locate the file to verify whether the replication factor has been
set to 6.