Hadoop Practical Commands
1. Start Hadoop
sbin/start-all.sh
2. Check all nodes are running
jps
3. Run ls to list Hadoop file
hdfs dfs -ls
4. Make a directory
hdfs dfs -mkdir <folder name>
5. touchz: It creates an empty file.
hdfs dfs -touchz <file_path>
6. copyFromLocal (or) put: To copy files/folders from local file system to
hdfs store. This is the most important command. Local filesystem
means the files present on the OS.
hdfs dfs -copyFromLocal <local file path> <dest(present on hdfs)>
7. Use editor to create a file
Sudo nano inputdata.txt
8. cat: To print file contents.
hdfs dfs -cat Inputdata.txt
--- Execution of MapReduce Word count with java ----
Create a text input file on HDUSER
Sudo nano t1.txt (enter some text line in it)
Now copy form hduser to hdfs
hdfs dfs -copyFromLocal t1.txt /input
goto location for where jars are available
cd /usr/local/hadoop/shar/hadoop/mapreduce
Execute Mapreduce Word Count
hadoop jar hadoop-mapreduce-examples-3.3.0.jar wordcount /input
/output //run the command for mapreduce
(Give a new output dir name every time)
Check your output by executing 2 commands
hdfs dfs -ls /output
hdfs dfs -ls /output/part
--- Execution of MapReduce Word count with python ----
Run following 3 commands
cat t1.txt | python3 mapper.py | sort -k1,1 | python3 reducer.py
hadoop jar /home/hduser/Downloads/jar_files/hadoop-streaming-3.2.1.jar -input
/input/t1.txt -output /output1 -mapper /home/hduser/mapper.py -reducer
/home/hduser/reducer.py ( Note:- change the name of output directory every time
you run this command)
Check your output by executing 2 commands
hdfs dfs -ls /output1
hdfs dfs -ls /output1/part