Bigdata Lab File
Bigdata Lab File
Experiment 1
Objective: In this practical, you will learn how to download, install, and configure Hadoop, one of the most
popular distributed storage and processing frameworks. You will also gain an understanding of different
Hadoop modes, explore startup scripts, and work with configuration files.
Prerequisites:
• A Linux-based operating system (e.g., Ubuntu) or access to a virtual machine with Linux installed.
• Basic command-line skills.
• Java Development Kit (JDK) 8 or higher installed.
• Familiarity with basic Linux commands.
Materials:
• A computer with internet access.
• Hadoop distribution (Hadoop can be downloaded from the official Apache Hadoop website).
Procedure:
1. Downloading and Installing Hadoop:
1.1. Open a terminal on your Linux system.
1.2. Download the Hadoop distribution from the official Apache Hadoop website:
https://hadoop.apache.org/releases.html.
1.3. Choose the latest stable version and download the binary distribution. For example, you can use wget
or curl to download the distribution:
wget https://www.apache.org/dyn/closer.lua/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz
1.4. Extract the downloaded Hadoop archive:
tar -xzvf hadoop-3.3.1.tar.gz
1.5. Move the extracted Hadoop directory to a suitable location (e.g., /usr/local):
sudo mv hadoop-3.3.1 /usr/local
4. Configuration Files:
4.1. Explore other configuration files in the etc/hadoop directory, such as mapred-site.xml and yarn-
site.xml. Understand their purposes and how they affect Hadoop behavior.
4.2. Modify the configuration files to change various Hadoop settings. For example, increase the replication
factor in the hdfs-site.xml file.
5. Conclusion:
By completing this practical, you have learned how to download, install, and configure Hadoop. You've also
gained an understanding of different Hadoop modes, worked with startup scripts, and explored
configuration files. This knowledge is essential for working with Hadoop and distributed data processing.
Experiment 2
Objective: In this practical, you will learn how to perform basic file management tasks in Hadoop, such as adding files
and directories, retrieving files, and deleting files using Hadoop Distributed File System (HDFS).
Prerequisites:
• Hadoop installed and configured (you can refer to the previous practical for installation and configuration).
• Familiarity with basic Hadoop commands (e.g., hadoop fs, hdfs dfs).
Materials:
Procedure:
1.2. Use the hadoop fs or hdfs dfs command to add a local file to the HDFS. For example, to add a file named
example.txt from your local system to HDFS:
2. Retrieving Files:
2.1. Retrieve a file from HDFS to your local filesystem using the hadoop fs or hdfs dfs command. For example, to
retrieve example.txt from HDFS to your local directory:
2.2. Verify that the file has been copied to your local directory.
3. Deleting Files:
3.1. Use the hadoop fs or hdfs dfs command to delete a file in HDFS. For example, to delete example.txt:
4. Conclusion:
By completing this practical, you have learned how to perform basic file management tasks in Hadoop using HDFS.
You can add files and directories, retrieve files, and delete files as needed. These fundamental file management skills
are crucial when working with Hadoop for distributed data storage and processing.
Experiment 3
Experiment 4
Experiment 5
Experiment 6
Experiment 7
Experiment 8