0% found this document useful (0 votes)

18 views

Bigdata Lab File

University lab file that contains all basic practicals one needs to go through to get to know the subject.

Uploaded by

Grette

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Bigdata Lab File

University lab file that contains all basic practicals one needs to go through to get to know the subject.

Uploaded by

Grette

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Remarks

Experiment 1

Objective: In this practical, you will learn how to download, install, and configure Hadoop, one of the most
popular distributed storage and processing frameworks. You will also gain an understanding of different
Hadoop modes, explore startup scripts, and work with configuration files.
Prerequisites:
• A Linux-based operating system (e.g., Ubuntu) or access to a virtual machine with Linux installed.
• Basic command-line skills.
• Java Development Kit (JDK) 8 or higher installed.
• Familiarity with basic Linux commands.
Materials:
• A computer with internet access.
• Hadoop distribution (Hadoop can be downloaded from the official Apache Hadoop website).
Procedure:
1. Downloading and Installing Hadoop:
1.1. Open a terminal on your Linux system.
1.2. Download the Hadoop distribution from the official Apache Hadoop website:
https://hadoop.apache.org/releases.html.
1.3. Choose the latest stable version and download the binary distribution. For example, you can use wget
or curl to download the distribution:
wget https://www.apache.org/dyn/closer.lua/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz
1.4. Extract the downloaded Hadoop archive:
tar -xzvf hadoop-3.3.1.tar.gz
1.5. Move the extracted Hadoop directory to a suitable location (e.g., /usr/local):
sudo mv hadoop-3.3.1 /usr/local

2. Understanding Different Hadoop Modes:

Hadoop can operate in three different modes:
• Local (Standalone) Mode: Used for debugging, running Hadoop on a single machine.
• Pseudo-Distributed Mode: Simulates a cluster on a single machine, useful for development and
testing.
• Fully-Distributed Mode: Deploys Hadoop on a cluster of multiple machines.
2.1. Open the Hadoop configuration file, hadoop-env.sh, located in the etc/hadoop directory and set the
JAVA_HOME environment variable to point to your JDK installation:
export JAVA_HOME=/path/to/your/jdk
2.2. Explore the core-site.xml and hdfs-site.xml configuration files in the etc/hadoop directory. Understand
how they define various properties like Hadoop filesystem and data node settings.
3. Startup Scripts:
3.1. Navigate to the sbin directory in your Hadoop installation (e.g., /usr/local/hadoop-3.3.1/sbin).
3.2. Run the following command to start the Hadoop NameNode and DataNode in a pseudo-distributed
mode:
./start-dfs.sh
3.3. Open a web browser and visit the Hadoop NameNode web interface at http://localhost:9870 to check
the cluster status.
3.4. Use the following command to stop the Hadoop cluster:
./stop-dfs.sh

4. Configuration Files:
4.1. Explore other configuration files in the etc/hadoop directory, such as mapred-site.xml and yarn-
site.xml. Understand their purposes and how they affect Hadoop behavior.
4.2. Modify the configuration files to change various Hadoop settings. For example, increase the replication
factor in the hdfs-site.xml file.

5. Conclusion:
By completing this practical, you have learned how to download, install, and configure Hadoop. You've also
gained an understanding of different Hadoop modes, worked with startup scripts, and explored
configuration files. This knowledge is essential for working with Hadoop and distributed data processing.
Experiment 2

Objective: In this practical, you will learn how to perform basic file management tasks in Hadoop, such as adding files
and directories, retrieving files, and deleting files using Hadoop Distributed File System (HDFS).

Prerequisites:

• Hadoop installed and configured (you can refer to the previous practical for installation and configuration).

• A running Hadoop cluster in Pseudo-Distributed or Fully-Distributed mode.

• Familiarity with basic Hadoop commands (e.g., hadoop fs, hdfs dfs).

Materials:

• Access to a Hadoop cluster.

Procedure:

1. Adding Files and Directories:

1.1. Open a terminal on your local machine.

1.2. Use the hadoop fs or hdfs dfs command to add a local file to the HDFS. For example, to add a file named
example.txt from your local system to HDFS:

hadoop fs -copyFromLocal /path/to/local/example.txt /user/yourusername/

1.3. Check if the file has been successfully copied to HDFS:

hadoop fs -ls /user/yourusername/

1.4. To create a directory in HDFS, use the following command:

hadoop fs -mkdir /user/yourusername/new_directory

2. Retrieving Files:

2.1. Retrieve a file from HDFS to your local filesystem using the hadoop fs or hdfs dfs command. For example, to
retrieve example.txt from HDFS to your local directory:

hadoop fs -copyToLocal /user/yourusername/example.txt /path/to/local/

2.2. Verify that the file has been copied to your local directory.

3. Deleting Files:

3.1. Use the hadoop fs or hdfs dfs command to delete a file in HDFS. For example, to delete example.txt:

hadoop fs -rm /user/yourusername/example.txt

3.2. Confirm that the file has been deleted:

hadoop fs -ls /user/yourusername/

4. Conclusion:

By completing this practical, you have learned how to perform basic file management tasks in Hadoop using HDFS.
You can add files and directories, retrieve files, and delete files as needed. These fundamental file management skills
are crucial when working with Hadoop for distributed data storage and processing.
Experiment 3
Experiment 4
Experiment 5
Experiment 6
Experiment 7
Experiment 8

Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
Lab 1 - Hadoop HDFS and MapReduce
No ratings yet
Lab 1 - Hadoop HDFS and MapReduce
4 pages
CCS334-BDA LAB MANUAL final (1)
No ratings yet
CCS334-BDA LAB MANUAL final (1)
46 pages
bigdatamanual(2)
No ratings yet
bigdatamanual(2)
45 pages
Big Data
No ratings yet
Big Data
23 pages
CCS334 BDA LAB MANUAL
No ratings yet
CCS334 BDA LAB MANUAL
48 pages
big data
No ratings yet
big data
28 pages
Big data analytics lab-JD
No ratings yet
Big data analytics lab-JD
49 pages
SSJ Bda File
No ratings yet
SSJ Bda File
16 pages
BIG data file
No ratings yet
BIG data file
28 pages
BDA-Lab Record
No ratings yet
BDA-Lab Record
43 pages
ccs 334 bigdata manual
No ratings yet
ccs 334 bigdata manual
45 pages
Bigdatamanualfinal 231019063224 d211cb48
No ratings yet
Bigdatamanualfinal 231019063224 d211cb48
45 pages
Big Data Analytics Lab Experiments
No ratings yet
Big Data Analytics Lab Experiments
16 pages
BDA LAB MANUAL
No ratings yet
BDA LAB MANUAL
45 pages
Big_data_Lab_Manual[1] (4)
No ratings yet
Big_data_Lab_Manual[1] (4)
32 pages
Big Data Record 2024-25
No ratings yet
Big Data Record 2024-25
46 pages
BigData_Lab_Manual
No ratings yet
BigData_Lab_Manual
44 pages
EXP 1-2
No ratings yet
EXP 1-2
9 pages
BDA LabManual
No ratings yet
BDA LabManual
20 pages
Big Data Manual Ai
No ratings yet
Big Data Manual Ai
33 pages
Week 1 in Terminal
No ratings yet
Week 1 in Terminal
10 pages
Bdafile
No ratings yet
Bdafile
9 pages
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
No ratings yet
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
35 pages
bda-manual
No ratings yet
bda-manual
33 pages
bi lab file
No ratings yet
bi lab file
19 pages
lab manual
No ratings yet
lab manual
34 pages
Big Data File
No ratings yet
Big Data File
16 pages
BDA LAB MANUEL
No ratings yet
BDA LAB MANUEL
9 pages
BIGDATA LAB MANUAL
No ratings yet
BIGDATA LAB MANUAL
27 pages
Big Data Manual
No ratings yet
Big Data Manual
19 pages
213nt1306- Big Data Analytics Lab Manual
No ratings yet
213nt1306- Big Data Analytics Lab Manual
80 pages
big datalab
No ratings yet
big datalab
4 pages
Hadoop File Complte
No ratings yet
Hadoop File Complte
18 pages
HadoopfilePP
No ratings yet
HadoopfilePP
83 pages
Computer Science & Engineering: Department of
No ratings yet
Computer Science & Engineering: Department of
6 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
Ba Lab Record-It b2022-26
No ratings yet
Ba Lab Record-It b2022-26
43 pages
Extracting Real Value From Your Data With Apache Hadoop: Sarah Sproehnle
No ratings yet
Extracting Real Value From Your Data With Apache Hadoop: Sarah Sproehnle
51 pages
Final Copy - BDA LAB Record
No ratings yet
Final Copy - BDA LAB Record
44 pages
A Report On Distributed Computing
No ratings yet
A Report On Distributed Computing
25 pages
Cloud PDF
No ratings yet
Cloud PDF
47 pages
HDFS
No ratings yet
HDFS
6 pages
BDA Record (1)
No ratings yet
BDA Record (1)
34 pages
bda lab record
No ratings yet
bda lab record
60 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
Bda Record
No ratings yet
Bda Record
46 pages
BDA lab manual UPDATED
No ratings yet
BDA lab manual UPDATED
45 pages
BD Lab File
No ratings yet
BD Lab File
39 pages
Big Data
No ratings yet
Big Data
67 pages
Group A 1st
No ratings yet
Group A 1st
4 pages
Hadoop1
No ratings yet
Hadoop1
15 pages
@bigdatalabfile 09
No ratings yet
@bigdatalabfile 09
35 pages
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
No ratings yet
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
61 pages
Amrita CC 3.1
No ratings yet
Amrita CC 3.1
7 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
34 pages
22MCC20028_Rohit_Dutta_BIG_Data_1.1
No ratings yet
22MCC20028_Rohit_Dutta_BIG_Data_1.1
7 pages
big-data-file
No ratings yet
big-data-file
32 pages

Bigdata Lab File

Uploaded by

Bigdata Lab File

Uploaded by

Remarks

2. Understanding Different Hadoop Modes:

• A running Hadoop cluster in Pseudo-Distributed or Fully-Distributed mode.

• Access to a Hadoop cluster.

1. Adding Files and Directories:

1.1. Open a terminal on your local machine.

hadoop fs -copyFromLocal /path/to/local/example.txt /user/yourusername/

1.3. Check if the file has been successfully copied to HDFS:

hadoop fs -ls /user/yourusername/

1.4. To create a directory in HDFS, use the following command:

hadoop fs -mkdir /user/yourusername/new_directory

hadoop fs -copyToLocal /user/yourusername/example.txt /path/to/local/

hadoop fs -rm /user/yourusername/example.txt

3.2. Confirm that the file has been deleted:

hadoop fs -ls /user/yourusername/

You might also like