0% found this document useful (0 votes)

11 views6 pages

Bda 1

Big data experiment

Uploaded by

Vijay Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views6 pages

Bda 1

Big data experiment

Uploaded by

Vijay Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

EXPERIMENT 1

Aim: Install Apache Hadoop

Theory: Hadoop is a Java-based programming framework that supports the processing and
storage of extremely large datasets on a cluster of inexpensive machines. It was the first major
open source project in the big data playing field and is sponsored by the Apache Software
Foundation.

Hadoop-2.7.3 is comprised of four main layers:

• Hadoop Common is the collection of utilities and libraries that support other Hadoop
modules.
• HDFS, which stands for Hadoop Distributed File System, is responsible for persisting
data to disk.
• YARN, short for Yet Another Resource Negotiator, is the "operating system" for
HDFS.
• MapReduce is the original processing model for Hadoop clusters. It distributes work
within the cluster or map, then organizes and reduces the results from the nodes into a
response to a query. Many other processing models are available for the 2.x version of
Hadoop.

Installation:

Prerequisites:
Step1: Installing Java 8 version.
OpenJDK version "1.8.0_91"
OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-3ubuntu1~16.04.1-b14)
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode) This output verifies
that OpenJDK has been successfully installed. Note: To set the path for
environment variables. i.e. JAVA_HOME

Step2: Installing Hadoop

With Java in place, we'll visit the Apache Hadoop Releases page to find the most
recent stable release. Follow the binary for the current release:

Download Hadoop from www.hadoop.apache.org

Vijay Kumar | 03020811922

Fig 1: Apache Hadoop: Official Download Page

Fig 2

Procedure to Run Hadoop

1. Install Apache Hadoop 2.2.0 in Microsoft Windows OS: If Apache Hadoop 2.2.0 is not
already installed then follow the post Build, Install, Configure and Run Apache Hadoop
2.2.0 in Microsoft Windows OS.

2. Start HDFS (Namenode and Datanode) and YARN (Resource Manager and Node
Manager)
Run following commands.
Command Prompt
C:\Users\abhijitg>cd c:\hadoop c:\hadoop>sbin\
start-dfs c:\hadoop>sbin\start-yarn
starting yarn daemons

Namenode, Datanode, Resource Manager and Node Manager will be started in

few minutes and ready to execute Hadoop MapReduce job in the Single Node
(pseudo-distributed mode) cluster.

Vijay Kumar | 03020811922

Fig 3: Namenode and Datanode

Fig 4: Resource Manager & Node Manager

Run wordcount MapReduce job Now we'll run wordcount MapReduce job available
in %HADOOP_HOME%\share\hadoop\mapreduce\hadoop-mapreduce-examples-
2.2.0.jar
Create a text file with some content. We'll pass this file as input to
the wordcount MapReduce job for counting words.
C:\file1.txt

Vijay Kumar | 03020811922

Fig 5

Create a directory (say 'input') in HDFS to keep all the text files (say 'file1.txt') to be used for
counting words.
C:\Users\abhijitg>cd c:\hadoop C:\hadoop>bin\hdfs dfs -mkdir input
Copy the text file(say 'file1.txt') from local disk to the newly created 'input' directory in
HDFS.
C:\hadoop>bin\hdfs dfs -copyFromLocal c:/file1.txt input

Check content of the copied file.

C:\hadoop>hdfs dfs -ls input

Found 1 items
-rw-r--r-- 1 ABHIJITG supergroup 55 2014-02-03 13:19 input/file1.txt

C:\hadoop>bin\hdfs dfs -cat input/file1.txt

Install Hadoop
Run Hadoop Wordcount Mapreduce Example

Run the wordcount MapReduce job provided in %HADOOP_HOME%\share\hadoop\

mapreduce\hadoop-mapreduce-examples-2.2.0.jar
C:\hadoop>bin\yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-
2.2.0.jar wordcount input output
14/02/03 13:22:02 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/02/03 13:22:03 INFO input.FileInputFormat: Total input paths to process : 1
14/02/03 13:22:03 INFO mapreduce.JobSubmitter: number of splits:1
::
14/02/03 13:22:04 INFO mapreduce.JobSubmitter: Submitting tokens for job:
job_1391412385921_0002
14/02/03 13:22:04 INFO impl.YarnClientImpl: Submitted application
application_1391412385921_0002 to ResourceManager at /0.0.0.0:8032 14/02/03
13:22:04 INFO mapreduce.Job: The url to track the job:
http://ABHIJITG:8088/proxy/application_1391412385921_0002/
14/02/03 13:22:04 INFO mapreduce.Job: Running job: job_1391412385921_0002
14/02/03 13:22:14 INFO mapreduce.Job: Job job_1391412385921_0002 running in uber
mode : false
14/02/03 13:22:14 INFO mapreduce.Job: map 0% reduce 0%
14/02/03 13:22:22 INFO mapreduce.Job: map 100% reduce 0%
14/02/03 13:22:30 INFO mapreduce.Job: map 100% reduce 100%
14/02/03 13:22:30 INFO mapreduce.Job: Job job_1391412385921_0002 completed
successfully
14/02/03 13:22:31 INFO mapreduce.Job: Counters: 43
File System Counters

Vijay Kumar | 03020811922

FILE: Number of bytes read=89
FILE: Number of bytes written=160142
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0

HDFS: Number of bytes read=171

HDFS: Number of bytes written=59
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=5657
Total time spent by all reduces in occupied slots (ms)=6128
Map-Reduce Framework
Map input records=2
Map output records=7
Map output bytes=82
Map output materialized bytes=89
Input split bytes=116
Combine input records=7
Combine output records=6
Reduce input groups=6
Reduce shuffle bytes=89
Reduce input records=6
Reduce output records=6
Spilled Records=12
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=145
CPU time spent (ms)=1418
Physical memory (bytes) snapshot=368246784
Virtual memory (bytes) snapshot=513716224
Total committed heap usage (bytes)=307757056
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0

Vijay Kumar | 03020811922

WRONG_LENGTH=0
WRONG_MAP=0 WRONG_REDUCE=0
File Input Format Counters
Bytes Read=55
File Output Format Counters
Bytes Written=59

http://abhijitg:8088/cluster

Fig 6: Hadoop Installed

Result: We've installed Hadoop in stand-alone mode and verified it by running an example
program it provided.

Vijay Kumar | 03020811922

Cp5261 Da Lab Me-Cse 2021 - Edit
No ratings yet
Cp5261 Da Lab Me-Cse 2021 - Edit
88 pages
BDA Output
No ratings yet
BDA Output
32 pages
Da Lab Record - Merged
No ratings yet
Da Lab Record - Merged
48 pages
CSF443 Lab-Report Nimish Shandilya 1000016934
No ratings yet
CSF443 Lab-Report Nimish Shandilya 1000016934
17 pages
Bda Exp1 Chinmay
No ratings yet
Bda Exp1 Chinmay
13 pages
MapReduce Merged
No ratings yet
MapReduce Merged
18 pages
Unit IV Programming Model
No ratings yet
Unit IV Programming Model
30 pages
Big Data Analysis 3170722 Lab Manual
No ratings yet
Big Data Analysis 3170722 Lab Manual
68 pages
BDA Lab Manual - Organized
No ratings yet
BDA Lab Manual - Organized
69 pages
Write A Mapreduce Program To Find Dept Wise Salary. Empno Empname Dept Salary
100% (1)
Write A Mapreduce Program To Find Dept Wise Salary. Empno Empname Dept Salary
5 pages
Exp2 Hadoop
No ratings yet
Exp2 Hadoop
6 pages
DSBDA GRP B 1
No ratings yet
DSBDA GRP B 1
8 pages
DSBDA GRP B 1
No ratings yet
DSBDA GRP B 1
8 pages
BDA Record
No ratings yet
BDA Record
58 pages
Dsbda Group B 1
No ratings yet
Dsbda Group B 1
5 pages
BDA Lab
No ratings yet
BDA Lab
13 pages
Hadoop Lab Notes: Nicola Tonellotto November 15, 2010
No ratings yet
Hadoop Lab Notes: Nicola Tonellotto November 15, 2010
9 pages
Hadoop Lab Hdfs FSB
No ratings yet
Hadoop Lab Hdfs FSB
10 pages
Hadoop Installation Steps
No ratings yet
Hadoop Installation Steps
10 pages
Big Data Analytics Lab Manual (BE AI&DS)
No ratings yet
Big Data Analytics Lab Manual (BE AI&DS)
29 pages
Big Data Lab Manual Printout
No ratings yet
Big Data Lab Manual Printout
51 pages
Hadoop
No ratings yet
Hadoop
51 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
59 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
32 pages
Bda Lab S
No ratings yet
Bda Lab S
92 pages
Hive Assignment Logs
No ratings yet
Hive Assignment Logs
37 pages
Creation and Execution Process Document
No ratings yet
Creation and Execution Process Document
4 pages
Bda Megh
No ratings yet
Bda Megh
50 pages
Big Data Akshat
No ratings yet
Big Data Akshat
57 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
48 pages
Bda Record
No ratings yet
Bda Record
83 pages
Practical-1: Aim:-Make A Single Node Cluster in Hadoop. Solution
No ratings yet
Practical-1: Aim:-Make A Single Node Cluster in Hadoop. Solution
49 pages
104 Da11-13
No ratings yet
104 Da11-13
14 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
58 pages
Data Analytics Lab
No ratings yet
Data Analytics Lab
42 pages
Big Data File
No ratings yet
Big Data File
16 pages
Data Science
No ratings yet
Data Science
82 pages
Big Data
No ratings yet
Big Data
28 pages
Build App With ChatGPT
100% (1)
Build App With ChatGPT
96 pages
Big Data Manual
No ratings yet
Big Data Manual
82 pages
BDA Manual
No ratings yet
BDA Manual
41 pages
Bda Manual
No ratings yet
Bda Manual
33 pages
Bigdata Lab
No ratings yet
Bigdata Lab
55 pages
Bda Experiment No2
No ratings yet
Bda Experiment No2
12 pages
Notes
No ratings yet
Notes
53 pages
@bigdatalabfile 09
No ratings yet
@bigdatalabfile 09
35 pages
BIG Data File
No ratings yet
BIG Data File
28 pages
Parlab Parallel Boot Camp: Cloud Computing With Mapreduce and Hadoop
No ratings yet
Parlab Parallel Boot Camp: Cloud Computing With Mapreduce and Hadoop
55 pages
CS-702 (D) BigData
No ratings yet
CS-702 (D) BigData
61 pages
BDA Lab 8 Manual
No ratings yet
BDA Lab 8 Manual
7 pages
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
No ratings yet
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
11 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
Palak
No ratings yet
Palak
10 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
Exp 1-2
No ratings yet
Exp 1-2
9 pages
Hadoop Administrator Training - Lab Hand Book
No ratings yet
Hadoop Administrator Training - Lab Hand Book
12 pages
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
No ratings yet
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
22 pages
Prerequisites: Single Node Setup Cluster Setup
No ratings yet
Prerequisites: Single Node Setup Cluster Setup
5 pages
Swings Java
No ratings yet
Swings Java
90 pages
Visual Studio Code - Mac, Linux, Windows
No ratings yet
Visual Studio Code - Mac, Linux, Windows
2 pages
LabManualV1 5
No ratings yet
LabManualV1 5
133 pages
JNDI
No ratings yet
JNDI
27 pages
JAVA Security
100% (1)
JAVA Security
61 pages
FIFA Mod Manager Log20250104
No ratings yet
FIFA Mod Manager Log20250104
10 pages
Mariadb Mysql
No ratings yet
Mariadb Mysql
2 pages
Aon Test Guide For Sab Tool - Chitkara University
No ratings yet
Aon Test Guide For Sab Tool - Chitkara University
38 pages
Gpedit Tutorial
0% (1)
Gpedit Tutorial
4 pages
How To Debloat Windows 10 - Ultimate Guide 2023
No ratings yet
How To Debloat Windows 10 - Ultimate Guide 2023
21 pages
Instalare Windows 98
No ratings yet
Instalare Windows 98
35 pages
Abstract Window Toolkit (AWT) - Library of Java Package - Contains 25 Packages With Hundreds of Classes - Used For GUI - Import Java - Awt.
No ratings yet
Abstract Window Toolkit (AWT) - Library of Java Package - Contains 25 Packages With Hundreds of Classes - Used For GUI - Import Java - Awt.
52 pages
Data Domain Instructions
No ratings yet
Data Domain Instructions
2 pages
How To Flash A Nexus 7 With A Factory Image
No ratings yet
How To Flash A Nexus 7 With A Factory Image
9 pages
Cannot Connect To My Printer - Operation Failed With Error 0x00000002
No ratings yet
Cannot Connect To My Printer - Operation Failed With Error 0x00000002
11 pages
Native Updater Log
No ratings yet
Native Updater Log
13 pages
Flutter Setup
No ratings yet
Flutter Setup
3 pages
.Google - Android.youtube Logcat
No ratings yet
.Google - Android.youtube Logcat
19 pages
Manager Guide
No ratings yet
Manager Guide
38 pages
Settings Cache
No ratings yet
Settings Cache
159 pages
Cafezee PDF
No ratings yet
Cafezee PDF
121 pages
Comfar Iii Expert 32 PDF
No ratings yet
Comfar Iii Expert 32 PDF
2 pages
SoftInfo CybTouch Converter V2.0
No ratings yet
SoftInfo CybTouch Converter V2.0
3 pages
Third Quarter, Week 1
No ratings yet
Third Quarter, Week 1
10 pages
1F3Y4A - HP Smart Tank 585 All-in-One Printer - LAR - SPL - D01
No ratings yet
1F3Y4A - HP Smart Tank 585 All-in-One Printer - LAR - SPL - D01
2 pages
Introduction of OfficeServ DM - Rev02
No ratings yet
Introduction of OfficeServ DM - Rev02
12 pages
Sim Venture Install 2019 PDF
No ratings yet
Sim Venture Install 2019 PDF
18 pages
Building ITK-SNAP On Windows
No ratings yet
Building ITK-SNAP On Windows
5 pages
Witch-Hanger: Graphical Pipe Support Design Program
No ratings yet
Witch-Hanger: Graphical Pipe Support Design Program
1 page
Big Data Analytics
From Everand
Big Data Analytics
Nitin Kumar Yadav
No ratings yet
Learn Hive in 24 Hours
From Everand
Learn Hive in 24 Hours
Alex Nordeen
No ratings yet

Bda 1

Uploaded by

Bda 1

Uploaded by

EXPERIMENT 1

Aim: Install Apache Hadoop

Hadoop-2.7.3 is comprised of four main layers:

Step2: Installing Hadoop

Download Hadoop from www.hadoop.apache.org

Vijay Kumar | 03020811922

Procedure to Run Hadoop

Namenode, Datanode, Resource Manager and Node Manager will be started in

Vijay Kumar | 03020811922

Fig 4: Resource Manager & Node Manager

Vijay Kumar | 03020811922

Check content of the copied file.

C:\hadoop>hdfs dfs -ls input

C:\hadoop>bin\hdfs dfs -cat input/file1.txt

Run the wordcount MapReduce job provided in %HADOOP_HOME%\share\hadoop\

Vijay Kumar | 03020811922

HDFS: Number of bytes read=171

Vijay Kumar | 03020811922

Fig 6: Hadoop Installed

Vijay Kumar | 03020811922

You might also like