0% found this document useful (0 votes)

16 views9 pages

Bdafile

Big Data Analytics File

Uploaded by

Kartikey TRIPATHI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views9 pages

Bdafile

Big Data Analytics File

Uploaded by

Kartikey TRIPATHI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

NETAJI SUBASH UNIVERSITY OF TECHNOLOGY

EAST CAMPUS
Geeta Colony, New Delhi- 110031

BIG DATA ANALYTICS

Course Code – CBCPC11

PRACTICAL FILE

Submitted By: Aarav Jain Submitted To: Shajal Afaq

Roll No: 2022UCB6063

INDEX
SNO EXPERIMENT DATE SUBMISSION SIGN
Experiment – 1
AIM: Installation of VMWare to set up Hadoop Environment and its ecosystems.

OUTPUT:
Experiment -2
AIM: To perform setting up of Hadoop in three operating modes: a)stand alone
(b)pseudo-distributed (c)fully-distributed

DESCRIPTION:
Hadoop is written in java, so you will need to have java in your machine,v6 or later. Sun's
Java Development Kit is the one most widely used with Hadoop, although others have
been reported to work.

Hadoop runs on Unix and Windows, Linux is the only supported production platform
,but the flavours of Unix (including MAC OS(x)) can be used to run Hadoop for
development. windows is only supported as a dev platform ,and additionally requires
Cygwin . During the installation you should include the open SSH packet if you plan to run
Hadoop in solo distributed mode.

ALGORITHM:
a) STEPS INVOLVED IN INSTALLING HDOOP IN STAND ALONE MODE

1. COMMAND FOR INSTALLING SSH IS : pseudo app get install ssh

2. COMMAND FOR KEY GENERATION IS : ssh-keygen-trsa-P's'
3. STORE THE KEY INTO RSA.PUB BY USING THE COMMAND : CAT
$HOME.ssh/id-isaput>>$HOME/.ssh/authorised_keys

4. Extract java using the command :tar-fz jdk 8u60-linux-i586.tar.gz

5. Extract eclipse using the command :tar XVfz-eclipse-jee-mars-tar-linux-gtk.tar.gz
6. Extract hdoop using the command :tar-XVfz-hdoop-2.71.tar.gz
7. move the java to /usr/library/jbm and eclipse to /opt/path. configure the java path in the
java
8. Export java path and hdoop path in ./bashrc
9. check is installation is successful or not by checking the java version and hdoop version
10. check the hadoop in stand alone mode working correctly or not by usng an
implicit hadoop jad file as word count.
11. if the word count is displayed correctly in -r-00000-filename it means the stand alone
mode is installed successfully
b) STEPS INVOLVED IN INSTALLING HDOOP IN PSSEUDO DISTRIBUTED
MODE:

1. In order to install hdoop in pseudo distributed mode we need to configure hddop

configuration file resides in the directory/home/systemname/hdoop/2.7.1/etc/hdoop

2. First configure the hdoop - env.sh file by changing the java path
3. Configure the core side.xml which contains the property tag, it contains the name and
value. Name as fs.defaultfs and value as hdfs://localhost:9000
configure yarn-side.xml

4. Configure mapred-side.xml before configure the copy mapred.side.xml.template to

side.xml
5. Now format the namenode by using the command hdfs-namenode-format

start -yarn.sh
start dfs.sh ,start the daemons like namenode,datnode run

jps which views all daemons.

6.Create a directory by using the cmd hdfsdfs-mkdir/csedir and enter some data into syastem
name.txt and copy from local directory to hdoop using cmd hdfsdfs-copy from localcsedir/
and run sample jar file.wordcount to check whether pseudo distributed is working or not

7. display content of file by using cmd hdfsdfs-cat/newdirectory/part-r-00000.

FULLY DISTRIBUTION MODE INSTALLATION:

ALGO:

1. Stop all single node cluster

$STOP CALL.SH

2. Decide one as namenode [master] and remaining as datanodes[slave] copy

public key to all 3 host to get a password less ssh access

$ssh-copy-id-I $HOME/.ssh/id_rsa.pub systemname @systemno.

3. Configure all configuration file, to name master and slave nodes

$ cd $HDOOP_HOME/etc/hdoop
$ nano core-side.xml
$ nano hdfs-side.xml
4. Add host name to file slaves and save it
$nano slaves

5. Configure $nano yarn-side.xml

6. Do in master node
$hdfs namenode format
$start dfs.sh
$start yarn.sh

7.Format namenode

8. Daemons starting in master and slave node

9. End

INPUT FORMAT:
ubantu @localhost>jps

OUTPUT FORMAT:

Datanode, Name node, Secondary namenode, Node manager , resource manager.

Experiment – 3
AIM: Implement the following task management task in Hadoop: a) Add file to
directory (b) retrieving files (c) delete files

DESCRIPTION:
HDFS is a scalable distributed file system designed to scale to petabytes of data while
running on top of underline file system of OS. HDFS keeps track of where the data resides in
a network by associating the names of RACK or network switch with the data set. This
allows Hadoop to efficiently schedule task to those nodes that contains data, or which are
nearest to optimizing bandwidth utilisation. Hadoop provide a set of command line, utilities
that work similarly to the Linux file commands and serve as primary interface with HDFS.
We are going to have a look into HDFS by interacting with it from the command line. We
will take a look at the most common file management task in Hadoop, which
includes;
a) add files and directory to HDFS.
b) retrieve files from HDFS to local file system
c) deleting files from HDFS.

ALGORITHM:
1)Adding Files and Directories to HDFS:

Before you can run Hadoop programme on data stored in HDFS, you will need to put the
data into HDFS first. Lets create a directory and put a file in it. HDFS has a default working
directory of /user/$USER where $USER is your user name.
This directory is automatically created for you, though, lets create it with mkdir
command. For the purpose of illustration we use chuck. You should substitute your
username in the example command.
hadoop fs-mkdir/user/chuck
hadoop fs-put example.txt
hadoop fs-put example.txt/user/chuck
2)Retrieve file from HDFS :

The Hadoop command get copies file from HDFS back to the local file system to
retrieve example.txt , we can run the following command
hadoop fs-cat example.txt

3)Delete file from HDFS :

hadoop fs-rm example.txt

command for create A DIRECTORY IN HDFS is :
hdfs dfs mkdir/lendicse
command for add A DIRECTORY IN HDFS is : "hdfs dfs-put lendi_english

4)NFS to HDFS copying from directory command is :

‘hdfs dfs -copyFROMLOCAL/home/lendi/desktop/shakes/glossary/lendicse/’ View

the file by command “ hdfs dfs-cta/lendi_english/glossary”

Command for listing items in hadoop is :

“hdfs dfs-ls hdfs://localhost:9000/”

Command for deleting the file is : “hdfs dfs rmr/”

INPUT : As any data of format structured, semi-structured and unstructured.

EXPECTED OUTPUT:

Hadoop Lab Manual
No ratings yet
Hadoop Lab Manual
54 pages
Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
213nt1306 - Big Data Analytics Lab Manual
No ratings yet
213nt1306 - Big Data Analytics Lab Manual
80 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
Bda Lab
No ratings yet
Bda Lab
47 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
Cloud PDF
No ratings yet
Cloud PDF
47 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
45 pages
Ccs 334 Bigdata Manual
No ratings yet
Ccs 334 Bigdata Manual
45 pages
1 Big Data Lab - 230823 - 103054
No ratings yet
1 Big Data Lab - 230823 - 103054
34 pages
Big Data Analytics lab-JD
No ratings yet
Big Data Analytics lab-JD
49 pages
Ccs334 Bda Lab Manual PRINT
No ratings yet
Ccs334 Bda Lab Manual PRINT
53 pages
BDA-Lab Record
No ratings yet
BDA-Lab Record
43 pages
Aryan
No ratings yet
Aryan
60 pages
Big Data
No ratings yet
Big Data
23 pages
CCS334 Bda Lab Manual
No ratings yet
CCS334 Bda Lab Manual
48 pages
Anurag 1-6 Merged
No ratings yet
Anurag 1-6 Merged
60 pages
Bigdata Lab File
No ratings yet
Bigdata Lab File
20 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
42 pages
Pro 3
No ratings yet
Pro 3
45 pages
3 Hadoop
No ratings yet
3 Hadoop
40 pages
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
No ratings yet
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
35 pages
A Report On Distributed Computing
No ratings yet
A Report On Distributed Computing
25 pages
Bigdatamanualfinal 231019063224 d211cb48
No ratings yet
Bigdatamanualfinal 231019063224 d211cb48
45 pages
Big Data Manual Ai
No ratings yet
Big Data Manual Ai
33 pages
Prachi 20CS111 BDALab File
No ratings yet
Prachi 20CS111 BDALab File
20 pages
1.mrplab Intro
No ratings yet
1.mrplab Intro
18 pages
BigData Lab Manual
No ratings yet
BigData Lab Manual
44 pages
Big Data File
No ratings yet
Big Data File
16 pages
Practice MTCRE
No ratings yet
Practice MTCRE
7 pages
Hadoop Installation
No ratings yet
Hadoop Installation
11 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
49 pages
Lab Manual
No ratings yet
Lab Manual
34 pages
BDA Lab Manual UPDATED
No ratings yet
BDA Lab Manual UPDATED
45 pages
Dsa Practical File
No ratings yet
Dsa Practical File
16 pages
CCS334-BDA LAB MANUAL Final
No ratings yet
CCS334-BDA LAB MANUAL Final
46 pages
Lab Manual
No ratings yet
Lab Manual
27 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
26 pages
Ccs334 Bda Lab Ex
No ratings yet
Ccs334 Bda Lab Ex
45 pages
How To Install Hadoop On Ubuntu 18
No ratings yet
How To Install Hadoop On Ubuntu 18
15 pages
Big Data Record 2024-25
No ratings yet
Big Data Record 2024-25
46 pages
Bda Lab Manuel
No ratings yet
Bda Lab Manuel
9 pages
Experiment-2 BDA Lab
No ratings yet
Experiment-2 BDA Lab
13 pages
Tle-Computer Systems SERVICING (Grade 10) : First Quarter - Module 2 Components of Computer System Hardware Tools
No ratings yet
Tle-Computer Systems SERVICING (Grade 10) : First Quarter - Module 2 Components of Computer System Hardware Tools
10 pages
Hadoop Installation Manual 2.odt
No ratings yet
Hadoop Installation Manual 2.odt
20 pages
Bigdatamanual
No ratings yet
Bigdatamanual
45 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
32 pages
Computer Science & Engineering: Department of
No ratings yet
Computer Science & Engineering: Department of
6 pages
Big Datalab
No ratings yet
Big Datalab
4 pages
Assignment Tanupriya BDDV
No ratings yet
Assignment Tanupriya BDDV
8 pages
Hadoop Single Node Cluster Setup Steps
No ratings yet
Hadoop Single Node Cluster Setup Steps
7 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
Big Data Analytics Lab Experiments
No ratings yet
Big Data Analytics Lab Experiments
16 pages
Hadoop 6
No ratings yet
Hadoop 6
5 pages
V1600Gx-B Series Release Notes V1.4.1R
No ratings yet
V1600Gx-B Series Release Notes V1.4.1R
16 pages
Bi Lab File
No ratings yet
Bi Lab File
19 pages
CCS334 Bda
No ratings yet
CCS334 Bda
23 pages
Hadoop Configuration
No ratings yet
Hadoop Configuration
12 pages
Week 1 in Terminal
No ratings yet
Week 1 in Terminal
10 pages
Exp 1-2
No ratings yet
Exp 1-2
9 pages
Cloud Paks Foundational Services Level 1 Quiz Attempt Review PDF
100% (1)
Cloud Paks Foundational Services Level 1 Quiz Attempt Review PDF
9 pages
I Truvision Navigator 6.0 User Manual en 2916
No ratings yet
I Truvision Navigator 6.0 User Manual en 2916
115 pages
Amrita CC 3.1
No ratings yet
Amrita CC 3.1
7 pages
NE40 Configuration Guide - Basic Configurations (V600R003C00 - 02)
No ratings yet
NE40 Configuration Guide - Basic Configurations (V600R003C00 - 02)
341 pages
Lab 1 - Hadoop HDFS and MapReduce
No ratings yet
Lab 1 - Hadoop HDFS and MapReduce
4 pages
Amazon Web Services Actualtests clf-c02 Sample Question 2023-Dec-03 by Lynn 36q Vce
No ratings yet
Amazon Web Services Actualtests clf-c02 Sample Question 2023-Dec-03 by Lynn 36q Vce
17 pages
Generation of Computers
No ratings yet
Generation of Computers
9 pages
3592 Operator Guide
No ratings yet
3592 Operator Guide
103 pages
1f98b77bcfcc427bb8c3e1bc9255a57a
No ratings yet
1f98b77bcfcc427bb8c3e1bc9255a57a
517 pages
1.1 Informix Fundamentals
No ratings yet
1.1 Informix Fundamentals
61 pages
FICT
No ratings yet
FICT
67 pages
PI Manual Logger
No ratings yet
PI Manual Logger
241 pages
Enterpise Campus Design Routed Access
No ratings yet
Enterpise Campus Design Routed Access
90 pages
Advanced Database Chapter 5
No ratings yet
Advanced Database Chapter 5
19 pages
ch6 Isra
No ratings yet
ch6 Isra
55 pages
Latitude 14 5491 Laptop - Specifications - en Us
No ratings yet
Latitude 14 5491 Laptop - Specifications - en Us
37 pages
EdgeTech JSF DATA FILE Description 0004824 - Rev - 1 - 20
No ratings yet
EdgeTech JSF DATA FILE Description 0004824 - Rev - 1 - 20
38 pages
Gigacore Ethernet Switches: Specifications (1/2)
No ratings yet
Gigacore Ethernet Switches: Specifications (1/2)
2 pages
CCN-Lab 9
No ratings yet
CCN-Lab 9
13 pages
GC 2024 11 17
No ratings yet
GC 2024 11 17
14 pages
Data Representation
No ratings yet
Data Representation
16 pages
01 Florian Kaltenberger Eurecom Status Update and Roadmap of 5G Nsa and Sa Developments
No ratings yet
01 Florian Kaltenberger Eurecom Status Update and Roadmap of 5G Nsa and Sa Developments
13 pages
How To Export and Mount The Nfs Filesystem in Aix
No ratings yet
How To Export and Mount The Nfs Filesystem in Aix
4 pages
Arithmetic Operation Using 8085-1
No ratings yet
Arithmetic Operation Using 8085-1
6 pages
Wireshark UDP v6.1
No ratings yet
Wireshark UDP v6.1
2 pages
CommVault - Support Model - v01 - 20200203
No ratings yet
CommVault - Support Model - v01 - 20200203
9 pages
Application Architectures: Single Tier Architecture
No ratings yet
Application Architectures: Single Tier Architecture
2 pages
Integrated Circuit Design of 4-Bit Booth Multiplier Radix-4 Using Microwind Software
No ratings yet
Integrated Circuit Design of 4-Bit Booth Multiplier Radix-4 Using Microwind Software
5 pages
Nikhil Kshirsagar::// WWW - Wpi - Edu
No ratings yet
Nikhil Kshirsagar::// WWW - Wpi - Edu
4 pages
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
From Everand
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
Dr. Hidaia Mahmood Alassouli
No ratings yet

Bdafile

Uploaded by

Bdafile

Uploaded by

NETAJI SUBASH UNIVERSITY OF TECHNOLOGY

BIG DATA ANALYTICS

Submitted By: Aarav Jain Submitted To: Shajal Afaq

Roll No: 2022UCB6063

1. COMMAND FOR INSTALLING SSH IS : pseudo app get install ssh

4. Extract java using the command :tar-fz jdk 8u60-linux-i586.tar.gz

1. In order to install hdoop in pseudo distributed mode we need to configure hddop

4. Configure mapred-side.xml before configure the copy mapred.side.xml.template to

jps which views all daemons.

7. display content of file by using cmd hdfsdfs-cat/newdirectory/part-r-00000.

FULLY DISTRIBUTION MODE INSTALLATION:

1. Stop all single node cluster

2. Decide one as namenode [master] and remaining as datanodes[slave] copy

$ssh-copy-id-I $HOME/.ssh/id_rsa.pub systemname @systemno.

5. Configure $nano yarn-side.xml

8. Daemons starting in master and slave node

Datanode, Name node, Secondary namenode, Node manager , resource manager.

3)Delete file from HDFS :

hadoop fs-rm example.txt

4)NFS to HDFS copying from directory command is :

‘hdfs dfs -copyFROMLOCAL/home/lendi/desktop/shakes/glossary/lendicse/’ View

the file by command “ hdfs dfs-cta/lendi_english/glossary”

“hdfs dfs-ls hdfs://localhost:9000/”

INPUT : As any data of format structured, semi-structured and unstructured.

You might also like