0% found this document useful (0 votes)
109 views20 pages

BDA Lab Manual

The document describes how to install Hadoop in three operating modes: standalone, pseudo-distributed, and fully distributed. It provides detailed steps to install Hadoop in standalone mode, including downloading and extracting files, configuring environment variables for Java and Hadoop, and running a sample wordcount program. For pseudo-distributed mode, it indicates Hadoop daemons will run as separate processes on a single machine. The full installation steps are not included.

Uploaded by

JAJULAHARIBABU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
109 views20 pages

BDA Lab Manual

The document describes how to install Hadoop in three operating modes: standalone, pseudo-distributed, and fully distributed. It provides detailed steps to install Hadoop in standalone mode, including downloading and extracting files, configuring environment variables for Java and Hadoop, and running a sample wordcount program. For pseudo-distributed mode, it indicates Hadoop daemons will run as separate processes on a single machine. The full installation steps are not included.

Uploaded by

JAJULAHARIBABU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 20

1.

Implement the following Data structures in Java


a)Linked Lists b) Stacks c) Queues d) Set e) Map
a)Linked Lists
import java.util.*;
public class LinkedListExample {
public static void main(String args[]) {
/* Linked List Declaration */
LinkedList<String> linkedlist = new LinkedList<String>();
/*add(String Element) is used for adding
* the elements to the linked list*/
linkedlist.add("Item1");
linkedlist.add("Item5");
linkedlist.add("Item3");
linkedlist.add("Item6");
linkedlist.add("Item2");
/*Display Linked List Content*/
System.out.println("Linked List Content: " +linkedlist);
/*Add First and Last Element*/
linkedlist.addFirst("First Item");
linkedlist.addLast("Last Item");
System.out.println("LinkedList Content after addition: " +linkedlist);
/*This is how to get and set Values*/
Object firstvar = linkedlist.get(0);
System.out.println("First element: " +firstvar);
linkedlist.set(0, "Changed first item");
Object firstvar2 = linkedlist.get(0);
System.out.println("First element after update by set method: " +firstvar2);
/*Remove first and last element*/
linkedlist.removeFirst();
linkedlist.removeLast();
System.out.println("LinkedList after deletion of first and last element: "
+linkedlist);
/* Add to a Position and remove from a position*/
linkedlist.add(0, "Newly added item");
linkedlist.remove(2);
System.out.println("Final Content: " +linkedlist);
}
}
Result:

Linked List Content: [Item1, Item5, Item3, Item6, Item2]


LinkedList Content after addition: [First Item, Item1, Item5, Item3, Item6, Item2,
Last Item]
First element: First Item
First element after update by set method: Changed first item
LinkedList after deletion of first and last element: [Item1, Item5, Item3, Item6,
Item2]
Final Content: [Newly added item, Item1, Item3, Item6, Item2]
b) STACKS:
package stacks;
public class MyStack {
private int maxSize;
private long[] stackArray;
private int top;
public MyStack(int s) {
maxSize = s;
stackArray = new long[maxSize];
top = -1;
}
public void push(long j) {
stackArray[++top] = j;
}
public long pop() {
return stackArray[top--];
}
public long peek() {
return stackArray[top];
}
public boolean isEmpty() {
return (top == -1);
}
public boolean isFull() {
return (top == maxSize - 1);
}
public static void main(String[] args) {
MyStack theStack = new MyStack(10);
theStack.push(10);
theStack.push(20);
theStack.push(30);
theStack.push(40);
theStack.push(50);
while (!theStack.isEmpty()) {
long value = theStack.pop();
System.out.print(value);
System.out.print(" "); }
System.out.println(""); }
}

Result
50 40 30 20 10

c) QUEUES:

package queues;
import java.util.*;
class GenQueue<E>
{
private LinkedList<E> list = new LinkedList<E>();
public void enqueue(E item)
{
list.addLast(item);
}
public E dequeue()
{
return list.poll();
}
public boolean hasItems()
{
return !list.isEmpty();
}
public int size()
{
return list.size();
}
public void addItems(GenQueue<? extends E> q)
{
while (q.hasItems())
list.addLast(q.dequeue());
}
}
public class GenQueueTest
{
public static void main(String[] args)
{
GenQueue<Employee> empList; empList = new
GenQueue<Employee>();
GenQueue<HourlyEmployee> hList;
hList = new GenQueue<HourlyEmployee>();
hList.enqueue(new HourlyEmployee( "Trump", "Donald"));
hList.enqueue(new HourlyEmployee( "Gates", "Bill"));
hList.enqueue(new HourlyEmployee( "Forbes", "Steve"));
empList.addItems(hList);
while (empList.hasItems())
{
Employee emp = empList.dequeue();
System.out.println(emp.firstName + " " + emp.lastName);
}
}
}
class Employee
{
public String lastName;
public String firstName;
public Employee()
{

}
public Employee(String last, String first)
{
this.lastName = last;
this.firstName = first;
}
public String toString()
{
return firstName + " " + lastName;
}
}
class HourlyEmployee extends Employee
{
public double hourlyRate;
public HourlyEmployee(String last, String first)
{
super(last, first);
}
}

d) SETS:

package sets;
import java.util.*;
public class SetDemo {
public static void main(String args[]) {
int count[] = {34, 22,10,60,30,22};
Set<Integer> set = new HashSet<Integer>();
try{
for(int i = 0; i<5; i++){
set.add(count[i]);
}
System.out.println(set);

TreeSet sortedSet = new TreeSet<Integer>(set);


System.out.println("The sorted list is:");
System.out.println(sortedSet);

System.out.println("The First element of the set is: "+


(Integer)sortedSet.first());
System.out.println("The last element of the set is: "+
(Integer)sortedSet.last());
}
catch(Exception e){}
}
}

e) MAPS

package maps;
import java.awt.Color;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;
/**
This program demonstrates a map that maps names to colors.
*/
public class MapDemo
{
public static void main(String[] args)
{
Map<String, Color> favoriteColors = new HashMap<String, Color>();
favoriteColors.put("Juliet", Color.BLUE);
favoriteColors.put("Romeo", Color.GREEN);
favoriteColors.put("Adam", Color.RED);
favoriteColors.put("Eve", Color.BLUE);
// Print all keys and values in the map
Set<String> keySet = favoriteColors.keySet();
for (String key : keySet)
{
Color value = favoriteColors.get(key);
System.out.println(key + " : " + value);
}
}
}
public static void main(String[] args) {
MyStack theStack = new MyStack(10);
theStack.push(1

(i)Perform setting up and Installing Hadoop in its three operating modes:


 Standalone,
 Pseudo distributed,
 Fully distributed

HADOOP INSTALLATION -STANDALONE MODE

After downloading Hadoop in your system, by default, it is configured in a


standalone mode and can be run as a single java process.
To install hadoop first we need to install java.
 JAVA INSTALLATION
Step1: Place software into system Downloads folder.
i.hadoop-2.7.2.tar.gz
ii.jdk-8u77-linux-i586.tar.gz
Step 2: Extract files in Downloads folder and rename hadoop2.7.2 as hadoop.
Step 3: Updating ubuntu
syntax : user@user-Thinkceter-E73:- $ sudo apt update
Step 4: Installing Openssh server
syntax:user@user-Thinkceter-E73:- $ sudo apt-get install openssh-server
Step5: Enter password in terminal.

Step 6: Follow the instructions prompted.


Step 7: To check local host connection type ssh localhost in terminal.
syntax: user@user-Thinkceter-E73:- $ssh localhost

Step 8: After connecting to local host type exit.

Step 9: open bashrc file


syntax:user@user-Thinkceter-E73:- $sudo gedit .bashrc
Step 10: Add these two lines to the end of the above file.

export JAVA_HOME=/home/user/Downloads/jdk1.8.0_77
export PATH=$PATH:$JAVA_HOME/bin

Step 11: Now apply all the changes into the current running system.
Close terminal. And open again or
user@user-Thinkceter-E73:- $ source ~/.bashrc
Step 12 : For verification of java path type echo $java_home in it.
 HADOOP INSTALLATION

Step 13: Open hadoop/etc/hadoop/hadoop-env.sh file and add this line at the end
of the file.
export JAVA_HOME=/home/user/Downloads/jdk1.8.0_77
save and exit.
Step 14: open bash rc file using the following command.
user@user-Thinkceter-E73:- $ sudo gedit .bashrc

At the end of the file add these two lines


export PATH=$PATH:/home/user/Downloads/hadoop/bin
export PATH=$PATH:/home/user/Downloads/hadoop/sbin
save and exit
step 15: Now close terminal and again open terminal.

for verification of hadoop installation


user@user-Thinkceter-E73:- $ hadoop
It will display with version.

 EXECUTION OF WORDCOUNT PROGRAM IN STANDALONE MODE


step 16: create directory and change to that directory and create two text files f1 and f2
Make sure that two files have some common words.
user@user-Thinkceter-E73:- mkdir input
user@user-Thinkceter-E73:- cd input

user@user-Thinkceter-E73:~/input$ vi f2.txt

f1.txt

user@user-Thinkceter-E73:~/input$ vi f1.txt
inception
requirements
analysis
design
development
implementation
testing
deployment

f2.txt

user@user-Thinkceter-E73:~/input$ vi f2.txt
inception
eloboration
elicitaion
implemenation
inception
development
tesing

Step 17:Go to home directory


user@user-Thinkceter-E73:- $ cd ..

step 18: Now execute the wordcount program using the following command
hadoop jar /home/user/Downloads/hadoop/share/hadoop/mapreduce/hadoop- mapreduce-examples-2.7.2.jar
wordcount input op

it display output as
user@user-Thinkceter-E73:- $ cat op/*
inception 3
requirements 1
analysis 1
design 1
development 2
implemenatation 2
deployment 1
testing 2

HADOOP INSTALLATION -PSEUDO DISTRIBUTION MODE

 It is a distributed simulation on single machine.


 This mode is useful for development.
 Each Hadoop daemon such as hdfs, yarn, MapReduce etc., will run as a separate
java process.

Steps for Hadoop installation in Pseduo Distributed Mode


1. Download JAVA Software
2. Download HADDOP Software
3. install java
4. Install ssh
5. set up ssh certificate
6. installl hadoop
7. configure hadoop
a)bashrc
b)haddop-env.sh
c)core-site.xml
d)hdfs-site.xml
e)mapred-site.xml.template
8. Format hadoop Filesystem
9. Start hadoop
10. Testing / running
11. Stopping hadoop

step 1 Downloading java


Hyperlink:
http://www.oracle.com/technetwork/j ava/javase/downloads/jdk8-downloads-2133151.html
File name: jdk-8u77-linux-i586.tar.gz

Step 2: Downloding Hadoop software


Hyperlink:
https://hadoop.apache.org/docs/stable/had oop-project-dist/hadoop- common/SingleCluster.html
(or)
Hyperlink:
http://www.apache.org/dyn/closer.cgi/hado op/common /
FILE NAME: hadoop-2.7.2.tar.gz
Step 3 : INSTALLING JAVA
 Extract jre file in to DOWNLOADS folder
 Add the below 2 lines to .bashrc file
export JAVA_HOME=/home/user/Downloads/jdk1.8.0_77
export PATH=$PATH:$JAVA_HOME/bin

Step 4 :installing SSH


At Prompt type the below command to install ssh server
user@user-Thinkceter-E73:- $ sudo apt-get install openssh-server

Step 5 :set up ssh certificate


To generate key for secured data transmission at $ prompt type
$ ssh-keygen -t rsa

copy the generated key into authorized keys.


$cat /home/user/.ssh/id_rsa.pub >> /home/user/.ssh/authorized_keys

Step 6 : Install hadoop


 Extract Hadoop zip file in to DOWNLOADS folder and rename it as hadoop
 Add the below 3 lines to .bashrc file
HADOOP_HOME=/home/user/Downloads/hadoop
export PATH=$PATH:/home/user/Downloads/hadoop/bin
export PATH=$PATH:/home/user/Downloads/hadoop/sbin
Step 7 :Configure hadoop
The 3 important configurations file are
 Core-site.xml
 Hdfs-site.xml
 Mapred-site.xml are compulsory for pseudo distributive mode of installation
and
 Hadoop-env.sh is optional
NOTE: all these files resides in Hadoop/etc/hadoop folder.

Step 7a) configure hadoop-env.sh


Add these 2 lines at the end of the file
export HADOOP_HOME=/home/user/Downloads/Hadoop
export JAVA_HOME=/home/user/Downloads/jdk1.8.0_77
step 7 b) configure coresite.xml
Open coresite.xml file and add these lines
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property> <name>dfs.permissions</name>
<value>false</value> </property>
</configuration>

Step 7c) Configure hdfs-site.xml fie


Open hdfs-site.xml and type the following at the end of the file .
<configuration>
<property> <name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

Step 7d) Configure mapred-site.xml

 In Hadoop/etc/Hadoop folder mapred-site.xml.template exist.


 So rename the file as mapred-site.xml
 Now open the file and type the following at the end of the file .
<configuration>
<property> <name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>

Step 8 :Format hadoop Filesystem


syntax: $ Hadoop namenode -format
Step 9: Start hadoop
$ start-all.sh

Step 10 : Testing / running


Browse the web interface for the NameNode and the JobTracker.
By default they are available at:
NameNode - http://localhost:50070/
JobTracker - http://localhost:50030/
Resourcemanager- http://localhost:8088
Step 11: Stopping hadoop
$ stop-all.sh

 Execution Of Wordcount Program In Pseudo Distributed Mode


1. Create directory on local file system
Command: $ mkdir input
2. Change into that directory
Command: $ cd input

3. Create a file with some data ie. List of student names


$ vi f1.txt
inception
requirements
analysis
design
development
implementation
testing
deployment

4. Create another file in same directory list of some other names , repeat some
names from first file
$ vi f2.txt
inception
eloboration
elicitaion
implemenation
inception
development
tesing

5. Create directory on hdfs file system


Command: $hadoop fs –mkdir /kits
6. Create another directory on hdfs file system in the above created directory.
Command: $hadoop fs –mkdir /kits/input

7. Transfer local directory with two files into hdfs file system
$ hdfs fs –put /home/user/input/f1 /kits/input/f1
$ hdfs fs –put /home/user/input/f2 /kits/input/f2

8. Run wordcount program


Command: $ hadoop jar /home/user/Downloads/hadoop/share/hadoop/mapreduce/hadoop-
mapreduce- examples-2.7.2.jar wordcount /kits/input /kits/output

9. Transfer out put directory from hdfs to local files system


Command: $hdfs fs –get /kits/output /home/user/output

10. Display the out put.

Command: $ cat /home/user/output/.*

inception 3
requirements 1
analysis 1
design 1
development 2
implemenatation 2
deployment 1
testing 2

Week 5:
3. Implement the following file management tasks in Hadoop
 Adding files and directories
 Retrieving files
 Deleting files

i) ADDING FILES AND DIRECTORIES :


a) creation of directories and sub directories :
to create directory in hadoop file system /HDFS
syntax:
$ hadoop fs -mkdir directory-name
example:
qisit@qisit-Vostro-1500:~$ hadoop fs -mkdir /kits
to create sub directory in hadoop file system /HDFS

syntax:
$ hadoop fs -mkdir path
example:
qisit@qisit-Vostro-1500:~$ hadoop fs -mkdir /kits/cse
to view directories created:
syantax:
$ hadoop fs -ls path
example:
$hadoop fs -ls kits
Found 1 items
drwxr-xr-x - qisit supergroup 0 2016-06-17 14:45 /kits/cse

Creation of files in HDFS/ hadoop file system:


we cannot directly create file in HDFS/hadoop file system.
For that we need to create files in local file system and then we should move/copy those files from
local file system to HDFS.

creating files in local file system:


we use gedit or vi editor to create file in Local File System.
Syntax:
gedit filename (or) vi filename
example:
qisit@qisit-Vostro-1500:~$ gedit input.txt
qisit@qisit-Vostro-1500:~$ vi input1.txt

To see the content of file:


qisit@qisit-Vostro-1500:~$ cat input.txt
welcome to KITS engineering college.
Branches:
CSE
ECE
EEE
CIVIL
MECHANICAL

qisit@qisit-Vostro-1500:~$ cat input1.txt


welcome to bigdata lab.
modules of hadoop:
hdfs
hbase
pig
hive

To view list of files in Local File System


Syntax:
$ ls
Example: qisit@qisit-Vostro-1500:~$ ls
Desktop dsjava input1 jhansi Music Templates
Documents examples.desktop input1.txt jout Pictures Videos
Downloads input input.txt kits Public
Now, we have to copy/move files from local file system to HDFS.
There are two ways to copy/move files from local file system to HDFS.

i)copyFromLocal: This command copies files from Local File System to HDFS
Syntax:
$hadoop fs -copyFromLocal <localsystempath> <HDFSPath>
Example:
hadoop fs -copyFromLocal /home/qisit/input.txt /kits/cse

ii) put: This command is also used to copy file from Local File System to HDFS.

Syntax:
$hadoop fs -put <localsystempath> <HDFSPath>
Example:
hadoop fs -put /home/qisit/input1.txt /kits/cse
To view files in HDFS:
Example: qisit@qisit-Vostro-1500:~$ hadoop fs -ls -R /kits
drwxr-xr-x - qisit supergroup 0 2016-06-17 15:27 /kits/cse
-rw-r--r-- 1 qisit supergroup 0 2016-06-17 15:21 /kits/cse/input.txt
-rw-r--r-- 1 qisit supergroup 78 2016-06-17 15:27 /kits/cse/input1.txt

ii) RETRIEVING FILES FROM HDFS:


there are two ways to retrieve files from HDFS to Local File System.
copyToLocal: This command is used to copy files from HDFS to LFS.
Syntax:
hadoop fs -copyToLocal <HDFS Path> <LFS Path>
Example:
qisit@qisit-Vostro-1500:~$ hadoop fs -copyToLocal /kits/cse/input.txt /home/qisit/

qisit@qisit-Vostro-1500:~$ ls

Desktop dsjava input1 jout Pictures Videos


Documents examples.desktop input.txt kits Public
Downloads input jhansi Music Templates
Get: This command is used to copy files from HDFS to LFS.
Syntax:
hadoop fs -get <HDFS Path> <LFS Path>
Example:
qisit@qisit-Vostro-1500:~$ hadoop fs -get /kits/cse/input1.txt /home/qisit/

qisit@qisit-Vostro-1500:~$ ls
Desktop dsjava input1 jhansi Music Templates
Documents examples.desktop input1.txt jout Pictures Videos
Downloads input input.txt kits Public
iii) DELETING FILES:
To delete file in HDFS:
Syntax:
$hadoop fs -rm filepath
Example:
qisit@qisit-Vostro-1500:~$ hadoop fs -rm /kits/cse/input.txt
Deletion interval = 0 minutes, Emptier interval = 0 minutes.
Deleted /kits/cse/input.txt
To delete directories in HDFS:
Syntax:
$hadoop fs -rmr directorypath
Example:
qisit@qisit-Vostro-1500:~$ hadoop fs -rmr /kits

Deletion interval = 0 minutes, Emptier interval = 0 minutes.


Deleted /kits

You might also like