0% found this document useful (0 votes)

13 views44 pages

BDA - LAB Manual

The document outlines procedures for working with HDFS commands and performing file management tasks in Hadoop. It includes steps for starting Hadoop, creating directories, copying files, displaying file contents, and managing files within HDFS. The document provides specific command-line instructions and expected outcomes for various file operations in Hadoop's file system.

Uploaded by

Gopika Gopika

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views44 pages

BDA - LAB Manual

Uploaded by

Gopika Gopika

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Ex No:01 Roll no:

Date: Page no:

WORKING WITH HDFS COMMANDS

AIM:
To work with HDFS commands.

PROCEDURE:

Step 1: Open the command prompt to start the Hadoop by typing the following command in
the specified path and hit enter.
C:\hadoop-2.8.0\sbin>start-all
Step 2: Open a new command prompt and start working with HDFS commands.

COMMANDS:

Objective: To print the Hadoop version.

Act:
C:\Users\CSE>hadoop version
Hadoop 2.8.0
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r
91f2b7a13d1e97be65db92ddabc627cc29ac0009
Compiled by jdu on 2017-03-17T04:12Z
Compiled with protoc 2.5.0
From source with checksum 60125541c2b3e266cbf3becc5bda666
This command was run using /C:/hadoop-2.8.0/share/hadoop/common/hadoop-common-2.8.0.jar

Objective: To create a directory by the name „bigdata‟and „analytics‟ in HDFS.

Act:
C:\Users\CSE>hdfs dfs -mkdir /bigdata
C:\Users\CSE>hdfs dfs -mkdir /analytics

Objective: To get the list of directories at the root of HDFS.

Act:
C:\Users\CSE>hdfs dfs -ls /
Found 2 items
drwxr-xr-x - CSE supergroup 0 2019-09-02 13:57 /analytics
drwxr-xr-x - CSE supergroup 0 2019-09-02 13:57 /bigdata
Objective: To copy a file from local file system to HDFS.
Act:
C:\Users\CSE>hdfs dfs -put C:/word.txt /bigdata/word.txt

Objective: To copy a file from local file system to HDFS via copyFromLocal command.
Act:
C:\Users\CSE>hdfs dfs -copyFromLocal C:/salary.txt /bigdata/salary.txt
Objective: To get the list of complete directories and files of HDFS.
Act:
C:\Users\CSE>hdfs dfs -ls -R /
drwxr-xr-x - CSE supergroup 0 2019-09-02 13:57 /analytics
drwxr-xr-x - CSE supergroup 0 2019-09-02 14:51 /bigdata
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:51 /bigdata/salary.txt
-rw-r--r-- 1 CSE supergroup 258 2019-09-02 14:49 /bigdata/word.txt

Objective: To display the contents of an HDFS file by the name word.txt on console.
Act:
C:\Users\CSE>hdfs dfs -cat /bigdata/word.txt
Hadoop,action,BigData,Export,update,visualization,word,count,file,mongodb,aggregate,hdfs,ins
ert,MapReduce,dataset,tableau,tool,action,Hadoop,count,hdfs,planning,administration,update,file
,mongodb,dataset,Hadoop,MapReduce,food,e-health,cancer,diagnosis

Objective: To copy a file from one directory to another on HDFS.

Act:
C:\Users\CSE>hdfs dfs -cp /bigdata/salary.txt /analytics/salary.txt

C:\Users\CSE>hdfs dfs -ls -R /

drwxr-xr-x - CSE supergroup 0 2019-09-02 14:56 /analytics
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:56 /analytics/salary.txt
drwxr-xr-x - CSE supergroup 0 2019-09-02 14:54 /bigdata
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:51 /bigdata/salary.txt
-rw-r--r-- 1 CSE supergroup 68 2019-09-02 14:54 /bigdata/student.txt
-rw-r--r-- 1 CSE supergroup 258 2019-09-02 14:49 /bigdata/word.txt

Objective: To move a file from one directory to another on HDFS.

Act:
C:\Users\CSE>hdfs dfs -mv /bigdata/student.txt /analytics/student.txt

C:\Users\CSE>hdfs dfs -ls -R /

drwxr-xr-x - CSE supergroup 0 2019-09-02 14:58 /analytics
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:56 /analytics/salary.txt
-rw-r--r-- 1 CSE supergroup 68 2019-09-02 14:54 /analytics/student.txt
drwxr-xr-x - CSE supergroup 0 2019-09-02 14:58 /bigdata
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:51 /bigdata/salary.txt
-rw-r--r-- 1 CSE supergroup 258 2019-09-02 14:49 /bigdata/word.txt

Objective: To create a file in HDFS with file size 0 bytes.

Act:
C:\Users\CSE>hdfs dfs -touchz /bigdata/sample.txt

C:\Users\CSE>hdfs dfs -ls -R /

drwxr-xr-x - CSE supergroup 0 2019-09-02 14:58 /analytics
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:56 /analytics/salary.txt
-rw-r--r-- 1 CSE supergroup 68 2019-09-02 14:54 /analytics/student.txt
drwxr-xr-x - CSE supergroup 0 2019-09-02 15:00 /bigdata
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:51 /bigdata/salary.txt
-rw-r--r-- 1 CSE supergroup 0 2019-09-02 15:00 /bigdata/sample.txt
-rw-r--r-- 1 CSE supergroup 258 2019-09-02 14:49 /bigdata/word.txt

Objective: To count the number of directories, files, and bytes under the specified path.
Act:
C:\Users\CSE>hdfs dfs -count /
3 516510 /

Objective: To show the last 1 KB of the file on console or stdout.

Act:
C:\Users\CSE>hdfs dfs -tail /bigdata/salary.txt
H_CLERK 3900 123 50
194 Samuel McCain F SMCCAIN 650.501.3876 01-JUL-06 SH_CLERK 3200
123 50
195 Vance Jones M VJONES 650.501.4876 17-MAR-07 SH_CLERK 2800
123 50
196 Alana Walsh M AWALSH 650.507.9811 24-APR-06 SH_CLERK 3100
124 50
197 Kevin Feeney M KFEENEY 650.507.9822 23-MAY-06 SH_CLERK 3000
124 50
198 Donald OConnell M DOCONNEL 650.507.9833 21-JUN-07
SH_CLERK 2600
124 50
199 Douglas Grant F DGRANT 650.507.9844 13-JAN-08 SH_CLERK 2600
124 50
200 Jennifer Whalen M JWHALEN 515.123.4444 17-SEP-03 AD_ASST
4400 101 10
201 Michael Hartstein M MHARTSTE 515.123.5555 17-FEB-04
MK_MAN 13000 100
20
202 Pat Fay M PFAY 603.123.6666 17-AUG-05 MK_REP 6000 201
20
203 Susan Mavris F SMAVRIS 515.123.7777 07-JUN-02 HR_REP 6500 101
40
204 Hermann Baer M HBAER 515.123.8888 07-JUN-02 PR_REP 10000 101
70
205 Shelley Higgins M SHIGGINS 515.123.8080 07-JUN-02 AC_MGR
12008 101 110
206 William Gietz M WGIETZ 515.123.8181 07-JUN-02 AC_ACCOUNT 8300
205 110
7 NIKHIL RANJAN M NIKSR 515.123.4568 21-SEP-11 AD_VP 20400
100 90

Objective: To show disk usage, in bytes, for all the files present on the path specified.
Act:
C:\Users\CSE>hdfs dfs -du /bigdata
8092 /bigdata/salary.txt
0 /bigdata/sample.txt
258 /bigdata/word.txt

Objective:To empty the trash in Hadoop file system.

Act:
C:\Users\CSE>hdfs dfs -expunge

Objective:To append the content of the local file to the specified destination file on HDFS.The
destination file will be created if it does not exist. If local file is specified as -, then the input is
read from stdin.
Act:
C:\Users\CSE>hdfs dfs -appendToFile C:/word.txt /analytics/student.txt

C:\Users\CSE>hdfs dfs -cat /analytics/student.txt

1001,John,45
1002,Jack,39
1003,Alex,44
1004,Smith,38
1005,bob,33Hadoop,action,Big Data,Export,update,visualization,word,count,file,mongodb,
aggregate,hdfs,insert,MapReduce,dataset,tableau,tool,action,Hadoop,count,hdfs,
planning,administration,update,file,mongodb,dataset,Hadoop,MapReduce,food,
e-health,cancer,diagnosis

C:\Users\CSE>hdfs dfs -appendToFile - /analytics/student.txt

NoSQL database used here is MongoDB

C:\Users\CSE>hdfs dfs -cat /analytics/student.txt

Objective: To remove an empty directory from HDFS.

Act:
C:\Users\CSE>hdfs dfs -rmdir /lab

Objective: To remove a directory and its contents from HDFS.

Act:
C:\Users\CSE>hdfs dfs -rm -r /analytics
Deleted /analytics

C:\Users\CSE>hdfs dfs -ls -R /

drwxr-xr-x - CSE supergroup 0 2019-09-02 15:00 /bigdata
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:51 /bigdata/salary.txt
-rw-r--r-- 1 CSE supergroup 0 2019-09-02 15:00 /bigdata/sample.txt
-rw-r--r-- 1 CSE supergroup 258 2019-09-02 14:49 /bigdata/word.txt

Objective: To remove a file from HDFS.

Act:
C:\Users\CSE>hdfs dfs -rm /bigdata/sample.txt
Deleted /bigdata/sample.txt
C:\Users\CSE>hdfs dfs -ls -R /
drwxr-xr-x - CSE supergroup 0 2019-09-02 15:14 /bigdata
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:51 /bigdata/salary.txt
-rw-r--r-- 1 CSE supergroup 258 2019-09-02 14:49 /bigdata/word.txt

RESULT:
Thus the working of HDFS commands has been done successfully.
Ex No:02 Roll no:
Date: Page no:

FILE MANAGEMENT TASKS IN HADOOP

AIM:
To perform the file management tasks in Hadoop.

PROCEDURE:

COMMANDS:

Objective: To create a directory by the name „bigdata‟and „analytics‟ in HDFS.

Act:
C:\Users\CSE>hadoop fs -mkdir /bigdata
C:\Users\CSE>hadoop fs -mkdir /analytics

Objective: To get the list of directories at the root of HDFS.

Act:
C:\Users\CSE>hadoop fs -ls /
Found 2 items
drwxr-xr-x - CSE supergroup 0 2019-09-02 13:57 /analytics
drwxr-xr-x - CSE supergroup 0 2019-09-02 13:57 /bigdata

Objective: To copy a file from local file system to HDFS.

Act:
C:\Users\CSE>hadoop fs -put C:/word.txt /bigdata/word.txt

Objective: To copy a file from local file system to HDFS via copyFromLocal command.
Act:
C:\Users\CSE>hadoop fs -copyFromLocal C:/salary.txt /bigdata/salary.txt

Objective: To get the list of complete directories and files of HDFS.

Act:
C:\Users\CSE>hadoop fs -ls -R /
drwxr-xr-x - CSE supergroup 0 2019-09-02 13:57 /analytics
drwxr-xr-x - CSE supergroup 0 2019-09-02 14:51 /bigdata
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:51 /bigdata/salary.txt
-rw-r--r-- 1 CSE supergroup 258 2019-09-02 14:49 /bigdata/word.txt

Objective: To display the contents of an HDFS file by the name word.txt on console.
Act:
C:\Users\CSE>hadoop fs -cat /bigdata/word.txt
Hadoop,action,BigData,Export,update,visualization,word,count,file,mongodb,aggregate,hdfs,ins
ert,MapReduce,dataset,tableau,tool,action,Hadoop,count,hdfs,planning,administration,update,file
,mongodb,dataset,Hadoop,MapReduce,food,e-health,cancer,diagnosis

Objective: To copy a file from one directory to another on HDFS.

Act:
C:\Users\CSE>hadoop fs -cp /bigdata/salary.txt /analytics/salary.txt

C:\Users\CSE>hadoop fs -ls -R /
drwxr-xr-x - CSE supergroup 0 2019-09-02 14:56 /analytics
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:56 /analytics/salary.txt
drwxr-xr-x - CSE supergroup 0 2019-09-02 14:54 /bigdata
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:51 /bigdata/salary.txt
-rw-r--r-- 1 CSE supergroup 68 2019-09-02 14:54 /bigdata/student.txt
-rw-r--r-- 1 CSE supergroup 258 2019-09-02 14:49 /bigdata/word.txt

Objective: To move a file from one directory to another on HDFS.

Act:
C:\Users\CSE>hadoop fs -mv /bigdata/student.txt /analytics/student.txt

C:\Users\CSE>hadoop fs -ls -R /
drwxr-xr-x - CSE supergroup 0 2019-09-02 14:58 /analytics
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:56 /analytics/salary.txt
-rw-r--r-- 1 CSE supergroup 68 2019-09-02 14:54 /analytics/student.txt
drwxr-xr-x - CSE supergroup 0 2019-09-02 14:58 /bigdata
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:51 /bigdata/salary.txt
-rw-r--r-- 1 CSE supergroup 258 2019-09-02 14:49 /bigdata/word.txt

Objective: To create a file in HDFS with file size 0 bytes.

Act:
C:\Users\CSE>hadoop fs -touchz /bigdata/sample.txt
C:\Users\CSE>hadoop fs -ls -R /
drwxr-xr-x - CSE supergroup 0 2019-09-02 14:58 /analytics
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:56 /analytics/salary.txt
-rw-r--r-- 1 CSE supergroup 68 2019-09-02 14:54 /analytics/student.txt
drwxr-xr-x - CSE supergroup 0 2019-09-02 15:00 /bigdata
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:51 /bigdata/salary.txt
-rw-r--r-- 1 CSE supergroup 0 2019-09-02 15:00 /bigdata/sample.txt
-rw-r--r-- 1 CSE supergroup 258 2019-09-02 14:49 /bigdata/word.txt

Objective: To show the last 1 KB of the file on console or stdout.

Act:
C:\Users\CSE>hadoop fs -tail /bigdata/salary.txt
197 Kevin Feeney M KFEENEY 650.507.9822 23-MAY-06 SH_CLERK 3000
124 50
198 Donald OConnell M DOCONNEL 650.507.9833 21-JUN-07
SH_CLERK 2600
124 50
199 Douglas Grant F DGRANT 650.507.9844 13-JAN-08 SH_CLERK 2600
124 50
200 Jennifer Whalen M JWHALEN 515.123.4444 17-SEP-03 AD_ASST
4400 101 10
201 Michael Hartstein M MHARTSTE 515.123.5555 17-FEB-04
MK_MAN 13000 100
20
202 Pat Fay M PFAY 603.123.6666 17-AUG-05 MK_REP 6000 201
20
203 Susan Mavris F SMAVRIS 515.123.7777 07-JUN-02 HR_REP 6500 101
40
204 Hermann Baer M HBAER 515.123.8888 07-JUN-02 PR_REP 10000 101
70
205 Shelley Higgins M SHIGGINS 515.123.8080 07-JUN-02 AC_MGR
12008 101 110
206 William Gietz M WGIETZ 515.123.8181 07-JUN-02 AC_ACCOUNT 8300
205 110
7 NIKHIL RANJAN M NIKSR 515.123.4568 21-SEP-11 AD_VP 20400
100 90

Objective: To show disk usage, in bytes, for all the files present on the path specified.
Act:
C:\Users\CSE>hadoop fs -du /bigdata
8092 /bigdata/salary.txt
0 /bigdata/sample.txt
258 /bigdata/word.txt

Objective:To append the content of the local file to the specified destination file on HDFS.The
destination file will be created if it does not exist. If local file is specified as -, then the input is
read from stdin.
Act:
C:\Users\CSE>hadoop fs -appendToFile C:/word.txt /analytics/student.txt

C:\Users\CSE>hadoop fs -cat /analytics/student.txt

C:\Users\CSE>hadoop fs -appendToFile - /analytics/student.txt

NoSQL database used here is MongoDB

C:\Users\CSE>hadoop fs -cat /analytics/student.txt

Objective: To remove an empty directory from HDFS.

Act:
C:\Users\CSE>hadoop fs -rmdir /lab

Objective: To remove a directory and its contents from HDFS.

Act:
C:\Users\CSE>hadoop fs -rm -r /analytics
Deleted /analytics

C:\Users\CSE>hadoop fs -ls -R /
drwxr-xr-x - CSE supergroup 0 2019-09-02 15:00 /bigdata
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:51 /bigdata/salary.txt
-rw-r--r-- 1 CSE supergroup 0 2019-09-02 15:00 /bigdata/sample.txt
-rw-r--r-- 1 CSE supergroup 258 2019-09-02 14:49 /bigdata/word.txt

Objective: To remove a file from HDFS.

Act:
C:\Users\CSE>hadoop fs -rm /bigdata/sample.txt
Deleted /bigdata/sample.txt

C:\Users\CSE>hadoop fs -ls -R /
drwxr-xr-x - CSE supergroup 0 2019-09-02 15:14 /bigdata
-rw-r--r-- 1 CSE supergroup 8092 2019-09-02 14:51 /bigdata/salary.txt
-rw-r--r-- 1 CSE supergroup 258 2019-09-02 14:49 /bigdata/word.txt

RESULT:
Thus the file management tasks in Hadoop has been done successfully.
Ex No:03 Roll no:
Date: Page no:

MAP REDUCE PROGRAM FOR RETRIEVING THE SUM AND AVERAGE SALARY
OF EMPLOYEES IN EVERY DEPARTMENT

AIM:
To write a Map Reduce program for retrieving the sum and average salary of employees
in every department.

PROCEDURE:
Step 1: Open Eclipse and select File -> New -> Java Project -> (Name it -MRDemo) -> Finish.
Step 2: Right Click the project name and select New -> Package (Name it - com.app) -> Finish.
Step 3: Right Click the package name and select New -> Class (Name it - SumAvgSalary).
Step 4: Add the following reference libraries by right clicking the project name and select
properties -> Java Build Path -> Add External JARs
 hadoop-core-0.20.0.jar
 org.apache.commons.cli-1.2.0.jar
Step 5: Write the Map Reduce program in SumAvgSalary.java file and save the file.
Step 6: Make the project jar file by right clicking the project name and select Export -> Select
export destination as Jar File under Java ->click next (Give the JAR file name as
MRDemo.jar) ->Finish.
Step 7: Open the command prompt to start the Hadoop by typing the following command in
the specified path and hit enter.
C:\hadoop-2.8.0\sbin>start-all
Step 8: Move the input file into HDFS by typing the following in the command prompt
hadoop fs -put C:/salary.txt /salary.txt
Step 9: Run the jar file by typing the following in the command prompt
hadoop jar MRDemo.jar com.app.SumAvgSalary /salary.txt /salarysumavg
Step 10:Check the output by typing the following in the command prompt
hadoop fs -ls /salarysumavg
hadoop fs -cat /salarysumavg/part-r-00000

PROGRAM:
SumAvgSalary.java:

package com.app;
import java.io.IOException;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.FloatWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.io.LongWritable;
public class SumAvgSalary
{
public static void main(String[] args) throws IOException, InterruptedException,
ClassNotFoundException
{
Job job=new Job();
job.setJobName("SumAvgSalary");
job.setJarByClass(SumAvgSalary.class);
job.setMapperClass(SumAvgSalaryMap.class);
job.setReducerClass(SumAvgSalaryRed.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(FloatWritable.class);
FileInputFormat.addInputPath(job, new Path("/salary.txt"));
FileOutputFormat.setOutputPath(job, new Path("/salarysumavg"));
System.exit(job.waitForCompletion(true)?0:1);
}
public static class SumAvgSalaryMap extends Mapper<LongWritable, Text, Text,
FloatWritable>
{
public void map(LongWritable key, Text empRecord, Context con) throws
IOException, InterruptedException
{
String[] word = empRecord.toString().split("\\t");
String un = word[7];
Float salary = Float.parseFloat(word[8]);
con.write(new Text(un), new FloatWritable(salary));
}
}
public static class SumAvgSalaryRed extends Reducer<Text, FloatWritable, Text, Text>
{
public void reduce(Text key, Iterable<FloatWritable>valueList,Context con)
throws IOException, InterruptedException
{
Float total = (float) 0;
int count=0;
for (FloatWritablevar :valueList)
{
total += var.get();
count++;
}
Float avg = (Float) total / count;
String out = "Total: " + total + " " + "Average: " + avg;
con.write(key, new Text(out));
}
}
}

INPUT:

OUTPUT:

RESULT:
Thus the Map Reduce program for retrieving the sum and average salary of employees in
every department has been written, executed and the output is verified successfully.
Ex No:04 Roll no:
Date: Page no:

MAP REDUCE PROGRAM FOR FINDING THE UNIT WISE SALARY

AIM:
To write a Map Reduce program for finding the unit wise salary.

PROCEDURE:

Step 1: Open Eclipse and select File -> New -> Java Project -> (Name it -MRDemo) -> Finish.
Step 2: Right Click the project name and select New -> Package (Name it -com.app) -> Finish.
Step 3: Right Click the package name and select New -> Class (Name it - UnitWiseSalary).
Step 4: Add the following reference libraries by right clicking the project name and select
properties -> Java Build Path -> Add External JARs
 hadoop-core-0.20.0.jar
 org.apache.commons.cli-1.2.0.jar
Step 5: Write the Map Reduce program in UnitWiseSalary.java file and save the file.
Step 6: Make the project jar file by right clicking the project name and select Export -> Select
export destination as Jar File under Java ->click next (Give the JAR file name as
MRDemo.jar) ->Finish.
Step 7: Open the command prompt to start the Hadoop by typing the following command in
the specified path and hit enter.
C:\hadoop-2.8.0\sbin>start-all
Step 8: Move the input file into HDFS by typing the following in the command prompt
hadoop fs -put C:/salary.txt /salary.txt
Step 9: Run the jar file by typing the following in the command prompt
hadoop jar MRDemo.jar com.app.UnitWiseSalary /salary.txt /salarysum
Step 10:Check the output by typing the following in the command prompt
hadoop fs -ls /salarysum
hadoop fs -cat /salarysum/part-r-00000

PROGRAM:
UnitWiseSalary.java:

package com.app;
import java.io.IOException;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.FloatWritable;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.io.LongWritable;

public class UnitWiseSalary

{
public static void main(String[] args) throws IOException, InterruptedException,
ClassNotFoundException
{
Job job=new Job();
job.setJobName("UnitWiseSalary");
job.setJarByClass(UnitWiseSalary.class);
job.setMapperClass(UnitWiseSalaryMap.class);
job.setReducerClass(UnitWiseSalaryRed.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(FloatWritable.class);
FileInputFormat.addInputPath(job, new Path("/salary.txt"));
FileOutputFormat.setOutputPath(job, new Path("/salarysum"));
System.exit(job.waitForCompletion(true)?0:1);
}
public static class UnitWiseSalaryMap extends Mapper<LongWritable, Text, Text,
FloatWritable>
{
public void map(LongWritable key, Text empRecord, Context con)
throwsIOException, InterruptedException
{
String[] word = empRecord.toString().split("\\t");
String un = word[7];
Float salary = Float.parseFloat(word[8]);
con.write(new Text(un), newFloatWritable(salary));
}
}
public static class UnitWiseSalaryRed extends Reducer<Text, FloatWritable, Text, Text>
{
public void reduce(Text key, Iterable<FloatWritable>valueList,Context con)
throwsIOException, InterruptedException
{
Float total = (float) 0;
for (FloatWritablevar :valueList)
{
total += var.get();
}
String out = "Total: " + total;
con.write(key, new Text(out));
}
}
}

INPUT:

OUTPUT:

RESULT:
Thus the Map Reduce program for finding the unit wise salary has been written, executed
and the output is verified successfully.
Ex No:05 Roll no:
Date: Page no:

CRUD (CREATE, READ, UPDATE AND DELETE) OPERATIONS IN MONGODB

AIM:
To demonstrate CRUD (Create, Read, Update and Delete) operations in MongoDB.

PROCEDURE:
Step 1: Open a command prompt and navigate to the bin directory present in the MongoDB
installation folder. Installation folder is C:\Program Files\MongoDB\Server\4.2.
Step 2: To get into the MongoDB shell, type the command “mongo.exe” in the command
prompt.

CRUD OPERATIONS:
CREATE:
To create a collection in a database, use db.createCollection() method. Collection can
also be created automatically, when some document is inserted.
>db.createCollection("Students")
{ "ok" : 1 }

INSERT:
To insert a document into a collection, use insert() method.
>db.Students.insert({_id:1,StudName:"Michelle Jacintha",Grade:"VII",Hobbies:"Internet
Surfing"});
WriteResult({ "nInserted" : 1 })
>db.Students.insert({_id:2,StudName:"Mabel Mathews",Grade:"VII",Hobbies:"Baseball"});
WriteResult({ "nInserted" : 1 })
>db.Students.insert({_id:3,StudName:"Aryan David",Grade:"VII",Hobbies:"Skatting"});
WriteResult({ "nInserted" : 1 })
>db.Students.insert({_id:4,StudName:"HersehGibbs",Grade:"VII",Hobbies:"Graffiti"});
WriteResult({ "nInserted" : 1 })

READ:
To read or display the documents from a collection, use find() method. The pretty()
method is used to format the result.
>db.Students.find().pretty();
{
"_id" : 1,
"StudName" : "Michelle Jacintha",
"Grade" : "VII",
"Hobbies" : "Internet Surfing"
}
{
"_id" : 2,
"StudName" : "Mabel Mathews",
"Grade" : "VII",
"Hobbies" : "Baseball"
}
{
"_id" : 3,
"StudName" : "Aryan David",
"Grade" : "VII",
"Hobbies" : "Skatting"
}
{
"_id" : 4,
"StudName" : "Herseh Gibbs",
"Grade" : "VII",
"Hobbies" : "Graffiti"
}

INSERT USING UPDATE:

Documents can also be inserted into a collection using update() method with upsert set to
true. Upsert means Update else insert.
>db.Students.update({_id:3,StudName:"Aryan David",Grade:"VII"},{$set:{Hobbies:"Chess"}},
{upsert:true});
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

READ AFTER INSERT USING UPDATE:

>db.Students.find({_id:3});
{ "_id" : 3, "StudName" : "Aryan David", "Grade" : "VII", "Hobbies" : "Chess" }

INSERT USING SAVE:

Documents can also be inserted into a collection using save() method.
>db.Students.save({_id:5,StudName:"Vamsi Bapat",Grade:"VII",Hobbies:"Cricket"})
WriteResult({ "nMatched" : 0, "nUpserted" : 1, "nModified" : 0, "_id" : 5 })

READ AFTER INSERT USING SAVE:

>db.Students.find({_id:5});
{ "_id" : 5, "StudName" : "Vamsi Bapat", "Grade" : "VII", "Hobbies" : "Cricket" }

READ:
>db.Students.find().pretty();
{
"_id" : 1,
"StudName" : "Michelle Jacintha",
"Grade" : "VII",
"Hobbies" : "Internet Surfing"
}
{
"_id" : 2,
"StudName" : "Mabel Mathews",
"Grade" : "VII",
"Hobbies" : "Baseball"
}
{
"_id" : 3,
"StudName" : "Aryan David",
"Grade" : "VII",
"Hobbies" : "Chess"
}
{
"_id" : 4,
"StudName" : "Herseh Gibbs",
"Grade" : "VII",
"Hobbies" : "Graffiti"
}
{
"_id" : 5,
"StudName" : "Vamsi Bapat",
"Grade" : "VII",
"Hobbies" : "Cricket"
}

READ:
To find a document wherein the “StudName” has the value “Aryan David”.
>db.Students.find({StudName:"Aryan David"}).pretty();
{
"_id" : 3,
"StudName" : "Aryan David",
"Grade" : "VII",
"Hobbies" : "Chess"
}

READ:
To display only the StudName from all the documents of the Students collection. The
identifier “_id” can be suppressed and not mentioned by specifying it as 0.
>db.Students.find({},{StudName:1,_id:0});
{ "StudName" : "Michelle Jacintha" }
{ "StudName" : "Mabel Mathews" }
{ "StudName" : "Aryan David" }
{ "StudName" : "Herseh Gibbs" }
{ "StudName" : "Vamsi Bapat" }

READ:
To find those documents where the Grade is set to “VII”. The relational operator $eq can
be used.
>db.Students.find({Grade:{$eq:'VII'}}).pretty();
{
"_id" : 1,
"StudName" : "Michelle Jacintha",
"Grade" : "VII",
"Hobbies" : "Internet Surfing"
}
{
"_id" : 2,
"StudName" : "Mabel Mathews",
"Grade" : "VII",
"Hobbies" : "Baseball"
}
{
"_id" : 3,
"StudName" : "Aryan David",
"Grade" : "VII",
"Hobbies" : "Chess"
}
{
"_id" : 4,
"StudName" : "Herseh Gibbs",
"Grade" : "VII",
"Hobbies" : "Graffiti"
}
{
"_id" : 5,
"StudName" : "Vamsi Bapat",
"Grade" : "VII",
"Hobbies" : "Cricket"
}

READ:
To find those documents where the Hobbies is not equal to “Baseball”. The relational
operator $ne can be used.
>db.Students.find({Hobbies:{$ne:'Baseball'}}).pretty();
{
"_id" : 1,
"StudName" : "Michelle Jacintha",
"Grade" : "VII",
"Hobbies" : "Internet Surfing"
}
{
"_id" : 3,
"StudName" : "Aryan David",
"Grade" : "VII",
"Hobbies" : "Chess"
}
{
"_id" : 4,
"StudName" : "Herseh Gibbs",
"Grade" : "VII",
"Hobbies" : "Graffiti"
}
{
"_id" : 5,
"StudName" : "Vamsi Bapat",
"Grade" : "VII",
"Hobbies" : "Cricket"
}
READ:
To find those documents where the Hobbies is set to either “Chess” or is set to “Skating”.
The operator $in can be used.
>db.Students.find({Hobbies:{$in:['Chess','Skating']}});
{ "_id" : 3, "StudName" : "Aryan David", "Grade" : "VII", "Hobbies" : "Chess" }

READ:
To find those documents where the Hobbies is set neither to “Chess” nor is set to
“Skating”. The operator $nin can be used.
>db.Students.find({Hobbies:{$nin:['Chess','Skating']}});
{ "_id" : 1, "StudName" : "Michelle Jacintha", "Grade" : "VII", "Hobbies" : "Internet Surfing" }
{ "_id" : 2, "StudName" : "Mabel Mathews", "Grade" : "VII", "Hobbies" : "Baseball" }
{ "_id" : 4, "StudName" : "Herseh Gibbs", "Grade" : "VII", "Hobbies" : "Graffiti" }
{ "_id" : 5, "StudName" : "Vamsi Bapat", "Grade" : "VII", "Hobbies" : "Cricket" }

DELETE:
To delete a document where the “_id” is set to 4.
>db.Students.remove({_id:4});
WriteResult({ "nRemoved" : 1 })

READ AFTER DELETE:

>db.Students.find().pretty();
{
"_id" : 1,
"StudName" : "Michelle Jacintha",
"Grade" : "VII",
"Hobbies" : "Internet Surfing"
}
{
"_id" : 2,
"StudName" : "Mabel Mathews",
"Grade" : "VII",
"Hobbies" : "Baseball"
}
{
"_id" : 3,
"StudName" : "Aryan David",
"Grade" : "VII",
"Hobbies" : "Chess"
}
{
"_id" : 5,
"StudName" : "Vamsi Bapat",
"Grade" : "VII",
"Hobbies" : "Cricket"
}

DELETE:
To delete all documents from the collection “Students”.
>db.Students.remove({});
WriteResult({ "nRemoved" : 4 })

READ AFTER DELETE:

>db.Students.find().pretty();
>

RESULT:
Thus the CRUD (Create, Read, Update and Delete) operations in MongoDB has been
executed and the output is verified successfully.
Ex No:06 Roll no:
Date: Page no:

IMPORT, EXPORT AND AGGREGATION IN MONGODB

AIM:
To demonstrate import, export and aggregation in MongoDB.

PROCEDURE:
Step 1: Pick any public dataset from the site www.kdnuggets.com with at least two numeric
Columns, convert it into CSV format, name the file as sample.csv and save it in C drive.
Step 2: Open a command prompt and navigate to the MongoDB installation folder. Installation
folder is C:\Program Files\MongoDB\Server\4.2.
Step 3: Use Mongoimport to import data from CSV format file into MongoDB collection,
“sample” in test database.
mongoimport --db test --collection sample --type csv --headerline --file C:\sample.csv
Step 4: Navigate to the bin directory present in the MongoDB installation folder specified above.
Step 5: To get into the MongoDB shell, type the command mongo.exe in the command propmpt.
Step 6: Compute the average of the values in the second numeric column using aggregate
operations in MongoDB.
Step 7: Type exit and come out of the MongoDB shell.
Step 8: Use Mongoexport to export documents of the sample collection in the test database into
CSV format file.

IMPORT IN MONGODB:
At the command prompt, execute the following command:
mongoimport --db test --collection sample --type csv --headerline --file C:\sample.csv
INPUT:
OUTPUT:

AGGREGATION IN MONGODB:

db.sample.aggregate({$group:{_id:"$_unit_id",Tot_choose_one:{$avg:"$choose_one"}}});

OUTPUT:
EXPORT IN MONGODB:
At the command prompt, execute the following command:
mongoexport --db test --collection sample --fieldFile C:\sample.xlsx --out d:\output.txt

OUTPUT:

RESULT:
Thus the import, export and aggregation in MongoDB has been executed and the output
is verified successfully.
Ex No:07 Roll no:
Date: Page no:

TOP N AND BOTTOM N VIEW ON THE WORKSHEET USING TABLEAU

VISUALIZATION TOOL

AIM:
To demonstrate the Top N and Bottom N view on the worksheet for a dataset using
Tableau visualization tool.

PROCEDURE:

Step 1: Open Tableau visualization tool by clicking on the Tableau icon in the desktop.

Step 2: Select the data onto Tableau by choosing the Sample-Superstore.xls file present under
the Saved Data Sources menu in the home page. A worksheet is shown where the chart is created
and displayed.
Step 3: Drag the dimension Product Name under Product to the Rows shelf and the Measures
Profit to the Columns shelf. Choose the chart type as Bar in the Marks section. Tableau shows
the following chart.
Step 4: Right click on the field Product Name and select sort. Choose Sort by as field, Sort Order
as Descending.
Step 5: Right click on the Top Customers under Paramters and click Edit. Give the name as Top
and Bottom Products, change the current value as 10 and click OK.
Step 6: Right click on the field Product Name and select Filter. In General tab, choose Use all
radio option. In the Top tab, choose the second radio option By Field. In the second drop-down,
choose the Top and Bottom categories, click Apply and the OK. The top 10 Sub-Category of
products by profit is now shown in the chart.
Step 7: Right click on the filter Product Name under Filter section, select Create Set. Name the
set as Top 10 Products. In the Top tab, choose the second radio option By Field. In the first drop-
down, choose Top and in second drop-down, choose the Top and Bottom Products, click OK.
Step 8: Right click on the filter Product Name under Filter section, select Create Set. Name the
set as Bottom 10 Products. In the Top tab, choose the second radio option By Field. In the first
drop-down, choose Bottom and in second drop-down, choose the Top and Bottom Products,
click OK.
Step 9: Right click on the Top 10 Customers under sets and select Create Combined Set. Name
the set as Top 10 and Bottom 10 Profit Products. In the second drop-down, select Bottom 10
Products and click OK.
Step 10: Right click on the Product Name in the Filters section and select remove.
Step 11: Drag Top 10 and Bottom 10 Profit Products Set from Sets section to the Filters section.
Step 12: Now click on the presenatation icon or F7 to view the Top 10 and Bottom 10 Products
based on the Profit in a single view.

RESULT:
Thus the Top N and Bottom N view on the worksheet for a dataset using Tableau
visualization tool has been demonstrated and the output is verified successfully.
Ex No:08 Roll no:
Date: Page no:

MAP REDUCE PROGRAM FOR WORD COUNTER

AIM:
To write a Map Reduce program for counting the occurrences of similar words in a file.

PROCEDURE:
Step 1: Open Eclipse and select File -> New -> Java Project -> (Name it -MRDemo) -> Finish.
Step 2: Right Click the project name and select New -> Package (Name it - com.app) -> Finish.
Step 3: Right Click the package name and select New -> Class (Name it - WordCounter).
Step 4: Add the following reference libraries by right clicking theproject name and select
properties -> Java Build Path -> Add External JARs
 hadoop-core-0.20.0.jar
 org.apache.commons.cli-1.2.0.jar
Step 5: Write the Map Reduce program in WordCounter.java file and save the file.
Step 6: Make the project jar file by right clicking the project name and select Export -> Select
export destination as Jar File under Java ->click next (Give the JAR file name as
MRDemo.jar) ->Finish.
Step 7: Open the command prompt to start the Hadoop by typing the following command in
the specified path and hit enter.
C:\hadoop-2.8.0\sbin>start-all
Step 8: Move theinput file into HDFS by typing the following in the command prompt
hadoop fs -put C:/word.txt /word.txt
Step 9: Run the jar file by typing the following in the command prompt
hadoop jar MRDemo.jar com.app.WordCounter /word.txt /wordcount
Step 10:Check the output by typing the following in the command prompt
hadoop fs -ls /wordcount
hadoop fs -cat /wordcount/part-r-00000

PROGRAM:
WordCounter.java:

package com.app;
import java.io.IOException;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.io.LongWritable;
public class WordCounter
{
public static void main(String [] args) throws IOException, InterruptedException,
ClassNotFoundException
{
Job job=new Job();
job.setJobName("WordCounter");
job.setJarByClass(WordCounter.class);
job.setMapperClass(WordCounterMap.class);
job.setReducerClass(WordCounterRed.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path("/word.txt"));
FileOutputFormat.setOutputPath(job, new Path("/wordcount"));
System.exit(job.waitForCompletion(true)?0:1);
}
public static class WordCounterMap extends Mapper<LongWritable, Text, Text,
IntWritable>
{
@Override
public void map(LongWritable key, Text value, Context context) throws
IOException, InterruptedException
{
String[] words=value.toString().split(",");
for(String word: words )
{
context.write(new Text(word), new IntWritable(1));
}
}
}
public static class WordCounterRed extends Reducer<Text, IntWritable, Text,
IntWritable>
{
@Override
public void reduce(Text word, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException
{
Integer count=0;
for(IntWritableval : values)
{
count += val.get();
}
context.write(word, new IntWritable(count));
}
}
}

INPUT:

OUTPUT:

RESULT:
Thus the Map Reduce program for finding the occurrences of similar words in a file has
been written, executed and the output is verified successfully.

Final Bda Lab Manual
No ratings yet
Final Bda Lab Manual
56 pages
Hadoop Command Line Interface
No ratings yet
Hadoop Command Line Interface
10 pages
Manual 5
No ratings yet
Manual 5
51 pages
PDC All Labs
100% (1)
PDC All Labs
129 pages
Snowflake Notes
No ratings yet
Snowflake Notes
2 pages
Ai&Ml (Bdamanual)
No ratings yet
Ai&Ml (Bdamanual)
24 pages
C21053 Jay Vijay Karwatkar-Big Data Analytics & Visualization
No ratings yet
C21053 Jay Vijay Karwatkar-Big Data Analytics & Visualization
210 pages
BDH Record - Merged
No ratings yet
BDH Record - Merged
47 pages
BDC Final Record
No ratings yet
BDC Final Record
36 pages
kh5 (Bda) Merged
No ratings yet
kh5 (Bda) Merged
21 pages
BDA Record
No ratings yet
BDA Record
36 pages
Final Bda 1-8 Lab Aayush
No ratings yet
Final Bda 1-8 Lab Aayush
17 pages
CA01
No ratings yet
CA01
14 pages
Lab File Format
No ratings yet
Lab File Format
60 pages
Hadoop Assignement Sumit 241111 133837
No ratings yet
Hadoop Assignement Sumit 241111 133837
13 pages
Lab2 BigData-HDFSp
No ratings yet
Lab2 BigData-HDFSp
4 pages
HDFS Command
No ratings yet
HDFS Command
15 pages
Hadoop Hdfs Commands
No ratings yet
Hadoop Hdfs Commands
2 pages
Hadoop
No ratings yet
Hadoop
4 pages
Experiment No 1
No ratings yet
Experiment No 1
13 pages
Unit - Iii RDBMS Notes
No ratings yet
Unit - Iii RDBMS Notes
26 pages
2335 m4 Demo2 v1 HDL 781gomlg
No ratings yet
2335 m4 Demo2 v1 HDL 781gomlg
6 pages
COMMAND Line Interface
No ratings yet
COMMAND Line Interface
26 pages
566EXP2
No ratings yet
566EXP2
4 pages
Hive Advanced Concepts
No ratings yet
Hive Advanced Concepts
57 pages
Experiment - Hdfs Commands
No ratings yet
Experiment - Hdfs Commands
8 pages
Bda Record 18071a0597-1
No ratings yet
Bda Record 18071a0597-1
28 pages
Big Data Analytics
No ratings yet
Big Data Analytics
14 pages
Hadoop Commands
No ratings yet
Hadoop Commands
2 pages
FDB For Exit Exam
No ratings yet
FDB For Exit Exam
284 pages
Apache Hadoop
No ratings yet
Apache Hadoop
3 pages
Bda Record (24-25)
No ratings yet
Bda Record (24-25)
50 pages
Vnrvjiet
No ratings yet
Vnrvjiet
70 pages
Big Data
No ratings yet
Big Data
3 pages
Unit1 Trees
No ratings yet
Unit1 Trees
82 pages
Basic HDFS Commands
No ratings yet
Basic HDFS Commands
7 pages
Data Science Papers
No ratings yet
Data Science Papers
109 pages
Ex-2 Hadoop Commands
No ratings yet
Ex-2 Hadoop Commands
6 pages
HDFS 1
No ratings yet
HDFS 1
4 pages
Hdfs Commands
No ratings yet
Hdfs Commands
3 pages
Hadoop Commands
100% (1)
Hadoop Commands
6 pages
BDA5
No ratings yet
BDA5
4 pages
Exp-2 Hadoop Commands
No ratings yet
Exp-2 Hadoop Commands
6 pages
Big Data Cheat Sheet
No ratings yet
Big Data Cheat Sheet
12 pages
@bigdatalabfile 09
No ratings yet
@bigdatalabfile 09
35 pages
HDFS Commands AfterINstallation
No ratings yet
HDFS Commands AfterINstallation
4 pages
HCteam IT Proposal
No ratings yet
HCteam IT Proposal
15 pages
Hadoop Commands Only
No ratings yet
Hadoop Commands Only
19 pages
DS Lab Manual
No ratings yet
DS Lab Manual
76 pages
SAP SAC Intro
No ratings yet
SAP SAC Intro
7 pages
Hadoop HDFS Commands
No ratings yet
Hadoop HDFS Commands
1 page
Lista de Comandos HDFS
No ratings yet
Lista de Comandos HDFS
8 pages
Practical 1 - 1 - Hadoop Commands
No ratings yet
Practical 1 - 1 - Hadoop Commands
3 pages
DA Lab Program-1
No ratings yet
DA Lab Program-1
3 pages
HDFS
No ratings yet
HDFS
6 pages
Command
No ratings yet
Command
1 page
Compiler Design Unit-3 and 4
No ratings yet
Compiler Design Unit-3 and 4
34 pages
What Does A Database Administrator
No ratings yet
What Does A Database Administrator
27 pages
HDFS Commands
No ratings yet
HDFS Commands
8 pages
Obiee Usage Tracking
No ratings yet
Obiee Usage Tracking
4 pages
Reading Data From Databases Tables Using OPEN SQL (SELECT) Statements in Sap Abap
No ratings yet
Reading Data From Databases Tables Using OPEN SQL (SELECT) Statements in Sap Abap
11 pages
HDFS Commands
No ratings yet
HDFS Commands
2 pages
Sap Technical
No ratings yet
Sap Technical
15 pages
DBMS Lab Record 2020-21
No ratings yet
DBMS Lab Record 2020-21
36 pages
Hadoop Imp Commands
No ratings yet
Hadoop Imp Commands
21 pages
Java
No ratings yet
Java
5 pages
Linux Commands - Mkdir - Rmdir - Touch - RM - CP - More - Less - Head - Tail - Cat
No ratings yet
Linux Commands - Mkdir - Rmdir - Touch - RM - CP - More - Less - Head - Tail - Cat
16 pages
Advantages and Disadvantages of MOLAP, ROLAP and HOLAP
No ratings yet
Advantages and Disadvantages of MOLAP, ROLAP and HOLAP
2 pages
Amol
No ratings yet
Amol
48 pages
Create A Directory in HDFS at Given Path(s) .: Upload
No ratings yet
Create A Directory in HDFS at Given Path(s) .: Upload
11 pages
2 HDFS Commands
No ratings yet
2 HDFS Commands
7 pages
Hadoop Linux Commands
No ratings yet
Hadoop Linux Commands
8 pages
Oracle
No ratings yet
Oracle
11 pages
Hdfs Lab Work
No ratings yet
Hdfs Lab Work
2 pages
Extreme Computing Lab Exercises Session One: 1 Getting Started
No ratings yet
Extreme Computing Lab Exercises Session One: 1 Getting Started
6 pages
Jenkins Architecture
No ratings yet
Jenkins Architecture
5 pages
HDFS Commands v02 PDF
No ratings yet
HDFS Commands v02 PDF
7 pages
HLD of OTT Apps
No ratings yet
HLD of OTT Apps
28 pages
Sorular Cevaplar PDF
No ratings yet
Sorular Cevaplar PDF
7 pages
Prog 114 1
No ratings yet
Prog 114 1
4 pages
Dood Problems Solved
100% (1)
Dood Problems Solved
10 pages
A Visual Git Reference
No ratings yet
A Visual Git Reference
11 pages
Aditya Technical Seminar
No ratings yet
Aditya Technical Seminar
10 pages
6 BSTs and AVL Trees
No ratings yet
6 BSTs and AVL Trees
12 pages
ACID (Atomicity, Consistency, Isolation, Durability)
No ratings yet
ACID (Atomicity, Consistency, Isolation, Durability)
15 pages
DOS Assignment 01
No ratings yet
DOS Assignment 01
13 pages
MySQL Practical 2
No ratings yet
MySQL Practical 2
6 pages
Unit 2 - Data Mining - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Data Mining - WWW - Rgpvnotes.in
14 pages
Ecm Deck v21
No ratings yet
Ecm Deck v21
40 pages
E R Model Basics
No ratings yet
E R Model Basics
19 pages
Asm Technologies-22 June: I Think U Can Get The Answers in Internet...
No ratings yet
Asm Technologies-22 June: I Think U Can Get The Answers in Internet...
2 pages

BDA - LAB Manual

Uploaded by

BDA - LAB Manual

Uploaded by

Ex No:01 Roll no:

Date: Page no:

WORKING WITH HDFS COMMANDS

Objective: To print the Hadoop version.

Objective: To create a directory by the name „bigdata‟and „analytics‟ in HDFS.

Objective: To get the list of directories at the root of HDFS.

Objective: To copy a file from one directory to another on HDFS.

C:\Users\CSE>hdfs dfs -ls -R /

Objective: To move a file from one directory to another on HDFS.

C:\Users\CSE>hdfs dfs -ls -R /

Objective: To create a file in HDFS with file size 0 bytes.

C:\Users\CSE>hdfs dfs -ls -R /

Objective: To show the last 1 KB of the file on console or stdout.

Objective:To empty the trash in Hadoop file system.

C:\Users\CSE>hdfs dfs -cat /analytics/student.txt

C:\Users\CSE>hdfs dfs -appendToFile - /analytics/student.txt

C:\Users\CSE>hdfs dfs -cat /analytics/student.txt

Objective: To remove an empty directory from HDFS.

Objective: To remove a directory and its contents from HDFS.

C:\Users\CSE>hdfs dfs -ls -R /

Objective: To remove a file from HDFS.

FILE MANAGEMENT TASKS IN HADOOP

Objective: To create a directory by the name „bigdata‟and „analytics‟ in HDFS.

Objective: To get the list of directories at the root of HDFS.

Objective: To copy a file from local file system to HDFS.

Objective: To get the list of complete directories and files of HDFS.

Objective: To copy a file from one directory to another on HDFS.

Objective: To move a file from one directory to another on HDFS.

Objective: To create a file in HDFS with file size 0 bytes.

Objective: To show the last 1 KB of the file on console or stdout.

C:\Users\CSE>hadoop fs -cat /analytics/student.txt

C:\Users\CSE>hadoop fs -appendToFile - /analytics/student.txt

C:\Users\CSE>hadoop fs -cat /analytics/student.txt

Objective: To remove an empty directory from HDFS.

Objective: To remove a directory and its contents from HDFS.

Objective: To remove a file from HDFS.

MAP REDUCE PROGRAM FOR FINDING THE UNIT WISE SALARY

public class UnitWiseSalary

CRUD (CREATE, READ, UPDATE AND DELETE) OPERATIONS IN MONGODB

INSERT USING UPDATE:

READ AFTER INSERT USING UPDATE:

INSERT USING SAVE:

READ AFTER INSERT USING SAVE:

READ AFTER DELETE:

READ AFTER DELETE:

IMPORT, EXPORT AND AGGREGATION IN MONGODB

TOP N AND BOTTOM N VIEW ON THE WORKSHEET USING TABLEAU

MAP REDUCE PROGRAM FOR WORD COUNTER

You might also like