0% found this document useful (0 votes)

28 views35 pages

@bigdatalabfile 09

This document contains details of an experiment conducted by student Ashwini Soni to install Hadoop 3 for a single node as part of their Big Data Lab course at Ujjain Engineering College. The document lists the steps taken to install Hadoop, configure files, and test the setup. It also includes details of two additional experiments conducted - using basic HDFS commands and writing a MapReduce program to count word frequencies in text.

Uploaded by

goatrip2024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views35 pages

@bigdatalabfile 09

Uploaded by

goatrip2024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Ujjain Engineering College, Ujjain

Indore Road, Ujjain, Madhya Pradesh 456010

Department of Computer Science and Engineering

Big Data Lab
LAB FILE
Session 2023-24
B.Tech VII SEMESTER

Name of Faculty Name of Student

Rekha Singh Ashwini Soni
0701CS201009
UJJAIN ENGINEERING COLLEGE, UJJAIN
Indore Road, Ujjain Madhya Pradesh, 456010
Department of Computer Science and Engineering

Subject : Big Data Lab

Semester : VII Semester
Name of Student : Ashwini Soni
Enrollment no. : 0701CS201009
List of Experiments

S.No. Aim Date of Date of Signature Remark

Experiment Submission
1
Install Hadoop 3 for Single
Node
2
HDFS Basic Commands
Write HDFS Commands to
perform following operations.
1.Create a directory in HDFS at
given path(s).
2.List the contents of a
directory.
3.Upload and download a file
in HDFS.
4.See contents of a file
5. Copy a file from source to
Destination
6.Copy a file from/To Local
file system to HDFS
7.Move file from source to
destination.
8.Remove a file or directory in
HDFS.
9. Display last few lines of a
file.

3
MapReduce

Write a program to count words

with its frequency (Write
custom Mapper,
Reducer and Driverclasses)
4
Hive Installation step by step.

5
Hive basic queries –
1. Write a query to count words
with its frequency using hive.
2. Create a managed table
Student with columns roll,
name, address, city, state and
load data into it.
3. Create a managed table
Result with columns roll, marks
and load data into it.
EXPERIMENT-1
AIM :- Install Hadoop 3 for Single Node.

1. JAVA-Java JDK (installed)

2. HADOOP-Hadoop package (Downloaded)

Step 1: Verify the Java installed

javac -version
Step 2: Extract Hadoop at C:\Hadoop

Step 3: Setting up the HADOOP_HOME variable

Use windows environment variable setting for Hadoop Path setting.

Step 4: Set JAVA_HOME variable

Use windows environment variable setting for Hadoop Path setting.
Step 5: Set Hadoop and Java bin directory path

Step 6: Hadoop Configuration :

For Hadoop Configuration we need to modify Six files that are listed below-
1. Core-site.xml
2. Mapred-site.xml
3. Hdfs-site.xml
4. Yarn-site.xml
5. Hadoop-env.cmd
6. Create two folders datanode and namenode
Step 6.1: Core-site.xml configuration
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Step 6.2: Mapred-site.xml configuration
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Step 6.3: Hdfs-site.xml configuration
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>C:\hadoop-2.8.0\data\namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>C:\hadoop-2.8.0\data\datanode</value>
</property>
</configuration>
Step 6.4: Yarn-site.xml configuration
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
Step 6.5: Hadoop-env.cmd configuration
Set "JAVA_HOME=C:\Java" (On C:\java this is path to file jdk.18.0)
Step 6.6: Create datanode and namenode folders
1. Create folder "data" under "C:\Hadoop-2.8.0"
2. Create folder "datanode" under "C:\Hadoop-2.8.0\data"
3. Create folder "namenode" under "C:\Hadoop-2.8.0\data"
Step 7: Format the namenode folder

Open command window (cmd) and typing command “hdfsnamenode –format”

Step 8: Testing the setup

Open command window (cmd) and typing command “start-all.cmd”

Step 8.1: Testing the setup:

Ensure that namenode, datanode, and Resource manager are running
Step 9: Open: http://localhost:8088
Step 10:
Open: http://localhost:50070
EXPERIMENT-2
AIM :- HDFS Basic Commands
Write HDFS Commands to perform following operations.
HDFS is the primary or major component of the Hadoop ecosystem which is responsible for
storing large data sets of structured or unstructured data across various nodes and thereby
maintaining the metadata in the form of log files. To use the HDFS commands, first you need
to start the Hadoop services using the following command: sbin/start-all.sh

1. Create a directory in HDFS at given path(s). 1

mkdir: To create a directory. In Hadoop dfs there is no home directory by default. So let’s
first create it.
Syntax:
hdfs dfs -mkdir / path

2. List the contents of a directory.

ls: This command is used to list all the files. Use lsr for recursive approach. It is useful
when we want a hierarchy of a folder.
Syntax:
hdfs dfs -ls /path

3. Upload and download a file in HDFS.

put: To copy files/folders from local file system to hdfs store. This is the most important
command. Local filesystem means the files present on the OS.
Syntax:
hdfs dfs -put <localsrc> <dest>
4. See contents of a file
cat: To print file contents.
Syntax:
hadoop fs -cat /path_to_file_in hdfs

5. Copy a file from source to destination

cp: This command is used to copy files within hdfs. Lets copy
folder geeks to geeks_copied.
Syntax:
hadoop fs -cp <src> <dest>

6. Copy a file from/To Local file system to HDFS

copyToLocal (or) get: To copy files/folders from hdfs store to local file system.
Syntax:
hadoop fs -copyToLocal <hdfs source> <localdst>

7. Move file from source to destination.

copyFromLocal (or) put: To copy files/folders from local file system to hdfs store. This is
the most important command. Local filesystem means the files present on the OS.
Syntax:
hdfs dfs -put <localsrc> <dest>
8. Remove a file or directory in HDFS.
rm: This command is similar to the UNIX rm command, and it is used for removing a file
from the HDFS file system. The command –rmr can be used to delete files recursively.
Syntax:
hdfs dfs -rm -r / filename

9. Display last few lines of a file.

tail: Here using the tail command, we are trying to display the 1KB of file ‘test’ present in the
dataflair directory on the HDFS filesystem. The Hadoop fs shell tail command shows the last
1KB of a file on console or stdout.
The -f shows the append data as the file grows.
Syntax:

hdfs dfs -tail / file

EXPERIMENT-3
AIM :- MapReduce
Write a program to count words with its frequency (Write custom
Mapper,Reducer and Driver classes)
In MapReduce word count example, we find out the frequency of each word. Here, the role of
Mapper is to map the keys to the existing values and the role of Reducer is to aggregate the
keys of common values. So, everything is represented in the form of Key-value pair.

Steps to execute MapReduce word count example:

• Create a text file in your local machine and write some text into it.

$ nano data.txt

• Check the text written in the data.txt file.

$ cat data.txt
In this example, we find out the frequency of each word exists in this text file.

• Create a directory in HDFS, where to kept text file.

$ hdfs dfs -mkdir /test

• Upload the data.txt file on HDFS in the specific directory.

$ hdfs dfs -put /home/codegyani/data.txt /test

1.Driver Code (WCDriver.java)

import java.io.IOException;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class WCDriver extends Configured implements Tool {
public int run(String args[]) throws IOException
{
if (args.length < 2)
{
System.out.println("Please give valid inputs");
return -1;
}
JobConf conf = new JobConf(WCDriver.class);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
conf.setMapperClass(WCMapper.class);
conf.setReducerClass(WCReducer.class);
conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(IntWritable.class);
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
JobClient.runJob(conf);
return 0;
}
// Main Method
public static void main(String args[]) throws Exception
{
int exitCode = ToolRunner.run(new WCDriver(), args);
System.out.println(exitCode);
}
}

2.Mapper Code(WCMapper.java)
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;
public class WCMapper extends MapReduceBase implements
Mapper<LongWritable,
Text, Text, IntWritable> {
// Map function
public void map(LongWritable key, Text value, OutputCollector<Text,
IntWritable> output, Reporter rep) throws IOException {
String line = value.toString();
// Splitting the line on spaces
for (String word : line.split(" ")) {
if (word.length() > 0) {
output.collect(new Text(word), new IntWritable(1));
}}
}}

3.Reducer Code
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;
public class WCReducer extends MapReduceBase implements Reducer<Text,
IntWritable, Text, IntWritable> {
// Reduce function
public void reduce(Text key, Iterator<IntWritable> value,
OutputCollector<Text, IntWritable> output,
Reporter rep) throws IOException
{
int count = 0;
// Counting the frequency of each words
while (value.hasNext())
{
IntWritable i = value.next();
count += i.get();
}
output.collect(key, new IntWritable(count));
}}
Create the jar file of this program and name it countworddemo.jar.
Run the jar file hadoop jar /home/codegyani/wordcountdemo WCDriver /test/data.txt
/r_output.
The output is stored in /r_output/part-00000

Now execute the command to see the output.

hdfs dfs -cat/r_output/part-00000
EXPERIMENT-4
AIM :- Hive Installation step by step.
1. Prerequisites
1. Hardware Requirement
* RAM — Min. 8GB, if you have SSD in your system then 4GB RAM would also
work.
* CPU — Min. Quad core, with at least 1.80GHz
2. JRE 1.8 — Offline installer for JRE
3. Java Development Kit — 1.8
4. A Software for Un-Zipping like 7Zip or Win Rar
* I will be using a 64-bit windows for the process, please check and download
the version supported by your system x86 or x64 for all the software.
5. Hadoop
* I am using Hadoop-2.9.2, you can also use any other STABLE version for
Hadoop.
6. MySQL Query Browser
7. Download Hive zip
* I am using Hive-3.1.2, you can also use any other STABLE version for
Hive.

Fig 1:- Download Hive-3.1.2

2. Unzip and Install Hive
After Downloading the Hive, we need to Unzip the apache-hive-3.1.2-bin.tar.gz file.

Fig 2:- Extracting Hive Step-1

Once extracted, we would get a new file apache-hive-3.1.2-bin.tar
Now, once again we need to extract this tar file.

Fig 3:- Extracting Hive Step-2

• Now we can organize our Hive installation, we can create a folder and move the
final extracted file in it. For Eg. :-

Fig 4:- Hive Directory

• Please note while creating folders, DO NOT ADD SPACES IN BETWEEN THE
FOLDER NAME.(it can cause issues later)
• I have placed my Hive in D: drive you can use C: or any other drive also.
3. Setting Up Environment Variables
Another important step in setting up a work environment is to set your Systems
environment variable.
To edit environment variables, go to Control Panel > System > click on the “Advanced
system settings” link
Alternatively, We can Right click on This PC icon and click on Properties and click on
the “Advanced system settings” link

Fig. 5:- Path for Environment Variable

Fig. 6:- Advanced System Settings Screen

3.1 Setting HIVE_HOME
• Open environment Variable and click on “New” in “User Variable”
Fig. 7:- Adding Environment Variable
• On clicking “New”, we get below screen.

Fig. 8:- Adding HIVE_HOME

• Now as shown, add HIVE_HOME in variable name and path of Hive in Variable
Value.
• Click OK and we are half done with setting HIVE_HOME.
3.2 Setting Path Variable
• Last step in setting Environment variable is setting Path in System Variable.

Fig. 9:- Setting Path Variable

• Select Path variable in the system variables and click on “Edit”.
Fig. 10:- Adding Path
• Now we need to add these paths to Path Variable :-
* %HIVE_HOME%\bin
• Click OK and OK. & we are done with Setting Environment Variables.
3.4 Verify the Paths
• Now we need to verify that what we have done is correct and reflecting.
• Open a NEW Command Window
• Run following commands
echo %HIVE_HOME%
4. Editing Hive
Once we have configured the environment variables next step is to configure Hive. It has
7 parts:-
4.1 Replacing bins
First step in configuring the hive is to download and replace the bin folder.
* Go to this GitHub Repo and download the bin folder as a zip.
* Extract the zip and replace all the files present under bin folder to
%HIVE_HOME%\bin
Note:- If you are using different version of HIVE then please search for its respective bin
folder and download it.
4.2 Creating File Hive-site.xml
Now we need to create the Hive-site.xml file in hive for configuring it :-
(We can find these files in Hive -> conf -> hive-default.xml.template)
We need to copy the hive-default.xml.template file and paste it in the same location and
rename it to hive-site.xml. This will act as our main Config file for Hive.

Fig. 11:- Creating Hive-site.xml

4.3 Editing Configuration Files
4.3.1 Editing the Properties
Now Open the newly created Hive-site.xml and we need to edit the following properties
<property>
<name>hive.metastore.uris</name>
<value>thrift://<Your IP Address>:9083</value>
<property>
<name>hive.downloaded.resources.dir</name>
<value><Your drive Folder>/${hive.session.id}_resources</value>
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/mydir</value>
Replace the value for <Your IP Address> with the IP Address of your System and
replace <Your drive Folder> with the Hive folder Path.
4.3.2 Removing Special Characters
This is a short step and we need to remove all the &#8 character present in the hive-
site.xml file.
4.3.3 Adding few More Properties
Now we need to add the following properties as it is in the hive-site.xml File.
<property>
<name>hive.querylog.location</name>
<value>$HIVE_HOME/iotmp</value>
<description>Location of Hive run time structured log file</description>
</property><property>
<name>hive.exec.local.scratchdir</name>
<value>$HIVE_HOME/iotmp</value>
<description>Local scratch space for Hive jobs</description>
</property><property>
<name>hive.downloaded.resources.dir</name>
<value>$HIVE_HOME/iotmp</value>
<description>Temporary local directory for added resources in the remote file
system.</description>
</property>
Great..!!! We are almost done with the Hive part, for configuring MySQL database as
Metastore for Hive, we need to follow below steps:-
4.4 Creating Hive User in MySQL
The next important step in configuring Hive is to create users for MySQL.
These Users are used for connecting Hive to MySQL Database for reading and writing
data from it.
Note:- You can skip this step if you have created the hive user while SQOOP installation.
• Firstly, we need to open the MySQL Workbench and open the workspace(default
or any specific, if you want). We will be using the default workspace only for
now.
Fig 12:- Open MySQL Workbench
• Now Open the Administration option in the Workspace and select Users and
privileges option under Management.

Fig 13:- Opening Users and Privileges

• Now select Add Account option and Create an new user with Login
Name as hive and Limit to Host Mapping as the localhost and Password of
your choice.

Fig 14:- Creating Hive User

• Now we have to define the roles for this user under Administrative Roles and
select DBManager ,DBDesigner and BackupAdmin Roles

Fig 15:- Assigning Roles

• Now we need to grant schema privileges for the user by using Add Entry option
and selecting the schemas we need access to.

Fig 16:- Schema Privileges

I am using schema matching pattern as %_bigdata% for all my bigdata related schemas.
You can use other 2 options also.
After clicking OK we need to select All the privileges for this schema.
Fig 17:- Select All privileges in the schema
• Click Apply and we are done with the creating Hive user.
4.5 Granting permission to Users
Once we have created the user hive the next step is to Grant All privileges to this user for
all the Tables in the previously selected Schema.
• Open the MySQL cmd Window. We can open it by using the Window’s Search
bar.
Fig 18:- MySQL cmd
• Upon opening it will ask for your root user password(created while setting up
MySQL).
• Now we need to run the below command in the cmd window.
grant all privileges on test_bigdata.* to 'hive'@'localhost';
where test_bigdata will be you schema name and hive@localhost will be the user name
@ Host name.
4.6 Creating Metastore
Now we need to create our own metastore for Hive in MySQL..
Firstly, we need to create a database for metastore in MySQL OR we can use the one
which used in previous step test_bigdata in my case.
Now Navigate to the below path
hive -> scripts -> metastore -> upgrade -> mysql and execute the file hive-schema-
3.1.0.mysql in MySQL in your database.
Note:- If you are using different Database, select the folder for same in upgrade folder
and execute the hive-schema file.
4.7 Adding Few More Properties(Metastore related Properties)
Finally, we need to open our hive-site.xml file once again and make some changes their,
these are related to Hive metastore that’s why did not add them in starting so as to
distinguish between the different set of properties.
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>Username to use against metastore database</description>
</property>

<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/<Your
Database>?createDatabaseIfNotExist=true</value>
<description>
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL
flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>hdfs://localhost:9000/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value><Hive Password></value>
<description>password to use against metastore database</description>
</property>
<property>
<name>datanucleus.schema.autoCreateSchema</name>
<value>true</value>
</property>
<property>
<name>datanucleus.schema.autoCreateTables</name>
<value>True</value>
</property>
<property>
<name>datanucleus.schema.validateTables</name>
<value>true</value>
<description>validates existing schema against code. turn this on if you want to verify
existing schema</description>
</property>
Replace the value for <Hive Password> with the hive user password that we created in
MySQL user creation. And <Your Database> with the database that we used for
metastore in MySQL.
5. Starting Hive
5.1 Starting Hadoop
Now we need to start a new Command Prompt remember to run it as administrator to
avoid permission issues and execute below commands
start-all.cmd
Fig. 19:- start-all.cmd
All the 4 daemons should be UP and running.
5.2 Starting Hive Metastore
Open a cmd window, run below command to start the Hive metastore.
hive --service metastore

Fig 20:- Starting Hive Metastore

5.3 Starting Hive
Now open a new cmd window and run the below command to start Hive
hive
EXPERIMENT-5
AIM-5. Hive basic queries –
I) Write a query to count words with its frequency using hive.
Input-
This is my first hive tutorial, which is known as hello world program in big data , big
data technologies are now on demand.
Hive Query

Step 1. Create a table in hive

hive> create table feedback(comments string);

Step 2. Load data from the sample file

Syntax:

hive> load data local inpath '/home/ashwini/hadoop_data/comments.txt' into table

feedback

Step 3. Convert comments into an array

hive> select split(comments,' ') from feedback;

Step 4. Use table generation udf

hive> select explode( split(comments,' ')) from feedback;

The output of the above explode with split function is

This
is
my
first
hive
tutorial,
which
is
known
as
hello
world
program
in
big
data
,
big
data
technologies
are
now
on
demand.

Step 5. Final step

hive> select word,count(*) from (select explode( split(comments,' ')) as word from
, 1
This 1
are 1
as 1
big 2
data 2
demand. 1
first 1
hello 1
hive 1
in 1
is 2
known 1
my 1
now 1
on 1
program 1
technologies1
tutorial, 1
which 1
world 1
Output -

II) Create a managed table Student with columns roll, name, address, city, state
and load data into it.
Creating DataBase in Hive

CREATE DATABASE student_detail;

SHOW DATABASES;

USE student_detail;

Creating Table in Hive

CREATE TABLE IF NOT EXISTS student(roll_no INT,name STRING, address

STRING, city STRING, state STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',';
Load Data into table

load data inpath '/hdoop/student.txt' overwrite into table student;

select * from student;

III) Create a managed table Result with columns roll, marks and load data into
it.

Creating DataBase in Hive

CREATE DATABASE student;

SHOW DATABASES;

USE student;
Creating Table in Hive

CREATE TABLE IF NOT EXISTS result(roll_no INT,marks FLOAT)

ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',';

Load Data into table

load data inpath '/hadoop/Result.txt' overwrite into table result;

select * from result;

History: Motorola MC68000 (Package)
No ratings yet
History: Motorola MC68000 (Package)
5 pages
Readme en PDF
No ratings yet
Readme en PDF
1 page
Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
889LCD/ATX: User's Manual
No ratings yet
889LCD/ATX: User's Manual
29 pages
S7 Communication With Put/Get: S7-1200 Cpus / S7-300 Cpus
No ratings yet
S7 Communication With Put/Get: S7-1200 Cpus / S7-300 Cpus
35 pages
Portserver Ts 8/16: User Guide
No ratings yet
Portserver Ts 8/16: User Guide
91 pages
Cloud Paks Foundational Services Level 1 Quiz Attempt Review PDF
100% (1)
Cloud Paks Foundational Services Level 1 Quiz Attempt Review PDF
9 pages
Chapter 6 Multithreading
No ratings yet
Chapter 6 Multithreading
36 pages
C21053 Jay Vijay Karwatkar-Big Data Analytics & Visualization
No ratings yet
C21053 Jay Vijay Karwatkar-Big Data Analytics & Visualization
210 pages
Errores Fortran
No ratings yet
Errores Fortran
169 pages
PDC All Labs
100% (1)
PDC All Labs
129 pages
Linux System Programming Part 1 - Linux Basics: IBA Bulgaria 2018
No ratings yet
Linux System Programming Part 1 - Linux Basics: IBA Bulgaria 2018
18 pages
Hadoop File Complte
No ratings yet
Hadoop File Complte
18 pages
P2512B 2011jan AlsDloadUserGuide Rev 4
No ratings yet
P2512B 2011jan AlsDloadUserGuide Rev 4
44 pages
The Difference Between Straight Through, Crossover, and Rollover Cables - Learn-Networking
No ratings yet
The Difference Between Straight Through, Crossover, and Rollover Cables - Learn-Networking
10 pages
05 SCCM 2012 MP Replica
No ratings yet
05 SCCM 2012 MP Replica
14 pages
Ajith Bandi: Devops Engineer/Aws
No ratings yet
Ajith Bandi: Devops Engineer/Aws
4 pages
SCTP Tutorial
No ratings yet
SCTP Tutorial
20 pages
Datapath For The MIPS Architecture (A Single-Cycle Implementation)
No ratings yet
Datapath For The MIPS Architecture (A Single-Cycle Implementation)
22 pages
BDH Record - Merged
No ratings yet
BDH Record - Merged
47 pages
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
No ratings yet
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
35 pages
Cloud PDF
No ratings yet
Cloud PDF
47 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
48 pages
Cosx
No ratings yet
Cosx
4 pages
Big Data
No ratings yet
Big Data
67 pages
Steps To Install and Configure The Citrix ICA Client
No ratings yet
Steps To Install and Configure The Citrix ICA Client
6 pages
Matriks Komponen Sistem Informasi: No Hardware/ Netware Software Mesin Media Program Sumber Daya/ Activity
No ratings yet
Matriks Komponen Sistem Informasi: No Hardware/ Netware Software Mesin Media Program Sumber Daya/ Activity
3 pages
Pdating The Synscan Firmware: System Requirements
No ratings yet
Pdating The Synscan Firmware: System Requirements
4 pages
01 Florian Kaltenberger Eurecom Status Update and Roadmap of 5G Nsa and Sa Developments
No ratings yet
01 Florian Kaltenberger Eurecom Status Update and Roadmap of 5G Nsa and Sa Developments
13 pages
Java - Jboss Eap - Mod - Cluster - To Prevent Jboss Talking To Each Other - Stack Overflow
No ratings yet
Java - Jboss Eap - Mod - Cluster - To Prevent Jboss Talking To Each Other - Stack Overflow
3 pages
Cheap Vps Server
No ratings yet
Cheap Vps Server
10 pages
Big Data Analytics Lab
No ratings yet
Big Data Analytics Lab
18 pages
Os Lab Pratices Finialized
No ratings yet
Os Lab Pratices Finialized
2 pages
Rahul Assignment 2
No ratings yet
Rahul Assignment 2
2 pages
Desd Edd QP
No ratings yet
Desd Edd QP
15 pages
CC Hadoop Lab
No ratings yet
CC Hadoop Lab
6 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
34 pages
NodeMCU ESP8266
No ratings yet
NodeMCU ESP8266
4 pages
Using MARS
No ratings yet
Using MARS
3 pages
Bugzilla Installation Steps
No ratings yet
Bugzilla Installation Steps
2 pages
Bigdata Lab
No ratings yet
Bigdata Lab
55 pages
Big Data Analytics IT
No ratings yet
Big Data Analytics IT
55 pages
Dsa Practical File
No ratings yet
Dsa Practical File
16 pages
Big Data Manual
No ratings yet
Big Data Manual
82 pages
Data Science
No ratings yet
Data Science
82 pages
Notes
No ratings yet
Notes
53 pages
BIGDATA LAB MANUAL
No ratings yet
BIGDATA LAB MANUAL
27 pages
Prachi 20CS111 BDALab File
No ratings yet
Prachi 20CS111 BDALab File
20 pages
Computer 5 1st Mid Term 2024
No ratings yet
Computer 5 1st Mid Term 2024
4 pages
Ccs 334 Bigdata Manual
No ratings yet
Ccs 334 Bigdata Manual
45 pages
Configure Hadoop Cluster in Pseudo Distributed Mode. Try Hadoop Basic Commands
No ratings yet
Configure Hadoop Cluster in Pseudo Distributed Mode. Try Hadoop Basic Commands
88 pages
Bigdatamanualfinal 231019063224 d211cb48
No ratings yet
Bigdatamanualfinal 231019063224 d211cb48
45 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
59 pages
Course: Big Data Analytics Lab Scheme: 2017
No ratings yet
Course: Big Data Analytics Lab Scheme: 2017
25 pages
BDA Lab Manual - Organized
No ratings yet
BDA Lab Manual - Organized
69 pages
BDA Practicalfile
No ratings yet
BDA Practicalfile
19 pages
Amrita CC 3.1
No ratings yet
Amrita CC 3.1
7 pages
BDA Lab
No ratings yet
BDA Lab
13 pages
Bda Record (24-25)
No ratings yet
Bda Record (24-25)
50 pages
BDA Record
No ratings yet
BDA Record
34 pages
Bda Record
No ratings yet
Bda Record
83 pages
Big Data Record 2024-25
No ratings yet
Big Data Record 2024-25
46 pages
Big Data
No ratings yet
Big Data
28 pages
Bda Manual
No ratings yet
Bda Manual
33 pages
Bigdatamanual
No ratings yet
Bigdatamanual
45 pages
Bda Lab S
No ratings yet
Bda Lab S
92 pages
CCS334 BDA Lab Manual
No ratings yet
CCS334 BDA Lab Manual
35 pages
CCS334-BDA LAB MANUAL Final
No ratings yet
CCS334-BDA LAB MANUAL Final
46 pages
Bda Lab Record
No ratings yet
Bda Lab Record
32 pages
Big Data Analytics lab-JD
No ratings yet
Big Data Analytics lab-JD
49 pages
Hadoop 1
No ratings yet
Hadoop 1
15 pages
CNS Lab Manual - 2024-25
No ratings yet
CNS Lab Manual - 2024-25
28 pages
Big Datalab
No ratings yet
Big Datalab
4 pages
BDA UNIT - 3 Updated
No ratings yet
BDA UNIT - 3 Updated
25 pages
CCS334 Bda Lab Manual
No ratings yet
CCS334 Bda Lab Manual
48 pages
BIG Data File
No ratings yet
BIG Data File
28 pages
BDA Journal
No ratings yet
BDA Journal
52 pages
Big Data Lab Manual Printout
No ratings yet
Big Data Lab Manual Printout
51 pages
Exp 1-2
No ratings yet
Exp 1-2
9 pages
Ccs334 Bda Lab Manual PRINT
No ratings yet
Ccs334 Bda Lab Manual PRINT
53 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
32 pages
Bda Lab Manual 2024
No ratings yet
Bda Lab Manual 2024
45 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
42 pages
Da Lab Record - Merged
No ratings yet
Da Lab Record - Merged
48 pages
Ccs334 Bda Lab Ex
No ratings yet
Ccs334 Bda Lab Ex
45 pages
BDA Manual
No ratings yet
BDA Manual
41 pages
Bda 1
No ratings yet
Bda 1
54 pages
Ai&Ml (Bdamanual)
No ratings yet
Ai&Ml (Bdamanual)
24 pages

@bigdatalabfile 09

Uploaded by

@bigdatalabfile 09

Uploaded by

Ujjain Engineering College, Ujjain

Indore Road, Ujjain, Madhya Pradesh 456010

Department of Computer Science and Engineering

Name of Faculty Name of Student

Subject : Big Data Lab

S.No. Aim Date of Date of Signature Remark

Write a program to count words

1. JAVA-Java JDK (installed)

Step 1: Verify the Java installed

Step 3: Setting up the HADOOP_HOME variable

Step 4: Set JAVA_HOME variable

Step 6: Hadoop Configuration :

Open command window (cmd) and typing command “hdfsnamenode –format”

Step 8: Testing the setup

Step 8.1: Testing the setup:

1. Create a directory in HDFS at given path(s). 1

2. List the contents of a directory.

3. Upload and download a file in HDFS.

5. Copy a file from source to destination

6. Copy a file from/To Local file system to HDFS

7. Move file from source to destination.

9. Display last few lines of a file.

hdfs dfs -tail / file

Steps to execute MapReduce word count example:

• Check the text written in the data.txt file.

• Create a directory in HDFS, where to kept text file.

$ hdfs dfs -mkdir /test

• Upload the data.txt file on HDFS in the specific directory.

$ hdfs dfs -put /home/codegyani/data.txt /test

1.Driver Code (WCDriver.java)

Now execute the command to see the output.

Fig 1:- Download Hive-3.1.2

Fig 2:- Extracting Hive Step-1

Fig 3:- Extracting Hive Step-2

Fig 4:- Hive Directory

Fig. 5:- Path for Environment Variable

Fig. 6:- Advanced System Settings Screen

Fig. 8:- Adding HIVE_HOME

Fig. 9:- Setting Path Variable

Fig. 11:- Creating Hive-site.xml

Fig 13:- Opening Users and Privileges

Fig 14:- Creating Hive User

Fig 15:- Assigning Roles

Fig 16:- Schema Privileges

Fig 20:- Starting Hive Metastore

Step 1. Create a table in hive

Step 2. Load data from the sample file

hive> load data local inpath '/home/ashwini/hadoop_data/comments.txt' into table

Step 3. Convert comments into an array

Step 4. Use table generation udf

The output of the above explode with split function is

Step 5. Final step

CREATE DATABASE student_detail;

Creating Table in Hive

CREATE TABLE IF NOT EXISTS student(roll_no INT,name STRING, address

load data inpath '/hdoop/student.txt' overwrite into table student;

Creating DataBase in Hive

CREATE DATABASE student;

CREATE TABLE IF NOT EXISTS result(roll_no INT,marks FLOAT)

Load Data into table

load data inpath '/hadoop/Result.txt' overwrite into table result;

You might also like