@bigdatalabfile 09
@bigdatalabfile 09
3
MapReduce
5
Hive basic queries –
1. Write a query to count words
with its frequency using hive.
2. Create a managed table
Student with columns roll,
name, address, city, state and
load data into it.
3. Create a managed table
Result with columns roll, marks
and load data into it.
EXPERIMENT-1
AIM :- Install Hadoop 3 for Single Node.
• Create a text file in your local machine and write some text into it.
$ nano data.txt
$ cat data.txt
In this example, we find out the frequency of each word exists in this text file.
2.Mapper Code(WCMapper.java)
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;
public class WCMapper extends MapReduceBase implements
Mapper<LongWritable,
Text, Text, IntWritable> {
// Map function
public void map(LongWritable key, Text value, OutputCollector<Text,
IntWritable> output, Reporter rep) throws IOException {
String line = value.toString();
// Splitting the line on spaces
for (String word : line.split(" ")) {
if (word.length() > 0) {
output.collect(new Text(word), new IntWritable(1));
}}
}}
3.Reducer Code
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;
public class WCReducer extends MapReduceBase implements Reducer<Text,
IntWritable, Text, IntWritable> {
// Reduce function
public void reduce(Text key, Iterator<IntWritable> value,
OutputCollector<Text, IntWritable> output,
Reporter rep) throws IOException
{
int count = 0;
// Counting the frequency of each words
while (value.hasNext())
{
IntWritable i = value.next();
count += i.get();
}
output.collect(key, new IntWritable(count));
}}
Create the jar file of this program and name it countworddemo.jar.
Run the jar file hadoop jar /home/codegyani/wordcountdemo WCDriver /test/data.txt
/r_output.
The output is stored in /r_output/part-00000
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/<Your
Database>?createDatabaseIfNotExist=true</value>
<description>
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL
flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>hdfs://localhost:9000/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value><Hive Password></value>
<description>password to use against metastore database</description>
</property>
<property>
<name>datanucleus.schema.autoCreateSchema</name>
<value>true</value>
</property>
<property>
<name>datanucleus.schema.autoCreateTables</name>
<value>True</value>
</property>
<property>
<name>datanucleus.schema.validateTables</name>
<value>true</value>
<description>validates existing schema against code. turn this on if you want to verify
existing schema</description>
</property>
Replace the value for <Hive Password> with the hive user password that we created in
MySQL user creation. And <Your Database> with the database that we used for
metastore in MySQL.
5. Starting Hive
5.1 Starting Hadoop
Now we need to start a new Command Prompt remember to run it as administrator to
avoid permission issues and execute below commands
start-all.cmd
Fig. 19:- start-all.cmd
All the 4 daemons should be UP and running.
5.2 Starting Hive Metastore
Open a cmd window, run below command to start the Hive metastore.
hive --service metastore
II) Create a managed table Student with columns roll, name, address, city, state
and load data into it.
Creating DataBase in Hive
SHOW DATABASES;
USE student_detail;
III) Create a managed table Result with columns roll, marks and load data into
it.
SHOW DATABASES;
USE student;
Creating Table in Hive