Ex3-Query Processing Using Hive and Beeswax
Ex3-Query Processing Using Hive and Beeswax
No:3
BIG DATA PROCESSING USING HIVE AND
BEESWAX
Date :
Aim:
To process the doctor database management system of big data using hive and
Beeswax.
About Hive:
Hive is a data warehouse system which is used to analyze structured data. It is
built on the top of Hadoop. It was developed by Facebook. Hive provides the
functionality of reading, writing, and managing large datasets rediding in distributed
storage. It runs SQL like queries called HQL (Hive query language) which gets
internally converted to MapReduce jobs.
Using Hive, we can skip the requirement of the traditional approach of writing
complex MapReduce programs. Hive supports Data Definition Language (DDL), Data
Manipulation Language (DML), and User Defined Functions (UDF).
Beeswax is a web-based query editor developed by Cloudera for interacting
with Apache Hive. It provides an interface for running HiveQL queries, managing
metadata, and exploring datasets. Apache Hive is a powerful tool for querying large
datasets in Hadoop, whereas Beeswax is a web-based interface that makes interacting
with Hive easier, mainly within the Cloudera ecosystem.
Procedure:
Step 1 : Open a terminal in
cloudera Step 2 : Type Hive in
the terminal
Step 3 : it open in below formate and type the below hive sql
comments hive>
1. show database:
It show the all the databases into the Hive DB
hive> show
databases; show
databases
OK
college
default
firstdb
hospital
Time taken: 0.035 seconds, Fetched: 4 row(s)
Example:
hive> ALTER TABLE student CHANGE spec salary String;
ALTER TABLE doctor CHANGE adres address
string; OK
Time taken: 0.324 seconds
5. LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO
TABLE tablename
[PARTITION (partcol1=val1, partcol2=val2 ...)]
hive> load data local inpath '/home/cloudera/details.txt' overwrite into table doctor;
load data local inpath '/home/cloudera/doctordet.txt' overwrite into table
doctor; Loading data to table default.doctor
Table default.doctor stats: [numFiles=1, numRows=0, totalSize=199,
rawDataSize=0] OK
Time taken: 0.617 seconds