0% found this document useful (0 votes)
7 views4 pages

Ex3-Query Processing Using Hive and Beeswax

The document outlines the process of managing and analyzing a doctor database using Hive and Beeswax. Hive, a data warehouse system built on Hadoop, allows users to run SQL-like queries to handle large datasets, while Beeswax serves as a web-based interface for easier interaction with Hive. The document includes a step-by-step procedure for executing various Hive commands, such as creating databases and tables, loading data, and querying information.

Uploaded by

2317061
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views4 pages

Ex3-Query Processing Using Hive and Beeswax

The document outlines the process of managing and analyzing a doctor database using Hive and Beeswax. Hive, a data warehouse system built on Hadoop, allows users to run SQL-like queries to handle large datasets, while Beeswax serves as a web-based interface for easier interaction with Hive. The document includes a step-by-step procedure for executing various Hive commands, such as creating databases and tables, loading data, and querying information.

Uploaded by

2317061
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Ex.

No:3
BIG DATA PROCESSING USING HIVE AND
BEESWAX
Date :

Aim:
To process the doctor database management system of big data using hive and
Beeswax.
About Hive:
Hive is a data warehouse system which is used to analyze structured data. It is
built on the top of Hadoop. It was developed by Facebook. Hive provides the
functionality of reading, writing, and managing large datasets rediding in distributed
storage. It runs SQL like queries called HQL (Hive query language) which gets
internally converted to MapReduce jobs.
Using Hive, we can skip the requirement of the traditional approach of writing
complex MapReduce programs. Hive supports Data Definition Language (DDL), Data
Manipulation Language (DML), and User Defined Functions (UDF).
Beeswax is a web-based query editor developed by Cloudera for interacting
with Apache Hive. It provides an interface for running HiveQL queries, managing
metadata, and exploring datasets. Apache Hive is a powerful tool for querying large
datasets in Hadoop, whereas Beeswax is a web-based interface that makes interacting
with Hive easier, mainly within the Cloudera ecosystem.

Procedure:
Step 1 : Open a terminal in
cloudera Step 2 : Type Hive in
the terminal
Step 3 : it open in below formate and type the below hive sql
comments hive>

List of Commands with description:

1. show database:
It show the all the databases into the Hive DB

hive> show databases;


show
databases
OK
college
default
firstdb
Time taken: 0.937 seconds, Fetched: 3 row(s)

2. Create schema <Database name>;


hive> create schema
hospital; create schema
hospital
OK
Time taken: 4.555 seconds

hive> show
databases; show
databases
OK
college
default
firstdb
hospital
Time taken: 0.035 seconds, Fetched: 4 row(s)

3. drop schema <database name>;


Drop the database from the hive DB
hive> drop schema firstdb;
drop schema
firstdb OK
Time taken: 0.241 seconds
hive> show databases;
show
databases
OK
college
default
hospital
Time taken: 0.01 seconds, Fetched: 3 row(s)

4.CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]


table_name

[(col_name data_type [COMMENT


col_comment], ...)] [COMMENT
table_comment]

[ROW FORMAT row_format] [STORED AS file_format]


hive> create table student(did int,dname String, spec String,dtiming String) comment
'Student details'
row format delimited fields terminated by
'\t' lines terminated by '\n' stored as
textfile;
create table student(did int,dname String, spec String,dtiming String) comment
'Student details' row format delimited fields terminated by '\t' lines terminated by '\n'
stored as textfile OK
Time taken: 0.776 seconds

4. ALTER TABLE name RENAME TO new_name


ALTER TABLE name ADD COLUMNS (col_spec[, col_spec ...])
ALTER TABLE name DROP [COLUMN] column_name
ALTER TABLE name CHANGE column_name new_name new_type
ALTER TABLE name REPLACE COLUMNS (col_spec[, col_spec
...])

Example:
hive> ALTER TABLE student CHANGE spec salary String;
ALTER TABLE doctor CHANGE adres address
string; OK
Time taken: 0.324 seconds
5. LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO
TABLE tablename
[PARTITION (partcol1=val1, partcol2=val2 ...)]

hive> load data local inpath '/home/cloudera/details.txt' overwrite into table doctor;
load data local inpath '/home/cloudera/doctordet.txt' overwrite into table
doctor; Loading data to table default.doctor
Table default.doctor stats: [numFiles=1, numRows=0, totalSize=199,
rawDataSize=0] OK
Time taken: 0.617 seconds

6. SELECT [ALL | DISTINCT] select_expr, select_expr, ...


FROM table_reference [WHERE where_condition]
[GROUP BY col_list] [HAVING having_condition]
[CLUSTER BY col_list | [DISTRIBUTE BY col_list] [SORT BY
col_list]] [LIMIT number];

hive> select * from doctor;

1001 Dr.K.S.Priya Cardio 9:00am-9:00pm


1002 Dr.A.Saranya Nutritionist 8:00pm-9:30pm
1003 Dr.P.Shan Neuro 9:00am-1:00pm
1004 Dr.M.Devi Otolaryngo 8:00am-12:00pm
1005 Dr.G.Anu Dermatolo 4:00pm-9:00pm
OK
Time taken: 0.083 seconds, Fetched: 5 row(s)
RESULT:
Thus, the big data processed using hive for doctor database management system.

You might also like