0% found this document useful (0 votes)

10 views27 pages

Hive Commands Syn

Uploaded by

Abhishek Nazare

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views27 pages

Hive Commands Syn

Uploaded by

Abhishek Nazare

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 27

Hive Commands

Syntax & Example

Create Database Statement
A database in Hive is a namespace or a collection of tables.

hive> CREATE SCHEMA userdb;

hive> SHOW DATABASES;

Drop database

hive> DROP DATABASE IF EXISTS userdb;

Creating Hive Tables

Create a table called students with two columns, the first being an integer and
the other a string.

hive> CREATE TABLE students(Usn INT, Name STRING);

Create a table called HIVE_TABLE with two columns and a partition column called ds. The
partition column is a virtual column. It is not part of the data itself but is derived from the
partition that a particular dataset is loaded into. By default, tables are assumed to be of text
input format and the delimiters are assumed to be ^A(ctrl-a).

hive> CREATE TABLE students(Usn INT, Name STRING)

PARTITIONED BY (ds STRING);

Browse the table

hive> Show tables;

Altering and Dropping Tables

hive> ALTER TABLE students RENAME TO Stud;

hive> ALTER TABLE Stud ADD COLUMNS (col INT);
hive> ALTER TABLE HIVE_TABLE ADD COLUMNS (col1 INT COMMENT 'a comment
');
hive> ALTER TABLE HIVE_TABLE REPLACE COLUMNS (col2 INT, weight STRING, a
ge INT COMMENT ‘age replaces new_col1');
Hive DML Commands
To understand the Hive DML commands, let's see the employee and
employee_department table first.

LOAD DATA

hive> LOAD DATA LOCAL INPATH './usr/Desktop/

kv1.txt' OVERWRITE INTO TABLE Employee;
SELECTS and FILTERS

hive> SELECT E.EMP_ID FROM Employee E WHERE E.Address='US';

GROUP BY

hive> SELECT E.EMP_ID FROM Employee E GROUP BY E.Addresss;

Hive Sort By vs Order By

Hive sort by and order by commands are used to fetch data in sorted order. The
main differences between sort by and order by commands are given below.
Sort by

hive> SELECT E.EMP_ID FROM Employee E SORT BY E.empid;

May use multiple reducers for final output.

Only guarantees ordering of rows within a reducer.
May give partially ordered result. Uses single reducer to guarantee total order in
Order by output.
LIMIT can be used to minimize sort time.
hive> SELECT E.EMP_ID FROM Employee E order BY E.empid;
Hive Join
Let's see two tables Employee and EmployeeDepartment that are going to
be joined.

Inner joins

Select * from employee join employeedepartment ON (employee.empid=employeedepart

ment.empId)

Left outer joins

Select e.empId, empName, department from employee e Left outer join employeedepartm
ent ed on(e.empId=ed.empId);
Right outer joins

Select e.empId, empName, department from employee e Right outer join employeede
partment ed on(e.empId=ed.empId);

Full outer joins

Select e.empId, empName, department from employee e FULL outer join employeede
partment ed on(e.empId=ed.empId);

HiveQL - Operators

The HiveQL operators facilitate to

perform various arithmetic and
relational operations. Here, we
are going to execute such type of
operations on the records of the
below table:
Example of Operators in Hive
Let's create a table and load the data into it by using the following steps: -

Select the database in which we want to create a table.

hive> use hql;

Create a hive table using the following command: -

hive> create table employee (Id int, Name string , Salary float)
row format delimited
fields terminated by ',' ;

Now, load the data into the table.

hive> load data local inpath '/home/HQL/hive/emp_data' into table employee;

Let's fetch the loaded data by using the following command: -

hive> select * from employee;

Arithmetic Operators in Hive
In Hive, the arithmetic operator accepts any numeric type. The commonly used
arithmetic operators are: -
Operators Description
A+B This is used to add A and B.
A-B This is used to subtract B from A.
A*B This is used to multiply A and B.
A/B This is used to divide A and B and returns the quotient
of the operands.

A%B This returns the remainder of A / B.

A|B This is used to determine the bitwise OR of A and B.

A&B This is used to determine the bitwise AND of A and B.

A^B This is used to determine the bitwise XOR of A and B.

~A This is used to determine the bitwise NOT of A.

Examples of Arithmetic Operator in Hive
Let's see an example to increase the salary of each employee by 50.

hive> select id, name, salary + 50 from employee;

Let's see an example to decrease the salary of each employee by 50.

hive> select id, name, salary - 50 from employee;

Let's see an example to find out the 10% salary of each employee.

hive> select id, name, (salary * 10) /100 from employee;

Relational Operators in Hive
In Hive, the relational operators are generally used with clauses like Join and Having
to compare the existing records. The commonly used relational operators are: -

Operator Description
A=B It returns true if A equals B, otherwise false.

A <> B, A !=B It returns null if A or B is null; true if A is not equal

to B, otherwise false.

A<B It returns null if A or B is null; true if A is less than

B, otherwise false.

A>B It returns null if A or B is null; true if A is greater

than B, otherwise false.

A<=B It returns null if A or B is null; true if A is less than

or equal to B, otherwise false.

A>=B It returns null if A or B is null; true if A is greater

than or equal to B, otherwise false.

A IS NULL It returns true if A evaluates to null, otherwise

false.
A IS NOT NULL It returns false if A evaluates to null, otherwise
true.
Examples of Relational Operator in Hive

Let's see an example to fetch the details of the employee having salary>=25000.

hive> select * from employee where salary >= 25000;

Mathematical Functions in Hive

The commonly used mathematical functions in the hive are: -

Return type Functions Description
BIGINT round(num) It returns the BIGINT for the rounded
value of DOUBLE num.
BIGINT floor(num) It returns the largest BIGINT that is
less than or equal to num.
BIGINT ceil(num), ceiling(DOUBLE It returns the smallest BIGINT that is
num) greater than or equal to num.
DOUBLE exp(num) It returns exponential of num.
DOUBLE ln(num) It returns the natural logarithm of
num.
DOUBLE log10(num) It returns the base-10 logarithm of
num.
DOUBLE sqrt(num) It returns the square root of num.
DOUBLE abs(num) It returns the absolute value of num.
DOUBLE sin(d) It returns the sin of num, in radians.
DOUBLE asin(d) It returns the arcsin of num, in
radians.
DOUBLE cos(d) It returns the cosine of num, in
radians.
DOUBLE acos(d) It returns the arccosine of num, in
radians.
DOUBLE tan(d) It returns the tangent of num, in
radians.
DOUBLE atan(d) It returns the arctangent of num, in
radians.
Example of Mathematical Functions in Hive

Let's see an example to fetch the square root of each employee's salary.

hive> select Id, Name, sqrt(Salary) from employee_data ;

Return Type Operator Description
Aggregate Functions in Hive BIGINT count(*) It returns the count of
the number of rows
In Hive, the aggregate present in the file.

function returns a single DOUBLE sum(col) It returns the sum of

value resulting from values.
computation over many DOUBLE sum(DISTINCT col) It returns the sum of
rows. distinct values.
DOUBLE avg(col) It returns the average of
Let's see some commonly values.

used aggregate functions: - DOUBLE avg(DISTINCT col) It returns the average of

distinct values.

DOUBLE min(col) It compares the values

and returns the
minimum one form it.

DOUBLE max(col) It compares the values

and returns the
maximum one form it.
Examples of Aggregate Functions in Hive

Let's see an example to fetch the maximum salary of an employee.

hive> select max(Salary) from employee_data;

Let's see an example to fetch the minimum salary of an employee.

hive> select min(Salary) from employee_data;

HiveQL - GROUP BY and HAVING Clause

The Hive Query Language provides GROUP BY and HAVING clauses that facilitate
similar functionalities as in SQL. Here, we are going to execute these clauses on the
records of the below table:
GROUP BY Clause

The HQL Group By clause is used to group the data from the multiple records based on
one or more column. It is generally used in conjunction with the aggregate functions
(like SUM, COUNT, MIN, MAX and AVG) to perform an aggregation over each group.

Example of GROUP BY Clause in Hive

Let's see an example to sum the salary of employees based on department.

Select the database in which we want to create a table.

hive> use hiveql;

Now, create a table by using the following command:

hive> create table emp (Id int, Name string , Salary float, Department string)
row format delimited
fields terminated by ',' ;

Now, fetch the sum of employee salaries department wise by using the
following command:

hive> select department, sum(salary) from emp group by department;

HAVING CLAUSE

The HQL HAVING clause is used with GROUP BY clause. Its purpose is to apply
constraints on the group of data produced by GROUP BY clause. Thus, it always
returns the data where the condition is TRUE.

Example of Having Clause in Hive

In this example, we fetch the sum of employee's salary based on department and
apply the required constraints on that sum by using HAVING clause.

Let's fetch the sum of employee's salary based on department having sum >= 35000
by using the following command:

hive> select department, sum(salary) from emp group by department having sum(s
alary)>=35000;
Apache Hive View and Hive Index

Objective – Hive View

In this Hive index Tutorial, we will learn the whole concept of Hive Views and
Indexing in Hive. Also, we will cover how to create Hive Index and hive Views,
manage views and Indexing of hive, hive index types, hive index performance,
and hive view performance. In addition, we will learn several examples to
understand both. We can save any result set data as a view. Whereas Apache
Hive Index is a pointer to a particular column of a table.

What is Hive view?

Basically, Apache Hive View is similar to Hive tables, that are generated on the basis
of requirements.
Must Read Hive Internal Tables vs External Tables in detail
As a Hive view, we can save any result set data.
Well, we can say its usage is as same as the use of views in SQL.
Although, we can perform all type of DML operations on Hive views.

In other words, Apache Hive View is a searchable object in a database which we can
define by the query. However, we can not store data in the view. Still, some refer to as
a view as “virtual tables”. Hence, we can query a view like we can a table. Moreover, by
using joins it is possible to combine data from or more table. Also, it contains a subset
of information.
i. Apache Hive View Syntax
Create VIEW < VIEWNAME> AS SELECT

ii. Creating a Hive View

However, at the time of executing a SELECT statement, we can create a view. So, to
create Hive view Syntax is:

CREATE VIEW [IF NOT EXISTS] view_name [(column_name [COMMENT

column_comment], …) ]
[COMMENT table_comment]
AS SELECT …

iii. Apache Hive View Example

Let’s suppose, an employee table. It includes fields Id, Name, Salary, Designation, and
Dept. Now here we are generating a query to retrieve the employee details who earn
a salary of more than Rs 35000. So, we store the result in a view named emp_30000.
Table 1- Apache Hive View
Table 1- Apache Hive View

ID Name Salary Designation Dept

Technical
1201 Michel 45000 TP
manager
1202 Chandler 45000 Proofreader PR

1203 Ross 40000 Technical writer TP

1204 Joey 40000 Hr Admin HR
1205 Monika 35000 Op Admin Admin

Hence, using the above scenario here is the following query retrieves the employee details:
hive> CREATE VIEW emp_35000 AS
SELECT * FROM employee
WHERE salary>35000
iv. Dropping a Hive View
However, to drop a Hive view, use the following syntax:
DROP VIEW view_name
The following query drops a view named as emp_35000:
hive> DROP VIEW emp_35000;
b. What is Apache Hive Index?

On defining indexing in Hive we can say these are pointers to particular column
name of a table.

However, the user has to

manually define the Hive index

Basically, we are creating the

pointer to particular column
name of the table, wherever we
are creating Hive index.

By using the Hive index value

created on the column name, any
Changes made to the column
present in tables are stored.
i. Apache Hive index Syntax
Create INDEX < INDEX_NAME> ON TABLE < TABLE_NAME(column names)>

Let’s Learn Hive Operators – A Complete Tutorial for Hive Built-in Operators
ii. Create an Indexing in Hive
However, creating a Apache Hive index means creating a pointer on a particular
column of a table. So, to create an indexing in hive.
Apache Hive Index Syntax is:
CREATE INDEX index_name
ON TABLE base_table_name (col_name, ...)
AS 'index.handler.class.name'
[WITH DEFERRED REBUILD]
[IDXPROPERTIES (property_name=property_value, ...)]
[IN TABLE index_table_name]
[PARTITIONED BY (col_name, ...)]
[
[ ROW FORMAT ...] STORED AS ...
| STORED BY ...
]
[LOCATION hdfs_path]
[TBLPROPERTIES (...)]
iii. Apache Hive Index Example

Let’s suppose the same employee table which we had used earlier with the fields Id,
Name, Salary, Designation, and Dept. So, here create an index named index_salary on
the salary column of the employee table.

Hence, we use the following query to create a Hive index:

hive> CREATE INDEX inedx_salary ON TABLE employee(salary)

AS ‘org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler’;

However, it is a pointer to the salary column. Basically, the changes are stored using an
index value, if the column is modified.

iv. Dropping an Index

Also, drop indexing in Hive we use the following syntax of Apache Hive Index:
DROP INDEX <index_name> ON <table_name>
Here, is the following query drops a Hive index named index_salary:
hive> DROP INDEX index_salary ON employee;

Unit-4 Pig Hive
No ratings yet
Unit-4 Pig Hive
40 pages
Hive and Pig
No ratings yet
Hive and Pig
57 pages
BDA Unit-5
No ratings yet
BDA Unit-5
39 pages
Bda-Unit-Iv - 2020-21
100% (1)
Bda-Unit-Iv - 2020-21
30 pages
Unit-5 - Hive
No ratings yet
Unit-5 - Hive
31 pages
Practical-2 Hive (Show - Create - Load Commands)
No ratings yet
Practical-2 Hive (Show - Create - Load Commands)
13 pages
Hive Commands
No ratings yet
Hive Commands
7 pages
MySQL Cheatsheet Withlinks Compressed
No ratings yet
MySQL Cheatsheet Withlinks Compressed
40 pages
Unit IV
No ratings yet
Unit IV
64 pages
Cse3002 Big Data m2
No ratings yet
Cse3002 Big Data m2
76 pages
Ebook Interview Questions
No ratings yet
Ebook Interview Questions
200 pages
Hive Practice July
No ratings yet
Hive Practice July
69 pages
Unit 2.2 Hive
No ratings yet
Unit 2.2 Hive
80 pages
COMP5320 2025 Wk2 L1 RelationalAlgebra - Tagged
No ratings yet
COMP5320 2025 Wk2 L1 RelationalAlgebra - Tagged
71 pages
Hive File Format
No ratings yet
Hive File Format
38 pages
Hive
No ratings yet
Hive
15 pages
Hive Query Language
No ratings yet
Hive Query Language
33 pages
Hive-Bucketing and Indexing
No ratings yet
Hive-Bucketing and Indexing
28 pages
BDA - Exp-8 - Aarya Sawant
No ratings yet
BDA - Exp-8 - Aarya Sawant
18 pages
Hive Table Session
No ratings yet
Hive Table Session
23 pages
Hive Final
No ratings yet
Hive Final
75 pages
Shivam Cs Practical
No ratings yet
Shivam Cs Practical
50 pages
Hive
No ratings yet
Hive
42 pages
M4 Q&a
No ratings yet
M4 Q&a
22 pages
Hive 2nd Practical
No ratings yet
Hive 2nd Practical
11 pages
Unit 5
No ratings yet
Unit 5
21 pages
Welcome To SQL
No ratings yet
Welcome To SQL
116 pages
HIVE Architecture
No ratings yet
HIVE Architecture
5 pages
Apache HIVE
No ratings yet
Apache HIVE
44 pages
Dbms 6
No ratings yet
Dbms 6
29 pages
HIVE
No ratings yet
HIVE
80 pages
Hive Workshop Practical
No ratings yet
Hive Workshop Practical
29 pages
Beginner Level: DDL DML DCL TCL DQL
No ratings yet
Beginner Level: DDL DML DCL TCL DQL
122 pages
SQL Queries
80% (5)
SQL Queries
75 pages
HIVE
No ratings yet
HIVE
28 pages
Hive Builtin Functions
No ratings yet
Hive Builtin Functions
14 pages
Hive Documet
No ratings yet
Hive Documet
33 pages
Hive Main
No ratings yet
Hive Main
33 pages
Hive QL
No ratings yet
Hive QL
3 pages
Ch8 - Joins in SQL
No ratings yet
Ch8 - Joins in SQL
32 pages
Practice Set (ABEE)
No ratings yet
Practice Set (ABEE)
8 pages
Hive
No ratings yet
Hive
65 pages
Hive
No ratings yet
Hive
45 pages
Previous Year Oracle Solved Aptitude Question Papers
No ratings yet
Previous Year Oracle Solved Aptitude Question Papers
10 pages
98-364.exam.70q: Number: 98-364 Passing Score: 800 Time Limit: 120 Min
No ratings yet
98-364.exam.70q: Number: 98-364 Passing Score: 800 Time Limit: 120 Min
45 pages
Module 3-1
No ratings yet
Module 3-1
32 pages
Hive Code
No ratings yet
Hive Code
6 pages
Datatypes in Hive
No ratings yet
Datatypes in Hive
31 pages
Big Data Analytics: Welcome
No ratings yet
Big Data Analytics: Welcome
69 pages
DSCI 5350 - Lecture 5 PDF
No ratings yet
DSCI 5350 - Lecture 5 PDF
64 pages
Hive Tutorial
No ratings yet
Hive Tutorial
25 pages
Hive
No ratings yet
Hive
13 pages
Lab Answer Key: Module 1: Introduction To Microsoft SQL Server 2014 Lab: Working With SQL Server 2014 Tools
No ratings yet
Lab Answer Key: Module 1: Introduction To Microsoft SQL Server 2014 Lab: Working With SQL Server 2014 Tools
173 pages
Cheat Sheet: Hive Basics
No ratings yet
Cheat Sheet: Hive Basics
1 page
DBMSNotes
No ratings yet
DBMSNotes
17 pages
Hive Overview
No ratings yet
Hive Overview
28 pages
TD Hive Guide V2.0 PDF
No ratings yet
TD Hive Guide V2.0 PDF
34 pages
Hive Material
No ratings yet
Hive Material
19 pages
Introduction To Hive
No ratings yet
Introduction To Hive
14 pages
SQL - May
No ratings yet
SQL - May
35 pages
TD Hive Guide V2.0
No ratings yet
TD Hive Guide V2.0
34 pages
Teradata Vantage SQL Basics
No ratings yet
Teradata Vantage SQL Basics
14 pages
Tejas 22-10-24
No ratings yet
Tejas 22-10-24
15 pages
Chapter+9+ HIVE
No ratings yet
Chapter+9+ HIVE
50 pages
Hiveppt
No ratings yet
Hiveppt
29 pages
ANL252 SU6 Jul2022
No ratings yet
ANL252 SU6 Jul2022
51 pages
Hive Functions: Cheat Sheet
No ratings yet
Hive Functions: Cheat Sheet
6 pages
Hive Presentation
No ratings yet
Hive Presentation
18 pages
Big Data Analytics and Developers Training Session 10
No ratings yet
Big Data Analytics and Developers Training Session 10
27 pages
Untitled 1
No ratings yet
Untitled 1
31 pages
Dsa SQL Sheet
No ratings yet
Dsa SQL Sheet
4 pages
Hive
No ratings yet
Hive
9 pages
Hive Workshop Practical
No ratings yet
Hive Workshop Practical
29 pages
Test
No ratings yet
Test
80 pages
Pre-Calculus Essentials
From Everand
Pre-Calculus Essentials
Ernest Woodward
No ratings yet
SQL Narayana Reddy
100% (1)
SQL Narayana Reddy
124 pages
SQL Interview Questions
No ratings yet
SQL Interview Questions
45 pages
Excel Techniques
From Everand
Excel Techniques
Online Trainees
2/5 (1)
Document
No ratings yet
Document
97 pages
Hive Function Cheat Sheet
No ratings yet
Hive Function Cheat Sheet
1 page
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
DBA Tutorial
No ratings yet
DBA Tutorial
19 pages
Q - Mta - SQL Server Ok
100% (9)
Q - Mta - SQL Server Ok
22 pages
Hive Cheat Sheet - Quick Reference
No ratings yet
Hive Cheat Sheet - Quick Reference
19 pages
Alberto Ferrari - Optimizing DAX Queries
No ratings yet
Alberto Ferrari - Optimizing DAX Queries
43 pages
SQL Queries - Best Practices
No ratings yet
SQL Queries - Best Practices
3 pages
SQL 80 Plus Queries
No ratings yet
SQL 80 Plus Queries
15 pages
On MYSQL COMMANDS
No ratings yet
On MYSQL COMMANDS
9 pages
CC6001ES ADSD SQL Exam Model Paper 1 Answers
No ratings yet
CC6001ES ADSD SQL Exam Model Paper 1 Answers
12 pages

Hive Commands Syn

Uploaded by

Hive Commands Syn

Uploaded by

Hive Commands

Syntax & Example

hive> CREATE SCHEMA userdb;

hive> DROP DATABASE IF EXISTS userdb;

Creating Hive Tables

hive> CREATE TABLE students(Usn INT, Name STRING);

hive> CREATE TABLE students(Usn INT, Name STRING)

Browse the table

hive> Show tables;

Altering and Dropping Tables

hive> ALTER TABLE students RENAME TO Stud;

hive> LOAD DATA LOCAL INPATH './usr/Desktop/

hive> SELECT E.EMP_ID FROM Employee E WHERE E.Address='US';

hive> SELECT E.EMP_ID FROM Employee E GROUP BY E.Addresss;

Hive Sort By vs Order By

hive> SELECT E.EMP_ID FROM Employee E SORT BY E.empid;

May use multiple reducers for final output.

Select * from employee join employeedepartment ON (employee.empid=employeedepart

Left outer joins

Full outer joins

The HiveQL operators facilitate to

Select the database in which we want to create a table.

hive> use hql;

Create a hive table using the following command: -

Now, load the data into the table.

hive> load data local inpath '/home/HQL/hive/emp_data' into table employee;

Let's fetch the loaded data by using the following command: -

hive> select * from employee;

A%B This returns the remainder of A / B.

A|B This is used to determine the bitwise OR of A and B.

A&B This is used to determine the bitwise AND of A and B.

A^B This is used to determine the bitwise XOR of A and B.

~A This is used to determine the bitwise NOT of A.

hive> select id, name, salary + 50 from employee;

Let's see an example to decrease the salary of each employee by 50.

hive> select id, name, salary - 50 from employee;

hive> select id, name, (salary * 10) /100 from employee;

A <> B, A !=B It returns null if A or B is null; true if A is not equal

A<B It returns null if A or B is null; true if A is less than

A>B It returns null if A or B is null; true if A is greater

A<=B It returns null if A or B is null; true if A is less than

A>=B It returns null if A or B is null; true if A is greater

A IS NULL It returns true if A evaluates to null, otherwise

hive> select * from employee where salary >= 25000;

Mathematical Functions in Hive

The commonly used mathematical functions in the hive are: -

hive> select Id, Name, sqrt(Salary) from employee_data ;

function returns a single DOUBLE sum(col) It returns the sum of

used aggregate functions: - DOUBLE avg(DISTINCT col) It returns the average of

DOUBLE min(col) It compares the values

DOUBLE max(col) It compares the values

Let's see an example to fetch the maximum salary of an employee.

hive> select max(Salary) from employee_data;

Let's see an example to fetch the minimum salary of an employee.

hive> select min(Salary) from employee_data;

Example of GROUP BY Clause in Hive

Let's see an example to sum the salary of employees based on department.

Select the database in which we want to create a table.

hive> use hiveql;

hive> select department, sum(salary) from emp group by department;

Example of Having Clause in Hive

Objective – Hive View

What is Hive view?

ii. Creating a Hive View

CREATE VIEW [IF NOT EXISTS] view_name [(column_name [COMMENT

iii. Apache Hive View Example

ID Name Salary Designation Dept

1203 Ross 40000 Technical writer TP

However, the user has to

Basically, we are creating the

By using the Hive index value

Hence, we use the following query to create a Hive index:

hive> CREATE INDEX inedx_salary ON TABLE employee(salary)

iv. Dropping an Index

You might also like