0% found this document useful (0 votes)

101 views33 pages

3 SQL Hadoop Analyzing Big Data Hive m3 Hiveql Slides

This document provides an overview of the Hive query language. It describes Hive data types including primitive, complex, and collection types. It covers loading and organizing data in Hive through managed and external partitioned tables, as well as dynamic partition inserts. The document also discusses retrieving data through single scan multiple inserts, Hive functions and aggregation, grouping sets, cube, and rollup operations.

Uploaded by

गोपाल शर्मा

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

101 views33 pages

3 SQL Hadoop Analyzing Big Data Hive m3 Hiveql Slides

Uploaded by

गोपाल शर्मा

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Hive Query Language

Ahmad Alkilani
www.pluralsight.com
Outline
 Data Types

 Load and Organize Data

 Managed/External Partitioned Tables
 Dynamic Partition Inserts

 Single Scan-Multiple Inserts

 Hive Functions, Aggregates, Group By, Cube, Rollup, Having

 Sorting and Clustering Results

 Using the CLI in the real world

 Batch mode
 Variable Substitution
Primitive Data Types

Numeric

• TINYINT, SMALLINT, INT, BIGINT

• FLOAT
• DOUBLE
• DECIMAL – Starting Hive 0.11

Date/Time

• TIMESTAMP starting Hive 0.8

• Strings must be in format "YYYY-MM-DD HH:MM:SS.fffffffff"
• Integer types as UNIX timestamp in seconds from UNIX epoch
• Floating point types same as Integer with decimal precision
• DATE starting Hive 0.12

Misc.

• BOOLEAN
• STRING
• BINARY
Complex/Collection Types

Type Syntax

Arrays ARRAY<data_type>
Maps MAP<primitive_type, data_type>
Struct STRUCT<col_name : data_type [COMMENT col_comment],
…>
Union Type UNIONTYPE<data_type, data_type, …>

CREATE TABLE movies (

movie_name string,
participants ARRAY<string>,
release_dates MAP<string, timestamp>,
studio_addr STRUCT<state:string, city:string, zip:string, streetnbr:int, streetname:string,
unit:string>,
complex_participants MAP<string, STRUCT<address:string, attributes MAP<string,
string>>>
misc UNIONTYPE<int, string, ARRAY<double>>
);
Complex/Collection Types
CREATE TABLE movies (
movie_name string,
participants ARRAY<string>,
release_dates MAP<string, timestamp>,
studio_addr STRUCT<state:string, city:string, zip:string, streetnbr:int, streetname:string,
unit:string>,
complex_participants MAP<string, STRUCT<address:string, attributes MAP<string,
string>>>
misc UNIONTYPE<int, string, ARRAY<double>>
“Inception” 2010-07-16 00:00:00 91505 “Dark Green”
); {0:800}
“Planes” 2013-08-09 00:00:00 91505 “Green” {3:[1.0, 2.3, 5.6]}
SELECT movie_name,
participants[0],
release_dates[“USA”],
studio_addr.zip,
complex_participants[“Leonardo
DiCaprio”].attributes[“fav_color”],
misc
FROM movies;
Type Conversions

Implicit
Conversions

DOUBLE FLOAT BOOLEAN STRING

BIGINT SMALLINT TINYINT

FLOAT STRING INT TIMESTAMP
16L 16S 16Y

Explicit Conversions
 CAST(‘13’ AS INT)
 CAST(‘This results in NULL’ AS INT)
 CAST(‘2.0’ AS FLOAT)
 CAST(CAST(binary_data AS STRING) AS DOUBLE)
Loading and organizing data in Hive

Hive Query Language

Table Partitions
 Managed Partitioned Tables
CREATE TABLE page_views ( eventTime STRING, userid STRING, page STRING)
PARTITIONED BY(dt STRING, applicationtype STRING)
STORED AS TEXTFILE;

/apps/hive/warehouse/page_views
/apps/hive/warehouse/page_views/dt=2013-08-
10/application=android
LOAD DATA INPATH
page_views ‘/mydata/android/Aug_10_2013/pageviews/’
INTO TABLE page_views
PARTITION (dt = ‘2013-08-10’, applicationtype = ‘android’);
dt=2013-08-10 LOAD DATA INPATH
‘/sample/android/Aug_10_2013/pageviews/’
OVERWRITE INTO TABLE page_views
application=androi PARTITION (dt = ‘2013-08-10’, applicationtype = ‘android’);
d
Table Partitions
 Virtual Partition Columns
CREATE TABLE page_views ( eventTime STRING, userid STRING, page STRING)
PARTITIONED BY(dt STRING, applicationtype STRING)
STORED AS TEXTFILE;

eventTime STRING
userid STRING
page STRING
dt STRING
applicationtype STRING

SELECT dt as eventDate, page, count(*) as pviewCount FROM page_views

WHERE applicationtype = ‘iPhone’;
Table Partitions
 External Partitioned Tables
CREATE EXTERNAL TABLE page_views ( eventTime STRING, userid STRING,
page STRING)
PARTITIONED BY(dt STRING, applicationtype STRING)
STORED AS TEXTFILE;
eventTime STRING
userid STRING
page STRING
dt STRING
applicationtype STRING
ALTER TABLE page_views ADD PARTITION (dt=‘2013-09-09’, applicationtype=‘Windows Phone 8’)
LOCATION ‘/somewhere/on/hdfs/data/2013-09-09/wp8’;

ALTER TABLE page_views ADD PARTITION (dt=‘2013-09-09’, applicationtype=‘iPhone’)

LOCATION ‘hdfs://NameNode/somewhere/on/hdfs/data/iphone/current’;

ALTER TABLE page_views ADD IF NOT EXISTS

PARTITION (dt=‘2013-09-09’, applicationtype=‘iPhone’) LOCATION
‘/somewhere/on/hdfs/data/iphone/current’
PARTITION (dt=‘2013-09-08’, applicationtype=‘iPhone’) LOCATION
‘/somewhere/on/hdfs/data/prev1/iphone’
PARTITION (dt=‘2013-09-07’, applicationtype=‘iPhone’) LOCATION
‘/somewhere/on/hdfs/data/iphone/prev2’;
Demo
Multiple Inserts
 Interchangeability of blocks
FROM movies
SELECT *;

 Syntax
FROM from_statement
INSERT OVERWRITE TABLE table1 [PARTITION (partcol1=val1, partcol2=val2)] select_statement1
INSERT INTO TABLE table2 [PARTITION (partcol1=val1, partcol2=val2) [IF NOT EXISTS]]
select_statement2
INSERT OVERWRITE DIRECTORY ‘path’ select_statement3;

 Extract action and horror movies into tables for further processing
FROM movies
INSERT OVERWRITE TABLE horror_movies SELECT * WHERE horror = 1 AND release_date = ‘8/23/2013’
INSERT INTO action_movies SELECT * WHERE action = 1 AND release_date = ‘8/23/2013’;

FROM (SELECT * FROM movies WHERE release_date = ‘8/23/2013’) src

INSERT OVERWRITE TABLE horror_movies SELECT * WHERE horror = 1
INSERT INTO action_movies SELECT * WHERE action = 1;
Dynamic Partition Inserts
CREATE TABLE views_stg (eventTime STRING, userid STRING)
PARTITIONED BY(dt STRING, applicationtype STRING, page STRING);

FROM page_views src

INSERT OVERWRITE TABLE views_stg PARTITION (dt=‘2013-09-13’, applicationtype=‘Web’, page=‘Home’)
SELECT src.eventTime, src.userid WHERE dt=‘2013-09-13’ AND applicationtype=‘Web’, page=‘Home’
INSERT OVERWRITE TABLE views_stg PARTITION (dt=‘2013-09-14’, applicationtype=‘Web’, page=‘Cart’)
SELECT src.eventTime, src.userid WHERE dt=‘2013-09-14’ AND applicationtype=‘Web’, page=‘Cart’
INSERT OVERWRITE TABLE views_stg PARTITION (dt=‘2013-09-15’, applicationtype=‘Web’,
page=‘Checkout’)
SELECT src.eventTime, src.userid WHERE dt=‘2013-09-15 AND applicationtype=‘Web’, page=‘Checkout’
FROM page_views src
INSERT OVERWRITE TABLE views_stg PARTITION (applicationtype=‘Web’, dt, page)
SELECT src.eventTime, src.userid, src.dt, src.page WHERE applicationtype=‘Web’

 Dynamically determine partitions to create and populate

 Use input data to determine partitions
Dynamic Partition Inserts
 Default maximum dynamic partitions = 1000
 hive.exec.max.dynamic.partitions
 hive.exec.max.dynamic.partitions.pernode

 Enable/Disable dynamic partition inserts

 hive.exec.dynamic.partition=true

 Use strict mode when in doubt

 hive.exec.dynamic.partition.mode=strict

 Increase max number of files a data node can service in (hdfs-site.xml)

 dfs.datanode.max.xcievers=4096
Table Partitions
 Partitions for managed tables created by loading data into table
 LOCATION for EXTERNAL partitioned tables is optional
 Advantages to using same directory structure of managed tables
 Apache Hive
 MSCK REPAIR TABLE table_name;
 Amazon's Elastic Map Reduce
 ALTER TABLE table_name RECOVER PARTITIONS;
 Virtual columns and column name collision
 ALTER TABLE ADD PARTITION isn’t restricted to managed tables
 ALTER TABLE table_name [PARTITION spec] SET LOCATION "new
location“
 Not everything results in partition pruning
 Data is in lowest level, leaf, directory
 When filter doesn’t show in explain plan that means partition pruning
was used to service the predicate.
Data Retrieval

Hive Query Language

Group By

SELECT a b _c0
a, b, SUM(c) 1 B 10
FROM 1 H 30
t1 a b c
1 S 10
GROUP BY 1 H 10
2 A 10
a, b 2 A 10
1 H 20
1 B 10
SELECT a _c0
a, SUM(c) 1 S 10
1 50
FROM 2 10
t1
GROUP BY
a
Grouping Sets, Cube, Rollup

SELECT a, b, SUM(c) FROM t1 GROUP BY a, b GROUPING SETS ((a,b),a)

SELECT a, b, SUM(c) FROM t1 GROUP BY a, b

UNION ALL
SELECT a, NULL, SUM(c) FROM t1 GROUP BY a

SELECT a, b, SUM(c) FROM t1 GROUP BY a, b GROUPING SETS (a,b,())

SELECT a, NULL, SUM(c) FROM t1 GROUP BY a

UNION ALL
SELECT NULL, b, SUM(c) FROM t1 GROUP BY b
UNION ALL
SELECT NULL, NULL, SUM(c) FROM t1
Grouping Sets, Cube, Rollup

Cube

SELECT a, b, c, SUM(d) FROM t1 GROUP BY a, b WITH CUBE

SELECT a, b, c, SUM(d) FROM t1 GROUP BY a, b, c GROUPING SETS

((a,b,c),(a,b),(b,c),(a,c),a,b,c,())

Rollup
SELECT a, b, c, SUM(d) FROM t1 GROUP BY a, b WITH ROLLUP

SELECT a, b, c, SUM(d) FROM t1 GROUP BY a, b, c GROUPING SETS

((a,b,c),(a,b),a,())
Functions in Hive

 Built -in Functions

 Mathematical
 Collection
 Type conversion
 Date
 Conditional
 String
 Misc.
 xPath

 UDAFs

 UDTFs
Built-in Functions

 Mathematical
SELECT rand(), a FROM t1; SELECT rand(3), rand(a) FROM t1;
SELECT pow(a, b) FROM t2; SELECT tan(a) FROM t3;

abs(double a)
round(double a, int d)
floor(double a)

 Collection
size(Map<K.V>)
map_keys(Map<K.V>)
map_values(Map<K.V>)

SELECT array_contains(a, ‘test’) FROM t1;

Built-in Functions

 Date
unix_timestamp()
year(string d), month(string d), day(string d), hour, second
datediff(string enddate, string startdate)
date_add(string startdate, int days)
date_sub(string startdate, int days)
to_date(string timestamp)

 Conditional
SELECT IF(a = b, ‘true result’, ‘false result’) FROM t1;
SELECT COALESCE(a, b, c) FROM t1;
SELECT CASE a WHEN 123 THEN ‘first’ WHEN 456 THEN ‘second’
ELSE ‘none’ END FROM t1;
SELECT CASE WHEN a = 13 THEN c ELSE d END FROM t1;
Built-in Functions

 String
SELECT concat(a, b) FROM t1; SELECT concat_ws(sep, a, b) FROM t1;
SELECT regex_replace(“Hive Rocks”, “ive”, “adoop”) FROM dummy;

substr(string|binary A, int start)

substring(string|binary A, int start, int length)

sentences(string str, string lang, string locale)

SELECT sentences(“Loving this course! Hive is awesome.”) FROM dummy;

((“Loving”, “this”, “course”), (“Hive”, “is”,

“awesome”))
Built-in Aggregate Functions (UDAFs)
COUNT(*), COUNT(expr), COUNT(DISTINCT expr)
SUM(col), SUM(DISTINCT col)

AVG, MIN, MAX, VARIANCE, STDDEV_POP

HISTOGRAM_NUMERIC(col, b)
returns array<struct {‘x’, ‘y’}>

array[0].y
array[2].y
array[1].y

array[0].x array[1].x array[2].x

HAVING & GROUP BY

 Having Syntax

SELECT
a, b, SUM(c)
FROM  Group By on Function
t1
GROUP BY SELECT
a, b CONCAT(a,b) as r
HAVING , SUM(c)
SUM(c) > 2 FROM
t1
GROUP BY
CONCAT(a,b)
HAVING
SUM(c) > 2
Sorting in Hive
ORDER BY
SELECT x, y, z FROM t1 ORDER BY x ASC

part-00000

A A
Map
B A
D Reducer B
C C
Map
A D
Sorting in Hive
SORT BY
SELECT x, y, z FROM t1 SORT BY x
part-00000
A
Reducer
A
D
Map
B
part-00001
D Reducer A
C C
Map
A
part-00002
Reducer B
Controlling Data Flow
DISTIRBUTE BY
SELECT x, y, z FROM t1 DISTRIBUTE BY y

key y z
Reducer
x1 1 A
Map x1 2 B
x1 3 D Reducer
x1 4 C
Map
x2 5 A

Reducer
Controlling Data Flow
DISTIRBUTE BY
SELECT x, y, z FROM t1 DISTRIBUTE BY y

key y z
Reducer
x1 1 A
Map x1 2 B
x1 1 D Reducer
x1 4 C
Map
x2 5 A
Reducer
Controlling Data Flow
DISTIRBUTE BY with SORT BY
SELECT x, y, z FROM t1 DISTRIBUTE BY y SORT BY z

key y z
Reducer
x1 1 A
Map x1 2 B
x1 1 D Reducer
x1 4 C
Map
x2 5 A
Reducer

CLUSTER BY
SELECT x, y, z FROM t1 CLUSTER BY y
Command line options and variable substitution

Hive CLI
The CLI
 hive
 hive -e ‘select a, b, from t1 where c = 15’
 hive -S -e ‘select a, b from t1’ > results.txt
-e and -f run hive in batch mode
 hive -f /my/local/file/system/get-data.sql

 Variable Substitution
$ hive -d srctable=movies
4 namespaces
hive> set hivevar:cond=123;
 hivevar hive> select a,b,c from pluralsight.${hivevar:srctable}
 -d, --define , --hivevar where a = ${hivevar:cond};
 set hivevar:name=value
$ hive -v -d src=movies -d db=pluralsight -e 'select * from
 hiveconf ${hivevar:db}.${hivevar:src} LIMIT 100;‘
 --hiveconf
 set hiveconf:property=value
 system
 set system:property=value
 env
 set env:property=value
Summary
 Data Types
 Primitive and Complex

 Table Partitioning
 Managed tables by loading data
 Alter Table for External tables
 Dynamic partition inserts

 Multi Inserts

 Functions

 Order By, Sort By, Distribute By, Cluster By

 The Hive CLI

Unit-4 Pig Hive
No ratings yet
Unit-4 Pig Hive
40 pages
112 Q&a
No ratings yet
112 Q&a
139 pages
Big Data Record 2
No ratings yet
Big Data Record 2
117 pages
Hive L1
No ratings yet
Hive L1
134 pages
Dirty Dreams To Tell Your Boyfriend: Click Here To Download
60% (5)
Dirty Dreams To Tell Your Boyfriend: Click Here To Download
3 pages
Hive and Pig
No ratings yet
Hive and Pig
57 pages
Bda-Unit-Iv - 2020-21
100% (1)
Bda-Unit-Iv - 2020-21
30 pages
Wa0006.
No ratings yet
Wa0006.
53 pages
Module 4
No ratings yet
Module 4
34 pages
Cse3002 Big Data m2
No ratings yet
Cse3002 Big Data m2
76 pages
Mod 2
No ratings yet
Mod 2
70 pages
Hive Intoduction and Tables
No ratings yet
Hive Intoduction and Tables
31 pages
IB ACIO Previous Paper 2014-15
No ratings yet
IB ACIO Previous Paper 2014-15
30 pages
Hive Commands
No ratings yet
Hive Commands
15 pages
M4 Q&a
No ratings yet
M4 Q&a
22 pages
Bigdata Question
No ratings yet
Bigdata Question
16 pages
Flutter Documentation - Flutter
No ratings yet
Flutter Documentation - Flutter
3 pages
2025 Specimen Paper 5 Mark Scheme
No ratings yet
2025 Specimen Paper 5 Mark Scheme
10 pages
Hive File Format
No ratings yet
Hive File Format
38 pages
Hive
No ratings yet
Hive
42 pages
Hive Main
No ratings yet
Hive Main
24 pages
5 - Hive
No ratings yet
5 - Hive
51 pages
Hive Basics
No ratings yet
Hive Basics
35 pages
Hive Query Language
No ratings yet
Hive Query Language
33 pages
Lab6E - Creating Hive Partition Table
No ratings yet
Lab6E - Creating Hive Partition Table
11 pages
Lab6F - Creating Hive Table With Complex Data Type
No ratings yet
Lab6F - Creating Hive Table With Complex Data Type
11 pages
Hive Commands
No ratings yet
Hive Commands
7 pages
Apache HIVE
No ratings yet
Apache HIVE
44 pages
Facebook Hive POC
No ratings yet
Facebook Hive POC
18 pages
Hadoop Hive
No ratings yet
Hadoop Hive
61 pages
Complete Hive Practical
No ratings yet
Complete Hive Practical
8 pages
6.1NoSQL ApacheHIVE Witha3
No ratings yet
6.1NoSQL ApacheHIVE Witha3
45 pages
Hive Main
No ratings yet
Hive Main
33 pages
HIVE
No ratings yet
HIVE
80 pages
HIVE Lect
No ratings yet
HIVE Lect
91 pages
Hive
No ratings yet
Hive
50 pages
DAY 6 PATHFit 1
No ratings yet
DAY 6 PATHFit 1
34 pages
Hive Interview
75% (4)
Hive Interview
17 pages
Hive
No ratings yet
Hive
29 pages
Pan Conveyors PDF
100% (1)
Pan Conveyors PDF
24 pages
Beginning Flutter: A Hands On Guide to App Development
From Everand
Beginning Flutter: A Hands On Guide to App Development
Marco L. Napoli
No ratings yet
HIVE Architecture
No ratings yet
HIVE Architecture
5 pages
Chapter+9+ HIVE
No ratings yet
Chapter+9+ HIVE
50 pages
7 Hive Notes
No ratings yet
7 Hive Notes
36 pages
Introduction To Hive
No ratings yet
Introduction To Hive
14 pages
One Word Answer Questions Covering Dermatology
50% (2)
One Word Answer Questions Covering Dermatology
5 pages
Tutorial - Apache Hive - Apache Software Foundation
No ratings yet
Tutorial - Apache Hive - Apache Software Foundation
15 pages
Hive1 PDF
No ratings yet
Hive1 PDF
17 pages
Cheat Sheet: Hive Basics
No ratings yet
Cheat Sheet: Hive Basics
1 page
HDFSandhivecommands
No ratings yet
HDFSandhivecommands
15 pages
Hive PPT
No ratings yet
Hive PPT
25 pages
NgRx SignalStore: An effortless solution for state management
From Everand
NgRx SignalStore: An effortless solution for state management
Abdelfattah Ragab
No ratings yet
Session 3.2
No ratings yet
Session 3.2
27 pages
DSCI 5350 - Lecture 5 PDF
No ratings yet
DSCI 5350 - Lecture 5 PDF
64 pages
How to a Developers Guide in 4k: Developer edition, #2
From Everand
How to a Developers Guide in 4k: Developer edition, #2
Xinc Cyberwizard
No ratings yet
Hindu Conceptions of Law
No ratings yet
Hindu Conceptions of Law
25 pages
Creative Non-Fiction - Q3 - W6
100% (5)
Creative Non-Fiction - Q3 - W6
17 pages
HIVE
No ratings yet
HIVE
24 pages
Hive Overview
No ratings yet
Hive Overview
28 pages
Automation Simulation:: Your Gateway Into Smart Manufacturing
No ratings yet
Automation Simulation:: Your Gateway Into Smart Manufacturing
31 pages
Bigdata Analytics
No ratings yet
Bigdata Analytics
13 pages
Big Data Analytics in Oil and Gas
No ratings yet
Big Data Analytics in Oil and Gas
23 pages
Ashish Kedia's Answer To How Do I Practice Programming Everyday
No ratings yet
Ashish Kedia's Answer To How Do I Practice Programming Everyday
1 page
Note
No ratings yet
Note
1 page
Om Namah Shivaya
No ratings yet
Om Namah Shivaya
1 page
Campbell - Introduction To Geomagnetic Fields
No ratings yet
Campbell - Introduction To Geomagnetic Fields
26 pages
Apache Hive: An Introduction
No ratings yet
Apache Hive: An Introduction
51 pages
Acio Interview - What Willbe Asked in Acio Interview - Quora
No ratings yet
Acio Interview - What Willbe Asked in Acio Interview - Quora
3 pages
Automatic Power Switching Mains, Solar, Inverter
No ratings yet
Automatic Power Switching Mains, Solar, Inverter
14 pages
Untitled 1
No ratings yet
Untitled 1
31 pages
Big Data Analytics and Developers Training Session 10
No ratings yet
Big Data Analytics and Developers Training Session 10
27 pages
Hive Presentation
No ratings yet
Hive Presentation
18 pages
JW Player 6.8.4616 (Ads Edition) - Google खोज
0% (1)
JW Player 6.8.4616 (Ads Edition) - Google खोज
2 pages
Hive Tutorial 310518 0511 31592
No ratings yet
Hive Tutorial 310518 0511 31592
20 pages
Working of A Human Ear: PHASE:-#02. Chapter: - Sound
No ratings yet
Working of A Human Ear: PHASE:-#02. Chapter: - Sound
14 pages
Hive Introduction
No ratings yet
Hive Introduction
13 pages
14-Lesson Cloudera Hive
No ratings yet
14-Lesson Cloudera Hive
9 pages
Apache Hive Interview Questions: 1. Define The Difference Between Hive and Hbase?
No ratings yet
Apache Hive Interview Questions: 1. Define The Difference Between Hive and Hbase?
10 pages
Experiment 3: Hive: Aim: To Understand Data Processing Tool - Hive and HQL (Hive Query Language)
No ratings yet
Experiment 3: Hive: Aim: To Understand Data Processing Tool - Hive and HQL (Hive Query Language)
11 pages
2011 05 08 The Backslider
No ratings yet
2011 05 08 The Backslider
2 pages
Lecture No. 08. Manual Techniques at Shoulder - A
No ratings yet
Lecture No. 08. Manual Techniques at Shoulder - A
29 pages
VBQ-XII - English Core - 2
No ratings yet
VBQ-XII - English Core - 2
25 pages
Jammu Secretariat) : Kashmir at (Chief
No ratings yet
Jammu Secretariat) : Kashmir at (Chief
4 pages
Dept. Name Postname Ur SC Stobctotalexsohhhvhgp
No ratings yet
Dept. Name Postname Ur SC Stobctotalexsohhhvhgp
4 pages
ICTU SurveyQuestionnaire SB
No ratings yet
ICTU SurveyQuestionnaire SB
2 pages
TWGMC 1N4007 - C727081 - Diode 1N4001 Surface Mount
No ratings yet
TWGMC 1N4007 - C727081 - Diode 1N4001 Surface Mount
3 pages
Golang Tutorial - Table of Contents
No ratings yet
Golang Tutorial - Table of Contents
3 pages
5 SQL Hadoop Analyzing Big Data Hive m5 Storage Eco System Slides
No ratings yet
5 SQL Hadoop Analyzing Big Data Hive m5 Storage Eco System Slides
15 pages
2 SQL Hadoop Analyzing Big Data Hive m2 Intro Slides
No ratings yet
2 SQL Hadoop Analyzing Big Data Hive m2 Intro Slides
14 pages
Get EPIC - Design Your Personal Success Plan - 01 (1) GG Guide
No ratings yet
Get EPIC - Design Your Personal Success Plan - 01 (1) GG Guide
1 page
Fast Algorithms For Mining Association Rules
No ratings yet
Fast Algorithms For Mining Association Rules
2 pages
1 SQL Hadoop Analyzing Big Data Hive m1 Intro Hadoop Slides
No ratings yet
1 SQL Hadoop Analyzing Big Data Hive m1 Intro Hadoop Slides
11 pages
THermit Rialtech - Instruction
No ratings yet
THermit Rialtech - Instruction
17 pages
Digital Twins For Precision Healthcare
No ratings yet
Digital Twins For Precision Healthcare
20 pages
FFmpeg, HLS - Google खोज
No ratings yet
FFmpeg, HLS - Google खोज
2 pages
Lm3622 Aplication Circuit
No ratings yet
Lm3622 Aplication Circuit
2 pages
Fatigue Strength
No ratings yet
Fatigue Strength
7 pages
What Is LightGBM, How To Implement It - How To Fine Tune The Parameters
No ratings yet
What Is LightGBM, How To Implement It - How To Fine Tune The Parameters
2 pages
f389 Saw Filter
No ratings yet
f389 Saw Filter
9 pages
Itlog Ni Jan
No ratings yet
Itlog Ni Jan
10 pages
Mahesh
No ratings yet
Mahesh
1 page
Power Development Department, J & K Online Payment Receipt
No ratings yet
Power Development Department, J & K Online Payment Receipt
1 page
FPGA Implementation of Simplified SVPWM Algorithm For Three Phase Voltage Source Inverter
No ratings yet
FPGA Implementation of Simplified SVPWM Algorithm For Three Phase Voltage Source Inverter
8 pages
Neeraj Tripathi: Simple Dynamic D-Latch
No ratings yet
Neeraj Tripathi: Simple Dynamic D-Latch
14 pages
GAD Activity Design Template
No ratings yet
GAD Activity Design Template
2 pages
Amazonico London A La Carte Menu
No ratings yet
Amazonico London A La Carte Menu
2 pages
Open Silicon Pakistan Brochure
No ratings yet
Open Silicon Pakistan Brochure
1 page
PLAYBILL - Get Connected...
No ratings yet
PLAYBILL - Get Connected...
5 pages

3 SQL Hadoop Analyzing Big Data Hive m3 Hiveql Slides

Uploaded by

3 SQL Hadoop Analyzing Big Data Hive m3 Hiveql Slides

Uploaded by

Hive Query Language

 Load and Organize Data

 Single Scan-Multiple Inserts

 Hive Functions, Aggregates, Group By, Cube, Rollup, Having

 Sorting and Clustering Results

 Using the CLI in the real world

• TINYINT, SMALLINT, INT, BIGINT

• TIMESTAMP starting Hive 0.8

CREATE TABLE movies (

DOUBLE FLOAT BOOLEAN STRING

BIGINT SMALLINT TINYINT

Hive Query Language

SELECT dt as eventDate, page, count(*) as pviewCount FROM page_views

ALTER TABLE page_views ADD PARTITION (dt=‘2013-09-09’, applicationtype=‘iPhone’)

ALTER TABLE page_views ADD IF NOT EXISTS

FROM (SELECT * FROM movies WHERE release_date = ‘8/23/2013’) src

FROM page_views src

 Dynamically determine partitions to create and populate

 Enable/Disable dynamic partition inserts

 Use strict mode when in doubt

 Increase max number of files a data node can service in (hdfs-site.xml)

Hive Query Language

SELECT a, b, SUM(c) FROM t1 GROUP BY a, b GROUPING SETS ((a,b),a)

SELECT a, b, SUM(c) FROM t1 GROUP BY a, b

SELECT a, b, SUM(c) FROM t1 GROUP BY a, b GROUPING SETS (a,b,())

SELECT a, NULL, SUM(c) FROM t1 GROUP BY a

SELECT a, b, c, SUM(d) FROM t1 GROUP BY a, b WITH CUBE

SELECT a, b, c, SUM(d) FROM t1 GROUP BY a, b, c GROUPING SETS

SELECT a, b, c, SUM(d) FROM t1 GROUP BY a, b, c GROUPING SETS

 Built -in Functions

SELECT array_contains(a, ‘test’) FROM t1;

substr(string|binary A, int start)

sentences(string str, string lang, string locale)

SELECT sentences(“Loving this course! Hive is awesome.”) FROM dummy;

((“Loving”, “this”, “course”), (“Hive”, “is”,

AVG, MIN, MAX, VARIANCE, STDDEV_POP

array[0].x array[1].x array[2].x

 Order By, Sort By, Distribute By, Cluster By

 The Hive CLI

You might also like