Hive
Hive
05/31/2024 7
Hive architecture (from the paper)
05/31/2024 8
Data model
Hive structures data into well-understood
database concepts such as: tables, rows, cols,
partitions
It supports primitive types: integers, floats,
doubles, and strings
Hive also supports:
◦ associative arrays: map<key-type, value-type>
◦ Lists: list<element type>
◦ Structs: struct<file name: file type…>
SerDe: serialize and deserialized API is used to
move data in and out of tables
05/31/2024 9
Query Language (HiveQL)
Subset of SQL
Meta-data queries
No inserts on existing tables
10
Data Model
Tables
Basic type columns (int, float, boolean)
Complex type: List / Map ( associate array)
Partitions
Buckets
CREATE TABLE sales( id INT, items
ARRAY<STRUCT<id:INT,name:STRING>
) PARITIONED BY (ds STRING)
CLUSTERED BY (id) INTO 32 BUCKETS;