2 Partitioning+QC+Done
2 Partitioning+QC+Done
Bananas
VALUES IN PRODUCT COLUMN Nutella
Peanut Butter
/PRODUCT=PEANUT BUTTER
/FILE-01
/PRODUCT=NUTELLA
/FILE-01
/PRODUCT=MILK
/FILE-01
/USER/HIVE/WAREHOUSE THERE IS JUST ONE FILE
/SALES-TABLE INSIDE EACH DIRECTORY
BY ADDING
partitioned by(column name column_data_type)
/DATE=2015-01-21
/FILE-01
/DATE=2015-01-22
/FILE-01
/DATE=2015-01-23
/FILE-01
/DATE=2015-01-24
/FILE-01
WE CAN PARTITION TABLES BY TWO COLUMNS
WE CAN PARTITION TABLES BY TWO COLUMNS
partitioned by
(
column_name_1 data_type_1,
column_name_2 data_type_2
)
WE CAN PARTITION TABLES BY TWO COLUMNS
partitioned by
(
column_name_1 data_type_1,
column_name_2 data_type_2
)
SEPARATED BY A COMMA
WE CAN PARTITION TABLES BY TWO COLUMNS
partitioned by
(
column_name_1 data_type_1,
column_name_2 data_type_2
)
/USER/HIVE/WAREHOUSE
SALES_DATA_DATE_PRODUCT_PARTITION
DATA IS STORED
IN THESE FILES
/DATE=‘2015-01-17’
/PRODUCT=BANANAS
/FILE-01
/PRODUCT = PEANUT BUTTER
/DATE=‘2015-01-18’
/PRODUCT = BANANAS
/PRODUCT = PEANUT BUTTER
HOW DO WE GET DATA FROM
TABLES WITH PARTITIONS?
HOW DO WE GET DATA FROM
TABLES WITH PARTITIONS?
CREATE TABLE Sales_Data_Date_Product_Partition
(
StoreLocation VARCHAR(30),
Revenue DECIMAL(10,2)
WHEN YOU QUERY
) THIS TABLE JUST
partitioned by TREAT IT AS IF IT
( HAS 4 COLUMNS
OrderDate DATE,
product VarChar(30)
);
HOW DO WE GET DATA FROM
TABLES WITH PARTITIONS?
partitioned by YOU CAN WRITE
( SELECT STATEMENTS
OrderDate DATE, WHICH TREAT THESE
product VarChar(30)
); AS YOU WOULD ANY
REGULAR COLUMN
HOW DO WE GET DATA FROM
TABLES WITH PARTITIONS?
partitioned by QUERIES WITH
( CONDITIONS ON THE
OrderDate DATE, PARTITION
product VarChar(30)
); COLUMNS WILL
RUN FASTER
HOW DO WE PUT STUFF INTO
TABLES WITH PARTITIONS?
HOW DO WE PUT STUFF INTO TABLES WITH PARTITIONS?
WE CREATED A PARTITIONED TABLE USING THE FOLLOWING COMMAND
DYNAMIC PARTITION
NON-DYNAMIC PARTITION
DYNAMIC PARTITIONING
SOURCE TABLE =
SALES_DATA_WITHOUT_PARTITION
LET US IMPORT DATA INTO A PARTITIONED TABLE FROM ANOTHER TABLE
Bellandur
Bellandur Bananas
Nutella January
January 18,2016
18,2016 8,236.33
7,455.67
Bellandur Peanut Butter January 18,2016 5,316.89
HOW DO WE PUT STUFF INTO TABLES WITH PARTITIONS USING DYNAMIC PARTITIONING?
SOURCE TABLE =
SALES_DATA_WITHOUT_PARTITION TABLE IS PARTITIONED ON DATE AND PRODUCT
StoreLocation Product Date Revenue
Bellandur
Bellandur Bananas
Nutella January
January 18,2016
18,2016 8,236.33
7,455.67
Bellandur Peanut Butter January 18,2016 5,316.89
HOW DO WE PUT STUFF INTO TABLES WITH PARTITIONS USING DYNAMIC PARTITIONING?
SOURCE TABLE =
SALES_DATA_WITHOUT_PARTITION TABLE IS PARTITIONED ON DATE AND PRODUCT
StoreLocation Product Date Revenue
Bellandur
Bellandur Bananas
Nutella January
January 18,2016
18,2016 8,236.33
7,455.67
HOW DO WE PUT STUFF INTO TABLES WITH PARTITIONS USING DYNAMIC PARTITIONING?
WHEN WE LOAD INTO PARTITION TABLES, WE
SPECIFY ONLY NAMES OF PARTITION COLUMNS
Insert into Sales_Data_Date_Product_Partition
partition(Product,OrderDate)
select StoreLocation,Revenue,Product,OrderDate
from Sales_Data_Without_Partition;
/PRODUCT=BANANAS
/FILE-01
/PRODUCT = PEANUT BUTTER
/DATE=‘2015-01-18’
/PRODUCT = NUTELLA
/PRODUCT = MILK
/DATE=‘2015-01-19’
/PRODUCT = MILK
/DATE=‘2015-01-20’ /PRODUCT=BANANAS
/PRODUCT = PEANUT BUTTER
HOW TO CHECK WHAT PARTITIONS EXIST IN A TABLE?