ABP W11-W12 Big Data Analytics Lab-HIVE
ABP W11-W12 Big Data Analytics Lab-HIVE
(A7902) (VCE-R21)
Open Virtual box and then start cloudera quickstart Terminal and type “hive” to
launch hive shell
11.a) DDL Commands for Databases
1) CREATE database Statement is used to create a database in Hive. A database in Hive is a
namespace or a collection or catalog of tables.
[COMMENT database_comment]
[LOCATION hdfs_path]
Simple creation
OK
2) SHOW databases statement lists all the databases present in the metastore.
Syn: SHOW (DATABASES/SCHEMAS) [LIKE ‘wildcards'];
Wildcards in the regular expression can only be '*' for any character(s) or '|' for a
choice. Examples are 'employees', 'emp*', 'emp*|*ees', all of which will match the
database named 'employees’:
6) ALTER database statement in Hive is used to change the metadata associated with the
database in Hive.
Syntax for changing Database Properties:
ALTER (DATABASE|SCHEMA) db_name SET DBPROPERTIES
(property_name=property_value, ...);
hive> ALTER DATABASE employee SET DBPROPERTIES (‘creator’=‘Bhanu Prasad’,
‘date’=‘07-12-2020’);
employee this is employee database hdfs://quickstart.cloudera:8020
/user/hive/warehouse/hivedir/ cloudera USER {date= 07-12-2020, creator=Bhanu
Prasad};
2) SHOW tables statement in Hive lists all the base tables and views in the current
database.
Syn: SHOW TABLES [IN database_name];
hive> SHOW TABLES IN employee;
OK
emptable
3) DESCRIBE table statement in Hive shows the lists of columns for the specified table.
Syn: DESCRIBE [EXTENDED|FORMATTED] [db_name.] table_name[.col_name (
[.field_name])];
hive> DESCRIBE employee.emptable;
emp_id string This is Employee ID
emp_name string This is Employee Name
emp_sal float This is Employee Salary
hive> DESCRIBE EXTENDED employee.emptable;
hive> DESCRIBE FORMATTED employee.emptable;
4) ALTER table statement in Hive enables you to change the structure of an existing table,
rename the table, add columns to the table, change the table properties, etc.
Syntax for Rename a table:
ALTER TABLE table_name RENAME TO new_table_name;
hive> ALTER TABLE employee.emptable RENAME TO employee.facultytable;
5) DROP table statement in Hive deletes the data for a particular table and remove all
metadata associated with it from Hive metastore.
If PURGE is not specified, then the data is actually moved to the .Trash/current
directory.
If PURGE is specified, then data is lost completely.
Syn: DROP TABLE [IF EXISTS] table_name [PURGE];
hive> DROP TABLE IF EXISTS employee.emptable PURGE;
OK
6) TRUNCATE table statement in Hive removes all the rows from the table or partition.
Syn: TRUNCATE TABLE table_name;
hive> TRUNCATE TABLE employee.emptable;
OK
Open Virtual box and then start cloudera quickstart Terminal and type “hive” to
launch hive shell
12.a) DML Commands for Tables
1) LOAD statement in Hive is used to copy/move data files into the locations
corresponding to Hive tables.
OVERWRITE contents of the target table (or partition) will be deleted and replaced by
the files otherwise contents are added to the table
OK
emptextdata contents
1,bob,25000.00,asstprof,35,male
2,mary,35000.00,assocprof,38,female
3,mike,50000.00,prof,45,male
2) SELECT statement in Hive is similar to the SELECT statement in SQL used for retrieving
data from the database.
Syn: SELECT * FROM tablename; //displays all records
hive> SELECT * FROM employee.facultytable;
1 bob 25000.00 asstprof 35 male
3) a) INSERT INTO statement appends the data into existing data in the table or partition.
Syn: INSERT INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2
...)] VALUES (col1value,col2value,…)
hive> INSERT INTO TABLE employee.facultytable VALUES (4, ‘jessy’, 45000.00,
‘assocprof’, 40, ‘female’);
hive> SELECT * FROM employee.facultytable;
4 jessy 45000.00 assocprof 40 female
1 bob 25000.00 asstprof 35 male
2 mary 35000.00 assocprof 38 female
3 mike 50000.00 prof 45 male
b) INSERT OVERWRITE table overwrites the existing data in the table or partition.
Syn: INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, ..) [IF
NOT EXISTS]] select_statement FROM from_statement;
4) DELETE statement in Hive deletes the table data. If the WHERE clause is specified, then
it deletes the rows that satisfy the condition in where clause.
Syn: DELETE FROM tablename [WHERE expression];
hive> DELETE FROM employee.facultytable WHERE emp_age=38;
hive> SELECT * FROM employee.facultytable;
4 jessy 45000.00 assocprof 40 female
1 bob 25000.00 asstprof 35 male
3 mike 50000.00 prof 45 male
5) UPDATE statement in Hive updates the table data. If the WHERE clause is specified, then
it updates the column of the rows that satisfy the condition in WHERE clause.
Partitioning and Bucketing columns cannot be updated.
Syn: UPDATE tablename SET column = value [, column = value ...] [WHERE
expression];
hive> UPDATE employee.facultytable SET emp_name = ‘mike tyson’ WHERE
emp_age=45;
hive> SELECT * FROM employee.facultytable;
4 jessy 45000.00 assocprof 40 female
1 bob 25000.00 asstprof 35 male
3 mike tyson 50000.00 prof 45 male
6) EXPORT statement exports the table or partition data along with the metadata to the
specified output location in the HDFS. Metadata is exported in a _metadata file, and data is
exported in a subdirectory ‘data.’
Syn: EXPORT TABLE tablename [PARTITION (part_column="value"[, ...])] TO
'export_target_path' [ FOR replication('eventid') ];
hive> EXPORT TABLE employee.drivertable TO ‘/user/hive/warehouse’;
7) IMPORT command imports the data from a specified location to a new table or already
existing table.
Syn: IMPORT [[EXTERNAL] TABLE new_or_original_tablename [PARTITION
(part_column="value"[, ...])]] FROM 'source_path' [LOCATION 'import_target_path’];
hive> IMPORT TABLE employee.importedtable FROM ‘/user/hive/warehouse’;