0% found this document useful (0 votes)

24 views29 pages

Unit5 Notes

big dta 5 th unit

Uploaded by

pullisaipriya232

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views29 pages

Unit5 Notes

big dta 5 th unit

Uploaded by

pullisaipriya232

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

UNIT5

Hive, data types and file formats, HiveQL data definition, HiveQL data manipulation, HiveQL
queries. Case study on analysing different phases of data analytics.

GGGG

HIVE
The term ‘Big Data’ is used for collections of large datasets that include
huge volume, high velocity, and a variety of data that is increasing day by day.
Using traditional data management systems, it is difficult to process Big Data.
Therefore, the Apache Software Foundation introduced a framework called
Hadoop to solve Big Data management and processing challenges.
Hadoop

Hadoop is an open-source framework to store and process Big Data in a

distributed environment. It contains two modules, one is MapReduce and
another is Hadoop Distributed File System (HDFS).
 MapReduce: It is a parallel programming model for processing large
amounts of structured, semi-structured, and unstructured data on large
clusters of commodity hardware.
 HDFS:Hadoop Distributed File System is a part of Hadoop framework,
used to store and process the datasets. It provides a fault-tolerant file
system to run on commodity hardware.
The Hadoop ecosystem contains different sub-projects (tools) such as Sqoop,
Pig, and Hive that are used to help Hadoop modules.
 Sqoop: It is used to import and export data to and from between HDFS
and RDBMS.
 Pig: It is a procedural language platform used to develop a script for
MapReduce operations.
 Hive: It is a platform used to develop SQL type scripts to do MapReduce
operations.

Note: There are various ways to execute MapReduce operations:

 The traditional approach using Java MapReduce program for structured,

semi-structured, and unstructured data.
 The scripting approach for MapReduce to process structured and semi
structured data using Pig.
 The Hive Query Language (HiveQL or HQL) for MapReduce to process
structured data using Hive.

What is Hive

Hive is a data warehouse infrastructure tool to process structured data in

Hadoop. It resides on top of Hadoop to summarize Big Data, and makes
querying and analyzing easy.
Initially Hive was developed by Facebook, later the Apache Software
Foundation took it up and developed it further as an open source under the
name Apache Hive.
Features of Hive

 It stores schema in a database and processed data into HDFS.

 It is designed for OLAP.
 It provides SQL type language for querying called HiveQL or HQL.
 It is familiar, fast, scalable, and extensible.

Architecture of Hive

The following component diagram depicts the architecture of Hive:

This component diagram contains different units. The following table describes
each unit:

Unit Name Operation

User Interface Hive is a data warehouse infrastructure software that can

create interaction between user and HDFS. The user
interfaces that Hive supports are Hive Web UI, Hive
command line, and Hive HD Insight (In Windows server).

Meta Store Hive chooses respective database servers to store the

schema or Metadata of tables, databases, columns in a
table, their data types, and HDFS mapping. It stores the
data in a traditional RDBMS format

HiveQL Process HiveQL is similar to SQL for querying on schema info on

Engine the Metastore. It is one of the replacements of traditional
approach for MapReduce program. Instead of writing
MapReduce program in Java, we can write a query for
MapReduce job and process it.

Execution The conjunction part of HiveQL process Engine and

Engine MapReduce is Hive Execution Engine. Execution engine
processes the query and generates results as same as
MapReduce results. It uses the flavor of MapReduce.

HDFS or Hadoop distributed file system or HBASE are the data

HBASE storage techniques to store data into file system.

Working of Hive

The following diagram depicts the workflow between Hive and Hadoop.

The following table defines how Hive interacts with Hadoop framework:

Step Operation
No.

1 Execute Query
The Hive interface such as Command Line or Web UI sends query to
Driver (any database driver such as JDBC, ODBC, etc.) to execute.

2 Get Plan
The driver takes the help of query compiler that parses the query to
check the syntax and query plan or the requirement of query.

3 Get Metadata
The compiler sends metadata request to Metastore (any database).

4 Send Metadata
Metastore sends metadata as a response to the compiler.

5 Send Plan
The compiler checks the requirement and resends the plan to the
driver. Up to here, the parsing and compiling of a query is complete.

6 Execute Plan
The driver sends the execute plan to the execution engine.

7 Execute Job
Internally, the process of execution job is a MapReduce job. The
execution engine sends the job to JobTracker, which is in Name
node and it assigns this job to TaskTracker, which is in Data node.
Here, the query executes MapReduce job.

7.1 Metadata Ops

Meanwhile in execution, the execution engine can execute metadata
operations with Metastore.

8 Fetch Result
The execution engine receives the results from Data nodes.

9 Send Results
The execution engine sends those resultant values to the driver.

10 Send Results
The driver sends the results to Hive Interfaces.
DATA TYPES AND FILE FORMATS
There are different data types in Hive, which are involved in the table
creation. All the data types in Hive are classified into four types, given as
follows:
 Column Types
 Literals
 Null Values
 Complex Types

Column Types

Column type are used as column data types of Hive. They are as follows:

Integral Types
Integer type data can be specified using integral data types, INT. When
the data range exceeds the range of INT, you need to use BIGINT and if the
data range is smaller than the INT, you use SMALLINT. TINYINT is smaller
than SMALLINT.
The following table depicts various INT data types:

Type Postfix Example

TINYINT Y 10Y

SMALLINT S 10S

INT - 10

BIGINT L 10L

String Types
String type data types can be specified using single quotes (' ') or double
quotes (" "). It contains two data types: VARCHAR and CHAR. Hive follows
C-types escape characters.

The following table depicts various CHAR data types:

Data Type Length

VARCHAR 1 to 65355

CHAR 255

Timestamp
It supports traditional UNIX timestamp with optional nanosecond
precision. It supports java.sql.Timestamp format “YYYY-MM-DD
HH:MM:SS.fffffffff” and format “yyyy-mm-dd hh:mm:ss.ffffffffff”.

Dates
DATE values are described in year/month/day format in the form
{{YYYY-MM-DD}}.

Decimals
The DECIMAL type in Hive is as same as Big Decimal format of Java. It
is used for representing immutable arbitrary precision. The syntax and example
is as follows:
DECIMAL(precision, scale)
decimal(10,0)

Union Types
Union is a collection of heterogeneous data types. You can create an
instance using create union. The syntax and example is as follows:
UNIONTYPE<int, double, array<string>, struct<a:int,b:string>>
{0:1}
{1:2.0}
{2:["three","four"]}
{3:{"a":5,"b":"five"}}
{2:["six","seven"]}
{3:{"a":8,"b":"eight"}}
{0:9}
{1:10.0}

Literals

The following literals are used in Hive:

Floating Point Types
Floating point types are nothing but numbers with decimal points.
Generally, this type of data is composed of DOUBLE data type.
Decimal Type
Decimal type data is nothing but floating point value with higher range
than DOUBLE data type. The range of decimal type is approximately -10-308 to
10308.
Null Value

Missing values are represented by the special value NULL.

Complex Types

The Hive complex data types are as follows:

Arrays
Arrays in Hive are used the same way they are used in Java.
Syntax: ARRAY<data_type>
Maps
Maps in Hive are similar to Java Maps.
Syntax: MAP<primitive_type, data_type>
Structs
Structs in Hive is similar to using complex data with comment.
Syntax: STRUCT<col_name : data_type [COMMENT col_comment], ...>

HIVE FILE FORMATS:

Following are the Apache Hive different file formats:
 Text File
 Sequence File
 RC File
 AVRO File
 ORC File
 Parquet File

Hive Text File Format:

Hive Text file format is a default storage format. You can use the text
format to interchange the data with other client application. The text file format
is very common most of the applications. Data is stored in lines, with each line
being a record. Each lines are terminated by a newline character (\n).
The text format is simple plane file format. You can use the compression
(BZIP2) on the text file to reduce the storage spaces.
Create a TEXT file by add storage option as ‘STORED AS
TEXTFILE’ at the end of a Hive CREATE TABLE command.
Syntax:
Create table textfile_table(column_specs)
stored as textfile;

Hive Sequence File Format:

Sequence files are Hadoop flat files which stores values in binary key-
value pairs. The sequence files are in binary format and these files are able to
split. The main advantages of using sequence file is to merge two or more files
into one file.
Create a sequence file by add storage option as ‘STORED AS
SEQUENCEFILE’ at the end of a Hive CREATE TABLE command.

Syntax
Create table sequencefile_table (column_specs)
stored as sequencefile;

Hive RC File Format

RCFile is row columnar file format. This is another form of Hive file
format which offers high row level compression rates. If you have requirement
to perform multiple rows at a time then you can use RCFile format.
The RCFile are very much similar to the sequence file format. This file format
also stores the data as key-value pairs.
Create RCFile by specifying ‘STORED AS RCFILE’ option at the end
of a CREATE TABLE Command:
Syntax:
Create table RCfile_table(column_specs)
stored as rcfile;

Hive AVRO File Format

Avro stores the data definition (schema) in JSON format making it easy
to read and interpret by any program. The data itself is stored in binary format
making it compact and efficient.
Syntax:
Create table avro_table(column_specs) stored as avro;
Hive ORC File Format

The ORC file stands for Optimized Row Columnar file format. The ORC
file format provides a highly efficient way to store data in Hive table. This file
system was actually designed to overcome limitations of the other Hive file
formats. The Use of ORC files improves performance when Hive is reading,
writing, and processing data from large tables.
Syntax:
Create table orc_table(column_specs) stored as orc;

Hive Parquet File Format

Parquet is a column-oriented binary file format. The parquet is highly

efficient for the types of large-scale queries. Parquet is especially good for
queries scanning particular columns within a particular table. The Parquet table
uses compression Snappy, gzip; currently Snappy by default.
Create Parquet file by specifying ‘STORED AS PARQUET’ option at
the end of a CREATE TABLE Command.
Syntax:
Create table parquet_table(column_specs) stored as parquet;

HIVEQL DATA DEFINITION

Hive DDL commands are the statements used for defining and changing
the structure of a table or database in Hive. It is used to build or modify the
tables and other objects in the database.

The several types of Hive DDL commands are:

1. CREATE
2. SHOW
3. DESCRIBE
4. USE
5. DROP
6. ALTER
7. TRUNCATE

Table-1 Hive DDL commands

DDL
Use With
Command
CREATE Database, Table
Databases, Tables, Table Properties, Partitions, Functions,
SHOW
Index
DESCRIBE Database, Table, view
USE Database
DROP Database, Table
ALTER Database, Table
TRUNCATE Table

Note that the Hive commands are case-insensitive.

1.Create Database:
The CREATE DATABASE statement is used to create a database in the
Hive. The DATABASE and SCHEMA are interchangeable. We can use either
DATABASE or SCHEMA.
Syntax:
CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name;
Example :
CREATE DATABASE IF NOT EXISTS bda;

2.Show Database:
The SHOW DATABASES statement lists all the databases present in the
Hive.
Syntax:
SHOW (DATABASES|SCHEMAS);
Example:
SHOW DATABASES;

3.Describe Database:
The DESCRIBE DATABASE statement in Hive shows the name of
Database in Hive, and its location on the file system.
The EXTENDED can be used to get the database properties.
Syntax:
DESCRIBE DATABASE/SCHEMA [EXTENDED] db_name;
Example
DESCRIBE DATABASE bda;

4.Use Database
The USE statement in Hive is used to select the specific database for a
session on which all subsequent HiveQL statements would be executed.
Syntax:
USE database_name;
Example
USE bda;

5.Drop Database
The DROP DATABASE statement in Hive is used to Drop (delete) the
database.
The default behavior is RESTRICT which means that the database is
dropped only when it is empty. To drop the database with tables, we can use
CASCADE.
Syntax:
DROP (DATABASE|SCHEMA) [IF EXISTS] database_name
[RESTRICT|CASCADE];

Example :
DROP DATABASE [IF EXISTS] bda CASCADE

CREATE DATABASE bda;

CREATE TABLE mca;
Add col(name, dob)
6.Alter Database:
The ALTER DATABASE statement in Hive is used to change the
metadata associated with the database
Syntax:
ALTER (DATABASE|SCHEMA) database_name SET
DBPROPERTIES (property_name=property_value, ...);
Example:
ALTER DATABASE bda SET DBPROPERTIES (‘createdfor’=’mca’);

In this example, we are setting the database properties of the ‘bda’

database after its creation by using the ALTER command.

Syntax for changing Database owner:

ALTER (DATABASE|SCHEMA) database_name SET OWNER
[USER|ROLE] user_or_role;
In this example, we are changing the owner role of the ‘bda’ database using the
ALTER statement.
Example:
ALTER DATABASE bda SET OWNER ROLE admin;

1. CREATE TABLE

The CREATE TABLE statement in Hive is used to create a table with

the given name. If a table or view already exists with the same name, then the
error is thrown. We can use IF NOT EXISTS to skip the error.

Syntax:
CREATE TABLE [IF NOT EXISTS] [db_name.] table_name [(col_name
data_type [COMMENT col_comment], ... [COMMENT col_comment])]
[COMMENT table_comment] [ROW FORMAT row_format] [STORED AS
file_format] [LOCATION hdfs_path];

Example
CREATE TABLE IF NOT EXISTS Employee(
Emp-ID STRING COMMENT ‘this is employee-id’
Emp_designation STRING COMMENT ‘this is employee post’)

2. SHOW TABLES in Hive

The SHOW TABLES statement in Hive lists all the base tables
and views in the current database.
Syntax:
SHOW TABLES [IN database_name];
Example
SHOW TABLES;
3. DESCRIBE TABLE in Hive

The DESCRIBE statement in Hive shows the lists of columns for the
specified table.
Syntax:
DESCRIBE [EXTENDED|FORMATTED] [db_name.]
table_name[.col_name ( [.field_name])];
Example:
DESCRIBE Employee;
It describes column name , data type and comment.

4. DROP TABLE in Hive

The DROP TABLE statement in Hive deletes the data for a particular
table and remove all metadata associated with it from Hive metastore.
If PURGE is not specified then the data is actually moved to the
.Trash/current directory. If PURGE is specified, then data is lost completely.
Syntax:
DROP TABLE [IF EXISTS] table_name [PURGE];
Example:
DROP TABEL IF EXISTS Employee PURGE;

5. ALTER TABLE in Hive

The ALTER TABLE statement in Hive enables you to change the

structure of an existing table. Using the ALTER TABLE statement we can
rename the table, add columns to the table, change the table properties, etc.

Syntax to Rename a table:

ALTER TABLE table_name RENAME TO new_table_name;
Example
ALTER TABEL Employee RENAME TO Comp_Emp;

In this example, we are trying to rename the ‘Employee’ table to ‘Com_Emp’

using the ALTER statement.
Syntax to Add columns to a table:
ALTER TABLE table_name ADD COLUMNS (column1, column2) ;
Example
ALTER TABEL Comp_Emp ADD COLUMNS (emp_dob STRING
,emp_contact STRING);
In this example, we are adding two columns ‘Emp_DOB’ and
‘Emp_Contact’ in the ‘Comp_Emp’ table using the ALTER command.

6. TRUNCATE TABLE

TRUNCATE TABLE statement in Hive removes all the rows from the
table or partition.
Syntax:
TRUNCATE TABLE table_name;
Example
TRUNCATE TABLE Comp_Emp;
It removes all the rows in Comp_Emp Table.

HIVEQL DATA MANIPULATION

Hive DML (Data Manipulation Language) commands are used to insert,
update, retrieve, and delete data from the Hive table once the table and database
schema has been defined using Hive DDL commands.
The various Hive DML commands are:
1. LOAD
2. SELECT
3. INSERT
4. DELETE
5. UPDATE
6. EXPORT
7. IMPORT

1.Load Command:
The LOAD statement in Hive is used to move data files into the locations
corresponding to Hive tables.
 If a LOCAL keyword is specified, then the LOAD command will look
for the file path in the local filesystem.
 If the LOCAL keyword is not specified, then the Hive will need the
absolute URI of the file.
 In case the keyword OVERWRITE is specified, then the contents of the
target table/partition will be deleted and replaced by the files referred by
filepath.
 If the OVERWRITE keyword is not specified, then the files referred by
filepath will be appended to the table.

Syntax:
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO
TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)];
Example:
LOAD DATA LOCAL INPATH ‘/home/user/dab’ INTO TABEL emp_data;

2. SELECT COMMAND
The SELECT statement in Hive is similar to the SELECT statement in
SQL used for retrieving data from the database.
Syntax:
SELECT col1,col2 FROM tablename;

Example:

SELECT * FROM emp_data;

It will display all the rows in emp_data

3. INSERT Command
The INSERT command in Hive loads the data into a Hive table. We can
do insert to both the Hive table or partition.
a. INSERT INTO

The INSERT INTO statement appends the data into existing data in the
table or partition. INSERT INTO statement works from Hive version 0.8.
Syntax:
INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1,
partcol2=val2 ...)] select_statement1 FROM from_statement;
Example
CREATE TABLE IF NOT EXIST example(id STRING, name STRING,
dep STRING ,state STRING, salary STRING, year STRING

INSERT statement to load data into table “example”.

Example:
INSERT INTO TABLE example SELECT emp.emp_id, emp.emp_name,
emp.emp_dep, emp.state, emp.salary, emp.year_of_joing from emp_data
emp;

SELECT * FROM example;

b. INSERT OVERWRITE

The INSERT OVERWRITE table overwrites the existing data in the table or
partition.

Example:
Here we are overwriting the existing data of the table ‘example’ with the
data of table ‘dummy’ using INSERT OVERWRITE statement.

INSERT OVERWRITE TABLE example SELECT dmy.enroll, dmy.name,

dmy.department, dmy.salary, dmy.year from dummy dmy;

By using the SELECT statement we can verify whether the existing data of the
table ‘example’ is overwritten by the data of table ‘dummy’ or not.
SELECT * FROM EXAMPLE;
c. INSERT .. VALUES

INSERT ..VALUES statement in Hive inserts data into the table directly
from SQL..
Example:
Inserting data into the ‘student’ table using INSERT ..VALUES statement.
INSERT INTO TABLE student VALUES(101,’Callen’,’IT’,’7.8’),
(103,’joseph’,’CS’,’8.2’),
(105,’Alex’,’IT’,’7.9’);
SELECT * FROM student;

4. DELETE command
The DELETE statement in Hive deletes the table data. If the WHERE
clause is specified, then it deletes the rows that satisfy the condition in where
clause.
Syntax:
DELETE FROM tablename [WHERE expression];
Example:
In the below example, we are deleting the data of the student from table
‘student’ whose roll_no is 105.
DELETE FROM student WHERE roll_no=105;
SELECT * FROM student;
5. UPDATE Command
The UPDATE statement in Hive deletes the table data. If the WHERE
clause is specified, then it updates the column of the rows that satisfy the
condition in WHERE clause.
Syntax:
UPDATE tablename SET column = value [, column = value ...] [WHERE
expression];
Example:
In this example, we are updating the branch of the student whose roll_no is 103
in the ‘student’ table using an UPDATE statement.
UPDATE student SET branch=’IT’ WHERE roll_no=103;
SELECT * FROM student;

6. EXPORT Command
The Hive EXPORT statement exports the table or partition data along
with the metadata to the specified output location in the HDFS.
Example:
Here in this example, we are exporting the student table to the HDFS
directory “export_from_hive”.
EXPORT TABLE student TO ‘export_from_hive’

The table successfully exported. You can check for the _metadata file and data
sub-directory using ls command.
7. IMPORT Command
The Hive IMPORT command imports the data from a specified location
to a new table or already existing table.
Example:
Here in this example, we are importing the data exported in the above
example into a new Hive table ‘imported_table’.

Verifying whether the data is imported or not using hive SELECT statement.
SELECT * from imported_table;

HIVEQL QUERIES.
Liks SQL , in hiveql it contains,
 Hiveql operators,
 Hiveql functions,
 Hiveql group by & having ,
 hiveql orderby & sortby ,
 HiveqlJoin

HiveQL - Operators
The HiveQL operators facilitate to perform various arithmetic and relational operations.
Here, we are going to execute such type of operations on the records :

Let's create a table and load the data into it by using the following steps: -

Select the database in which we want to create a table.

hive> use hql;

Create a hive table using the following command: -

hive> create table employee (Id int, Name string , Salary float)
Now, load the data into the table.
hive> load data local inpath '/home/codegyani/hive/emp_data' into table employ
ee;

Let's fetch the loaded data by using the following command: -

hive> select * from employee;

Arithmetic Operators in Hive

In Hive, the arithmetic operator accepts any numeric type. The commonly used
arithmetic operators are: -

Operator Description
s

A+B This is used to add A and B.

A-B This is used to subtract B from A.

A*B This is used to multiply A and B.

A/B This is used to divide A and B and returns the quotient of the
operands.

A%B This returns the remainder of A / B.

A|B This is used to determine the bitwise OR of A and B.

A&B This is used to determine the bitwise AND of A and B.

A^B This is used to determine the bitwise XOR of A and B.

~A This is used to determine the bitwise NOT of A.

hive> select id, name, salary + 50 from employee;

Let's see an example to decrease the salary of each employee by 50.

hive> select id, name, salary - 50 from employee;

Let's see an example to find out the 10% salary of each employee.

hive> select id, name, (salary * 10) /100 from employee;

Relational operators in HIVE:

hive> select * from employee where salary >= 25000;

hive> select * from employee where salary < 25000;

HIVEQL Functions:

There are so many mathematical function in Hive like

Round(num) , floor(num), sqrt(num), abs(num)
hive> select Id, Name, sqrt(Salary) from employee_data ;
Aggregate Functions in Hive
the aggregate function returns a single value resulting from computation over many
rows. Let''s see some commonly used aggregate functions: -

count(*) - It returns the count of the number of rows present in the file.

sum(col), - It returns the sum of values.

sum(DISTINCT col)- It returns the sum of distinct values

avg(col) - It returns the average of values.

min(col) - It compares the values and return the minimum one from it
max(col) - It compares the values and return the minimum one from it
Example:
hive> select max(Salary) from employee_data;

hive> select min(Salary) from employee_data;

HiveQL - GROUP BY and HAVING Clause

+------+--------------+-------------+-------------------+--------+
| ID | Name | Salary | Designation | Dept |
+------+--------------+-------------+-------------------+--------+
|1201 | Gopal | 45000 | Technical manager | TP |
|1202 | Manisha | 45000 | Proofreader | PR |
|1203 | Masthanvali | 40000 | Technical writer | TP |
|1204 | Krian | 40000 | Hr Admin | HR |
|1205 | Kranthi | 30000 | Op Admin | Admin |
+------+--------------+-------------+-------------------+--------+
hive> SELECT Id, Name, Dept FROM employee ORDER BY DEPT;
+------+--------------+-------------+-------------------+--------
+
| ID | Name | Salary | Designation | Dept |
+------+--------------+-------------+-------------------+--------+
|1205 | Kranthi | 30000 | Op Admin | Admin |
|1204 | Krian | 40000 | Hr Admin | HR |
|1202 | Manisha | 45000 | Proofreader | PR |
|1201 | Gopal | 45000 | Technical manager | TP |
|1203 | Masthanvali | 40000 | Technical writer | TP |
+------+--------------+-------------+-------------------+--------+
hive> SELECT Dept,count(*) FROM employee GROUP BY DEPT;
Dept | Count(*) |
+------+--------------+
|Admin | 1 |
|PR | 2 |
|TP | 3 |
+------+--------------+

HAVING CLAUSE
The HQL HAVING clause is used with GROUP BY clause. Its purpose is to apply
constraints on the group of data produced by GROUP BY clause. Thus, it always returns
the data where the condition is TRUE.

hive> select department, sum(salary) from emp group by department

having sum(salary)>=45000;

output:
DEPARTMENT SUM(SALARY) DEPARTMENT SUM(SALARY)
Tp 85000 Tp 85000
Pr 45000 Pr 45000
Hr 40000
Admin 30000

Differences between Hive and Pig

Hive Pig

Hive is commonly used by Data Pig is commonly used by

Analysts. programmers.

It follows SQL-like queries. It follows the data-flow language.

It can handle structured data. It can handle semi-structured data.

It works on server-side of HDFS cluster. It works on client-side of HDFS cluster.

Hive is slower than Pig. Pig is comparatively faster than Hive.

Limitations of Hive
o Hive is not capable of handling real-time data.
o It is not designed for online transaction processing.
o Hive queries contain high latency.

Chapter+9+ HIVE
No ratings yet
Chapter+9+ HIVE
50 pages
Hadoop HIVE
No ratings yet
Hadoop HIVE
41 pages
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
hive
No ratings yet
hive
47 pages
Unit 5 (BDC)
No ratings yet
Unit 5 (BDC)
59 pages
(r17a0528) Big Data Analytics-57-100
No ratings yet
(r17a0528) Big Data Analytics-57-100
44 pages
Unit-Vi Hive Hadoop & Big Data
100% (1)
Unit-Vi Hive Hadoop & Big Data
24 pages
Unit 4 HIVE - PIG
No ratings yet
Unit 4 HIVE - PIG
71 pages
Bda Unit 5 Hive Notes
No ratings yet
Bda Unit 5 Hive Notes
23 pages
unit-IV.docx
No ratings yet
unit-IV.docx
64 pages
Unit 3
No ratings yet
Unit 3
8 pages
Hive
No ratings yet
Hive
23 pages
Hive Final (1)
No ratings yet
Hive Final (1)
75 pages
Introduction to Hive
No ratings yet
Introduction to Hive
14 pages
Big Data
No ratings yet
Big Data
120 pages
Bda Bi Jit Chapter-5
No ratings yet
Bda Bi Jit Chapter-5
27 pages
Unit 5 Lecture No-1(Hive)
No ratings yet
Unit 5 Lecture No-1(Hive)
30 pages
Unit IV (1)
No ratings yet
Unit IV (1)
22 pages
HIVE
No ratings yet
HIVE
80 pages
Unit 5 Handouts
No ratings yet
Unit 5 Handouts
16 pages
Hive Tutorial
No ratings yet
Hive Tutorial
25 pages
hive
No ratings yet
hive
49 pages
Unit 5 Lecture No-1(Hive)
No ratings yet
Unit 5 Lecture No-1(Hive)
30 pages
Unit-IV -BDA
No ratings yet
Unit-IV -BDA
42 pages
Hive Unit VI
No ratings yet
Hive Unit VI
39 pages
Unit 3 BDA
No ratings yet
Unit 3 BDA
44 pages
big-data-unit 5
No ratings yet
big-data-unit 5
54 pages
Big Data Analytics: Welcome
No ratings yet
Big Data Analytics: Welcome
69 pages
Big-Data-Unit 5
No ratings yet
Big-Data-Unit 5
54 pages
Unit V-Hive
No ratings yet
Unit V-Hive
10 pages
BDA Unit-5-PPT
No ratings yet
BDA Unit-5-PPT
39 pages
Course On: Big Data Analytics
No ratings yet
Course On: Big Data Analytics
59 pages
Hiveppt
No ratings yet
Hiveppt
29 pages
HIVE (1)
No ratings yet
HIVE (1)
18 pages
Session 3.1
No ratings yet
Session 3.1
29 pages
Bigdata Analytics
No ratings yet
Bigdata Analytics
13 pages
module 3-1
No ratings yet
module 3-1
32 pages
Learn Hive in 24 Hours
From Everand
Learn Hive in 24 Hours
Alex Nordeen
No ratings yet
Apache Hive: Prashant Gupta
100% (1)
Apache Hive: Prashant Gupta
61 pages
Chapter 5 Hive
No ratings yet
Chapter 5 Hive
69 pages
HIVE AND PIG
No ratings yet
HIVE AND PIG
57 pages
Hive
No ratings yet
Hive
30 pages
Hive PPT
No ratings yet
Hive PPT
61 pages
bda report
No ratings yet
bda report
16 pages
Introduction To Hive
No ratings yet
Introduction To Hive
9 pages
Hive Overview
No ratings yet
Hive Overview
28 pages
Unit-4 Pig Hive
No ratings yet
Unit-4 Pig Hive
40 pages
Super 25 Unit 4 Notes
No ratings yet
Super 25 Unit 4 Notes
16 pages
Unit 2.2 Hive
No ratings yet
Unit 2.2 Hive
80 pages
Hive
No ratings yet
Hive
26 pages
Hive Tutorial
No ratings yet
Hive Tutorial
19 pages
Session 3.2
No ratings yet
Session 3.2
27 pages
Hive
No ratings yet
Hive
12 pages
Hive - A Warehousing Solution Over A Map-Reduce Framework
No ratings yet
Hive - A Warehousing Solution Over A Map-Reduce Framework
4 pages
Hive
No ratings yet
Hive
12 pages
Hive
No ratings yet
Hive
4 pages
Hive Tutorial For Beginners: Learn With Examples in 3 Days
No ratings yet
Hive Tutorial For Beginners: Learn With Examples in 3 Days
3 pages
Big Data Analytics
From Everand
Big Data Analytics
Nitin Kumar Yadav
No ratings yet
Apache Hive Handbook: Query, Analyze, and Optimize Big Data
From Everand
Apache Hive Handbook: Query, Analyze, and Optimize Big Data
Robert Johnson
No ratings yet
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
From Everand
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
Wei Liu
No ratings yet
Hadoop and Big Data
No ratings yet
Hadoop and Big Data
41 pages
99 Apache Spark Interview Questions For Professionals
33% (12)
99 Apache Spark Interview Questions For Professionals
11 pages
Data Engineer Interview Questions
No ratings yet
Data Engineer Interview Questions
7 pages
Nutch Tutorial Hadoop Architecture
No ratings yet
Nutch Tutorial Hadoop Architecture
20 pages
Big Data - SRM University PDF
No ratings yet
Big Data - SRM University PDF
29 pages
Big Data Unit5
No ratings yet
Big Data Unit5
57 pages
Hadoop Pig Presentation
No ratings yet
Hadoop Pig Presentation
33 pages
Ccs 334
No ratings yet
Ccs 334
16 pages
Hands-On Exercises With Big Data: Lab Sheet 1: Getting Started With Mapreduce and Hadoop
No ratings yet
Hands-On Exercises With Big Data: Lab Sheet 1: Getting Started With Mapreduce and Hadoop
14 pages
Big Data Analytics QB Bank
No ratings yet
Big Data Analytics QB Bank
3 pages
(Ebook) Professional Hadoop Solutions by Boris Lublinsky, Kevin T. Smith, Alexey Yakubovich ISBN 9781118611937, 1118611934 pdf download
100% (1)
(Ebook) Professional Hadoop Solutions by Boris Lublinsky, Kevin T. Smith, Alexey Yakubovich ISBN 9781118611937, 1118611934 pdf download
48 pages
Balbhim Ramchandra Patil
No ratings yet
Balbhim Ramchandra Patil
4 pages
Divide and Recombine Approach For Detailed Analysis and Visualization of Large Complex Data
No ratings yet
Divide and Recombine Approach For Detailed Analysis and Visualization of Large Complex Data
13 pages
BIG DATA ANALYTICS (1)
No ratings yet
BIG DATA ANALYTICS (1)
20 pages
Improved Job Scheduling For Achieving Fairness On Apache Hadoop YARN
No ratings yet
Improved Job Scheduling For Achieving Fairness On Apache Hadoop YARN
6 pages
M.E. Bda
No ratings yet
M.E. Bda
106 pages
Unit - III
No ratings yet
Unit - III
37 pages
IT JOB Tips
No ratings yet
IT JOB Tips
36 pages
Data Science: Executive PG Programme in
No ratings yet
Data Science: Executive PG Programme in
32 pages
Aryan BDA Assignment
No ratings yet
Aryan BDA Assignment
6 pages
Petroleum: Big Data Analytics in Oil and Gas Industry: An Emerging Trend
No ratings yet
Petroleum: Big Data Analytics in Oil and Gas Industry: An Emerging Trend
10 pages
Question Bank For PUT
No ratings yet
Question Bank For PUT
3 pages
Hadoop and Mapreduce Cheat Sheet
No ratings yet
Hadoop and Mapreduce Cheat Sheet
1 page
MongoDB Reference Manual Master
No ratings yet
MongoDB Reference Manual Master
1,161 pages
Bda Mod2
No ratings yet
Bda Mod2
8 pages
DataCentricComputing
No ratings yet
DataCentricComputing
8 pages
R23 IDS Unit 3 Lecture Notes
No ratings yet
R23 IDS Unit 3 Lecture Notes
57 pages
Master Thesis
No ratings yet
Master Thesis
68 pages
Data Science & Data Analytics What Is Data ?
No ratings yet
Data Science & Data Analytics What Is Data ?
6 pages