0% found this document useful (0 votes)
178 views3 pages

Hive Tutorial For Beginners: Learn With Examples in 3 Days

This document provides an overview and syllabus for a 3 day Hive tutorial for beginners. It introduces Apache Hive, which helps query and manage large datasets using SQL-like queries. The syllabus covers basic Hive concepts like installation, configuration, data types, and advanced topics like partitions, buckets, indexes, queries and joins. It defines what Hive is, how it provides a SQL interface to analyze data stored in Hadoop using MapReduce, and compares Hive to using MapReduce directly.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
178 views3 pages

Hive Tutorial For Beginners: Learn With Examples in 3 Days

This document provides an overview and syllabus for a 3 day Hive tutorial for beginners. It introduces Apache Hive, which helps query and manage large datasets using SQL-like queries. The syllabus covers basic Hive concepts like installation, configuration, data types, and advanced topics like partitions, buckets, indexes, queries and joins. It defines what Hive is, how it provides a SQL interface to analyze data stored in Hadoop using MapReduce, and compares Hive to using MapReduce directly.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Hive Tutorial for Beginners: Learn with

Examples in 3 Days
ByDavid TaylorUpdatedApril 16, 2022

Hive Tutorial Summary

Apache Hive helps with querying and managing large datasets real fast. It is an ETL
tool for the Hadoop ecosystem. In this Apache Hive tutorial for beginners, you will
learn Hive basics and important topics like HQL queries, data extractions,
partitions, buckets, and so on. This Hive tutorials series will help you learn Hive
concepts and basics.

What should I know?

To learn this Hive query tutorial, you need basic knowledge of SQL, Hadoop and
knowledge of other databases will be of an additional help.

Hive Course Syllabus


Introduction
👉 Lesson 1 What is Hive? — Architecture & Modes

👉 Lesson 2 Download & Install HIVE — How to Download & Install HIVE on Ubuntu

👉 Lesson 3 HIVE Metastore Configuration — Why to Use MySQL?

👉 Lesson 4 Hive Data Types — Create & Drop Database in Hive

Advanced Stuff
👉 Lesson 1 Hive Create Table — Types and its Usage

👉 Lesson 2 Hive Partitions & Buckets — Learn with Example

👉 Lesson 3 Hive Indexes and View — Learn with Example

👉 Lesson 4 Hive Queries — Learn with Example


👉 Lesson 5 Hive Join & SubQuery Tutorial — Learn with Example

👉 Lesson 6 Hive Query Language Tutorial — Built-in Operators

👉 Lesson 7 Hive Function — Built-in & User Defined Functions

👉 Lesson 8 Hive ETL — Loading JSON, XML, Text Data Examples

Introduction to Hive
Hive evolved as a data warehousing solution built on top of Hadoop Map-Reduce
framework.

The size of data sets being collected and analyzed in the industry for business
intelligence is growing and in a way, it is making traditional data warehousing
solutions more expensive. Hadoop with MapReduce framework, is being used as an
alternative solution for analyzing data sets with huge size. Though, Hadoop has
proved useful for working on huge data sets, its MapReduce framework is very low
level and it requires programmers to write custom programs which are hard to
maintain and reuse. Hive comes here for rescue of programmers.

Hive engine compiles these queries into Map-Reduce jobs to be executed on


Hadoop. In addition, custom Map-Reduce scripts can also be plugged into queries.
Hive operates on data stored in tables which consists of primitive data types and
collection data types like arrays and maps.
Hive comes with a command-line shell interface which can be used to create tables
and execute queries.

Hive query language is similar to SQL wherein it supports subqueries. With Hive
query language, it is possible to take a MapReduce joins across Hive tables. It has a
support for simple SQL like functions– CONCAT, SUBSTR, ROUND etc.,
and aggregation functions– SUM, COUNT, MAX etc. It also supports GROUP BY and
SORT BY clauses. It is also possible to write user defined functions in Hive query
language.

What is Hive?
Apache Hive is a data warehouse framework for querying and analysis of data
stored in HDFS. It is developed on top of Hadoop. Hive is an open-source software
to analyze large data sets on Hadoop. It provides SQL-like declarative language,
called HiveQL, to express queries. Using Hive-QL, users associated with SQL can
perform data analysis very easily.

Hive Vs Map Reduce


Prior to choosing one of these two options, we must look at some of their features.

While choosing between Hive and Map reduce following factors are taken in
consideration;

 Type of Data
 Amount of Data
 Complexity of Code

Hive Vs Map Reduce?


Feature Hive Map Reduce

 It compiles language with two main task


It Supports SQL like query language for is map task, and another one is a reduce
Language
interaction and for Data modeling  We can define these task using Java or P

Level of abstraction Higher level of Abstraction on top of HDFS Lower level of abstraction

Efficiency in Code Comparatively lesser than Map reduce Provides High efficiency

Less number of lines code required for


Extent of code More number of lines of codes to be defined
execution

Type of Development
Less Development work required More development work needed
work required

You might also like