Hbase Tutorial
Hbase Tutorial
e
HBa
Audience
This tutorial will help professionals aspiring to make a career in Big Data Analytics using
Hadoop Framework. Software professionals, analytics Professionals, and ETL developers
are the key beneficiaries of this course.
Prerequisites
Before you start proceeding with this tutorial, we assume that you are already aware of
Hadoop's architecture and APIs, have experience in writing basic applications using java,
and have a working knowledge of any database.
All the content and graphics published in this e-book are the property of Tutorials Point
(I) Pvt. Ltd. The user of this e-book is prohibited to reuse, retain, copy, distribute or
republish any contents or a part of contents of this e-book in any manner without written
consent of the publisher.
We strive to update the contents of our website and tutorials as timely and as precisely
as possible, however, the contents may contain inaccuracies or errors. Tutorials Point (I)
Pvt. Ltd. provides no guarantee regarding the accuracy, timeliness or completeness of
our website or its contents including this tutorial. If you discover any errors on our
website or in this tutorial, please notify us at [email protected]
i
HBa
Table of Contents
About the Tutorial································································································································· i
Audience··············································································································································· i
Prerequisites········································································································································· i
Table of Contents································································································································· ii
1. HBASE ─ OVERVIEW·······················································································1
What is HBase?···································································································································· 1
Features of HBase································································································································· 4
Applications of HBase··························································································································· 5
HBase History······································································································································· 5
2. HBASE ─ ARCHITECTURE················································································6
HBase Architecture······························································································································· 6
3. HBASE ─ INSTALLATION··················································································8
Pre-Installation Setup··························································································································· 8
Installing HBase·································································································································· 16
i
HBa
4. HBASE ─ SHELL··········································································································24
HBase Shell········································································································································· 24
General Commands···························································································································· 24
status················································································································································· 27
version··············································································································································· 27
table_help·········································································································································· 27
whoami·············································································································································· 28
Class HBaseAdmin······························································································································ 29
Creating Table···································································································································· 31
list······················································································································································ 35
Disable a Table··································································································································· 38
i
HBa
Verification········································································································································· 38
is_disabled········································································································································· 38
disable_all·········································································································································· 39
Enable a Table···································································································································· 42
Verification········································································································································· 42
is_enabled·········································································································································· 43
describe············································································································································· 46
alter··················································································································································· 46
exists·················································································································································· 53
drop··················································································································································· 55
drop_all·············································································································································· 55
exit····················································································································································· 59
Stopping HBase·································································································································· 59
i
HBa
Class HBaseConfiguration··················································································································· 61
Class HTable······································································································································· 61
Class Put············································································································································· 62
Class Get············································································································································ 64
Class Delete········································································································································ 64
Class Result········································································································································ 65
Creating Data····································································································································· 67
Updating Data···································································································································· 72
Reading Data······································································································································ 76
v
HBa
scan···················································································································································· 84
count·················································································································································· 87
truncate············································································································································· 87
grant·················································································································································· 88
revoke················································································································································ 88
user_permission································································································································· 88
v
1.HBASE ─ OVERVIEW HBa
Since 1970, RDBMS is the solution for data storage and maintenance related problems. After
the advent of big data, companies realized the benefit of processing big data and started
opting for solutions like Hadoop.
Hadoop uses distributed file system for storing big data, and MapReduce to process it.
Hadoop excels in storing and processing of huge data of various formats such as arbitrary,
semi-, or even unstructured.
Limitations of Hadoop
Hadoop can perform only batch processing, and data will be accessed only in a sequential
manner. That means one has to search the entire dataset even for the simplest of jobs.
A huge dataset when processed results in another huge data set, which should also be
processed sequentially. At this point, a new solution is needed to access any point of data in
a single unit of time (random access).
What is HBase?
HBase is a distributed column-oriented database built on top of the Hadoop file system. It is
an open-source project and is horizontally scalable.
HBase is a data model that is similar to Google’s big table designed to provide quick random
access to huge amounts of structured data. It leverages the fault tolerance provided by the
Hadoop File System (HDFS).
It is a part of the Hadoop ecosystem that provides random real-time read/write access to
data in the Hadoop File System.
One can store the data in HDFS either directly or through HBase. Data consumer
reads/accesses the data in HDFS randomly using HBase. HBase sits on top of the Hadoop
File System and provides read and write access.
7
HBa
HDFS does not support fast individual HBase provides fast lookups for larger tables.
record lookups.
8
HBa
Such databases are designed for small Column-oriented databases are designed for
number of rows and columns. huge tables.
9
HBa
It is built for wide tables. HBase is It is thin and built for small tables. Hard to
horizontally scalable. scale.
Features of HBase
HBase is linearly scalable.
1
HBa
Applications of HBase
It is used whenever there is a need to write heavy applications.
HBase is used whenever we need to provide fast random access to available data.
Companies such as Facebook, Twitter, Yahoo, and Adobe use HBase internally.
HBase History
Year Event
Oct 2007 The first usable HBase along with Hadoop 0.15.0 was released.
1
2.HBASE ─ ARCHITECTURE HBa
HBase Architecture
In HBase, tables are split into regions and are served by the region servers. Regions are
vertically divided by column families into “Stores”. Stores are saved as files in HDFS. Shown
below is the architecture of HBase.
Note: The term ‘store’ is used for regions to explain the storage structure.
HBase has three major components: the client library, a master server, and region servers.
Region servers can be added or removed as per requirement.
Master Server
The master server -
Assigns regions to the region servers and takes the help of Apache ZooKeeper for
this task.
Handles load balancing of the regions across region servers. It unloads the busy
servers and shifts the regions to less occupied servers.
Is responsible for schema changes and other metadata operations such as creation of
tables and column families.
1
HBa
Regions
Regions are nothing but tables that are split up and spread across the region servers.
Region server
The region servers have regions that -
Handle read and write requests for all the regions under it.
Decide the size of the region by following the region size thresholds.
When we take a deeper look into the region server, it contain regions and stores as shown
below:
The store contains memory store and HFiles. Memstore is just like a cache memory.
Anything that is entered into the HBase is stored here initially. Later, the data is transferred
and saved in Hfiles as blocks and the memstore is flushed.
Zookeeper
Zookeeper is an open-source project that provides services like maintaining
configuration information, naming, providing distributed synchronization, etc.
1
HBa
In addition to availability, the nodes are also used to track server failures or network
partitions.
In pseudo and standalone modes, HBase itself will take care of zookeeper.
1
3.HBASE ─ INSTALLATION HBa
This chapter explains how HBase is installed and initially configured. Java and Hadoop are
required to proceed with HBase, so you have to download and install java and Hadoop in
your system.
Pre-Installation Setup
Before installing Hadoop into Linux environment, we need to set up Linux using ssh (Secure
Shell). Follow the steps given below for setting up the Linux environment.
Creating a User
First of all, it is recommended to create a separate user for Hadoop to isolate the Hadoop file
system from the Unix file system. Follow the steps given below to create a user.
2. Create a user from the root account using the command “useradd username”.
3. Now you can open an existing user account using the command “su
username”. Open the Linux terminal and type the following commands to create a
user.
$ su
password:
# useradd
hadoop # passwd
hadoop New
passwd: Retype
The following commands are used to generate a key value pair using SSH. Copy the public
keys form id_rsa.pub to authorized_keys, and provide owner, read and write permissions to
authorized_keys file respectively.
1
HBa
$ ssh-keygen -t rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
Verify ssh
Installing Java
ssh localhost
Java is the main prerequisite for Hadoop and HBase. First of all, you should verify the
existence of java in your system using “java -version”. The syntax of java version command
is given below.
Step 2
Generally you will find the downloaded java file in Downloads folder. Verify it and extract the
jdk-7u71-linux-x64.gz file using the following commands.
$ cd Downloads/
$ ls
jdk-7u71-linux-x64.gz
1
HBa
jdk1.7.0_71 jdk-7u71-linux-x64.gz
Step 3
To make java available to all the users, you have to move it to the location “/usr/local/”.
Open root and type the following commands.
$ su 4
Step
Forpassword:
setting up PATH and JAVA_HOME variables, add the following commands to ~/.bashrc
file.
# mv jdk1.7.0_71
/usr/local/
Now apply all #the
exit
changes into the current running system.
export JAVA_HOME=/usr/local/jdk1.7.0_71
export PATH= $PATH:$JAVA_HOME/bin
$ source
Step 5 ~/.bashrc
Use the following commands to configure java alternatives:
Now verify the installation using the command java -version from the terminal as explained
# alternatives --install /usr/bin/java java usr/local/java/bin/java 2
above.
# alternatives --install /usr/bin/javac javac usr/local/java/bin/javac
2 # alternatives --install /usr/bin/jar jar usr/local/java/bin/jar 2
1
HBa
Downloading Hadoop
After installing java, you have to install Hadoop. First of all, verify the existence of Hadoop
using “ Hadoop version ” command as shown below.
1
HBa
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export
HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
Now apply all the changes into the current running system.
1
HBa