Acquire and Access Data Using
NoSQL Database
Course Road Map
Lesson 5: Introduction to the Hadoop
Module 1: Big Data Management System Distributed File System (HDFS)
Lesson 6: Acquire Data using CLI, Fuse-
Module 2: Data Acquisition and Storage DFS, and Flume
Lesson 07: Acquire and Access Data
Module 3: Data Access and Processing
Using NoSQL Database
Module 4: Data Unification and Analysis Lesson 08: Primary Administrative Tasks
for NoSQL Database
Module 5: Using and Managing Oracle
Big Data Appliance
7-2
Objectives
After completing this lesson, you should be able to:
• Describe NoSQL Database characteristics
• Differentiate NoSQL from RDBMS and HDFS
• Describe NoSQL Database benefits
• Load and remove data in an NoSQL DB
• Retrieve data from an NoSQL DB
7-3
What is a NoSQL Database?
• Is a key-value database
• Is accessible by using Java APIs
• Stores unstructured or semi-structured data
KVStore
as byte arrays
• NoSQL Database is a nonrelational database
Benefits
• Easy to install and configure
• Highly reliable
• General-purpose database system
• Scalable throughput and predictable latency
• Configurable consistency and durability
7-4
RDBMS Compared to NoSQL
RDBMS NoSQL
High-value, high-density, complex data Low-value, low-density, simple data
Complex data relationships Very simple relationships
Joins Avoids joins
Schema-centric, structured data Unstructured or semi-structured data
Designed to scale up (not out) Distributed storage and processing
Well-defined standards Standards not yet evolved
Database-centric Application- and developer-centric
High security Minimal or no security
7-5
RDBMS Compared to NoSQL
7-6
HDFS Compared to NoSQL
HDFS NoSQL
File system Database
No inherent structure Simple data structure
Batch-oriented Real-time
Processes data to use Delivers a service
Bulk storage Fast access to specific records
Write once, read many Read, write, delete, update
7-7
Points to Consider Before Choosing NoSQL
When deciding on an applications database technology,
you should analyze:
• The data to be stored
– High volume with low value?
If answer is “yes,” NoSQL is a good choice.
• The application schema
– Dynamic?
If answer is “yes,” NoSQL is a good choice.
7-8
NoSQL Key-Value Data Model
Each record consists of a key-value pair.
Key Component Value Component
Major keys Minor Keys - Byte Array OR AVRO Schema
{“name" : “User",
name fname lname "namespace" :
"com.company. avro",
"type" : "record“,
"fields":
user/userid subscriptions expiration date [{"name": “userId",
"type": “Integer",
"default": 0}
picture .jpg ]
}
7-9
Acquiring and Accessing Data in a NoSQL DB
Primary Tasks
• Creating Tables
– Create a table (parent or child)
– Add the table to the KVStore
• Adding or Removing Data
– Use the Table API
– Insert Rows
– Delete Rows
• Reading Data
– Retrieve a single record
– Retrieve multiple records
7 - 10
Primary (Parent) Table Data Model
Table Name
Primary Key1 Field 1 Field 2 Field 3
Table Name : User
Primary Key “Value”
userId firstName lastName
7 - 11
Table Data Model: Child Tables
Table Name : User
Primary Key “Value”
userId firstName lastName
Child Table Name: Folder
Primary Key “Value”
UserID Folder Arrival From To Sender CC Subject Msg
Name Date Body
7 - 12
Creating Tables
7 - 13
Creating Tables: Two Options
1 2
CLI APIs
7 - 14
Accessing the CLI
Access the CLI by invoking the runadmin utility or the
kvcli.jar file.
1
java -jar $KVHOME/lib/kvstore.jar runadmin
-port <port number>
-host <host name>
2
java -jar $KVHOME/lib/kvcli.jar
-port <port number>
-host <host name>
-store kvstore
7 - 15
Executing a DDL Command
7 - 16
Viewing Table Descriptions
All Tables in KVStore
execute "show tables"
execute "show AS JSON
tables"
Specific Table
execute "describe AS
JSON table <tablename>
[fields]"
7 - 17
Recommendation: Using Scripts
Use scripts for all database operations:
• This is the recommended approach.
• Prevents accidental errors/typos.
• Consistent store environment through
all cycles of development, testing,
and deployment.
1
java -jar $KVHOME/lib/kvstore.jar runadmin
-port <port number>
-host <host name>
load –file <path to script>
2
kv> load –file <path to script>
7 - 18
Loading Data Into Tables
7 - 19
Write Operations: put() Methods
put putIfAbsent putIfPresent putIfVersion
Writes to the table Writes to the table
Writes to the table Writes to the table
irrespective of only if the record
only if the record is only if the record is
whether the record present is same as
not already present already present
is present or not a specified version
Conditional
Definite write Creation Updates
Update
7 - 20
Writing Rows to Tables: Steps
To write a row into a table, perform the following steps:
1. Obtain TableAPI handle.
2. Construct handle to the table.
3. Create a Row object.
4. Add field values to the row.
5. Write the row to the table.
7 - 21
Reading Data from Tables
7 - 22
Read Operations: get() Methods
get multiGet multiGetIterator tableIterator
Retrieves a Retrieves a set Retrieves a set of
Retrieves a set of records
single record of records from records or indexes
from the table in batches
from the table the table from the table
- Atomic non-atomic non-atomic
- - single thread multiple threads
7 - 23
Removing Data From Tables
7 - 24
Delete Operations: 3 TableAPIs
delete multiDelete deleteIfVersion
Deletes a single record
Deletes a single record Deletes multiple records from a table if the version
from a table from tables of the record is same as a
specified version
7 - 25
Summary
After completing this lesson, you learned how to:
• Describe NoSQL Database characteristics
• Differentiate NoSQL from RDBMS and HDFS
• Describe NoSQL Database benefits
• Load and remove data in an Oracle NoSQL DB
• Retrieve data from an NoSQL DB
7 - 26