0% found this document useful (0 votes)
13 views

No SQL

This document provides an overview and comparison of NoSQL and SQL databases. It discusses how NoSQL databases were developed to address scalability and flexibility limitations of traditional RDBMS. The main types of NoSQL databases are described as key-value, document, columnar, and graph databases. Common use cases for each type are also outlined. Microsoft Azure storage and database options relevant to NoSQL are then reviewed, including Azure Cosmos DB, Azure Storage, and the different replication options for Azure Storage accounts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

No SQL

This document provides an overview and comparison of NoSQL and SQL databases. It discusses how NoSQL databases were developed to address scalability and flexibility limitations of traditional RDBMS. The main types of NoSQL databases are described as key-value, document, columnar, and graph databases. Common use cases for each type are also outlined. Microsoft Azure storage and database options relevant to NoSQL are then reviewed, including Azure Cosmos DB, Azure Storage, and the different replication options for Azure Storage accounts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Eshant Garg

Azure Data Engineer, Architect, Advisor


[email protected]
Why NoSQL DB?
What traditional databases were
lacking?
RDBMS were lacking

Scalability Flexibility
What is NoSQL
• Vertical scaling Vertical Scaling Horizontal Scaling
• Add more CPU, RAM, HDD in same
system
• Horizontal Scaling
• Add more commodity machines in
system
NoSQL Use Cases

Big data and real-time web applications.

Relationship b/w data is not important

Data change frequently


NoSQL Limitations

Schema-less data means inconsistent data

Denormalized data means redundant data

Redundant data means inaccuracies and conflicts

Does not support many good features of Relational DB


• SPs, Functions, Views, Row level security, Locks, etc.
SQL vs NoSQL

SQL NoSQL

• Relational database • Non-relational or distributed


• Fixed schema • Dynamic
• Designed for complex queries • Not for complex queries
• SQL, MySql, Oracle, Postgres • MongoDB, Redis, Hbase
• Vertical scaling • Horizontal scaling
• Row Oriented • Multi-model oriented
• Tables • Collections
• Limited for big data • Great for big data
4 Types of NoSQL Databases
Key-value store

• Uses a simple key/value to store data


• Quick to query due to its simplicity
• Value can be JSON, BLOB, String etc.
• Use Cases:
• User profiles and session info on a website, blog comments, telecom
directories, IP forwarding tables, shopping cart contents on e-
commerce sites, and more.
• Examples
• Cosmos DB Table API, Redis, Table Storage, Oracle NoSQL Database,
Voldemorte, Aerospike, Oracle Berkeley DB
Document store

• Document-oriented model to store data


• Similar to key/value store, difference is that, the value in a document store
database consists of semi-structured data.
• Each record and its associated data within a single document.
• Document stores are usually XML, JSON, BSON, YAML, etc.
• Use Cases:
• Content management systems, blogging platforms, and other web
applications, blog comments, chat sessions, tweets, ratings, etc.
• Examples
• Cosmos DB, MongoDB, DocumentDB, CouchDB, MarkLogic, OrientDB
Column store

• Stores data using a column oriented model


• Columns in each row are contained within that row
• Each row can have different columns to the other rows.
• Extremely quick to load and query
• Use Cases:
• Sensor Logs [Internet of Things (IOT)], User preferences, Geographic
information, Reporting systems, Time Series Data, Logging and other
write heavy applications
• Examples
• Cosmos DB, Bigtable, Cassandra, Hbase, Vertica, Druid, Accumulo,
Hypertable
Graph store

• Focuses on how data relates to other data points.


• A node is a specific entity or piece of information
• Edge simply specifies the relationship between two nodes.
• Use Cases:
• Social networks, realtime product recommendations, network diagrams,
fraud detection, access management, and more.
• Examples
• Cosmos DB Gremlin API, Neo4j, Blazegraph, and OrientDB.
Multi-model
• Include features/characteristics of more than one data model.

• Example:
• OrientDB: OrientDB combines a graph model with a document model.
• ArangoDB: Uses key/value, document, and graph models.
• Virtuoso: Combines relational, graph, and document models.
NoSQL Offerings by Microsoft Azure

Azure

IaaS PaaS

Azure Data Cosmos


Storage Lake DB

Blob Table File Queue


Microsoft Azure Storage
Programmatic Access to Storage Accounts

REST APIs SDKs PowerShell

Azure Storage
Azure CLI AzCopy
Explorer
Azure Storage Account Type

Supported Services: Blob, File, Disk, Table and Queue Storage

Supports Blob Access Tiers

General Purpose V2 Block Blobs, Append Blobs, Page Blobs

Hierarchical namespace support (Data Lake Gen2)

Premium tier available for Page Blobs only


Azure Storage Account Type

Supported Services: Blob, File, Disk, Table and


Queue Storage

Does NOT Support Blob Access Tiers

General Purpose V1
Classic Deployment and Resource Manager
Deployment

Can convert V1 to V2
Azure Storage Account Type

Supports : Blob Storage


NOT support: File, Disk, Table and Queue Storage

Supports: Block Blobs, Append Blobs


Blob Storage
NOT Support: Page blobs

Supports Blob Access Tiers


Azure Storage Account Type

Better performance
- High transaction rates
- Low storage latency

Backed by Solid State Drives

Block Blob Storage Block and Append Blobs only

Does NOT support Blob Storage Access Tiers

Example workloads: Analytics, Data transformations, eCommerce and mapping


applications
Azure Storage Account Type

High performance

Low latency

File Storage Backed by Solid State Drives

IOPS bursting feature

Billed based on provisioned storage


Three categories of replication options
Locally Redundant Storage (LRS)
Region A Region B

Storage Clusters Region B


Each cluster is physically separate in what's called an Hundreds of miles away from the primary region to
availability zone, with its own separate utilities and prevent data loss in the event of a natural disaster.
networking.
Zone Redundant Storage (ZRS)
Region A Region B

Storage Clusters Region B


Each cluster is physically separate in what's called an Hundreds of miles away from the primary region to
availability zone, with its own separate utilities and prevent data loss in the event of a natural disaster.
networking.
Geo Redundant Storage (GRS)
Region A Region B

Storage Clusters Region B


Each cluster is physically separate in what's called an Hundreds of miles away from the primary region to
availability zone, with its own separate utilities and prevent data loss in the event of a natural disaster.
networking.
Geo Zone Redundant Storage (GZRS)
Region A Region B

Storage Clusters Region B


Each cluster is physically separate in what's called an Hundreds of miles away from the primary region to
availability zone, with its own separate utilities and prevent data loss in the event of a natural disaster.
networking.
Read access geo Redundant Storage (RA-GRS)
Region A Region B (Read)

Storage Clusters Region B


Each cluster is physically separate in what's called an Hundreds of miles away from the primary region to
availability zone, with its own separate utilities and prevent data loss in the event of a natural disaster.
networking.
Read access Geo Zone Redundant Storage (RA-GZRS)
Region A Region B (Read)

Storage Clusters Region B


Each cluster is physically separate in what's called an Hundreds of miles away from the primary region to
availability zone, with its own separate utilities and prevent data loss in the event of a natural disaster.
networking.
Blob Access Tiers

Archive
Hot Tier Cool Tier
Tier

Highest storage cost Lowest storage cost Lowest storage cost


Lowest data access cost Higher data access cost Highest data retrieval cost
Data is offline

Azure Blob Storage Lifecycle Management


Azure Blob Storage
• Designed for images and unstructured Data
• Store Documents and access in browser
• Database backup
• Store audio and video files and stream them
• Store data for analysis
• Log files
• Scalability
• Cheapest way to store data in azure
• Simple design and easy to use
• HDFS and blob storage REST APIs
Blob Types

• Block Blog
• Composed for Blocks
• Append Blob
• Can only append blocks
• Ideal for logs
• Page Blob
• VM disks and databases
• Frequent random read/write applications
Use cases

• Only basic storage is needed


• Data is unstructured
• Data that is older or not used as much
• Money is an issue
Advantages of Blob storage

• Extremely cheap
• Simple to setup
• No configuration
• Doesn’t require powerful computing to manage
Limitations of Blob storage

• No Indexes
• No Search Tools
• Not optimized for performance
• You are responsible for replication and
synchronization
• Requires external compute to process

You might also like