0% found this document useful (0 votes)
14 views30 pages

Lecture NoSQL

The document compares SQL and NoSQL databases, highlighting the key characteristics and differences between them. SQL databases are defined by a predefined schema, strict consistency, and the use of ACID transactions, while NoSQL databases are schema-free, prioritize quick responses, and often utilize BASE transactions. The presentation also discusses various types of NoSQL databases and their applications in handling large volumes of data.

Uploaded by

syyeung7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views30 pages

Lecture NoSQL

The document compares SQL and NoSQL databases, highlighting the key characteristics and differences between them. SQL databases are defined by a predefined schema, strict consistency, and the use of ACID transactions, while NoSQL databases are schema-free, prioritize quick responses, and often utilize BASE transactions. The presentation also discusses various types of NoSQL databases and their applications in handling large volumes of data.

Uploaded by

syyeung7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

+

A Comparison of SQL and NoSQL Databases


Slides from: Keith W. Hare
Metadata Open Forum

More reading: http://martinfowler.com/articles/nosqlKeyPoints.html


Metadata Open Forum
+
Abstract

NoSQL databases (either no-SQL or Not Only SQL)


are currently a hot topic in some parts of
computing. In fact, one website lists over a
hundred different NoSQL databases.

This presentation reviews the features common to


the NoSQL databases and compares those
features to the features and capabilities of SQL
databases.

2
BIG DATA! 14 April 2019
3 14 April 2019
+
SQL Characteristics

n Data stored in rows, columns and tables


n Relationships represented by data
n Data Manipulation Language
n Data Definition Language
n Transactions

n Abstraction from physical layer

4 14 April 2019
+
SQL Physical Layer Abstraction

n Applications specify what, not how


n Query optimization engine
n Physical layer can change without modifying
applications
n Createindexes to support queries
n In Memory databases

5 14 April 2019
+
Data Manipulation Language (DML)

n Datamanipulated with Select, Insert, Update, &


Delete statements
n Select T1.Column1, T2.Column2 …
From Table1, Table2 …
Where T1.Column1 = T2.Column1 …

n Data Aggregation

n Compound statements

n Functions and Procedures


n Explicit transaction control
6 14 April 2019
+
Data Definition Language

n Schema defined at the start


n Create Table (Column1 Datatype1, Column2 Datatype 2, …)
n Constraints to define and enforce relationships
n Primary Key
n Foreign Key
n Etc.

n Triggers to respond to Insert, Update , & Delete


n Stored Modules
n Alter …
n Drop …
n Security and Access Control
7 14 April 2019
+
Transactions – ACID Properties

n Atomic – All of the work in a transaction completes


(commit) or none of it completes
n Consistent – A transaction transforms the database
from one consistent state to another consistent state.
Consistency is defined in terms of constraints.
n Isolated – The results of any changes made during a
transaction are not visible until the transaction has
committed.
n Durable – The results of a committed transaction
survive failures

8 14 April 2019
+
NewSQL: real-time analytics

n ) SQL as the primary mechanism for application interaction

n 2) ACID support for transactions

n 3) A non-locking concurrency control mechanism so real-time


reads will not conflict with writes, and thereby cause them to
stall.

n 4) An architecture providing much higher per-node performance


than available from the traditional "elephants”

n 5) A scale-out, shared-nothing architecture, capable of running


on a large number of nodes without bottlenecking
+
NoSQL Definition

From www.nosql-database.org:
Next Generation Databases mostly addressing
some of the points: being non-relational,
distributed, open-source and horizontal
scalable. The original intention has been
modern web-scale databases. The movement
began early 2009 and is growing rapidly. Often
more characteristics apply as: schema-free,
easy replication support, simple API, eventually
consistent / BASE (not ACID), a huge data
amount, and more.
10 14 April 2019
+
NoSQL Products/Projects
http://www.nosql-database.org/ lists 122
NoSQL Databases

nCassandra

nCouchDB

nHadoop & Hbase


nMongoDB

nStupidDB

nEtc.

11 14 April 2019
+
NoSQL Products/Projects

http://www.nosql-
database.org/ lists 122 NoSQL
Databases

nCassandra

nCouchDB

nHadoop & Hbase

nMongoDB

nStupidDB

12 nEtc. 14 April 2019


+
NoSQL Distinguishing Characteristics
n Large data volumes n Asynchronous Inserts &
n Google s big data Updates
n Scalable replication and n Schema-less
distribution
n Potentially thousands of n ACIDtransaction
machines properties are not
n Potentially distributed around needed – BASE
the world
n CAP Theorem
n Queries need to return
answers quickly n Open source
development
n Mostly query, few updates

13 14 April 2019
+
BASE Transactions

n Acronym contrived to be the opposite of


ACID
n Basically Available,
n Soft state,
n Eventually Consistent
n Characteristics
n Weak consistency – stale data OK
n Availability first
n Best effort
n Approximate answers OK
n Aggressive (optimistic)
n Simpler and faster

14 14 April 2019
+
Brewer s CAP Theorem

A distributed system can support only two of the


following characteristics:

n Consistency

n Availability

n Partition tolerance

15 14 April 2019
+
+
NoSQL Database Types

Discussing NoSQL databases is


complicated because there are a variety of
types:
nColumn Store – Each storage block
contains data from only one column
nDocument Store – stores documents
made up of tagged elements
nKey-Value Store – Hash table of keys
17 14 April 2019
+
NoSQL Example: Column Store

n Eachstorage block contains data from only


one column
n Example: Hadoop/Hbase
n http://hadoop.apache.org/
n Yahoo, Facebook

n Example: Ingres VectorWise


n Column Store integrated with an SQL database
n http://www.ingres.com/products/vectorwise

Metadata Open Forum 18


+
Column Store Comments

nMore efficient than row (or document)


store if:
n Multiple row/record/documents are inserted
at the same time so updates of column
blocks can be aggregated
n Retrievals access only some of the
columns in a row/record/document

Metadata Open Forum 19


+
Other Non-SQL Databases

nXML Databases
nGraph Databases
nCodasyl Databases
nObject Oriented Databases
nEtc…

nWill not address these today


20 14 April 2019
+
Storing and Modifying Data

nSyntax varies
n HTML
n Java Script
n Etc.

nAsynchronous – Inserts and updates


do not wait for confirmation
nVersioned

nOptimistic Concurrency
23 14 April 2019
+
Retrieving Data
nSyntax Varies
n No set-based query language
n Procedural program languages such as
Java, C, etc.

nApplication specifies retrieval path


nNo query optimizer
nQuick answer is important
nMay not be a single right answer
24 14 April 2019
+
Open Source

nSmall upfront software costs


nSuitable
for large scale distribution on
commodity hardware

25 14 April 2019
+
NoSQL Summary

nNoSQL databases reject:


n Overhead of ACID transactions
n Complexity of SQL
n Burden of up-front schema design
n Declarative query expression
n Yesterday s technology

nProgrammer responsible for


n Step-by-step procedural language
n Navigating access path
26 14 April 2019
+
Summary
n SQL Databases
n Predefined Schema
n Standard definition and interface language
n Tight consistency
n Well defined semantics

n NoSQL Database
n No predefined Schema
n Per-product definition and interface language
n Getting an answer quickly is more important
than getting a correct answer

27 14 April 2019
+
Web References

n NoSQL -- Your Ultimate Guide to the Non - Relational Universe!


http://nosql-database.org/links.html
n NoSQL (RDBMS)
http://en.wikipedia.org/wiki/NoSQL
n PODC Keynote, July 19, 2000. Towards Robust. Distributed
Systems. Dr. Eric A. Brewer. Professor, UC Berkeley. Co-Founder
& Chief Scientist, Inktomi .
www.eecs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
n Brewer's CAP Theorem posted by Julian Browne, January 11,
2009. http://www.julianbrowne.com/article/viewer/brewers-cap-
theorem
n How to write a CV Geek & Poke Cartoon
http://geekandpoke.typepad.com/geekandpoke/2011/01/nosql.html

28 14 April 2019
+
Web References

n Exploring CouchDB: A document-oriented database for


Web applications , Joe Lennon, Software developer,
Core International.
http://www.ibm.com/developerworks/opensource/library/o
s-couchdb/index.html
n Graph Databases, NOSQL and Neo4j Posted by Peter
Neubauer on May 12, 2010 at:
http://www.infoq.com/articles/graph-nosql-neo4j
n Cassandra vs MongoDB vs CouchDB vs Redis vs Riak
vs HBase comparison , Kristóf Kovács.
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-
redis
n Distinguishing Two Major Types of Column-Stores
Posted by Daniel Abadi onMarch 29, 2010
http://dbmsmusings.blogspot.com/2010/03/distinguishing-
two-major-types-of_29.html

29 14 April 2019
+
Web References

n MapReduce: Simplified Data Processing on Large


Clusters , Jeffrey Dean and Sanjay Ghemawat,
December 2004.
http://labs.google.com/papers/mapreduce.html
n Scalable SQL , ACM Queue, Michael Rys, April 19,
2011
http://queue.acm.org/detail.cfm?id=1971597
n a practical guide to noSQL , Posted by Denise Miura on
March 17, 2011 at
http://blogs.marklogic.com/2011/03/17/a-practical-guide-
to-nosql/

30 14 April 2019

You might also like