0% found this document useful (0 votes)

12 views26 pages

NoSQL D

NoSQL databases are a diverse category of data storage solutions that do not adhere to traditional RDBMS principles, designed to handle large volumes of unstructured and semi-structured data. They prioritize scalability, flexibility, and performance over strict ACID compliance, often utilizing the CAP theorem to balance consistency, availability, and partition tolerance. Various types of NoSQL databases, such as document stores, key-value stores, and graph databases, cater to different application needs and data management challenges.

Uploaded by

Praket Mehta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views26 pages

NoSQL D

Uploaded by

Praket Mehta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

NoSQL databases are currently a hot topic in some parts of computing, with over

a hundred different NoSQL databases.

▪ Data stored in columns and tables
▪ Relationships represented by data
▪ Data Manipulation Language
▪ Data Definition Language
▪ Transactions
▪ Abstraction from physical layer
▪ Applications specify what, not how
▪ Physical layer can change without modifying applications
▪ Create indexes to support queries
▪ In Memory databases
▪ Atomic – All of the work in a transaction completes (commit) or none of it
completes
▪ Consistent – A transaction transforms the database from one consistent state
to another consistent state. Consistency is defined in terms of constraints.
▪ Isolated – The results of any changes made during a transaction are not
visible until the transaction has committed.
▪ Durable – The results of a committed transaction survive failures
▪ NoSQL stands for:
▪ No Relational
▪ No RDBMS
▪ Not Only SQL

▪ NoSQL is an umbrella term for all databases and data stores that don’t follow the
RDBMS principles
▪ A class of products
▪ A collection of several (related) concepts about data storage and manipulation
▪ Often related to large data sets
▪ Non-relational DBMSs are not new
▪ But NoSQL represents a new incarnation
▪ Due to massively scalable Internet applications
▪ Based on distributed and parallel computing

▪ Development
▪ Starts with Google
▪ First research paper published in 2003
▪ Continues also thanks to Lucene's developers/Apache (Hadoop) and Amazon (Dynamo)
▪ Then a lot of products and interests came from Facebook, Netfix, Yahoo, eBay, Hulu, IBM,
and many more
▪ Three major papers were the seeds of the NoSQL movement
▪ BigTable (Google)
▪ Dynamo (Amazon)
▪ Distributed key-value data store
▪ Eventual consistency
▪ CAP Theorem
▪ NoSQL comes from Internet, thus it is often related to the “big data” concept
▪ How much big are “big data”?
▪ Over few terabytes Enough to start spanning multiple storage units

▪ Challenges
▪ Efficiently storing and accessing large amounts of data is difficult, even more considering
fault tolerance and backups
▪ Manipulating large data sets involves running immensely parallel processes
▪ Managing continuously evolving schema and metadata for semi-structured and un-
structured data is difficult
▪ Explosion of social media sites (Facebook, Twitter) with large
data needs
▪ Rise of cloud-based solutions such as Amazon S3 (simple storage
solution)
▪ Just as moving to dynamically-typed languages (Python, Ruby,
Groovy), a shift to dynamically-typed data with frequent schema
changes
▪ Open-source community
▪ The context is Internet
▪ RDBMSs assume that data are
▪ Dense
▪ Largely uniform (structured data)

▪ Data coming from Internet are

▪ Massive and sparse
▪ Semi-structured or unstructured

▪ With massive sparse data sets, the typical storage mechanisms and access methods
get stretched
▪ Large data volumes ▪ Asynchronous Inserts &
▪ Google’s “big data” Updates
▪ Schema-less
▪ Scalable replication and
distribution ▪ ACID transaction properties
▪ Potentially thousands of are not needed – BASE
machines ▪ CAP Theorem
▪ Potentially distributed
around the world ▪ Open source development

▪ Queries need to return

answers quickly
▪ Mostly query, few
updates
Discussing NoSQL databases is complicated because there are a variety of types:

▪Sorted ordered Column Store

▪Optimized for queries over large datasets, and store
columns of data together, instead of rows
▪Document databases:
▪pair each key with a complex data structure known as a document.

▪Key-Value Store :
▪are the simplest NoSQL databases. Every single item in the database is stored
as an attribute name (or 'key'), together with its value.
▪Graph Databases :
▪are used to store information about networks of data, such as social connections.
▪ Documents
▪ Loosely structured sets of key/value pairs in documents, e.g., XML, JSON
▪ Encapsulate and encode data in some standard formats or encodings
▪ Are addressed in the database via a unique key
▪ Documents are treated as a whole, avoiding splitting a document into its constituent
name/value pairs
▪ Allow documents retrieving by keys or contents
▪ Notable for:
▪ MongoDB (used in FourSquare, Github, and more)
▪ CouchDB (used in Apple, BBC, Canonical, Cern, and more)
▪ The central concept is the notion of a "document“ which corresponds to a
row in RDBMS.
▪ A document comes in some standard formats like JSON (BSON).
▪ Documents are addressed in the database via a unique key that represents
that document.
▪ The database offers an API or query language that retrieves documents
based on their contents.
▪ Documents are schema free, i.e., different documents can have structures
and schema that differ from one another. (An RDBMS requires that each row
contain the same columns.)

16
{
_id: ObjectId("51156a1e056d6f966f268f81"),
type: "Article",
author: "Derick Rethans",
title: "Introduction to Document Databases with MongoDB",
date: ISODate("2013-04-24T16:26:31.911Z"),
body: "This arti…"
},
{
_id: ObjectId("51156a1e056d6f966f268f82"),
type: "Book",
author: "Derick Rethans",
title: "php|architect's Guide to Date and Time Programming with PHP",
isbn: "978-0-9738621-5-7"
}
▪ Store data in a schema-less way
▪ Store data as maps
▪ HashMaps or associative arrays
▪ Provide a very efficient average running
time algorithm for accessing data
▪ Notable for:
▪ Couchbase (Zynga, Vimeo, NAVTEQ, ...)
▪ Redis (Craiglist, Instagram, StackOverfow,
flickr, ...)
▪ Amazon Dynamo (Amazon, Elsevier,
IMDb, ...)
▪ Apache Cassandra (Facebook, Digg,
Reddit, Twitter,...)
▪ Voldemort (LinkedIn, eBay, …)
▪ Riak (Github, Comcast, Mochi, ...)
▪ Data are stored in a column-oriented way
▪ Data efficiently stored
▪ Avoids consuming space for storing nulls
▪ Columns are grouped in column-families
▪ Data isn’t stored as a single table but is stored by column families
▪ Unit of data is a set of key/value pairs
▪ Identified by “row-key”
▪ Ordered and sorted based on row-key

▪ Notable for:
▪ Google's Bigtable (used in all
Google's services)
▪ HBase (Facebook, StumbleUpon,
Hulu, Yahoo!, ...)
▪ Graph-oriented
▪ Everything is stored as an edge, a node or an attribute.
▪ Each node and edge can have any number of attributes.
▪ Both the nodes and edges can be labelled.
▪ Labels can be used to narrow searches.

20
▪ Issues with scaling up when the dataset is just too big
▪ RDBMS were not designed to be distributed
▪ Traditional DBMSs are best designed to run well on a “single” machine
▪ Larger volumes of data/operations requires to upgrade the server with faster
CPUs or more memory known as ‘scaling up’ or ‘Vertical scaling’
▪ NoSQL solutions are designed to run on clusters or multi-node database
solutions
▪ Larger volumes of data/operations requires to add more machines to the
cluster, Known as ‘scaling out’ or ‘horizontal scaling’
▪ Different approaches include:
▪ Master-slave
▪ Sharding (partitioning)
▪ RDBMSs are based on ACID (Atomicity, Consistency, Isolation, and Durability)
properties
▪ NoSQL
▪ Does not give importance to ACID properties
▪ In some cases completely ignores them
▪ In distributed parallel systems it is difficult/impossible to ensure ACID properties
▪ Long-running transactions don't work because keeping resources blocked for a
long time is not practical
▪ Acronym contrived to be the opposite of ACID
▪ Basically Available,
▪ Soft state,
▪ Eventually Consistent
▪ Characteristics
▪ Weak consistency – stale data OK
▪ Availability first
▪ Best effort
▪ Approximate answers OK
▪ Aggressive (optimistic)
▪ Simpler and faster
A congruent and logical way for assessing the problems involved in
assuring ACID-like guarantees in distributed systems is provided by the
CAP theorem
At most two of the following three can be maximized at one time
▪ Consistency
▪ Each client has the same view of the
data

▪ Availability
▪ Each client can always read and write

▪ Partition tolerance
▪ System works well across distributed
physical networks
▪ CAP theorem – At most two properties on three can be
addressed
▪ The choices could be as follows:

1. Availability is compromised but consistency and partition

tolerance are preferred over it
2. The system has little or no partition tolerance. Consistency
and availability are preferred
3. Consistency is compromised but systems are always available
and can work when parts of it are partitioned
• Consistency and Availability is not
“binary” decision

• AP systems relax consistency in

favor of availability – but are not C A
inconsistent

• CP systems sacrifice availability for

consistency- but are not unavailable
P
• This suggests both AP and CP
systems can offer a degree of
consistency, and availability, as
well as partition tolerance
▪ There is no perfect NoSQL database
▪ Every database has its advantages and disadvantages
▪ Depending on the type of tasks (and preferences) to accomplish

▪ NoSQL is a set of concepts, ideas, technologies, and software

dealing with
▪ Big data
▪ Sparse un/semi-structured data
▪ High horizontal scalability
▪ Massive parallel processing

▪ Different applications, goals, targets, approaches need different

NoSQL solutions
▪ Where would I use a NoSQL database?
▪ Do you have somewhere a large set of uncontrolled,
unstructured, data that you are trying to fit into a RDBMS?
▪ Log Analysis
▪ Social Networking Feeds (many firms hooked in through
Facebook or Twitter)
▪ External feeds from partners
▪ Data that is not easily analyzed in a RDBMS such as time-
based data
▪ Large data feeds that need to be massaged before entry into
an RDBMS

Computerized System Life Cycle Management
100% (1)
Computerized System Life Cycle Management
107 pages
Unit 2 Bda Bda
No ratings yet
Unit 2 Bda Bda
29 pages
SAP Master Data Management
75% (4)
SAP Master Data Management
3 pages
Full Stack UNIT3
No ratings yet
Full Stack UNIT3
57 pages
IntroNoSQL Revised
No ratings yet
IntroNoSQL Revised
28 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
43 pages
Dhruba Jyoti Saha - Java Architect
No ratings yet
Dhruba Jyoti Saha - Java Architect
15 pages
Nosql Database: New Era of Databases For Big Data Analytics - Classification, Characteristics and Comparison
No ratings yet
Nosql Database: New Era of Databases For Big Data Analytics - Classification, Characteristics and Comparison
17 pages
Final Report Mini Project
No ratings yet
Final Report Mini Project
45 pages
Project
No ratings yet
Project
222 pages
Perspective, Cube, KPI, ODC Example
No ratings yet
Perspective, Cube, KPI, ODC Example
10 pages
Unit 2
No ratings yet
Unit 2
23 pages
Qradar Admin Guide
No ratings yet
Qradar Admin Guide
342 pages
Lecture 8 Chapter 5 Part 4 Big Data Storage Concepts
No ratings yet
Lecture 8 Chapter 5 Part 4 Big Data Storage Concepts
9 pages
Nosql
No ratings yet
Nosql
64 pages
NoSQL Database Comprehensive Report
No ratings yet
NoSQL Database Comprehensive Report
75 pages
BDT Unit 4
No ratings yet
BDT Unit 4
93 pages
Unit 1 Mangodb
No ratings yet
Unit 1 Mangodb
57 pages
2 - NoSQL
No ratings yet
2 - NoSQL
32 pages
Rdbms Concepts 1
No ratings yet
Rdbms Concepts 1
122 pages
Module 1
No ratings yet
Module 1
34 pages
Chapter14 BigData&NoSQLDatabases
No ratings yet
Chapter14 BigData&NoSQLDatabases
39 pages
Bda Unit-5 PDF
No ratings yet
Bda Unit-5 PDF
83 pages
Lecture 6 - NoSQL
No ratings yet
Lecture 6 - NoSQL
28 pages
CIS - 468 - 04 - NOSQL Databases and Big Data Storage Systems
No ratings yet
CIS - 468 - 04 - NOSQL Databases and Big Data Storage Systems
102 pages
Massively Parallel Cloud Data Storage Systems: S. Sudarshan IIT Bombay
No ratings yet
Massively Parallel Cloud Data Storage Systems: S. Sudarshan IIT Bombay
17 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
29 pages
NOsql Presentation
No ratings yet
NOsql Presentation
20 pages
Unit 1
No ratings yet
Unit 1
23 pages
Dbms Presentation
No ratings yet
Dbms Presentation
22 pages
Unit 4 BDA
No ratings yet
Unit 4 BDA
22 pages
NGD Unit 1-4
No ratings yet
NGD Unit 1-4
43 pages
Unit VI Big Data
No ratings yet
Unit VI Big Data
19 pages
NOSQL Lecture 1 Notes
No ratings yet
NOSQL Lecture 1 Notes
31 pages
No SQL Lecture Notes
No ratings yet
No SQL Lecture Notes
17 pages
Unit 4
No ratings yet
Unit 4
47 pages
RK NoSQL
No ratings yet
RK NoSQL
35 pages
BDA CW Chapter 3
No ratings yet
BDA CW Chapter 3
9 pages
DBMS Chapter 5
No ratings yet
DBMS Chapter 5
52 pages
41 NoSQL Introduction
No ratings yet
41 NoSQL Introduction
18 pages
Unit VI - 1
No ratings yet
Unit VI - 1
31 pages
No SQL
No ratings yet
No SQL
109 pages
NoSQL Tutorial - New
No ratings yet
NoSQL Tutorial - New
10 pages
Module 1 Introduction
No ratings yet
Module 1 Introduction
9 pages
NoSql 2024 Assign2
No ratings yet
NoSql 2024 Assign2
189 pages
NoSQL Database
No ratings yet
NoSQL Database
64 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
13 pages
Bcse302l Dbms Module-7 Nosql
No ratings yet
Bcse302l Dbms Module-7 Nosql
30 pages
NoSQL Notes
No ratings yet
NoSQL Notes
11 pages
Unit 3 NoSQL
No ratings yet
Unit 3 NoSQL
98 pages
Unit Ii - Nosql Databases
No ratings yet
Unit Ii - Nosql Databases
112 pages
Lecture 1
No ratings yet
Lecture 1
31 pages
Learning Guide 2.1 - CloudDatabase - NOSQL PDF
No ratings yet
Learning Guide 2.1 - CloudDatabase - NOSQL PDF
44 pages
Unit 2
No ratings yet
Unit 2
26 pages
UNIT II First Half Notes
No ratings yet
UNIT II First Half Notes
21 pages
Nosql Module 1
No ratings yet
Nosql Module 1
23 pages
Module 5 - NoSQL Databases
No ratings yet
Module 5 - NoSQL Databases
33 pages
Lecture 1 - NoSQL
No ratings yet
Lecture 1 - NoSQL
31 pages
No SQL
No ratings yet
No SQL
12 pages
NoSQL Databases
No ratings yet
NoSQL Databases
20 pages
Module 1
No ratings yet
Module 1
69 pages
Unit 2 Bda
No ratings yet
Unit 2 Bda
28 pages
2023 Automation Trends Ebook UiPath
No ratings yet
2023 Automation Trends Ebook UiPath
13 pages
NoSQL Big Data Management
No ratings yet
NoSQL Big Data Management
36 pages
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
No ratings yet
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
31 pages
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
No ratings yet
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
42 pages
Introduction To Nosql: Gabriele Pozzani
No ratings yet
Introduction To Nosql: Gabriele Pozzani
49 pages
Introduction To: Nosql
No ratings yet
Introduction To: Nosql
27 pages
R-05-ELE-PEPTPN Discipline Specific Training Guide For Registration As A Professional PE-PT-PN
No ratings yet
R-05-ELE-PEPTPN Discipline Specific Training Guide For Registration As A Professional PE-PT-PN
44 pages
Appendix II - IT Audit Report Examples (En)
No ratings yet
Appendix II - IT Audit Report Examples (En)
6 pages
INTRO SAP ERP. Book Magal and Word
No ratings yet
INTRO SAP ERP. Book Magal and Word
52 pages
SD-WAN With FOS FMG-7.4.x-New Features
No ratings yet
SD-WAN With FOS FMG-7.4.x-New Features
173 pages
SpendoliniAPEX Security Checklist
No ratings yet
SpendoliniAPEX Security Checklist
38 pages
Project Report On
No ratings yet
Project Report On
14 pages
Unit-5: Attacks and Techniques Used in Cyber Crime
No ratings yet
Unit-5: Attacks and Techniques Used in Cyber Crime
17 pages
ER To Relational Model
No ratings yet
ER To Relational Model
39 pages
6 Months - SDET Learning Planner
No ratings yet
6 Months - SDET Learning Planner
22 pages
Relational Model
No ratings yet
Relational Model
27 pages
Chapter 2 Types and Level of Testing
No ratings yet
Chapter 2 Types and Level of Testing
24 pages
Web Mining Unit-1
No ratings yet
Web Mining Unit-1
26 pages
Quick Start Guide - SAP Build
No ratings yet
Quick Start Guide - SAP Build
23 pages
Auto-Size Select and Search Prompt
No ratings yet
Auto-Size Select and Search Prompt
7 pages
Super Notes
No ratings yet
Super Notes
8 pages
01 - Sys Architecture
No ratings yet
01 - Sys Architecture
10 pages
RERIMIENTOS PARA WINManage
No ratings yet
RERIMIENTOS PARA WINManage
2 pages
Ict Lab Task 1
No ratings yet
Ict Lab Task 1
13 pages
Preparation of Papers For R-ICT 2007
No ratings yet
Preparation of Papers For R-ICT 2007
6 pages
Install Panorama On Oracle Cloud Infrastructure (OCI)
No ratings yet
Install Panorama On Oracle Cloud Infrastructure (OCI)
8 pages
Kinkar Ca2
No ratings yet
Kinkar Ca2
7 pages
30-Day Roadmap To Help You Learn Tableau
No ratings yet
30-Day Roadmap To Help You Learn Tableau
5 pages
PROJECT
No ratings yet
PROJECT
6 pages
Data Sheet Metrointegrator v4 6 En-1
No ratings yet
Data Sheet Metrointegrator v4 6 En-1
3 pages
Booking Voucher RESHMMR17049
No ratings yet
Booking Voucher RESHMMR17049
2 pages
DBA's Guide to NoSQL
From Everand
DBA's Guide to NoSQL
The Enlightened DBA
5/5 (1)

NoSQL D

Uploaded by

NoSQL D

Uploaded by

NoSQL databases are currently a hot topic in some parts of computing, with over

a hundred different NoSQL databases.

▪ Data coming from Internet are

▪ Queries need to return

▪Sorted ordered Column Store

1. Availability is compromised but consistency and partition

• AP systems relax consistency in

• CP systems sacrifice availability for

▪ NoSQL is a set of concepts, ideas, technologies, and software

▪ Different applications, goals, targets, approaches need different

You might also like