0% found this document useful (0 votes)
52 views52 pages

DECS 43A - Big Data Analysis

This document discusses NoSQL databases and provides an introduction to MongoDB and Apache Cassandra. It begins with an overview of NoSQL and how it differs from SQL databases in being non-relational, distributed, schema-less, and designed for large datasets. Examples of uses for NoSQL include log analysis, social networking feeds, and time series data. The document then outlines some key features of NoSQL before describing the two main types: key-value stores and schema-less databases like column-based Cassandra. It concludes with details on MongoDB and Cassandra in upcoming sections.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views52 pages

DECS 43A - Big Data Analysis

This document discusses NoSQL databases and provides an introduction to MongoDB and Apache Cassandra. It begins with an overview of NoSQL and how it differs from SQL databases in being non-relational, distributed, schema-less, and designed for large datasets. Examples of uses for NoSQL include log analysis, social networking feeds, and time series data. The document then outlines some key features of NoSQL before describing the two main types: key-value stores and schema-less databases like column-based Cassandra. It concludes with details on MongoDB and Cassandra in upcoming sections.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 52

Big Data Analytics

Government Arts and Science College


Tittagudi-606106
Department of Computer Science

DECS 43A – Big Data Analysis


II Year IV Semester

Unit - 3
Big Data Technologies and Databases
Dr. S. P. Ponnusamy
Assistant Professor and Head

1
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Unit - 3
Big Data Technologies and Databases

Introduction to NoSQL, Uses, Features and Types, Need,


Advantages, Disadvantages and Application of NoSQL, Overview of
NewSQL, Comparing SQL, NoSQL and NewSQL, Introduction to
MongoDB and its needs, Characteristics of MongoDB, Introduction
of Apache Cassandra and its needs, Characteristics of Cassandra

2
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Agenda
 Introduction to NoSQL
 Uses, Features and Types
 Need, Advantages, Disadvantages
 Application of NoSQL
 Overview of NewSQL
 Comparing SQL, NoSQL and NewSQL
 Introduction to MongoDB and its needs
 Characteristics of MongoDB
 Introduction of Apache Cassandra and its needs
 Characteristics of Cassandra
3
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Introduction to NoSQL
RDBMS (SQL) ….. NoSQL
• Value of RDBMS
• Getting Persistent Data
• Concurrency
• Shared DB integration
• A (mostly) standard model

4
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Introduction to NoSQL
RDBMS Characteristics

• Data stored in columns and tables


• Relationships represented by data
• Data Manipulation Language
• Data Definition Language
• Transactions
• Abstraction from physical layer
• Applications specify what, not how
• Physical layer can change without modifying applications
– Create indexes to support queries
– In Memory databases
5
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

RDBMS (SQL) ….. NoSQL Introduction to NoSQL

6
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Introduction to NoSQL
RDBMS (SQL) ….. NoSQL

7
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Introduction to NoSQL
RDBMS (SQL) ….. NoSQL

8
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Introduction to NoSQL
What is NoSQL ?

9
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Introduction to NoSQL
What is NoSQL ?

• NoSQL stands for:


– No Relational
– No RDBMS
– Not Only SQL
• NoSQL is an umbrella term for all databases and data stores that don’t
follow the RDBMS principles
– A class of products
– A collection of several (related) concepts about data storage and manipulation
– Often related to large data sets

10
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Introduction to NoSQL
Where does NoSQL come from?
• Non-relational DBMSs are not new
• But NoSQL represents a new incarnation
– Due to massively scalable Internet applications
– Based on distributed and parallel computing
• Development
– Starts with Google
– First research paper published in 2003
– Continues also thanks to Lucene's developers/Apache (Hadoop) and Amazon (Dynamo)
– Then a lot of products and interests came from Facebook, Netfix, Yahoo, eBay, Hulu,
IBM, and many more
11
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Introduction to NoSQL
What is NoSQL ?

12
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Uses of NoSQL

• Log Analysis
• Social Networking Feeds
• Time Based Data (not easily analyzed in RDBMS)
• Dealing with rich variety of Data
(structured, semi-structured and unstructured)

13
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Features of NoSQL

• Open Source
• Non-Relational
• Distributed
• Schema-less
• Cluster friendly
• Born out of 21st century Web Applications

14
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
• Broad Classification
1. Key-Value or the big hash table
2. Schema-less
NoSQL

Scheme-less
• Column Based (Cassandra)
Key-Value or the big hash table
• Document Based (CouchDB,
[Amazon S3 (Dynamo) Scalaris)
HBase)
• Graph Based (Neo4j)

15
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL

16
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Key-value Data Store
• Store data in a schema-less way
• Store data as maps
– HashMaps or associative arrays
– Provide a very efficient average running time algorithm for
accessing data
• Notable for:
– Couchbase (Zynga, Vimeo, NAVTEQ, ...)
– Redis (Craiglist, Instagram, StackOverfow, flickr, ...)
– Amazon Dynamo (Amazon, Elsevier, IMDb, ...)
– Apache Cassandra (Facebook, Digg, Reddit, Twitter,...)
– Voldemort (LinkedIn, eBay, …)
– Riak (Github, Comcast, Mochi, ...)

17
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL

18
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Key-value Data Store

19
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Key-value Data Store

20
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Key-value Data Store

21
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Key-value Data Store

22
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Document based

23
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Document based - JSON

{
_id: ObjectId("51156a1e056d6f966f268f81"),
type: "Article",
author: "Derick Rethans",
title: "Introduction to Document Databases with MongoDB",
date: ISODate("2013-04-24T16:26:31.911Z"),
body: "This arti…"
},
{
_id: ObjectId("51156a1e056d6f966f268f82"),
type: "Book",
author: "Derick Rethans",
title: "php|architect's Guide to Date and Time Programming with PHP",
isbn: "978-0-9738621-5-7"
}
24
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Document based - MongoDB

25
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Document based

26
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Column based

27
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Column based
• Data are stored in a column-oriented way
– Data efficiently stored
– Avoids consuming space for storing nulls
– Columns are grouped in column-families
– Data isn’t stored as a single table but is stored by column families
– Unit of data is a set of key/value pairs
• Identified by “row-key”
• Ordered and sorted based on row-key
• Notable for:
– Google's Bigtable (used in all Google's services)
– HBase (Facebook, StumbleUpon, Hulu, Yahoo!, ...)

28
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Column based

29
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Column based

30
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Column based

31
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Graph based

• Graph-oriented
• Everything is stored as an edge, a node or
an attribute.
• Each node and edge can have any number
of attributes.
• Both the nodes and edges can be
labelled.
• Labels can be used to narrow searches.

32
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Graph based

33
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Types of NoSQL
Graph based

34
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Why NoSQL?

35
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Advantages of NoSQL

36
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Disadvantages of NoSQL

37
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

CAP Theorem

38
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

CAP Theorem

39
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

CAP Theorem

40
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

CAP Theorem

41
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Use of NoSQL in Industry

42
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Use of NoSQL in Industry

43
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Use of NoSQL in Industry


NoSQL Vendors

44
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

SQL vs NoSQL

45
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

SQL vs NoSQL

46
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

SQL vs NoSQL

47
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

SQL vs NoSQL

48
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

NewSQL

49
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Characteristics of NewSQL

50
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

Comparision

51
Big Data Analytics
Government Arts and Science College
Tittagudi-606106
Department of Computer Science

End

52

You might also like