0% found this document useful (0 votes)
4 views25 pages

Unit 2 - Session 1 and 2

The document outlines the syllabus for the Big Data Analytics course (CCS334) at KGiSL Institute of Technology, focusing on NoSQL data management. It covers various NoSQL database types, their advantages, challenges, and comparisons with SQL databases. Key topics include key-value pair databases, column-based databases, document-based databases, and graph-based databases.

Uploaded by

selvendran.akash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views25 pages

Unit 2 - Session 1 and 2

The document outlines the syllabus for the Big Data Analytics course (CCS334) at KGiSL Institute of Technology, focusing on NoSQL data management. It covers various NoSQL database types, their advantages, challenges, and comparisons with SQL databases. Key topics include key-value pair databases, column-based databases, document-based databases, and graph-based databases.

Uploaded by

selvendran.akash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

KGiSL Institute of Technology

(Approved by AICTE, New Delhi; Affiliated to Anna University, Chennai)


Recognized by UGC, Accredited by NBA (IT)
365, KGiSL Campus, Thudiyalur Road, Saravanampatti, Coimbatore – 641035.

DEPARTMENT OF COMPUTER SCIENCE & BUSINESS SYSTEMS


Name of the Faculty : BALAKRISHNAN D

Subject Name & Code : CCS334 / BIG DATA ANALYTICS

Branch & Department : B.Tech & CSBS

Year & Semester : III/ VI

Academic Year : 2024-25


SYLLABUS

UNIT II NOSQL DATA MANAGEMENT (7 Sessions)

• Introduction to NoSQL – aggregate data models – key-value and


document data models – relationships – graph databases –
schemaless databases – materialized views – distribution models –
master-slave replication – consistency - Cassandra – Cassandra data
model – Cassandra examples – Cassandra clients
NOSQL

Not Only SQL


New approach to database design
Horizontal Scalability
For data not stored in traditional way (Row / Column or Tabular)
Optimized for applications with
 Flexible data models
 Large volume
 Low latency
 Relaxing on data consistency restrictions for achieving availability
For Storing and retrieving Semi and Unstructured data
Distributed Database
Why NOSQL? Advantages

• Scalability
• Used Sharding for horizontal scaling
• Handles structured, semi and unstructured data
• Developer friendly – Easy to use and deploy
• Availability
• Replication feature
• Supports massive number of concurrent users
• Quickly adopts to changing requirements
• Extremely responsive
• Performance
Challenges in NOSQL

• Open Source
• Non GUI mode
• Backup
• Large Document Size
• Narrow Focus
• ACID property is not supported
• Atomicity
• Consistency
• Isolation
• Durability
NOSQL Vs. SQL
NOSQL Database - Types
NOSQL Database – Types with examples
NOSQL Database – Size Vs. Performance
Key Value Pair based database

• Key-Value pair database that stores data as collection of Key Value pairs.
• Key has to be unique value
• Value can be String, Number, Date, Object (JSON, BLOB, etc), even another
data structure
• Can grow horizontally – we can add any number of key-value pairs for a
database object
• Easy to store and retrieve
• Can think key as a question and value as a answer…
Key Value Pair based database
Column based databases

• Data is stored in columns rather rows


• Scalable (Horizontal) and flexible
• Organized in Column families (Multiple columns forms a family)
• Identified by a unique row key
• Variable number of columns and multiple data types
• Large amount of data in a single column
• Ideal for high performance on aggregation queries
• Used in data warehouses, CRM applications, etc
Column based databases
Column based databases
Column based databases
Column based databases
Document based databases

• Data is stored in documents.


• Each document is nested with keys and values
• Stored in JSON, XML, BSON etc
• Value can be
• Atomic data type
• Complex elements such as arrays, lists, objects, collections etc
• Documents can be retrieved fully or partially using unique key
• Can be used in blogs, e-commerce applications, Real time analytics, etc
• Example : Amazon SimpleDB, CouchDB, MongoDB, Riak, Lotus Notes, etc
Document based databases
Document based databases
Document based databases
Document based databases
Graph based databases

• To store and query highly connected data


• Can be modelled in the form of entities and relationship between them
• Entities are also referred as Nodes or vertices
• Relationship also referred as Edges
• Every node and edge has a unique identifier
• Netflix uses Graph database for its Digital Asset Management
• Assets – what they have watched
• Access management – what they are allowed to watch
Graph based databases
Graph based databases
Interactive Session

You might also like