Fdocuments - in Nosql-Seminar
Fdocuments - in Nosql-Seminar
Agenda
Introduction to NOSQL
Objective
Examples of NOSQL databases
NOSQL vs SQL
Conclusion
Basic Concepts
Not linear!
Scalling contd..
NoSQL Scalling -
Need more storage?
Add more servers!
Need higher performance?
Add more servers!
Need better reliability?
Add more servers!
Scalling Summary
Key-Value Stores
Map Reduce Framework
Document Databases
Graph Databases
Key Value Stores
Reduce
collect results from machines you gave the tasks
combine results and return it to requester
Slower than sequential data processing, but massively parallel
Sort petabyte of data in a few hours
Input, Map, Shuffle, Reduce, Output
Popular NoSQL
Cassandra
Facebook (original developer, used it till late 2010)
Twitter
Digg
Reddit
Rackspace
Cisco
BigTable
Google (open-source version is HBase)
MongoDB
Foursquare
Craigslist
Bit.ly
SourceForge
GitHub
MONGODB
Document store
Basic support for dynamic (ad hoc) queries
Query by example (nice!)
Conditional Operators
<, <=, >, >=
$all, $exists, $mod, $ne, $in, $nin, $nor, $or, $and,
$size, $type
MONGODB
Written in: Erlang
Main point: DB consistency, ease of use
Bi-directional (!) replication, continuous or ad-hoc, with conflict detection,
thus, master-master replication. (!)
MVCC - write operations do not block reads
Previous versions of documents are available
Crash-only (reliable) design
Needs compacting from time to time
Views: embedded map/reduce
Formatting views: lists & shows
Server-side document validation possible
Authentication possible
Real-time updates via _changes (!)
Attachment handling
CouchApps (standalone JS apps)
HADOOP
Apache project
A framework that allows for the distributed processing of large
data sets across clusters of computers
Designed to scale up from single servers to thousands of machines
Designed to detect and handle failures at the application layer,
instead of relying on hardware for it
Created by Doug Cutting, who named it after his son's toy elephant
Hadoop subprojects
Cassandra
HBase
Pig
Hive was a Hadoop subproject, but is now a top-level Apache project
HADOOP contd..
http://
www.facebook.com/note.php?note_id=24413138919
http://en.wikipedia.org/wiki/Apache_Cassandra
http://en.wikipedia.org/wiki/SQL
http://en.wikipedia.org/wiki/NoSQL
www.slideshare.com
THANK
YOU..!!