Mongo DB
Mongo DB
NoSQL DATABASES
NoSQL Database is a non-relational Data Management System, that does not require a fixed schema. It avoids
joins, and is easy to scale.
The major purpose of using a NoSQL database is for distributed data stores with humongous data storage needs.
NoSQL is used for Big data and real-time web apps. For example, companies like Twitter, Facebook and Google
collect terabytes of user data every single day.
Traditional RDBMS uses SQL syntax to store and retrieve data for further insights.
Instead, a NoSQL database system encompasses a wide range of database technologies that can store structured,
semi-structured, unstructured and polymorphic data.
Brief History of NoSQL Databases
1998- Carlo Strozzi use the term NoSQL for his lightweight, open-source
relational database
2000- Graph database Neo4j is launched
2004- Google BigTable is launched
2005- CouchDB is launched
2007- The research paper on Amazon Dynamo is released
2008- Facebooks open sources the Cassandra project
2009- The term NoSQL was reintroduced
Non- Relational
Multi-Model • NoSQL databases never follow the relational model
Easily Scalable • Never provide tables with flat fixed-column records
Big Data Applications • Work with self-contained aggregates or BLOBs
Redundancy and Zero Downtime • Doesn’t require object-relational mapping and data normalization
Distributed • No complex features like query languages, query planners, referential
integrity joins, ACID
Schema-free
• NoSQL databases are either schema-free or have relaxed schemas
• Do not require any sort of definition of the schema of the data
• Offers heterogeneous structures of data in the same domain
Types of NoSQL
Distributed System Eventual and Immediate Immediate Consistency Eventual and Immediate Eventual and Immediate
Consistency Consistency Consistency Consistency
Owner and Developer MongoDB, Inc. Apache Software Apache Software neo4j.com
Foundation Foundation
Editions Community (Free) and Community Community with Option Free source
Enterprise of Third-Party Support
Data Mining
Social Media Networking Sites
Software Development
• CAP theorem, also known as Brewer’s theorem, stands for Consistency, Availability and Partition Tolerance
• In normal operations, your data store provides all three functions. But the CAP theorem maintains that when a
distributed database experiences a network failure, you can provide either consistency or availability.
• It’s a tradeoff. All other times, all three can be provided. But, in the event of a network failure, a choice must be
made.
• In the theorem, partition tolerance is a must. The assumption is that the system operates on a distributed data store
so the system, by nature, operates with network partitions.
• Network failures will happen, so to offer any kind of reliable service, partition tolerance is necessary—the P of CAP.
• That leaves a decision between the other two, C and A. When a network failure happens, one can choose to
guarantee consistency or availability:
High consistency comes at the cost of lower availability.
High availability comes at the cost of lower consistency.
Installation
Creating database
Creating collection
Insert documents
Finding documents
Update documents
•Mongoshell
https://www.mongodb.com/try/download/community
© Kalasalingam academy of research and education
Step-2
RDBMS MongoDB
Database Database
Table, View Collection
Row Document (JSON, BSON)
Column Field
Index Index
Join Embedded Document
Foreign Key Reference
Partition Shard
“Binary JSON”
“JavaScript Object Notation”
• Lightweight
Built on Goals
http://json.org/ http://bsonspec.org/
MONGODB TERMINOLOGIES FOR RDBMS CONCEPTS
RDBMS MongoDB
Database Database
Table, View Collection
Row Document (JSON, BSON)
Column Field
Index Index
Join Embedded Document
Foreign Key Reference
Partition Shard
Integer
Date Boolean
Object ID String
Null Arrays
• String : This is most commonly used datatype to store the data. String in mongodb must be UTF-8
valid.
• Integer : This type is used to store a numerical value. Integer can be 32 bit or 64 bit depending
upon your server.
• Boolean : This type is used to store a boolean (true/ false) value.
• Double : This type is used to store floating point values.
• Min/ Max keys : This type is used to compare a value against the lowest and highest BSON
elements.
• Arrays : This type is used to store arrays or list or multiple values into one key.
• Timestamp : ctimestamp. This can be handy for recording when a document has been modified or
added.
• Object : This datatype is used for embedded documents.
db.createCollection
(name) • To create collection
Ex:- db.createCollection(Stud)
Insert
Find
Update
Delete
Insert
Find
Update
Delete
•Read operations retrieve documents from a collection; i.e. query a collection for documents.
Syntax: db.collection.find()
Ex: db.mca.find({})
db.mca.find ( { name:nosql } ) .limit(5)
db.mca.find ( { name: { $in: [“nosql”,”maths”] } } )
Although you can express this query using the $or operator, use the $in operator rather than the $or operator when
performing equality checks on the same field.
db.stud.find({name:/^n/})
Find students whose name starts with n
db.collection.explain().find()
db.stud.find({name:/n/}) Find students
whose name contains n letter
To update a document, MongoDB provides update operators, such as $set, to modify field values.
db.collection.updateOne(<filter>, <update>, <options>)
db.collection.updateMany(<filter>, <update>, <options>)
db.collection.replaceOne(<filter>, <update>, <options>)
Delete operations remove documents from a collection. MongoDB provides the following methods to delete
documents of a collection:
• db.collection.deleteOne()
To delete at most a single document that matches a specified filter (even though multiple documents may match the
specified filter) use the db.collection.deleteOne() method.
• db.collection.deleteMany()
To delete all documents from a collection, pass an empty filter document {} to the db.collection.deleteMany()
method. To delete all documents that match a deletion criteria, pass a filter parameter to the deleteMany() method.
In MongoDB, delete operations target a single collection. All write operations in MongoDB are atomic on the level
of a single document.
You can specify criteria, or filters, that identify the documents to remove. These filters use the same syntax as read
operations.
© Kalasalingam academy of research and education
Thank you !
• MongoDB uses indexing in order to make the query processing more efficient.
• If there is no indexing, then the MongoDB must scan every document in the collection and retrieve only those
documents that match the query.
• Indexes are special data structures that stores some information related to the documents such that it becomes easy for
MongoDB to find the right data file.
• The indexes are order by the value of the field specified in the index.
Creating Index
MongoDB provides a method called createIndex() that allows user to create an index.
Syntax - db.COLLECTION_NAME.createIndex({KEY:1})
The key determines the field on the basis of which you want to create an index and 1 (or -1) determines the order in
which these indexes will be arranged(ascending or descending).
db.mycol.createIndex({“age”:1}) {
“createdCollectionAutomatically” : false,
“numIndexesBefore” : 1,
“numIndexesAfter” : 2,
“ok” : 1 }
The createIndex() method also has a number of optional parameters. They are:
• background (Boolean)
• unique (Boolean)
• name (string)
• sparse (Boolean)
• expireAfterSeconds (integer)
• hidden (Boolean)
• storageEngine (Document)
db.COLLECTION_NAME.dropIndex({KEY:1})
Here, "key" is the name of the file on which you want to remove an existing index. Instead of the index specification
document (above syntax), you can also specify the name of the index directly as:
dropIndex("name_of_the_index")
Example
db.mycol.dropIndex({"title":1})
{
"ok" : 0,
"errmsg" : "can't find index with key: { title: 1.0 }",
"code" : 27,
"codeName" : "IndexNotFound"
}
• getIndexes() method
To get indexes
• db.COLLECTION_NAME.getIndexes()
Syntax
• db.mycol.createIndex({"title":1,"description":-1})
Example
Capped collections are fixed-size circular collections that follow the insertion order to support high performance for
create, read, and delete operations.
By circular, it means that when the fixed size allocated to the collection is exhausted, it will start deleting the oldest
document in the collection without providing any explicit commands.
• Map-reduce is a data processing programming model that helps to perform operations on large data sets and produce
aggregated results.
• MongoDB provides the mapReduce() function to perform the map-reduce operations.
• This function has two main functions, i.e., map function and reduce function.
• The map function is used to group all the data based on the key-value and the reduce function is used to perform
operations on the mapped data.
• So, the data is independently mapped and reduced in different spaces and then combined together in the function
and the result will save to the specified new collection.
• This mapReduce() function generally operated on large data sets only.
• Using Map Reduce you can perform aggregation operations such as max, avg on the data using some key and it is
similar to groupBy in SQL.
DATASET
{"id":1, "sec":A, "marks":80} var
MAP map = function() REDUCE
{"id":2, "sec":A, "marks":90} {emit(this.sec, var reduce =
{"id":1, "sec":B, "marks":99} function(sec,marks)
{"id":1, "sec":B, "marks":95} this.marks)}; {return Array.max(marks);};
{"id":1, "sec":C, "marks":90}
A dataset of 5 values is added Inside the map function, we use Reduce function is where actual
into the database. emit(this.sec, this.marks) function, aggregation of data takes place.
and we will return the sec and marks
of each record(document) from the
emit function.
Contd…..
• We have reduced the records now we will output them into a new collection.{out :”collectionName”}
db.collectionName.mapReduce(map,reduce,{out :"collectionName"});
• In the above query we have already defined the map, reduce.
• Then for checking we need to look into the newly created collection we can use the query
db.collectionName.find() we get:
{"id":"A", value:90}
{"id":"B", value:99}
{"id":"C", value:90}