0% found this document useful (0 votes)
5K views

Mongo Lesson2

The document discusses MongoDB, including what it is, how it stores data as JSON and BSON documents, its key features like secondary indexing, replication, auto-sharding, and aggregation, and how it can easily scale up by adding additional servers.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5K views

Mongo Lesson2

The document discusses MongoDB, including what it is, how it stores data as JSON and BSON documents, its key features like secondary indexing, replication, auto-sharding, and aggregation, and how it can easily scale up by adding additional servers.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Class 2 – Introduction to

MongoDB and its key features


AGENDA
• What is MongoDB?
• Overview of MongoDB.
• MongoDB’s key features
• MongoDB’s core server and tools.
• Installing MongoDB
• Getting and Starting MongoDB
• Running the Shell
• Use cases and production deployment
• Data Types, Schema Design and Data Modelling
MongoDB
What is MongoDB?
 MongoDB is a document-oriented database

 MongoDB replaces the concept of a “row” with a more flexible model, the
“document.”

 MongoDB stores data in the form of BSON(binary form of JSON)

 Document-oriented approach makes it possible to represent complex hierarchical


relationships with a single record.

 This approach fits very naturally into the way developers in modern object-oriented
languages think about their data

 MongoDB is also schema-free: a document’s keys are not predefined or fixed in any
way
 This gives developers a lot of flexibility in how they work with evolving data models.
MongoDB
What is JSON?
 JSON : JSON (JavaScript Object Notation) is a lightweight data-interchange format.It
is easy for humans to read and write.

 JSON is a text format that is completely language independent but uses conventions
that are familiar to programmers of the C-family of languages, including C, C++, C#,
Java, JavaScript, Perl, Python, and many others

 JSON supports all the basic data types you’d expect: numbers, strings, and boolean
values, as well as arrays and hashes

 Document databases such as MongoDB use JSON documents in order to store


records, just as tables and rows store records in a relational database

 A JSON database returns query results that can be easily parsed, with little or no
transformation, directly by JavaScript and most popular programming languages –
reducing the amount of logic you need to build into your application layer
MongoDB
What is JSON?
Here is an example of a JSON document:
{
"_id" : 1,
"name" : { “TeckEureka"},
"customers" : [ “Genpact", “Accenture", “Wipro", “WNS" ],
“courses" : [
{ "name" : “Data Science with SAS",
"domain" : “Statistics and Analytics "
},
{ "name" : “Big Data Specialization",
"domain" : “Hadoop eco-system analytics"
}
]
}
MongoDB
Why MongoDB stores data as BSON?
 MongoDB represents JSON documents in binary-encoded format called BSON.

 BSON is a bin­ary-en­coded seri­al­iz­a­tion of JSON-like doc­u­ments.

 BSON sup­ports the em­bed­ding of doc­u­ments and ar­rays with­in oth­er doc­u­ments
and ar­rays

 BSON also con­tains ex­ten­sions that al­low rep­res­ent­a­tion of data types that are not
part of the JSON spec such as BinData and Date data type

 BSON is designed to be lightweight , traversable and efficient

 MongoDB can even "reach inside" BSON objects to build indexes and match objects
against query expressions on both top-level and nested BSON keys
MongoDB
MongoDB Structure
RDBMS MongoDB
Database Database
Table, View Collection
Row Document (JSON, BSON)
Column Field
Index Index
Join Embedded Document
Foreign Key Reference
Partition Shard
MongoDB
Document Store
> db.user.findOne({“first”:”Sarita”})
{
"_id" : ObjectId(“6633e0bd72…"),
"first" : “Sarita",
"last" : “Digumarti",
“Specialization" : [
“FMCG ",
" retail,
“healthcare”]
“roles": {
“trainer": “Business analytics ",
“managment": “co-founder"}
}
MongoDB
MongoDB is Easy to Use
{
title: ‘MongoDB’,
contributors: [
{ name: ‘Eliot Horowitz’,
email: ‘[email protected]’ },
{ name: ‘Dwight Merriman’,
email: ‘[email protected]’ }
],
model: {
relational: false,
awesome: true
}
}
Transaction management in MongoDB

MySQL MongoDB
START TRANSACTION; db.contacts.save( {
INSERT INTO contacts VALUES userName: “joeblow”,
(NULL, ‘joeblow’); emailAddresses: [
INSERT INTO contact_emails VALUES “[email protected]”,
( NULL, ”[email protected]”, “[email protected]” ] } );
LAST_INSERT_ID() ),
( NULL, “[email protected]”,
LAST_INSERT_ID() );
COMMIT;
Schema Free

• MongoDB does not need any pre-defined data schema


• Every document could have different data!

{name: “Marbluetin”, {name: “Jim”, {name: “Mike”,


eyes: “”, eyes: “blue”, aliases: [“el diablo”]}
birthplace: “California”, loc: [43.2, 73.4],
aliases: [“Marty”, “Mark”], boss: “bill”}
loc: [31.7, 53.4] {name: “Michael”,
boss: ”bill”} pizza: “DiGiorno”,
{name: “Venus”, height: 56,
hat: ”yes”} loc: [44.6, 71.3]}
MongoDB
Easy Scaling
 Data set sizes for applications are growing at an incredible pace

 Due to Advances in sensor technology, increases in available bandwidth, and the


popularity of handheld devices

 where even small scale applications need to store more data than many databases
were meant to handle

 A terabyte of data, once an unheard-of amount of information, is now


commonplace
 developers face a difficult decision: how should they scale their databases
MongoDB
Scale UP or Scale Out?
MongoDB
Scaling UP
 Scaling a database comes down to the choice between scaling up (getting a bigger
machine) or scaling out (partitioning data across more machines)

 Scaling up is often the path of least resistance, but it has drawbacks: large machines
are often very expensive, and eventually a physical limit is reached where a more
powerful machine cannot be purchased at any cost

 For large web application, it is either impossible or not cost-effective to run off of
one machine

 it is both extensible and economical to scale out: to add storage space or increase
performance

 MongoDB was designed from the beginning to scale out.


 Its document-oriented data model allows it to automatically split up data across
multiple servers
MongoDB
Scaling Out
 MongoDB was designed from the beginning to scale out.

 Its document-oriented data model allows it to automatically split up data across


multiple servers

 It can balance data and load across a cluster, redistributing documents automatically
 This allows developers to focus on programming the application, not scaling it

 When they need more capacity, they can just add new machines to the cluster and
let the database figure out how to organize everything
MongoDB
MongoDB Features
 Ad hoc queries

 Secondary Indexes

 Replication

 Auto-Sharding

 Querying

 Fast In-Place Updates

 Aggregation

 Server side java

 Capped Collection
MongoDB
Secondary Indexes
 Secondary indexes in MongoDB are implemented as B-trees.
 B-tree indexes are optimized for a variety of queries, including range scans and
queries with sort clauses
 By permitting multiple secondary indexes, MongoDB allows users to optimize for a
wide variety of queries
 With MongoDB, you can create up to 64 indexes per collection
 ascending, descending, unique, compound-key, and even geospatial indexes are
supported
MongoDB
Replication
 MongoDB provides database replication via a topology known as a replica set
 Replica sets distribute data across machines for redundancy and automate failover
in the event of server and network outages

 Additionally, replication is used to scale database reads. If you have a read intensive
application, as is commonly the case on the web, it’s possible to spread database
reads across machines in the replica set cluster

 Replica sets consist of exactly one primary node and one or more secondary nodes.
 a replica set’s primary node can accept both reads and writes, but the secondary
nodes are read-only.
MongoDB
Replication
MongoDB
Replication
 A memory-mapped file is a segment of virtual memory which has been assigned
a direct byte-for-byte correlation with some portion of a file or file-like resource.”
 mmap()
MongoDB
Replica Sets

 Redundancy and Failover Host1:10000

 Zero downtime for upgrades and Host2:10001


maintaince
 Master-slave replication Host3:10002
 Strong Consistency
 Delayed Consistency
 Geospatial features Client
MongoDB
Sharding

 Partition your data


 Scale write throughput
 Increase capacity
 Auto-balancing

Host1:10000 Host2:10010

Host3:20000

Host4:30000 Client
MongoDB
MapReduce
MongoDB
Collection and Database
 A collection is a group of documents.
 Collections are schema-free. This means that the documents within a single
collection can have any number of different “shapes.”
 Keeping different kinds of documents in the same collection can be a nightmare for
developers and admins.
 Grouping documents of the same kind together in the same collection allows for
data locality
 MongoDB groups collections into databases. A single instance of MongoDB can host
several databases, each of which can be thought of as completely independent.

 A good rule of thumb is to store all data for a single application in the same
database
MongoDB
Schema Design
MongoDB’s collections do not enforce document structure

This flexibility facilitates the mapping of documents to an entity or an object.


Each document can have different keys and types

In practice, however, the documents in a collection share a similar structure.

The key challenge in data modeling is balancing the needs of the application, the
performance characteristics of the database engine, and the data retrieval patterns.

When designing data models, always consider the application usage of the data (i.e.
queries, updates, and processing of the data) as well as the inherent structure of the
data itself.
MongoDB
Reference Data Model
 The key decision in designing data models for MongoDB applications
revolves around the structure of documents and how the application
represents relationships between data

 There are two tools that allow applications to represent these


relationships: references and embedded documents.

 References: References store the relationships between data by including


links or references from one document to another.

 Applications can resolve these references to access the related data.


Broadly, these are normalized data models.
MongoDB
Reference Data Model
{
_id: “12”,
Name: “Apple”,
“Camera type”: 13MP SMART
{ _id: <ObjectId1>,
Phone_ids: [“12”,”13”], }

} {
_id: “13”,
Name: “SamSung”,
Normal
MobilePhones “Bluetooth”: “Yes”;

}
MongoDB
Embedded Data Model
 Embedded documents capture relationships between data by storing
related data in a single document structure.
 MongoDB documents make it possible to embed document structures in a
field or array within a document
 Grouping documents of the same kind together in the same collection
allows for data locality
 In general, use embedded data models when one-to-one, one-to-many
relationships between entities. In these relationships the “many” or child
documents always appear with or are viewed in the context of the “one”
or parent documents
 In general, embedding provides better performance for read operations,
as well as the ability to request and retrieve related data in a single
database operation.
 Embedded data models make it possible to update related data in a single
atomic write operation
MongoDB
Embedded Data Model
{
productdetail: [
{
_id: 12,
Name: “Apple”,
Price: “45,000”

},
{
_id: 13,
Name: “SamSung”,
Price: “40,000”
}
]

}
MongoDB
Data Types
 Null : Null can be used to represent both a null value and a nonexistent
field:{"x”: null} 

 Undefined : Undefined can be used in documents as well (JavaScript has


distinct types for null and undefined): 
{"x" : undefined}  

 Boolean: There is a boolean type, which will be used for the values 'true'
and 'false':
{"x" : true} 

 32-bit integer : This cannot be represented on the shell.  

 64-bit integer : Again, the shell cannot represent these.  

 64-bit floating point number : All numbers in the shell will be of this type.
MongoDB
Data Types
 maximum value: BSON contains a special type representing the largest possible
value. The shell does not have a type for this. 
 minimum value : BSON contains a special type representing the smallest possible
value. The shell does not have a type for this.
 ObjectId : ObjectIds are: small, likely unique, fast to generate, and ordered. These
values consists of 12-bytes, where the first four bytes are a timestamp that reflect
the ObjectId’s creation
 String: BSON strings are UTF-8.
 Symbol: This type is not supported by the shell
 Timestamps: BSON has a special timestamp type for internal MongoDB use and
is not associated with the regular Date type
 Date : BSON Date is a 64-bit integer that represents the number of milliseconds
since the Unix epoch (Jan 1, 1970)
 regular expression: Documents can contain regular expressions, using JavaScript’s
regular expression syntax: 
 {"x" : /learn/i}
MongoDB
Data Types
 Code :
Documents can also contain JavaScript code:

{"x" : function() { /* ... */ }}


 binary data
Binary data is a string of arbitrary bytes. It cannot be manipulated from the shell.
 
 Array
Sets or lists of values can be represented as arrays:
{“courses" : [“Big Data Specialist", “Data Science with R", “HR Analytics"]}
 
 embedded document
Documents can contain entire documents, embedded as values in a parent document:
{"course_duration" : {" Big Data Specialist " : “72 Hrs"}}
MongoDB
Core Servers
 Core server : The core database server runs via an executable called mongod

 Config server : mongdb can also run as config server in the sharded architecture to
contain the meta data.

 Routing server : mongos for “MongoDB Shard,” is a routing service for MongoDB
shard configurations.
MongoDB
MongoDB’s Tools
 The JavaScript shell : The MongoDB command shell is a JavaScript-based tool for
administering the database and manipulating data

 Database drivers : C, C++, C#, Erlang, Haskell, Java, Perl, PHP, Python, Scala, and
Ruby

 Command-line tools :
mongodump and mongorestore
mongoexport and mongoimport
Mongosniff
mongostat
MongoDB
Installing MongoDB on Linux 64 bit
Step 1: Download the binary files for the desired release of MongoDB. Download the
binaries from
https://www.mongodb.org/downloads. 
For example, to download the latest release through the shell, issue the following: 
$curl -O http://downloads.mongodb.org/linux/mongodb-linux-x86_64-3.0.2.tgz 
Step 2: Extract the files from the downloaded archive.  
$tar -zxvf mongodb-linux-x86_64-3.0.2.tgz
Step 3: Copy the extracted archive to the target directory.  
$mkdir -p mongodb
$cp -R -n mongodb-linux-x86_64-3.0.2/ mongodb
 Step 4: Ensure the location of the binaries is in the PATH variable. The MongoDB
binaries are in the bin/ directory of the archive. To ensure that the binaries are in your
PATH, you can modify your PATH.
MongoDB
Installing MongoDB on Windows
On windows mongoDB can be installed either using MongoDB.msi file . If you double
click on this file then a set of screens will appear to guide you through the installation
process.
OR

You may install MongoDB on Windows from the command line using msiexec.exe.
msiexec.exe /q /i mongodb-<version>-signed.msi INSTALLLOCATION="<installation
directory>"
MongoDB
Starting MongoDB on Linux
Create the data directory
$mkdir -p /data/db
Set permissions for the data directory
$chmod 777 –R /data/db
To start the MongoDB server, run the mongod executable

$ ./mongod

To start MongoDB Shell :


$ ./mongo
MongoDB
Starting MongoDB on Windows
Create data directory :
$ md \data\db

To start the MongoDB server, run the mongod executable


$./mongod.exe

To start MongoDB Shell :


$./mongo.exe
MongoDB
Use cases

 Personalization

 Mobile

 Internet of things

 Real time Analytics


MongoDB
Use cases

 Web Applications

 Content Management

 Catalog

 Single View
MongoDB
Customers
RECAP

MongoDB key features

You might also like