Mongo Lesson2
Mongo Lesson2
MongoDB replaces the concept of a “row” with a more flexible model, the
“document.”
This approach fits very naturally into the way developers in modern object-oriented
languages think about their data
MongoDB is also schema-free: a document’s keys are not predefined or fixed in any
way
This gives developers a lot of flexibility in how they work with evolving data models.
MongoDB
What is JSON?
JSON : JSON (JavaScript Object Notation) is a lightweight data-interchange format.It
is easy for humans to read and write.
JSON is a text format that is completely language independent but uses conventions
that are familiar to programmers of the C-family of languages, including C, C++, C#,
Java, JavaScript, Perl, Python, and many others
JSON supports all the basic data types you’d expect: numbers, strings, and boolean
values, as well as arrays and hashes
A JSON database returns query results that can be easily parsed, with little or no
transformation, directly by JavaScript and most popular programming languages –
reducing the amount of logic you need to build into your application layer
MongoDB
What is JSON?
Here is an example of a JSON document:
{
"_id" : 1,
"name" : { “TeckEureka"},
"customers" : [ “Genpact", “Accenture", “Wipro", “WNS" ],
“courses" : [
{ "name" : “Data Science with SAS",
"domain" : “Statistics and Analytics "
},
{ "name" : “Big Data Specialization",
"domain" : “Hadoop eco-system analytics"
}
]
}
MongoDB
Why MongoDB stores data as BSON?
MongoDB represents JSON documents in binary-encoded format called BSON.
BSON supports the embedding of documents and arrays within other documents
and arrays
BSON also contains extensions that allow representation of data types that are not
part of the JSON spec such as BinData and Date data type
MongoDB can even "reach inside" BSON objects to build indexes and match objects
against query expressions on both top-level and nested BSON keys
MongoDB
MongoDB Structure
RDBMS MongoDB
Database Database
Table, View Collection
Row Document (JSON, BSON)
Column Field
Index Index
Join Embedded Document
Foreign Key Reference
Partition Shard
MongoDB
Document Store
> db.user.findOne({“first”:”Sarita”})
{
"_id" : ObjectId(“6633e0bd72…"),
"first" : “Sarita",
"last" : “Digumarti",
“Specialization" : [
“FMCG ",
" retail,
“healthcare”]
“roles": {
“trainer": “Business analytics ",
“managment": “co-founder"}
}
MongoDB
MongoDB is Easy to Use
{
title: ‘MongoDB’,
contributors: [
{ name: ‘Eliot Horowitz’,
email: ‘[email protected]’ },
{ name: ‘Dwight Merriman’,
email: ‘[email protected]’ }
],
model: {
relational: false,
awesome: true
}
}
Transaction management in MongoDB
MySQL MongoDB
START TRANSACTION; db.contacts.save( {
INSERT INTO contacts VALUES userName: “joeblow”,
(NULL, ‘joeblow’); emailAddresses: [
INSERT INTO contact_emails VALUES “[email protected]”,
( NULL, ”[email protected]”, “[email protected]” ] } );
LAST_INSERT_ID() ),
( NULL, “[email protected]”,
LAST_INSERT_ID() );
COMMIT;
Schema Free
where even small scale applications need to store more data than many databases
were meant to handle
Scaling up is often the path of least resistance, but it has drawbacks: large machines
are often very expensive, and eventually a physical limit is reached where a more
powerful machine cannot be purchased at any cost
For large web application, it is either impossible or not cost-effective to run off of
one machine
it is both extensible and economical to scale out: to add storage space or increase
performance
It can balance data and load across a cluster, redistributing documents automatically
This allows developers to focus on programming the application, not scaling it
When they need more capacity, they can just add new machines to the cluster and
let the database figure out how to organize everything
MongoDB
MongoDB Features
Ad hoc queries
Secondary Indexes
Replication
Auto-Sharding
Querying
Aggregation
Capped Collection
MongoDB
Secondary Indexes
Secondary indexes in MongoDB are implemented as B-trees.
B-tree indexes are optimized for a variety of queries, including range scans and
queries with sort clauses
By permitting multiple secondary indexes, MongoDB allows users to optimize for a
wide variety of queries
With MongoDB, you can create up to 64 indexes per collection
ascending, descending, unique, compound-key, and even geospatial indexes are
supported
MongoDB
Replication
MongoDB provides database replication via a topology known as a replica set
Replica sets distribute data across machines for redundancy and automate failover
in the event of server and network outages
Additionally, replication is used to scale database reads. If you have a read intensive
application, as is commonly the case on the web, it’s possible to spread database
reads across machines in the replica set cluster
Replica sets consist of exactly one primary node and one or more secondary nodes.
a replica set’s primary node can accept both reads and writes, but the secondary
nodes are read-only.
MongoDB
Replication
MongoDB
Replication
A memory-mapped file is a segment of virtual memory which has been assigned
a direct byte-for-byte correlation with some portion of a file or file-like resource.”
mmap()
MongoDB
Replica Sets
Host1:10000 Host2:10010
Host3:20000
Host4:30000 Client
MongoDB
MapReduce
MongoDB
Collection and Database
A collection is a group of documents.
Collections are schema-free. This means that the documents within a single
collection can have any number of different “shapes.”
Keeping different kinds of documents in the same collection can be a nightmare for
developers and admins.
Grouping documents of the same kind together in the same collection allows for
data locality
MongoDB groups collections into databases. A single instance of MongoDB can host
several databases, each of which can be thought of as completely independent.
A good rule of thumb is to store all data for a single application in the same
database
MongoDB
Schema Design
MongoDB’s collections do not enforce document structure
The key challenge in data modeling is balancing the needs of the application, the
performance characteristics of the database engine, and the data retrieval patterns.
When designing data models, always consider the application usage of the data (i.e.
queries, updates, and processing of the data) as well as the inherent structure of the
data itself.
MongoDB
Reference Data Model
The key decision in designing data models for MongoDB applications
revolves around the structure of documents and how the application
represents relationships between data
} {
_id: “13”,
Name: “SamSung”,
Normal
MobilePhones “Bluetooth”: “Yes”;
}
MongoDB
Embedded Data Model
Embedded documents capture relationships between data by storing
related data in a single document structure.
MongoDB documents make it possible to embed document structures in a
field or array within a document
Grouping documents of the same kind together in the same collection
allows for data locality
In general, use embedded data models when one-to-one, one-to-many
relationships between entities. In these relationships the “many” or child
documents always appear with or are viewed in the context of the “one”
or parent documents
In general, embedding provides better performance for read operations,
as well as the ability to request and retrieve related data in a single
database operation.
Embedded data models make it possible to update related data in a single
atomic write operation
MongoDB
Embedded Data Model
{
productdetail: [
{
_id: 12,
Name: “Apple”,
Price: “45,000”
},
{
_id: 13,
Name: “SamSung”,
Price: “40,000”
}
]
}
MongoDB
Data Types
Null : Null can be used to represent both a null value and a nonexistent
field:{"x”: null}
Boolean: There is a boolean type, which will be used for the values 'true'
and 'false':
{"x" : true}
64-bit floating point number : All numbers in the shell will be of this type.
MongoDB
Data Types
maximum value: BSON contains a special type representing the largest possible
value. The shell does not have a type for this.
minimum value : BSON contains a special type representing the smallest possible
value. The shell does not have a type for this.
ObjectId : ObjectIds are: small, likely unique, fast to generate, and ordered. These
values consists of 12-bytes, where the first four bytes are a timestamp that reflect
the ObjectId’s creation
String: BSON strings are UTF-8.
Symbol: This type is not supported by the shell
Timestamps: BSON has a special timestamp type for internal MongoDB use and
is not associated with the regular Date type
Date : BSON Date is a 64-bit integer that represents the number of milliseconds
since the Unix epoch (Jan 1, 1970)
regular expression: Documents can contain regular expressions, using JavaScript’s
regular expression syntax:
{"x" : /learn/i}
MongoDB
Data Types
Code :
Documents can also contain JavaScript code:
Config server : mongdb can also run as config server in the sharded architecture to
contain the meta data.
Routing server : mongos for “MongoDB Shard,” is a routing service for MongoDB
shard configurations.
MongoDB
MongoDB’s Tools
The JavaScript shell : The MongoDB command shell is a JavaScript-based tool for
administering the database and manipulating data
Database drivers : C, C++, C#, Erlang, Haskell, Java, Perl, PHP, Python, Scala, and
Ruby
Command-line tools :
mongodump and mongorestore
mongoexport and mongoimport
Mongosniff
mongostat
MongoDB
Installing MongoDB on Linux 64 bit
Step 1: Download the binary files for the desired release of MongoDB. Download the
binaries from
https://www.mongodb.org/downloads.
For example, to download the latest release through the shell, issue the following:
$curl -O http://downloads.mongodb.org/linux/mongodb-linux-x86_64-3.0.2.tgz
Step 2: Extract the files from the downloaded archive.
$tar -zxvf mongodb-linux-x86_64-3.0.2.tgz
Step 3: Copy the extracted archive to the target directory.
$mkdir -p mongodb
$cp -R -n mongodb-linux-x86_64-3.0.2/ mongodb
Step 4: Ensure the location of the binaries is in the PATH variable. The MongoDB
binaries are in the bin/ directory of the archive. To ensure that the binaries are in your
PATH, you can modify your PATH.
MongoDB
Installing MongoDB on Windows
On windows mongoDB can be installed either using MongoDB.msi file . If you double
click on this file then a set of screens will appear to guide you through the installation
process.
OR
You may install MongoDB on Windows from the command line using msiexec.exe.
msiexec.exe /q /i mongodb-<version>-signed.msi INSTALLLOCATION="<installation
directory>"
MongoDB
Starting MongoDB on Linux
Create the data directory
$mkdir -p /data/db
Set permissions for the data directory
$chmod 777 –R /data/db
To start the MongoDB server, run the mongod executable
$ ./mongod
Personalization
Mobile
Internet of things
Web Applications
Content Management
Catalog
Single View
MongoDB
Customers
RECAP