DBS UNIT V Notes
DBS UNIT V Notes
Assistant Professor
Department of Computer Applications
M S Ramaiah Institute of Technology
[email protected]
Ph:9900087291
Contents
1 MongoDB 2
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Key Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Concepts of MongoDB . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5.1 Naming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.7 DataTypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.7.1 Basic DataTypes . . . . . . . . . . . . . . . . . . . . . . . 7
1.8 CRUD Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.8.1 Create Operation . . . . . . . . . . . . . . . . . . . . . . . 8
1.8.2 Query Documents . . . . . . . . . . . . . . . . . . . . . . 11
1.8.3 Update Operation . . . . . . . . . . . . . . . . . . . . . . 14
1.8.4 Delete Operation . . . . . . . . . . . . . . . . . . . . . . . 15
1.8.5 Bulk Write Operations . . . . . . . . . . . . . . . . . . . . 16
1.9 Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.10 Map-Reduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.10.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Page 1
Database Systems - Unit V Nithya BN
UNIT V
MongoDB
Introduction: Features, Database, Collection, Documents, Data Types
CRUD Operations: Create, Read, Update, Delete, Bulk Write
Aggregation: Aggregation Pipeline, Map-Reduce, Single Purpose Aggregation
Operations
1 MongoDB
1.1 Introduction
MongoDB is a document database designed for ease of development and scaling.
The Manual introduces key concepts in MongoDB, presents the query language,
and provides operational and administrative considerations and procedures as
well as a comprehensive reference section.MongoDB offers both a Community
and an Enterprise version of the database.
MongoDB Community is the source available and free to use edition of Mon-
goDB. MongoDB Enterprise is available as part of the MongoDB Enterprise
Advanced subscription and includes comprehensive support for your MongoDB
deployment. MongoDB Enterprise also adds enterprise-focused features such as
LDAP and Kerberos support, on-disk encryption, and auditing.
Document Database
A record in MongoDB is a document, which is a data structure composed of
field and value pairs. MongoDB documents are similar to JSON objects. The
values of fields may include other documents, arrays, and arrays of documents.
Page 2
Database Systems - Unit V Nithya BN
Page 3
Database Systems - Unit V Nithya BN
• Every document has a special key, ” id”, that is unique within a collection
• MongoDB is distributed with a simple but powerful tool called the mongo
shell. The mongo shell provides built - in support for administering Mon-
goDB instances and manipulating data using the MongoDB query lan-
guage. It is also a fully functional JavaScript interpreter that enables
users to create and load their own scripts for a variety of purposes.
1.4 Databases
MongoDB groups collections into databases. A single instance of MongoDB can
host several databases, each grouping together zero or more collections. A good
rule of thumb is to store all data fro a single application in the same database.
seperate databases are userful when storing data for several applications or users
on the same MongoDB server.
Databases are identified by name. Database names can be any UTF - 8 string,
with the following restrictions:
• The empty string (””) is not a valid database name.
• A database name cannot contain any of these characters:/, \, ., ”, ∗, <, >
, :, |, ?, $(a single space), or \0(the null character). Basically, stick with
alphanumeric ASCII.
• Database names are case - insensitive.
• Database names are limited to a maximum of 64 bytes.
There are also some reserved database names, which you can access but which
have special semantics. These are as follows:
local: This database stores data specific to a single server. In replica sets,
local stores data used in the replication process. The local database itself is
never replicated.
config Sharded MongoDB clusters use the config database to store informa-
tion about each shard.
1.5 Collection
Acollection is a group of documents. If a document is the MongoDB analog of
a row in a relational database, then a collection can be thought of as the analog
of a table.
Dynamic Schemas
Collections have dynamic schemas. This means that the documents within a
single collection can have any number of different ”shapes”. For example, both
of the following documents could be stored in a single collection:
Page 4
Database Systems - Unit V Nithya BN
1.5.1 Naming
A collection is identified by its name. Collection names have a few restrictions
listed below.
• The empty string(””)is not a valid collection name.
• Collection names may not contain the character, especially the null char-
acter, because this delineates the end of a collection name.
• You should not create any collections with names that start with system.,
a prefix reserved for internal collections. For example, the system.users
collection contains the databases users and the system.name spaces col-
lection contains information about all of the databases collection.
Page 5
Database Systems - Unit V Nithya BN
1.6 Documents
At the heart of MongoDB is the document: an ordered set of keys with associ-
ated values. The representation of a document varies by programming language,
but most languages have a data structure that is a natural fit, such as a map,
hash or dictionary. In JavaScript, for example, documents are represented as
objects:
This simple document contains a single key, ”greeting”, with a value of ”Hello,
world!”. Most documents will be more complex than this simple one and often
wll contain multiple key/value pairs:
As you can see, values in documents are not just ”blobs”. They can be one
of several different datatypes (or even an entire embedded document. In this
example the value for ”greetings” is a string, whereas the value for ”views” is
an integer.
The keys in a document are strings.
• Keys must not contain the character of null. This character is used to
signify the end of a key.
• The. and $ characters have some special properties and should be used
only in certain circumstances. In general, they should be considered re-
served, and drivers will complain if they are used inappropriately.
• MongoDB is type sensitive and case sensitive.
• MongoDB cannot contain duplicate keys.
Page 6
Database Systems - Unit V Nithya BN
1.7 DataTypes
1.7.1 Basic DataTypes
• Null
The null type can be used to represent both a null value and a nonexistent
field:
{”x” : null}
• Boolean
There is a boolean type, which can be used for the values true and false.
{”x” : true}
• Number
The shell defaults to using 64 - bit floating point numbers. thus, these
numbers both look ”normal” in the shell:
{”x”:3.14}
{”x” :3}
For integers, use the NumberInt or NumberLong classes, which represent
4 - byte or 8-byte signed integers, respectively.
{”x” : NumberInt(”3”)}
{”x” : NumberLong(”3”)}
• String
Any string of UTF -8 characters can be represented using the string type:
{”x”:”foobar”}
• Date
MongoDB stores dates as 64-bit integers representing milliseconds since
the Unix Epoch (January 1, 1970). The time zone is not stored:
Page 7
Database Systems - Unit V Nithya BN
• Object ID
An Object ID is a 12 - byte ID for documents:
• Code
MondoDB also makes it possible to store arbitrary JavaScript in queries
and documents:
• db.collection.insertOne()
• db.collection.insertMany()
In MongoDB, insert operations target a single collection. All write operations
in MongoDB are atomic on the level of a single document.
Create and Insert examples The following example inserts a new document
into the inventory collection. If the document does not specify an id field,
MongoDB adds the id field with an ObjectId value to the new document.
Page 8
Database Systems - Unit V Nithya BN
To retrieve the document that you just inserted, query the collection:
Page 9
Database Systems - Unit V Nithya BN
Insert Behavior
Collection Creation
If the collection does not currently exist, insert operations will create the col-
lection.
id Field
In MongoDB, each document stored in a collection requires a unique id field
that acts as a primary key. If an inserted document omits the id field, the
MongoDB driver automatically generates an ObjectId for the id field.
This also applies to documents inserted through update operations with upsert:
true.
Atomicity
All write operations in MongoDB are atomic on the level of a single document.
Insert Methods MongoDB provides the following methods for inserting doc-
uments into a collection:
db.collection.insertOne(): Inserts a single document into a collection.
db.collection.insertMany(): inserts multiple documents into a collection.
db.collection.insert() inserts a single document or multiple documents into a
collection.
Page 10
Database Systems - Unit V Nithya BN
The following example retrieves all documents from the inventory collection
where status equals either ”A” or ”D”:
Page 11
Database Systems - Unit V Nithya BN
Specify OR Conditions Using the $or operator, you can specify a compound
query that joins each clause with a logical OR conjunction so that the query
selects the documents in the collection that match at least one condition.
The following example retrieves all documents in the collection where the status
equals ”A” or qty is less than ($lt) 30 :
db.inventory.f ind ({$or : [{status : ”A”} , {$lt : 30}]})
Page 12
Database Systems - Unit V Nithya BN
Specify AND Condition The following query selects all documents where
the nested field h is less than 15, the nested field uom equals ”in”, and the status
field equals ”D”:
db.inventory.f ind ({”size.h” : {$lt : 15} , ”size : uom” : ”in”, status : ”D”})
Query an Array
Page 13
Database Systems - Unit V Nithya BN
Query for an Element by the Array Index Position Using dot no-
tation, you can specify query conditions for an element at a particular index or
position of the array. The array uses zero-based indexing.
The following example queries for all documents where the second element in
the array dim cm is greater than 25:
db.inventory.f ind ({”dim cm.1” : {$gt : 25}})
Query an Array by Array Length Use the $size operator to query for
arrays by number of elements. For example, the following selects documents
where the array tags has 3 elements.
db.inventory.f ind ({”tags” : {$size : 3}})
{
<update operator>: { <field1>: <value1>, ... },
<update operator>: { <field2>: <value2>, ... },
...
}
Some update operators, such as $set, will create the field if the field does not
exist.
Page 14
Database Systems - Unit V Nithya BN
db.inventory.updateOne(
{ item: "paper" },
{
$set: { "size.uom": "cm", status: "P" },
$currentDate: { lastModified: true }
}
)
The update operation:
• uses the $set operator to update the value of the size.uom field to ”cm”
and the value of the status field to ”P”,
• uses the $currentDate operator to update the value of the lastModified
field to the current date. If lastModified field does not exist, $currentDate
will create the field.
db.inventory.updateMany(
{ "qty": { $lt: 50 } },
{
$set: { "size.uom": "in", status: "P" },
$currentDate: { lastModified: true }
}
)
db.inventory.replaceOne(
{ item: "paper" },
{ item: "paper", instock: [ { warehouse: "A", qty: 60 }, { warehouse: "B", qty: 40 } ]
)
db.inventory.deleteMany()
The method returns a document with the status of the operation.
Page 15
Database Systems - Unit V Nithya BN
Delete All Documents that Match a Condition You can specify criteria,
or filters, that identify the documents to delete. The filters use the same syntax
as read operations.
To specify equality conditions, use hf ieldi : hvaluei expressions in the query
filter document:
{hf ieldi : hvaluei , · · · }
A query filter document can use the query operators to specify conditions in
the following form:
{hf ield1i : {hoperator1i : hvalue1i} , · · · }
To delete all documents that match a deletion criteria, pass a filter parameter
to the deleteMany() method.
The following example removes all documents from the inventory collection
where the status field equals ”A”:
db.inventory.deleteMany({ status : "A" })
The method returns a document with the status of the operation. For more
information and examples, see deleteMany().
Page 16
Database Systems - Unit V Nithya BN
• updateOne
• updateMany
• replaceOne
• deleteOne
• deleteMany
Each write operation is passed to bulkWrite() as a document in an array.
For example, the following performs multiple write operations:
The characters collection contains the following documents:
try {
db.characters.bulkWrite(
[
{ insertOne :
{
"document" :
{
"_id" : 4, "char" : "Dithras", "class" : "barbarian", "lvl" : 4
}
}
},
{ insertOne :
{
"document" :
{
"_id" : 5, "char" : "Taeln", "class" : "fighter", "lvl" : 3
}
}
},
{ updateOne :
{
"filter" : { "char" : "Eldon" },
"update" : { $set : { "status" : "Critical Injury" } }
}
},
{ deleteOne :
Page 17
Database Systems - Unit V Nithya BN
1.9 Aggregation
Aggregation operations process data records and return computed results. Ag-
gregation operations group values from multiple documents together, and can
perform a variety of operations on the grouped data to return a single result.
MongoDB provides three ways to perform aggregation: the aggregation pipeline,
the map-reduce function, and single purpose aggregation methods.
db.orders.aggregate([
{ $match: { status: "A" } },
{ $group: { _id: "$cust_id", total: { $sum: "$amount" } } }
])
Page 18
Database Systems - Unit V Nithya BN
First Stage:
The $match stage filters the documents by the status field and passes to the
next stage those documents that have status equal to ”A”.
Second Stage:
The $group stage groups the documents by the cust id field to calculate the sum
of the amount for each unique cust id.
The most basic pipeline stages provide filters that operate like queries and doc-
ument transformations that modify the form of the output document.
Other pipeline operations provide tools for grouping and sorting documents by
specific field or fields as well as tools for aggregating the contents of arrays,
including arrays of documents. In addition, pipeline stages can use operators
for tasks such as calculating the average or concatenating a string.
The pipeline provides efficient data aggregation using native operations within
MongoDB, and is the preferred method for data aggregation in MongoDB.
The aggregation pipeline can operate on a sharded collection.
The aggregation pipeline can use indexes to improve its performance during
some of its stages. In addition, the aggregation pipeline has an internal opti-
mization phase.
Page 19
Database Systems - Unit V Nithya BN
1.10 Map-Reduce
An aggregation pipeline provides better performance and usability than a map-
reduce operation.
Map-reduce operations can be rewritten using aggregation pipeline operators,
such as $group, $merge, and others.
For map-reduce operations that require custom functionality, MongoDB pro-
vides the $accumulator and $function aggregation operators starting in version
4.4.
1.10.1 Example
Create a sample collection orders with these documents:
db.orders.insertMany([
{ _id: 1, cust_id: "Ant O. Knee", ord_date:
new Date("2020-03-01"), price: 25,
items: [ { sku: "oranges", qty: 5, price: 2.5 },
{ sku: "apples", qty: 5, price: 2.5 } ], status: "A" },
{ _id: 2, cust_id: "Ant O. Knee", ord_date:
new Date("2020-03-08"), price: 70,
Page 20
Database Systems - Unit V Nithya BN
Return the Total Price Per Customer Perform the map-reduce operation
on the orders collection to group by the cust id, and calculate the sum of the
price for each cust id:
1. Define the map function to process each input document:
• In the function, this refers to the document that the map-reduce
operation is processing.
• The function maps the price to the cust id for each document and
emits the cust id and price.
Page 21
Database Systems - Unit V Nithya BN
db.orders.mapReduce(
mapFunction1,
reduceFunction1,
{ out: "map_reduce_example" }
)
This operation outputs the results to a collection named map reduce example.
If the map reduce example collection already exists, the operation will re-
place the contents with the results of this map-reduce operation.
4. Query the map reduce example collection to verify the results:
db.map_reduce_example.find().sort( { _id: 1 } )
Page 22