Lecture 02,03
Lecture 02,03
h p://www.ksi.mff.cuni.cz/~svoboda/courses/191-NDBI040/
Lecture 9
3. 12. 2019
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 2
Document Stores
Data model
• Documents
Self-describing
Hierarchical tree structures (JSON, XML, …)
– Scalar values, maps, lists, sets, nested documents, …
Iden fied by a unique iden fier (key, …)
• Documents are organized into collec ons
Query pa erns
• Create, update or remove a document
• Retrieve documents according to complex query condi ons
Observa on
• Extended key-value stores where the value part is examinable
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 3
MongoDB Document Database
MongoDB
JSON document database
• h ps://www.mongodb.com/
• Features
Open source, high availability, eventual consistency, automa c
sharding, master-slave replica on, automa c failover,
secondary indices, …
• Developed by MongoDB
• Implemented in C++, C, and JavaScript
• Opera ng systems: Windows, Linux, Mac OS X, …
• Ini al release in 2009
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 5
Query Example
Collec on of movies Query statement
{ Titles of movies filmed in 2005 and later,
_id: ObjectId("1"),
title: "Vratné lahve", sorted by these tles in descending order
year: 2006
} db.movies.find(
{ year: { $gt: 2005 } },
{ { _id: false, title: true }
_id: ObjectId("2"), ).sort({ title: -1 })
title: "Samotáři",
year: 2000
} Query result
{ { title: "Vratné lahve" }
_id: ObjectId("3"),
title: "Medvídek", { title: "Medvídek" }
year: 2007
}
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 6
Data Model
Database system structure
Instance → databases → collec ons → documents
• Database
• Collec on
Collec on of documents, usually of a similar structure
• Document
MongoDB document = one JSON object
– I.e. even a complex JSON object with other recursively nested
objects, arrays or values
Each document has a unique iden fier (primary key)
– Technically realized using a top-level _id field
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 7
Data Model
MongoDB document
• Internally stored in BSON format (Binary JSON) and Displayed as
JSON
Maximal allowed size 16 MB
GridFS can be used to split larger files into smaller chunks
Restric ons on fields
• Top-level _id is reserved for a primary key
• Field names cannot start with $ and cannot contain .
$ is reserved for query operators
. is used when accessing nested fields
• The order of fields is preserved
Except for_id fields that are always moved to the beginning
• Names of fields must be unique
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 8
Primary Keys
Features of iden fiers
• Unique within a collec on
• Immutable (cannot be changed once assigned)
• Can be of any type other than a JSON array(List)
Key management
• Natural iden fier
• Auto-incremen ng number – not recommended
• UUID (Universally Unique Iden fier)
• ObjectId – special 12-byte BSON type (the default op on)
Small, likely unique, fast to generate, ordered, based on a
mestamp, machine id, process id, and a process-local counter
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 9
Design Ques ons
Data modeling (in terms of collec ons and documents)
• No explicit schema is provided, nor expected or enforced
However…
– documents within a collec on are similar in prac ce
– implicit schema is required nevertheless
• Challenge
Balancing applica on requirements, performance aspects,
data structure, mutual rela onships, query pa erns, …
Two main concepts
• References
• Embedded documents
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 10
Denormalized Data Models
Embedded documents
• Related data in a single document
with embedded JSON objects, so called subdocuments
• Pros: data manipula on (fewer queries need to be issued)
• Cons: possible data redundancies
• Suitable for one-to-one or one-to-many rela onships
{
_id: ObjectId("2"), title: "Samotáři", year: 2000,
actors: [
{ firstname: "Jitka", lastname: "Schneiderová" },
{ firstname: "Ivan", lastname: "Trojan" },
{ firstname: "Jiří", lastname: "Macháček" }
]
}
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 11
Normalized Data Models
References
• Related data in separate documents
These are interconnected via directed links (references)
Technically expressed using ordinary values with iden fiers
of target documents (i.e. no special construct is provided)
• Features: higher flexibility, follow up queries might be needed
• Suitable for many-to-many rela onships
{ {
_id: ObjectId("2"), _id: ObjectId("6"),
title: "Samotáři", firstname: "Jitka",
year: 2000, lastname: "Schneiderová"
actors: [ ObjectId("6"), }
ObjectId("4"),
ObjectId("5") ]
}
…
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 12
Sample Data
Collec on of movies Collec on of actors
{ { _id: ObjectId("4"),
_id: ObjectId("1"), firstname: "Ivan",
title: "Vratné lahve", year: 2006, lastname: "Trojan" }
actors: [ ObjectId("7"), ObjectId("5") ]
{ _id: ObjectId("5"),
}
firstname: "Jiří",
{ lastname: "Macháček" }
_id: ObjectId("2"),
{ _id: ObjectId("6"),
title: "Samotáři", year: 2000,
firstname: "Jitka",
actors: [ ObjectId("6"), ObjectId("4"),
lastname: "Schneiderová" }
ObjectId("5") ]
} { _id: ObjectId("7"),
firstname: "Zdeněk",
{
lastname: "Svěrák" }
_id: ObjectId("3"),
title: "Medvídek", year: 2007,
actors: [ ObjectId("5"), ObjectId("4") ]
}
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 13
Applica on Interfaces
mongo shell
• Interac ve interface to MongoDB
• ./bin/mongo --username user --password pass
--host host --port 28015
Drivers
• Java, C, C++, C#, Perl, PHP, Python, Ruby, Scala, ...
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 14
Query Language
MongoDB query language is based on JavaScript
• Single command / en re script
• Read queries return a cursor
Allows us to iterate over all the selected documents
• Each command is always evaluated over a single collec on
Query pa erns
• Basic CRUD opera ons
Accessing documents via iden fiers or condi ons on fields
• Aggrega ons: MapReduce, pipelines, grouping
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 15
CRUD Operations
Overview
• db.collection.insert|insertOne|insertMany()
Inserts a new document / documents
• db.collection.replaceOne()
Replaces an existing document
• db.collection.update|updateOne|updateMany()
Modifies an existing document / documents
• db.collection.remove|deleteOne|deleteMany()
Removes an existing document / documents
• db.collection.find()
Finds documents based on filtering conditions
Projection and / or sorting may be applied too
NIE‐PDB: Advanced Database Systems | Lecture 8: Document Databases: MongoDB | 21. 11. 2023 16
Insert Opera on
Insert Opera on
Inserts a new document / documents into a given collec on
db . collection . insert
( document )
, options
[ document ]
• Parameters
Document: one or more documents to be inserted
– Provided document iden fiers (_id fields) must be unique
– (_id) When missing, they are generated automa cally (ObjectId)
Op ons
• Collec ons are created automa cally when not yet exist
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 18
Insert Opera on: Examples
Insert a new actor document
db.actors.insert( {
{ _id: ObjectId("8"),
firstname: "Anna", firstname: "Anna",
lastname: "Geislerová" lastname: "Geislerová"
} }
)
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 19
Update Opera on
Update Opera on
Modifies / replaces an exis ng document / documents
db . collection . update
( query , update )
, options
• Parameters
Query: descrip on of documents to be updated
– The same behavior as in find opera ons
Update: modifica on ac ons to be applied
Op ons
• At most one document is updated by default
Unless { multi: true } op on is specified
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 21
Update Opera on: Examples
Replace the whole document of at most one specified actor
db.actors.update( {
{ _id: ObjectId("8") }, _id: ObjectId("8"),
{ firstname: "Aňa", firstname: "Aňa",
lastname: "Geislerová" } lastname: "Geislerová"
) }
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 22
Update Opera on
Update / replace modes
document
{ update operator }
• Replace
when the update parameter contains no update operators
The whole document is replaced (_id is preserved)
• Update
when the update parameter contains only update operators
Current document is updated using these operators
– $set, $unset, $inc, $mul, …
– Each operator can be used at most once
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 23
Update Operators
Field operators
• $set – sets the value of a given field / fields
$set : { field : value / array / document }
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 24
Update Operators
Field operators
• $inc – increments the value of a given field / fields
$inc : { field : increment }
{ $type : "timestamp" }
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 25
Update Operators
Array operators
• $push – adds one item / all items to the end of an array
$push : { array field : value / array / document }
{ $each : array }
{ $each : array }
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 26
Update Operators
Array operators
• $pop – removes the first / last item of an array
$pop : { array field : 1 }
-1
query
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 27
Upsert Mode
Upsert behavior of update opera on
• When { upsert: true } op on is specified,
and, at the same me, no document was updated
⇒ new document is inserted
What this document will contain?
• In case of the replace mode…
All the fields (i.e. value fields) from the update parameter
• In case of the update mode…
All the value fields from the query parameter,
and the outcome of all the update operators
from the update parameter
• _id field is preserved, or newly generated if necessary
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 28
Upsert Mode: Example
Unsuccessful update of a movie resul ng to an inser on
db.movies.update(
{ title: "Tmavomodrý svět", year: { $gt: 2000 } },
{
$set: {
director: { firstname: "Jan", lastname: "Svěrák" },
year: 2001
},
$inc: { rating: 2 }
},
{ upsert: true }
)
{ _id: ObjectId("11"),
title: "Tmavomodrý svět",
director: { firstname: "Jan", lastname: "Svěrák" },
year: 2001,
rating: 2 }
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 29
Remove Opera on
Remove Opera on
Removes a document / documents from a given collec on
db . collection . remove
( query )
, options
• Parameters
Query: descrip on of documents to be removed
– The same behavior as in find opera ons
Op ons
• All the matching documents are removed
unless { justOne: true } op on is provided
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 31
Find Opera on
Find Opera on
Selects documents from a given collec on
db . collection . find
( query )
, projection
• Parameters
Query: descrip on of documents to be selected
Projec on: fields to be included / excluded in the result
• Matching documents are returned via an iterable cursor
This allows us to chain further sort, skip or limit opera ons
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 33
Find Opera on: Examples
Select all movies from our collec on
db.movies.find()
db.movies.find( { } )
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 34
Selec on
Query parameter describes the documents we are interested in
{ and / or logical operator }
{ query operator }
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 35
Selec on: Field Condi ons
Value equality
• The actual field value must be iden cal to the specified value
• I.e. iden cal…
including the number, order and names of recursively iden cal
values of all nested object fields
including the number and order of recursively iden cal array
items
Query operators
• The actual field value must sa sfy all the provided operators
Each operator can be used at most once
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 36
Value Equality: Examples
Select movies having a specific director
db.movies.find(
{ director: { firstname: "Jan", lastname: "Svěrák" } }
)
db.movies.find(
{ director: { lastname: "Svěrák", firstname: "Jan" } }
)
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 37
Dot Nota on
The dot nota on for field names
field
. field
position
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 38
Value Equality
Example (revisited)
Select movies having a specific director
db.movies.find(
{ director: { firstname: "Jan", lastname: "Svěrák" } }
)
db.movies.find(
{ "director.firstname": "Jan", "director.lastname": "Svěrák" }
)
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 39
Query Operators
Comparison operators
$eq : value / array / document
$ne
$lt : value
$lte
$gte
$gt
$nin ,
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 40
Query Operators
Comparison operators
• $eq, $ne
Tests the actual field value for equality / inequality
– The same behavior as in case of value equality condi ons
• $lt, $lte, $gte, $gt
Tests whether the actual field value is less than / less than or
equal / greater than or equal / greater than the provided value
• $in
Tests whether the actual field value is equal to at least one
of the provided values
• $nin
Nega on of $in
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 41
Query Operators
Element operators
• $exists – tests whether a given field exists / not exists
$exists : true
false
Evalua on operators
• $regex – tests whether a given field value matches
a specified regular expression (PCRE)
• $text – performs text search (text index must exists)
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 42
Query Operators
Array operators
• $all – tests whether a given array contains all the specified
items (in any order)
$all : [ value / array / document ]
Example (revisited)
Select movies having specific actors
db.movies.find(
{ actors: [ ObjectId("5"), ObjectId("7") ] }
)
db.movies.find(
{ actors: { $all: [ ObjectId("5"), ObjectId("7") ] } }
)
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 43
Query Operators
Array operators
• $size – tests the size of a given array against a fixed number
(and not, e.g., a range, unfortunately)
$size : size
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 44
Query Operators
Logical operators
• $and, $or
$and : [ query , query , query ]
$or
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 45
Querying Arrays
Condi on based on value equality is sa sfied when...
• the given field as a whole is iden cal to the provided value,
or
• at least one item of the array is iden cal to the provided value
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 46
Querying Arrays
Condi on based on query operators is sa sfied when...
• the given field as a whole sa sfies all the involved operators,
or
• each of the involved operators is sa sfied by at least one item
of the given array
note, however, that this item may not be the same for all the
individual operators
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 47
Projec on
Projec on allows us to determine the fields returned in the result
{ field : 1 }
true
false
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 48
Projec on Operators
Array operators
• $elemMatch – selects the first matching item of an array
This item must sa sfy all the operators included in query
When there is no such item, the field is not returned at all
$elemMatch : query
[ skip , count ]
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 49
Projec on: Examples
Find a par cular movie, select its iden fier, tle and actors
db.movies.find( {
{ _id: ObjectId("2") }, _id: ObjectId("2"),
{ title: true, actors: true } title: "Samotáři",
) actors: [ ObjectId("6"),
ObjectId("4"),
ObjectId("5") ]
}
Find movies from 2000, select their tles and the last two actors
db.movies.find( {
{ year: 2000 }, title: "Samotáři",
{ actors: [ ObjectId("4"),
title: 1, _id: 0, ObjectId("5") ]
actors: { $slice: -2 } }
}
)
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 50
Modifiers
Modifiers change the order and number of returned documents
• sort – orders the documents in the result
• limit – returns at most a certain number of documents
limit ( count )
All the modifiers are op onal, can be chained in any order (without
any implica ons), but must all be specified before any documents
are retrieved via a given cursor
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 51
Modifiers
Sort modifier orders the documents in the result
sort ( { field : 1 } )
-1
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 52
Lecture Conclusion
MongoDB
• Document database for JSON documents
• Sharding with master-slave replica on architecture
Query func onality
• CRUD opera ons
Insert, find, update, remove
Complex filtering condi ons
• MapReduce
• Index structures
NDBI040: Modern Database Concepts | Lecture 9: Document Databases: MongoDB | 3. 12. 2019 56