Lab 08 MongoDB MFF
Lab 08 MongoDB MFF
h p://www.ksi.mff.cuni.cz/~svoboda/courses/171-NDBI040/
5. 12. 2017
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 2
CRUD Opera ons
Overview
• db.collection.insert()
Inserts a new document into a collec on
• db.collection.update()
Modifies an exis ng document / documents or
inserts a new one
• db.collection.remove()
Deletes an exis ng document / documents
• db.collection.find()
Finds documents based on filtering condi ons
Projec on and / or sor ng may be applied too
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 3
Mongo Shell
Connect to our NoSQL server
• SSH / PuTTY and SFTP / WinSCP
• nosql.ms.mff.cuni.cz:42222
Start mongo shell
• mongo
Try several basic commands
• help
Displays a brief descrip on of database commands
• exit
quit()
Closes the current client connec on
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 4
Databases
Switch to your database
• use login
db = db.getSiblingDB('login')
Use your login name as a name for your database
List all the exis ng databases
• show databases
show dbs
db.adminCommand('listDatabases')
Your database will be created later on implicitly
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 5
Collec ons
Create a new collec on for actors
• db.createCollection("actors")
Suitable when crea ng collec ons with specific op ons
since collec ons can also be created implicitly
List all collec ons in your database
• show collections
db.getCollectionNames()
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 6
Insert Opera on
Inserts a new document / documents into a given collec on
db . collection . insert
( document )
, options
[ document ]
• Parameters
Document: one or more documents to be inserted
Op ons
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 7
Insert Opera on
Insert a few new documents into the collec on of actors
db.actors.insert({ _id: "trojan", name: "Ivan Trojan" })
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 8
Update Opera on
Modifies / replaces an exis ng document / documents
db . collection . update
( query , update )
, options
• Parameters
Query: descrip on of documents to be updated
Update: modifica on ac ons to be applied
Op ons
Update operators
• $set, $unset, $rename, $inc, $mul, $currentDate,
$push, $addToSet, $pop, $pull, …
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 9
Update Opera on
Update the document of actor Ivan Trojan
db.actors.update(
{ _id: "trojan" },
{ name: "Ivan Trojan", year: 1964 }
)
db.actors.update(
{ name: "Ivan Trojan", year: { $lt: 2000 } },
{ name: "Ivan Trojan", year: 1964 }
)
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 10
Update Opera on
Use update method to insert a new actor
• Inserts a new document when upsert behavior was enabled
and no document could be updated
db.actors.update(
{ _id: "geislerova" },
{ name: "Anna Geislerova" },
{ upsert: true }
)
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 11
Update Opera on
Try to modify the document iden fier of an exis ng document
• Your request will be rejected since
document iden fiers are immutable
db.actors.update(
{ _id: "trojan" },
{ _id: 1, name: "Ivan Trojan", year: 1964 }
)
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 12
Update Opera on
Update the document of actor Ivan Trojan
db.actors.update(
{ _id: "trojan" },
{
$set: { year: 1964, age: 52 },
$inc: { rating: 1 },
$push: { movies: { $each: [ "samotari", "medvidek" ] } }
}
)
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 13
Save Opera on
Replaces an exis ng / inserts a new document
db . collection . save
( document )
, options
• Parameters
Document: document to be modified / inserted
Op ons
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 14
Save Opera on
Use save method to insert new actors
• Document iden fier must not be specified in the query
or must not yet exist in the collec on
db.actors.save({ name: "Tatiana Vilhelmova" })
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 15
Remove Opera on
Removes a document / documents from a given collec on
db . collection . remove
( query )
, options
• Parameters
Query: descrip on of documents to be removed
Op ons
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 16
Remove Opera on
Remove selected documents from the collec on of actors
db.actors.remove({ _id: "geislerova" })
db.actors.remove(
{ year: { $lt: 2000 } },
{ justOne: true }
)
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 17
Sample Data
Insert the following actors into your emp ed collec on
{ _id: "trojan",
name: "Ivan Trojan", year: 1964,
movies: [ "samotari", "medvidek" ] }
{ _id: "machacek",
name: "Jiri Machacek", year: 1966,
movies: [ "medvidek", "vratnelahve", "samotari" ] }
{ _id: "schneiderova",
name: "Jitka Schneiderova", year: 1973,
movies: [ "samotari" ] }
{ _id: "sverak",
name: "Zdenek Sverak", year: 1936,
movies: [ "vratnelahve" ] }
{ _id: "geislerova",
name: "Anna Geislerova", year: 1976 }
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 18
Find Opera on
Selects documents from a given collec on
db . collection . find
( query )
, projection
• Parameters
Query: descrip on of documents to be selected
Projec on: fields to be included / excluded in the result
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 19
Querying
Execute and explain the meaning of the following queries
db.actors.find()
db.actors.find({ })
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 20
Querying
Execute and explain the meaning of the following queries
db.actors.find({ $or: [ { year: 1964 }, { rating: { $gte: 3 } } ] })
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 21
Index Structures
Mo va on
• Full collec on scan must be conducted when searching
for documents unless an appropriate index exists
Primary index
• Unique index on values of the _id field
• Created automa cally
Secondary indexes
• Created manually for values of a given key field / fields
• Always within just a single collec on
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 22
Index Structures
Secondary index crea on
{ field : 1 }
-1
text
hashed
2d
2dsphere
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 23
Index Structures
Index types
• 1, -1 – standard ascending / descending value indexes
Both scalar values and embedded documents can be indexed
• hashed – hash values of a single field are indexed
• text – basic full-text index
• 2d – points in planar geometry
• 2dsphere – points in spherical geometry
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 24
Index Structures
Index forms
• One key / mul ple keys (composed index)
• Ordinary fields / array fields (mul -key index)
Index proper es
• Unique – duplicate values are rejected (cannot be inserted)
• Par al – only certain documents are indexed
• Sparse – documents without a given field are ignored
• TTL – documents are removed when a meout elapses
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 25
Index Structures
Execute the following query and study its execu on plan
db.actors.find({ movies: "medvidek" })
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 26
MapReduce
Executes a MapReduce job on a selected collec on
db . collection . mapReduce
• Parameters
Map: JavaScript implementa on of the Map func on
Reduce: JavaScript implementa on of the Reduce func on
Op ons
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 27
MapReduce
Map func on
• Current document is accessible via this
• emit(key, value) is used for emissions
Reduce func on
• Intermediate key and values are provided as arguments
• Reduced value is published via return
Op ons
• query: only matching documents are considered
• sort: they are processed in a specific order
• limit: at most a given number of them is processed
• out: output is stored into a given collec on
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 28
MapReduce: Example
Count the number of movies filmed in each year, star ng in 2005
db.movies.mapReduce(
function() {
emit(this.year, 1);
},
function(key, values) {
return Array.sum(values);
},
{
query: { year: { $gte: 2005 } },
sort: { year: 1 },
out: "statistics"
}
)
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 29
MapReduce
Implement and execute the following MapReduce jobs
• Find a list of actors (their names sorted alphabe cally)
for each year (they were born)
values.sort()
Use out: { inline: 1 } op on
• Calculate the overall number of actors for each movie
this.movies.forEach(function(m) { … })
Array.sum(values)
Use out: { inline: 1 } op on once again
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 30
References
Documenta on
• h ps://docs.mongodb.com/v3.2/
NDBI040: Big Data Management and NoSQL Databases | Prac cal Class 8: MongoDB | 5. 12. 2017 31