0% found this document useful (0 votes)
67 views

Lab-MongoDB Aggregation and MapReduce

This document provides instructions and examples for using MongoDB aggregation and MapReduce. It includes examples of $match, $group, $sort, $lookup, $unwind, and MapReduce stages. The document demonstrates how to group and filter data, perform joins between collections using $lookup, and unwind embedded arrays. References for further reading on MongoDB aggregation and MapReduce are also provided.

Uploaded by

Damanpreet kaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

Lab-MongoDB Aggregation and MapReduce

This document provides instructions and examples for using MongoDB aggregation and MapReduce. It includes examples of $match, $group, $sort, $lookup, $unwind, and MapReduce stages. The document demonstrates how to group and filter data, perform joins between collections using $lookup, and unwind embedded arrays. References for further reading on MongoDB aggregation and MapReduce are also provided.

Uploaded by

Damanpreet kaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Lab: MongoDB Aggregation and MapReduce

Version Oct 26, 2022


This lab accompanies the slides MongoDB-Aggregation and MapReduce and is for practice use.

1. Go the test database using the following command


use test

2. Use insertMany() to create a collection orders and populate the collection


db.orders.insertMany( [
{ cust_id: "A123", amount: 500, status: "A" },
{ cust_id: "A123", amount: 250, status: "A" },
{ cust_id: "B212", amount: 200, status: "A" },
{ cust_id: "A123", amount: 300, status: "D" }
])

3. $match stage
db.orders.aggregate( [
{ $match : { status : "A" } }
])

db.orders.aggregate( [
{ $match : { cust_id: "A123" , status : "A" } }
])

db.orders.aggregate ( [
{ $match : { $or: [ {amount: {$gte: 300}} , {status : "D"} ] } }
])

What are the equivalent commands without using $match?

4. $group stage
db.orders.aggregate ( [
{ $group: { _id: "$cust_id", total: { $sum: "$amount" } } }
])

5. $group stage (multiple fields)


db.orders.aggregate ( [
{ $group: {
_id: { cust_id: "$cust_id", status: "$status" },
total: { $sum: "$amount" } }
}
])
6. A two-stage aggregation pipeline
db.orders.aggregate ( [
{ $match : { status : "A" } },
{ $group: { _id: "$cust_id", total: { $sum: "$amount" } } }
])

7. $count aggregation accumulator


db.orders.aggregate ( [
{ $match : { status : "A" } },
{ $group: { _id: null, order_count: { $sum: 1 } } }
])

From MongoDB 5.0, we can use $count in $group stage to get the number of documents in
a group directly. We can rewrite the above command as follows
db.orders.aggregate ( [
{ $match : { status : "A" } },
{ $group: { _id: null, order_count: { $count: { } } } }
])

Another example as follows


db.orders.aggregate ( [
{ $group: { _id: "$status", order_count: { $count: { } } } }
])

8. $sort
db.orders.aggregate ( [
{ $match : { status : "A" } },
{ $group: { _id: "$cust_id", total: { $sum: "$amount" } } },
{ $sort : { _id : 1 } }
] );

9. $min, $max, $avg


db.orders.aggregate ( [
{ $match : { status : "A" } },
{ $group: { _id: null, avg_amount: { $avg: "$amount" } } }
] );

db.orders.aggregate ( [
{ $match : { status : "A" } },
{ $group: {_id: "$cust_id", avg_amount: { $avg: "$amount" } } }
])
What is the difference between the above two commands?

An example of $max in the following


db.orders.aggregate ( [
{ $group: { _id: "$cust_id", max_amount: { $max: "$amount" } } }
])

10. $unwind Operator


a. Delete all documents from inventory
db.inventory.deleteMany( {} )

b. Insert a document to inventory


db.inventory.insertOne( { "_id" : 1, "item" : "ABC1", sizes: [ "S", "M", "L"] } )

The following aggregation uses the $unwind stage to output a document for each element
in the sizes array:
db.inventory.aggregate( [
{ $unwind: "$sizes" }
])

What is the output?

c. Create a sample collection named inventory2 with the following documents:

db.inventory2.insertMany( [
{ "_id" : 1, "item" : "ABC", price: NumberDecimal("80"), "sizes" : [ "S", "M", "L" ] },
{ "_id" : 2, "item" : "EFG", price: NumberDecimal("120"), "sizes" : [ ] },
{ "_id" : 3, "item" : "IJK", price: NumberDecimal("160"), "sizes" : "M" },
{ "_id" : 4, "item" : "LMN" , price: NumberDecimal("10") },
{ "_id" : 5, "item" : "XYZ", price: NumberDecimal("5.75"), "sizes" : null }
])

What is the output of the following command?


db.inventory2.aggregate( [
{ $unwind: "$sizes" }
])

What is the output of the following command?


db.inventory2.aggregate( [
{ $unwind: { path: "$sizes", preserveNullAndEmptyArrays: true } }
])
The following $unwind operation uses the includeArrayIndex option to include the array
index in the output

db.inventory2.aggregate( [
{
$unwind:
{
path: "$sizes",
includeArrayIndex: "arrayIndex"
}
}
])

You may also try to add preserveNullAndEmptyArrays: true to see what will happen.

db.inventory2.aggregate( [
{
$unwind:
{
path: "$sizes",
preserveNullAndEmptyArrays: true,
includeArrayIndex: "arrayIndex"
}
}
])

What is the difference?

You may check the following website


https://www.mongodb.com/docs/manual/reference/operator/aggregation/unwind/
for detailed explanation of the $unwind command
d. Group by Unwound Values

Check the following command

db.inventory2.aggregate( [
// First Stage
{
$unwind: { path: "$sizes", preserveNullAndEmptyArrays: true }
},
// Second Stage
{
$group:
{
_id: "$sizes",
averagePrice: { $avg: "$price" }
}
},
// Third Stage
{
$sort: { "averagePrice": -1 }
}
])

What is the output?

How about the following? What’s the difference?


db.inventory2.aggregate( [
// First Stage
{
$unwind: "$sizes",
},
// Second Stage
{
$group:
{
_id: "$sizes",
averagePrice: { $avg: "$price" }
}
},
// Third Stage
{
$sort: { "averagePrice": -1 }
}
])
e. Unwind Embedded Arrays

Create a sample collection named sales with the following documents:

db.sales.insertMany([
{
_id: "1",
"items" : [
{
"name" : "pens",
"tags" : [ "writing", "office", "school", "stationary" ],
"price" : NumberDecimal("12.00"),
"quantity" : NumberInt("5")
},
{
"name" : "envelopes",
"tags" : [ "stationary", "office" ],
"price" : NumberDecimal("1.95"),
"quantity" : NumberInt("8")
}
]
},
{
_id: "2",
"items" : [
{
"name" : "laptop",
"tags" : [ "office", "electronics" ],
"price" : NumberDecimal("800.00"),
"quantity" : NumberInt("1")
},
{
"name" : "notepad",
"tags" : [ "stationary", "school" ],
"price" : NumberDecimal("14.95"),
"quantity" : NumberInt("3")
}
]
}
])

What are the embedded arrays?


The following operation groups the items sold by their tags and calculates the total sales
amount per each tag.

db.sales.aggregate( [
// First Stage
{ $unwind: "$items" },

// Second Stage
{ $unwind: "$items.tags" },

// Third Stage
{
$group:
{
_id: "$items.tags",
totalSalesAmount:
{
$sum: { $multiply: [ "$items.price", "$items.quantity" ] }
}
}
}
])

11. A MapReduce example

db.orders.mapReduce(
function() { emit(this.cust_id , this.amount); },
function(key, values) { return Array.sum(values) },
{
query: { status: "A" },
out: "order_totals"
}
)

Check the results


db.order_totals.find()

Or you can try


db.orders.mapReduce(
function() { emit(this.cust_id , this.amount); },
function(key,values) { return Array.sum(values) },
{
query: { status: "A" },
out: { inline: 1 }
}
)

12. $lookup for Join


Create a collection orders with the following documents (delete the documents in orders
first if there are any):
db.orders.insertMany( [
{ _id : 1, item : "almonds", price : 12, quantity : 2 },
{ _id : 2, item : "pecans", price : 20, quantity : 1 },
{ _id : 3 }
])

Create another collection inventory with the following documents (delete the documents in
inventory first if there are any):
db.inventory.insertMany( [
{ _id : 1, sku : "almonds", description: "product 1", instock : 120 },
{ _id : 2, sku : "bread", description: "product 2", instock : 80 },
{ _id : 3, sku : "cashews", description: "product 3", instock : 60 },
{ _id : 4, sku : "pecans", description: "product 4", instock : 70 },
{ _id : 5, sku : null, description: "Incomplete" },
{ _id : 6 }
])

How to join on item=sku?

db.orders.aggregate( [
{
$lookup:
{
from: "inventory",
localField: "item",
foreignField: "sku",
as: "inventory_docs"
}
}
])
The operation would correspond to the following pseudo-SQL statement:
SELECT *, inventory_docs
FROM orders
WHERE inventory_docs IN (SELECT *
FROM inventory
WHERE sku = orders.item);

References
https://www.tutorialspoint.com/mongodb/mongodb_aggregation.htm
https://docs.mongodb.com/manual/
https://appdividend.com/2018/10/25/mongodb-aggregate-example-tutorial/
https://appdividend.com/2018/10/26/mongodb-mapreduce-example-tutorial/
https://docs.mongodb.com/manual/reference/method/db.collection.distinct/
https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/

The course materials are only for the use of students enrolled in the course CSIS 3300 at Douglas
College. Sharing this material to a third-party website can lead to a violation of Copyright law.

You might also like