4 - MongoDB Aggregation Framework
4 - MongoDB Aggregation Framework
1
KTPM-Phạm Quảng Tri
MongoDB
Aggregation Framework
2
KTPM-Phạm Quảng Tri
What is the MongoDB aggregation framework?
db.listingsAndReviews.find(
{ amenities : 'Wifi' },
{ price : 1, address : 1, _id : 0 }
)
db.listingsAndReviews.aggregate(
[ { $match : { amenities : 'Wifi' } },
{ $project : { price : 1, address : 1, _id : 0 } } ]
)
{ }
Example:
db.solarSystem.aggregate( [
{ $match : { atmosphericComposition: { $in : [/O2/] }, meanTemperature: { $gte : -40, $lte :40} } },
{ $project : { _id : 0, name : 1, hasMoons: { $gt : [ '$numberOfMoons', 0 ] } } }
],
{ allowDiskUse : true }
)
• Pipelines are always an array of one or more stages.
• Stages are composed of one or more aggregation operators or expressions
• Expressions may take a single argument or an array of arguments (Read More)
• Filters the document stream to allow only matching documents to pass unmodified into the next
pipeline stage.
• Place the $match as early in the aggregation pipeline as possible.
• $match can be used multiple times in pipeline.
• $match uses standard MongoDB query operators.
• you cannot use $where with $match.
db. sinhvien.aggregate( [
{ $match : { ten: { $eq: 'Nở' } } },
{ $count: 'TongSoSV' }
])
db.sinhvien.aggregate( [
{ $match : {
$and : [
{ 'lienLac.email' : '[email protected]' },
{ _id : { $eq : '57' } }
]
}
}])
• With $project state we can selectively remove and retain fields and also reassign existing field values and
derive entirely new fields.
Example:
db.solarSystem.aggregate( [ { $project : { _id : 0, name : 1, gravity: 1} } ] )
db.solarSystem.aggregate( [ { $project : { _id : 0, name : 1, 'gravity.value' : 1} } ] )
db.solarSystem.aggregate( [ { $project : { _id : 0, name : 1, surfacegravity: '$gravity.value' } } ] )
db.solarSystem.aggregate( [ { $project : { _id : 0, name : 1, new_surfacegravity: {
$multiply: [
{ $divide: [ '$gravity.value', 10 ] }, 100
]
}}}])
KTPM-Phạm Quảng Tri 12
Accumulator Expression with $project stage
• Accumulator expressions within $project work over an array within the
given document
• Some of accumulator expressions: $avg, $min, $max, $sum, …
• We're going to explore 'icecream_data' collection in dbtest database
Example:
db.icecream_data.aggregate( [ { $project : { max_high: { $max: '$trends.avg_high_tmp' } } } ] )
[ { _id : ObjectId('59bff494f70ff89cacc36f90'), max_high: 87 } ]
• Groups input documents by the specified _id expression and for each distinct grouping, outputs a
document.
• The _id field of each output document contains the unique group by value.
• The output documents can also contain computed fields that hold the values of some accumulator
expression.
db.movies.aggregate( [
{ $group : {
_id : '$year' ,
'numFilmsThisYear': { $sum: 1 }
} },
{ $sort: { _id : 1} }
])
• Grouping on the number of directors a film has, demonstrating that we have to validate types to protect some
expressions
db.movies.aggregate( [
{ $group : {
_id : { numDirectors : { $cond : [ { $isArray : '$directors' }, { $size : '$directors' }, 0 ] } },
numFilms : { $sum : 1 }, averageMetacritic: { $avg : '$metacritic' } } },
{ $sort : { '_id.numDirectors' : -1 } }
])
• Showing how to group all documents together. By convention, we use null or an empty string
db.movies.aggregate( [ { $group : { _id : null, count : { $sum: 1 } } } ] )
Example:
• Finding the top rated genres per year from 1990 to 2015...
db.movies.aggregate( [
{ $match : { 'imdb.rating' : { $gt : 0 }, year: { $gte : 1990, $lte : 2015 }, runtime : { $gte : 90 } } },
{ $unwind : '$genres' },
{ $group : { _id : { year : '$year' , genre : '$genres' }, average_rating : { $avg: '$imdb.rating' } } },
{ $sort : { '_id.year' : -1, average_rating : -1 } }
])
• Performs a left outer join to an unsharded collection in the same database to filter in documents from the
'joined' collection for processing.
• To each input document, the $lookup stage adds a new array field whose elements are the matching
documents from the 'joined' collection.
db.air_alliances.aggregate( [
{ $lookup : {
from : 'air_airlines',
localField : 'airlines',
foreignField : 'name',
as : 'airlines' }
}
])
KTPM-Phạm Quảng Tri 22
$lookup stage
Example:
db.air_alliances.aggregate( [
{ $match : { name : 'SkyTeam' } },
{ $lookup : {
from : 'air_airlines',
localField : 'airlines',
foreignField : 'name',
as : 'airlines' }
}
])
• let: Optional. Specifies the variables to use in the pipeline stages. Use the variable expressions to access the
document fields that are input to the pipeline
• pipeline: determines the resulting documents from the joined collection. To return all documents, specify an
empty pipeline []. The pipeline cannot directly access the document fields. Instead, define variables for the
document fields using the let option and then reference the variables in the pipeline stages
Collection: Restaurants
Collection: Orders
Example: list of warehouses with product quantity greater than or equal to the ordered product quantity.
Collection: item_orders
Collection: warehouses
level 1
level 2
level 3
db.movies.aggregate( [
{ $group : { _id : '$year', count : { $sum : 1 }, title : { '$first' : '$title' } } },
{ $sort : { count : -1 } },
{ $merge : {
into : { db : 'reporting', coll : 'movies2' },
on : '_id',
whenMatched : 'merge',
whenNotMatched : 'insert' } }
])