0% found this document useful (0 votes)
9 views13 pages

6 Explanation

The document outlines the stages of an aggregation pipeline in MongoDB, detailing operators such as $match, $group, $project, $sort, $limit, $skip, and $unwind. It provides an example of how to filter, unwind, group, sort, project, and skip documents to calculate average ratings and total reviews for restaurants in a specific location. The necessity of the $unwind stage is emphasized, as it allows for accurate aggregation by treating each review as a separate document.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views13 pages

6 Explanation

The document outlines the stages of an aggregation pipeline in MongoDB, detailing operators such as $match, $group, $project, $sort, $limit, $skip, and $unwind. It provides an example of how to filter, unwind, group, sort, project, and skip documents to calculate average ratings and total reviews for restaurants in a specific location. The necessity of the $unwind stage is emphasized, as it allows for accurate aggregation by treating each review as a separate document.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

The operators and aggregation stages:

Aggregation Stages:
1. $match: Filters documents.
{ $match: { <query> } }

2. $group: Groups documents by some specified expression and outputs a document for each
distinct grouping.

{
$group: {
_id: <expression>,
<field1>: { <accumulator1>: <expression1> },
...
}
}

3. $project: Reshapes each document by including, excluding, or adding new fields.


{
$project: {
<field1>: <expression1>,
...
}
}

4. $sort: Sorts the documents.

{ $sort: { <field>: <order> } }


- `<order>`: `1` for ascending, `-1` for descending.

5. $limit: Limits the number of documents.


{ $limit: <number> }

6. $skip: Skips a specified number of documents.

{ $skip: <number> }

7. $unwind: Deconstructs an array field from the input documents to output a document for
each element of the array.
{ $unwind: "$<field>" }

Document
{
"name": "Restaurant A",
"location": "Jayanagar",
"reviews": [
{ "rating": 5, "comment": "Excellent!" },
{ "rating": 4, "comment": "Good" }
]
},
{
"name": "Restaurant B",
"location": "Jayanagar",
"reviews": [
{ "rating": 3, "comment": "Average" },
{ "rating": 2, "comment": "Not great" }
]
},
{
"name": "Restaurant C",
"location": "MG Road",
"reviews": [
{ "rating": 5, "comment": "Fantastic!" }
]
}

Aggregation pipeline with example:

1. `$match`

{
$match: {
location: "Jayanagar"
}
}

- Filters the documents to include only those where the `location` field is "Jayanagar".
- Example:
- Input: The three documents above.
Output:
{
"name": "Restaurant A",
"location": "Jayanagar",
"reviews": [
{ "rating": 5, "comment": "Excellent!" },
{ "rating": 4, "comment": "Good" }
]
},
{
"name": "Restaurant B",
"location": "Jayanagar",
"reviews": [
{ "rating": 3, "comment": "Average" },
{ "rating": 2, "comment": "Not great" }
]

2. `$unwind`
{
$unwind: "$reviews"
}

- Deconstructs the `reviews` array field from the input documents to output a document for
each element in the `reviews` array.
- Example:
- Input: The filtered documents from `$match`.
Output:

{
"name": "Restaurant A",
"location": "Jayanagar",
"reviews": { "rating": 5, "comment": "Excellent!" }
},
{
"name": "Restaurant A",
"location": "Jayanagar",
"reviews": { "rating": 4, "comment": "Good" }
},
{
"name": "Restaurant B",
"location": "Jayanagar",
"reviews": { "rating": 3, "comment": "Average" }
},
{
"name": "Restaurant B",
"location": "Jayanagar",
"reviews": { "rating": 2, "comment": "Not great" }
}

3. `$group`
{
$group: {
_id: "$name",
averageRating: { $avg: "$reviews.rating" },
totalReviews: { $sum: 1 }
}
}

- Groups input documents by the restaurant name (`_id` is set to `$name`). Calculates the
average rating for each group and counts the total number of reviews.
- Example:
- Input: The unwound documents.
Output:
{
"_id": "Restaurant A",
"averageRating": 4.5,
"totalReviews": 2
},
{
"_id": "Restaurant B",
"averageRating": 2.5,
"totalReviews": 2
}

4. `$sort`
{
$sort: {
averageRating: -1
}
}

- Sorts the grouped documents by `averageRating` in descending order.


- Example:
- Input: The grouped documents.
- Output:

{
"_id": "Restaurant A",
"averageRating": 4.5,
"totalReviews": 2
},
{
"_id": "Restaurant B",
"averageRating": 2.5,
"totalReviews": 2
}

5. `$project`
{
$project: {
_id: 0,
restaurant: "$_id",
averageRating: 1,
totalReviews: 1
}
}

- Reshapes each document to include only the fields `restaurant`, `averageRating`, and
`totalReviews`. Renames `_id` to `restaurant`.
- Example:
- Input: The sorted documents.
Output:

{
"restaurant": "Restaurant A",
"averageRating": 4.5,
"totalReviews": 2
},
{
"restaurant": "Restaurant B",
"averageRating": 2.5,
"totalReviews": 2
}

6. `$skip`
{
$skip: 1
}

- Skips the first document in the sorted list.


- Example:
- Input: The projected documents.
Output:

{
"restaurant": "Restaurant B",
"averageRating": 2.5,
"totalReviews": 2
}

Final Result:
The final result of this aggregation pipeline, formatted using `. pretty ()`, would be:

{
"restaurant": "Restaurant B",
"averageRating": 2.5,
"totalReviews": 2
}

This means that the aggregation pipeline returns the restaurant with the second highest
average rating in "Jayanagar", along with its average rating and total number of reviews.

The `$unwind` stage is used to deconstruct an array field from the input documents to output
a document for each element of the array. In this example, `$unwind` is used to deconstruct
the `reviews` array so that each review becomes a separate document.

why `$unwind` is necessary in this context:

Purpose of `$unwind`

1. Isolate Individual Reviews:


- Each restaurant has multiple reviews stored in an array. To calculate the average rating
and total number of reviews, each review must be considered separately.
- By using `$unwind`, each review becomes its own document, which allows subsequent
stages to process each review individually.

2. Accurate Aggregation:
- To calculate the average rating correctly, each rating must be extracted from the `reviews`
array.
- Without `$unwind`, MongoDB would not be able to access each individual rating within
the array for aggregation.

Without `$unwind`
If we don't use `$unwind`, the `reviews` array would remain as it is, and the aggregation
framework wouldn't be able to directly compute the average rating or count the total reviews
across all documents. The `$group` stage wouldn't work as intended because it would be
grouping entire arrays rather than individual review entries.

what happens if we try to aggregate without `$unwind`:

Example Without `$unwind`:


Suppose we omit the `$unwind` stage and try to compute the average rating directly:
db.restaurants.aggregate([
{
$match: {
location: "Jayanagar"
}
},
{
$group: {
_id: "$name",
averageRating: { $avg: "$reviews.rating" },
totalReviews: { $sum: { $size: "$reviews" } }
}
},
{
$sort: {
averageRating: -1
}
},
{
$project: {
_id: 0,
restaurant: "$_id",
averageRating: 1,
totalReviews: 1
}
},
{
$skip: 1
}
]).pretty()

Issues With This Approach

1. Invalid Aggregation Expressions:


- `$reviews.rating` cannot be directly used in the `$avg` expression because `$reviews` is
an array.
- MongoDB does not support directly averaging an array field like this.

2. Incorrect Review Count:


- Using `{ $sum: { $size: "$reviews" } }` would work for counting the reviews but doesn't
solve the issue of averaging ratings.

Therefore, `$unwind` is necessary to flatten the array and process each review individually to
compute accurate average ratings and total review counts.

Correct Usage With `$unwind`


The correct approach is to use `$unwind` to handle each review separately, enabling accurate
calculations:
db.restaurants.aggregate([
{
$match: {
location: "Jayanagar"
}
},
{
$unwind: "$reviews"
},
{
$group: {
_id: "$name",
averageRating: { $avg: "$reviews.rating" },
totalReviews: { $sum: 1 }
}
},
{
$sort: {
averageRating: -1
}
},
{
$project: {
_id: 0,
restaurant: "$_id",
averageRating: 1,
totalReviews: 1
}
},
{
$skip: 1
}
]).pretty()

By using `$unwind`, each review is treated as a separate document, allowing for accurate
calculation of the average rating and total review count.

You might also like