Writeups - Assignment 11
Writeups - Assignment 11
Title: Implement Map reduces operation with suitable example using MongoDB.
Theory:
Map-reduce is a data processing paradigm for condensing large volumes of data into useful
aggregated results. MongoDB uses mapReduce command for map-reduce operations.
MapReduce is generally used for processing large data sets.
MapReduce Command
>db.collection.mapReduce(
out: collection,
query: document,
sort: document,
limit: number
The map-reduce function first queries the collection, then maps the result documents to emit key-
value pairs, which is then reduced based on the keys that have multiple values.
map is a javascript function that maps a value with a key and emits a key-value pair
reduce is a javascript function that reduces or groups all the documents having the same
key
Using MapReduce
Consider the following document structure storing user posts. The document stores user_name of
the user and the status of post.
"user_name": "mark",
"status":"active"
Now, we will use a mapReduce function on our posts collection to select all the active posts,
group them on the basis of user_name and then count the number of posts by each user using the
following code −
>db.posts.mapReduce(
function() { emit(this.user_id,1); },
query:{status:"active"},
out:"post_total"
"result" : "post_total",
"timeMillis" : 9,
"counts" : {
"input" : 4,
"emit" : 4,
"reduce" : 2,
"output" : 2
},
"ok" : 1,
The result shows that a total of 4 documents matched the query (status:"active"), the map
function emitted 4 documents with key-value pairs and finally the reduce function grouped
mapped documents having the same keys into 2.
To see the result of this mapReduce query, use the find operator −
>db.posts.mapReduce(
function() { emit(this.user_id,1); },
query:{status:"active"},
out:"post_total"
).find()
The above query gives the following result which indicates that both users tom and mark have
two posts in active states −