0% found this document useful (0 votes)
13 views8 pages

Project Documentation Final

The lab project report details the implementation of a scalable and distributed e-commerce database system using MongoDB Compass, featuring CRUD operations, indexing, and advanced functionalities like sharding and replication. The project includes five collections with a total of 38,279 documents, showcasing various data management techniques and configurations for high availability and performance. The successful execution demonstrates a comprehensive understanding of MongoDB's capabilities in a distributed environment.

Uploaded by

solomon Zinabu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views8 pages

Project Documentation Final

The lab project report details the implementation of a scalable and distributed e-commerce database system using MongoDB Compass, featuring CRUD operations, indexing, and advanced functionalities like sharding and replication. The project includes five collections with a total of 38,279 documents, showcasing various data management techniques and configurations for high availability and performance. The successful execution demonstrates a comprehensive understanding of MongoDB's capabilities in a distributed environment.

Uploaded by

solomon Zinabu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

ADDIS ABABA UNIVERSITY

COLLEGE OF NATURAL AND COMPUTATIONAL

SCIENCES DEPARTMENT OF

COMPUTER SCIENCE

Selected Topics in Data Management Systems (CoSc6031)

Lab Project Report

Project Title: Implementation of a Scalable and Distributed E-commerce Database System


using MongoDB Compass

1. Solomon Zinabu GSE/2578/17


2. Sasebih Nega GSE/7037/17
3. Dawit Kebede GSE/4878/17
4. Minale Ejigu GSE/5402/17

Date: 12/04/2025
1. Introduction

This project demonstrates a comprehensive understanding and practical implementation


of key features in MongoDB, using Compass as the primary tool. The e-commerce
dataset comprises five collections: customers, orders, order_items, payments, and
products, each containing 38,279 documents. The objective is to apply and showcase the
following functionalities:

 CRUD operations (Create, Read, Update, Delete)


 Projections
 Indexing
 Sorting and Aggregations
 Replica Set (1 Primary + 2 Secondaries)
 Sharding (Scalable distribution)
 Advanced MongoDB features (Lookups, Array functions, Grouping, Conditional
logic)

2. Data Collections Overview

Customers Collection

{
"customer_id": "I74lXDOfoqsp",
"customer_zip_code_prefix": 6020,
"customer_city": "goiania",
"customer_state": "GO"
}

Orders Collection

{
"order_id": "u6rPMRAYIGig",
"customer_id": "I74lXDOfoqsp",
"order_purchase_timestamp": "2017-11-18 12:29:57",
"order_approved_at": "2017-11-18 12:46:08"
}

Order Items Collection

{
"order_id": "u6rPMRAYIGig",
"product_id": "1slxdgbgWFax",
"seller_id": "3jwvL6ihC45G",
"price": 24.1,
"shipping_charges": 20.9
}

Payments Collection

{
"order_id": "u6rPMRAYIGig",
"payment_type": "credit_card",
"payment_installments": 2,
"payment_value": 155.77
}

Products Collection

{
"product_id": "1slxdgbgWFax",
"product_category_name": "toys",
"product_weight_g": 50,
"product_length_cm": 16,
"product_height_cm": 5,
"product_width_cm": 11
}
3. Functional Implementation Using MongoDB Compass

3.1 CRUD Operations

Each collection supports Create, Read, Update, and Delete operations performed directly
in Compass.

 Insert – Sample records added to each collection.


 Read – Filtered and projected queries to retrieve documents.
 Update – Updates on customer city/state.
 Delete – Removal based on customer_id, order_id, or product_id.

3.2 Projection

Used to limit fields during queries for performance:

{
"customer_id": 1,
"customer_city": 1,
"_id": 0
}
3.3 Indexing

Created indexes on:

 customer_id
 order_id
 product_id
 payment_type
 product_category_name

Improves query speed for searches and aggregations.

3.4 Aggregations

Customers: Cities with most customers

[ { "$group": { _id: "$customer_city", total: { "$sum": 1 } } }, { "$sort": { total: -1 } } ]

Orders: Monthly order count

[ { "$project": { month: { "$substr": ["$order_purchase_timestamp", 0, 7] } } }, { "$group":


{ _id: "$month", count: { "$sum": 1 } } } ]
Payments: Average by type

[ { "$group": { _id: "$payment_type", avg_payment: { "$avg": "$payment_value" } } } ]


3.5 Advanced Aggregation & Lookup

Join Orders with Customers:

[ { "$lookup": { from: "customers", localField: "customer_id", foreignField: "customer_id",


as: "customer_info" } } ]

Group products by category with average size:

[ { "$group": {
_id: "$product_category_name",
avg_weight: { "$avg": "$product_weight_g" },
avg_length: { "$avg": "$product_length_cm" }
}} ]

Find high-value orders:

[ { "$match": { payment_value: { "$gt": 1000 } } } ]


4. Replication & Sharding Setup

4.1 Replication

Replication in MongoDB provides high availability by creating copies of the same data
on multiple servers. For this project, we configured a Replica Set consisting of one
primary and two secondary nodes.

Objective: Ensure data redundancy and availability in case of hardware failure or server
crash.

Replica Set Configuration Steps:

1. Start MongoDB instances (in separate terminal windows or services):


2. mongod --replSet rs0 --port 27017 --dbpath /data/rs0 --bind_ip localhost
3. mongod --replSet rs0 --port 27018 --dbpath /data/rs1 --bind_ip localhost
mongod --replSet rs0 --port 27019 --dbpath /data/rs2 --bind_ip localhost

4. Initiate Replica Set from any node (done via mongosh):


5. rs.initiate({
6. _id: "rs0",
7. members: [
8. { _id: 0, host: "localhost:27017" },
9. { _id: 1, host: "localhost:27018" },
10. { _id: 2, host: "localhost:27019" }
11. ]
})

12. Check status using:

rs.status()

13. Read/Write Setup in Compass:


o Use connection string:
mongodb://localhost:27017,localhost:27018,localhost:27019/?replicaSet=rs
0
o Read Preference: Primary or Secondary
o Write always routes to Primary

Benefits Observed:

 Reads from secondaries for load balancing.


 Automatic failover if primary is unavailable.
 No data loss during network partitions.
4.2 Sharding

Sharding enables horizontal scaling by distributing data across multiple machines.


MongoDB uses mongos as a query router and distributes collections based on a shard
key.

Sharding Configuration Steps:

1. Start Config Server Replica Set:

mongod --configsvr --replSet configReplSet --port 26017 --dbpath /data/config --


bind_ip localhost

2. Start Shard Replica Set (rs0) (same as replication step above)


3. Start mongos instance:

mongos --configdb configReplSet/localhost:26017 --port 27020 --bind_ip localhost

4. Connect to mongos using Compass:


o Connection String:
mongodb://localhost:27020
5. Add Shard (from mongosh):

sh.addShard("rs0/localhost:27017,localhost:27018,localhost:27019")

6. Enable Sharding for Database:

sh.enableSharding("ecommerceDB")

7. Shard Specific Collections:


8. sh.shardCollection("ecommerceDB.customers", { "customer_id": "hashed" })
sh.shardCollection("ecommerceDB.orders", { "order_id": "hashed" })

9. Verify Shard Status:

sh.status()

Benefits Observed:

 Collections evenly distributed across shards


 High scalability with ability to add new shards
 Improved performance on large datasets
5. Read/Write Distribution Testing

Using Compass, operations tested on both primary and secondary nodes.

 Read Preference: Secondary


 Write Preference: Primary
 Observations: Compass allows controlling read/write behaviors via connection
options.

6. Conclusion

This project successfully implements a highly available, scalable e-commerce system using
MongoDB Compass. From core CRUD operations to advanced aggregations, replication,
and sharding, all requirements were fulfilled through UI-based interactions—
demonstrating complete mastery of MongoDB in a distributed environment.

Appendix

Tools Used:

 MongoDB (v8.0.5)
 MongoDB Compass (GUI)
 mongosh (for cluster setup only)

Environment:

 Windows 10
 Ports used: 27017 (primary), 27018, 27019 (secondaries), 27020 (mongos)

Sample Data Volume:

 ~38,279 documents per collection

You might also like