Week5 BDA
Week5 BDA
The objective of this lab experiment is to perform data analysis using MongoDB, a NoSQL
database, focusing on data querying, aggregation, and visualization techniques.
Prerequisites
1. MongoDB Installation: Ensure you have MongoDB installed on your machine. You
can download it from MongoDB Download Center.
2. MongoDB Compass: For GUI-based data visualization, install MongoDB Compass
from MongoDB Compass Download.
3. Sample Data: We will use a sample dataset for analysis. For this example, you can
use the Sample Data Sets provided by MongoDB.
Steps
1. Download Sample Data: Use the following command to import a sample dataset (e.g.,
iris.csv).
2. Import the iris.csv in Mongodb compass.
You can use MongoDB Compass’s query interface to run basic queries like:
You can use the Aggregation Tab in Compass to perform operations like:
1. Show Databases
2. Select a Database
3. Show Collections
Step 4: Aggregation
"total_records"
o/p:-
total_records
150
2. Finds the maximum and minimum petal length for each species.
_id: "$species",
max_petal_length: {
$max: "$petal_length"
},
min_petal_length: {
$min: "$petal_length"
o/p:-
_id: "setosa"
max_petal_length
1.9
min_petal_length
_id
"versicolor"
max_petal_length
5.1
min_petal_length
3
_id
"virginica"
max_petal_length
6.9
min_petal_length
4.5
average_petal_width: -1
o/p:-
_id
"virginica"
average_petal_width
2.026
_id
"versicolor"
average_petal_width
1.3259999999999998
_id
"setosa"
average_petal_width
0.24400000000000002
4. Finds the count of flowers with sepal width > 3.0, grouped by species.
sepal_width: {
$gt: 3.0
o/p:-
_id
67bffceafc9790da8d3b9499
sepal_length
5.1
sepal_width
3.5
petal_length
1.4
petal_width
0.2
species
"setosa"
_id
67bffceafc9790da8d3b949b
sepal_length
4.7
sepal_width
3.2
petal_length
1.3
petal_width
0.2
species
"setosa"
_id
67bffceafc9790da8d3b949c
sepal_length
4.6
sepal_width
3.1
petal_length
1.5
petal_width
0.2
species
"setosa"
_id
67bffceafc9790da8d3b949d
sepal_length
sepal_width
3.6
petal_length
1.4
petal_width
0.2
species
"setosa"
_id
67bffceafc9790da8d3b949e
sepal_length
5.4
sepal_width
3.9
petal_length
1.7
petal_width
0.4
species
"setosa"
_id
67bffceafc9790da8d3b949f
sepal_length
4.6
sepal_width
3.4
petal_length
1.4
petal_width
0.3
species
"setosa"
_id
67bffceafc9790da8d3b94a0
sepal_length
5
sepal_width
3.4
petal_length
1.5
petal_width
0.2
species
"setosa"
_id
67bffceafc9790da8d3b94a2
sepal_length
4.9
sepal_width
3.1
petal_length
1.5
petal_width
0.1
species
"setosa"
_id
67bffceafc9790da8d3b94a3
sepal_length
5.4
sepal_width
3.7
petal_length
1.5
petal_width
0.2
species
"setosa"
_id
67bffceafc9790da8d3b94a4
sepal_length
4.8
sepal_width
3.4
petal_length
1.6
petal_width
0.2
species
"setosa"
5. Groups the documents by species and counts how many belong to each.
_id: "$species",
count: {
$sum: 1
o/p:-
_id
"setosa"
count
50
_id
"versicolor"
count
50
_id
"virginica"
count
50
Step 5: Data Visualization
Step 6: Clean Up
1. Drop Collection
If needed, export your processed data as CSV or JSON from MongoDB Compass for
visualization in tools like Excel or Tableau.
Resources
• MongoDB Documentation
• MongoDB Aggregation Framework
• MongoDB Data Modeling
Conclusion
In this lab experiment, you learned how to set up MongoDB, import data, perform basic queries
and aggregations, and visualize the results. This foundational knowledge is crucial for data analysis
using MongoDB.