Assignment 11
Assignment 11
I used a Debian instead of Linux 2023 so I have followed different steps to install the
software. The steps I followed are:
Commands Used:
Opened a third terminal connection to the DATAPROC master node called CLI-Term:
Commands used:
1) use assignment;
2) load(‘./load.js’);
3) db.unicorns.find();
Exercise 1
Exercise 2
Command Used:
db.unicorns.insertOne({
name: "Malini",
gender: "F",
vampires: 23,
horns: 1
});
Exercise 4
Command Used:
db.unicorns.updateOne(
{ name: "Malini" },
);
Command used to verify that the above command: db.unicorns.find({ name: "Malini" });
Exercise 5
Command used to verify the above command: db.unicorns.find({ weight: { $gt: 600 } });
In the output shown below we can see that all unicorns with a weight of more than 600
pounds have been deleted:
Exercise 6: Summary of the article “Modeling temporal aspects of sensor data for
MongoDB NoSQL database”
The study focuses on addressing the challenges of managing real-time temporal data generated by IoT devices,
particularly from ANT+ sensors used in healthcare. The research question explores how NoSQL databases,
especially MongoDB, can provide a scalable and flexible schema for storing and processing such data. This is
particularly important because traditional relational databases (RDBMS) struggle to handle the demands of modern
applications, including the need for horizontal scaling, schema flexibility, and support for high-velocity data
streams.
The authors hypothesize that a document-oriented database like MongoDB can overcome these limitations by
supporting schema evolution and hierarchical data structures. They also propose that such a model is well-suited for
handling temporal aspects of real-time sensor data, which are critical for applications requiring time-series analysis,
such as remote healthcare monitoring.
To test their hypotheses, the researchers developed a middleware solution and designed a schema tailored for
MongoDB to efficiently store and query temporal data. The middleware was responsible for integrating data from
ANT+ sensors, which transmit timestamped measurements, into MongoDB's JSON-based hierarchical document
model. This design allowed the schema to adapt dynamically to new data formats and structures without requiring
predefined schemas, a common limitation in RDBMS. The study also incorporated an algorithm to handle schema
evolution, ensuring that new data could be seamlessly integrated while preserving the hierarchical organization of
existing data. The researchers analyzed the system's performance in terms of scalability, storage efficiency, and the
ability to maintain temporal order in real-time data streams. Key aspects of the evaluation included the system's
ability to handle large-scale timestamped datasets, the efficiency of queries on hierarchical data, and the robustness
of the schema in dynamic environments.
The results demonstrated MongoDB’s suitability for real-time IoT data management. The hierarchical schema
effectively reduced redundancy by embedding related data, minimizing the need for expensive join operations
typical in relational databases. Query performance improved significantly due to MongoDB’s ability to index both
primary and secondary attributes, even within sub-documents. The system also seamlessly supported schema
evolution, allowing it to handle new data formats dynamically without the need for complex migrations or
redefinitions. For example, as new data attributes were introduced, they were integrated into the existing schema
with minimal disruption, showcasing MongoDB's flexibility. Additionally, the system handled large volumes of
timestamped data efficiently, preserving temporal order while enabling fast query execution. This performance was
particularly beneficial in healthcare scenarios, where timely access to sensor data is critical for decision-making.
The implications of this research are significant, especially in fields where real-time data processing is vital, such as
healthcare and IoT applications. MongoDB’s ability to handle schema evolution and support dynamic queries makes
it a strong candidate for applications requiring both scalability and flexibility. While the study’s findings validate the
potential of MongoDB and other NoSQL databases for temporal data, the authors suggest further research is
necessary. Future studies should focus on incorporating advanced analytics and improving cross-document query
capabilities to enhance the system’s applicability across various domains.
In conclusion, this research highlights MongoDB's ability to address the unique requirements of real-time temporal
data. The findings validate the potential of NoSQL databases as a robust solution for IoT applications, offering
flexibility, scalability, and efficiency in ways traditional databases cannot. Future studies should explore additional
optimizations and advanced processing techniques to further enhance the capabilities of NoSQL systems in big data
environments.