6 4360704 Nosql Lab Manual
6 4360704 Nosql Lab Manual
Laboratory Manual
Introduction to NoSQL
(4360704)
Semester-6 of Diploma in Computer Engineering
Enrolment No
Name
Branch
Academic Term
Institute
Laboratory Manual
Prepared By,
Branch Coordinator
Shri B. H. Kantevala
HOD- Diploma in Computer Engineering Government
Polytechnic, Ahmedabad
Committee Chairman
Shri R. D. Raghani
(HOD-EC)
Principal (I/C) Government
Polytechnic, Gandhinagar
CTE’s Vision:
To facilitate quality technical and professional education having relevance for both
industry and society,
with moral and ethical values,
giving equal opportunity and access,
aiming to prepare globally competent technocrats.
CTE’s Mission:
Institute’s Mission:
Department’s Vision:
Department’s Mission:
Certificate
Place: …………………….….
Date: …………………………
Each practical session has been thoughtfully designed to serve as a tool for developing and
enhancing industry-relevant competencies in every student. Recognizing the limitations of traditional
teaching methods, especially in developing psychomotor skills, this lab manual shifts the focus from
the outdated practice of conducting practicals solely for concept and theory verification. Instead, it
prioritizes industry-defined outcomes, providing students with an opportunity to read and understand
procedures in advance. This allows them to grasp the magnitude of the experiment before its actual
execution, fostering predetermined outcomes.
In the context of NoSQL databases, this lab manual provides an introduction to the diverse
types of NoSQL databases, with a specific focus on MongoDB, Cassandra, Neo4j Graph Databases,
and Redis. The content covers installation procedures, basic CRUD operations, data modeling, and
simple queries for each database. Students are guided to apply their knowledge in practical scenarios,
enabling them to develop solutions for various problems using the distinctive features of each
NoSQL database. As we strive to present a comprehensive lab manual, we welcome and appreciate
any suggestions for improvement.
Programme Outcomes (POs):
Following programme outcomes are expected to be achieved through the practical of the course:
1. Basic and Discipline specific knowledge: Apply knowledge of basic mathematics, science
and engineering fundamentals and engineering specialization to solve the engineering problems.
2. Problem analysis: Identify and analyse well-defined engineering problems using codified
standard methods.
3. Design/development of solutions: Design solutions for engineering well-defined technical
problems and assist with the design of systems components or processes to meet specified needs.
4. Engineering Tools, Experimentation and Testing: Apply modern engineering tools and
appropriate technique to conduct standard tests and measurements.
5. Engineering practices for society, sustainability and environment: Apply appropriate
technology in context of society, sustainability, environment and ethical practices.
6. Project Management: Use engineering management principles individually, as a team
member or a leader to manage projects and effectively communicate about well-defined
engineering activities.
7. Life-long learning: Ability to analyse individual needs and engage in updating in the context
of technological changes in field of engineering.
Practical Outcome - Course Outcome matrix
Course Outcomes (COs):
CO1. Analyze the impact of the CAP theorem on various NoSQL databases, highlighting the
trade-offs between consistency, availability, and partition tolerance in database systems.
CO2. Apply MongoDB's features and basic CRUD operations to design and manipulate data
structures effectively, demonstrating proficiency in utilizing a document- oriented
database.
CO3. Demonstrate Cassandra's data model and query language (CQL), showcasing the ability to
create and manage distributed data tables efficiently.
CO4. Identify the significance of graph databases, illustrating their practical applications in
solving complex relationship-oriented problems.
CO5. Utilize Redis data structures and functionalities to implement efficient caching
strategies, showcasing the role of Redis in enhancing data retrieval performance.
Sr.
No. Experiment/Practical Outcome CO1 CO2 CO3 CO4 CO5
Marks
Sr. No Experiment/Practical Outcome Page Date Sign
(10)
A. Objective
- Identify and analyze specific engineering challenges related to NoSQL databases using
established methodologies and principles.
- Develop solutions for well-defined technical problems in the context of NoSQL databases,
contributing to the design of effective data management systems.
- Apply modern engineering tools and testing techniques to assess and validate the performance of
NoSQL databases, ensuring adherence to standard practices.
- Demonstrate the ability to analyze personal learning needs and engage in lifelong learning,
particularly in the evolving landscape of NoSQL databases and related technologies.
Page
Introduction to NoSQL
1. NoSQL Fundamentals:
- Understand the implications of the CAP theorem on distributed systems, emphasizing the
trade-offs between consistency, availability, and partition tolerance.
Analyze the impact of the CAP theorem on various NoSQL databases, highlighting the trade-offs
between consistency, availability, and partition tolerance in database systems.
- Foster increased interest and engagement among students, promoting a positive attitude
towards the study of NoSQL databases.
G. Prerequisite Theory
Page
Introduction to NoSQL
Introduction to NoSQL Databases
NoSQL databases, or "Not Only SQL," represent a diverse category of database management
systems designed to address the limitations of traditional relational databases.
NoSQL databases are defined as non-tabular databases that handle data storage differently as
compared to relational tables. Unlike SQL databases, NoSQL databases are schema- less and provide
a more flexible data model, allowing for the storage and retrieval of large volumes of unstructured or
semi-structured data. They are particularly well-suited for handling dynamic and rapidly evolving
data in distributed and scalable environments.
NoSQL databases are gaining traction for real-time cloud, web, and big data applications due to their
key features:
- NoSQL supports various data models, allowing flexibility in handling unstructured, semi-
structured, and structured data.
- Ideal for Agile development, as it accommodates different data models without the need for
separate databases.
- NoSQL offers simplified scalability with a serverless, peer-to-peer architecture, allowing for
horizontal scaling and improved performance.
- Sharding enables efficient handling of massive data volumes, and auto-replication ensures high
availability in case of failures.
- Advanced NoSQL databases facilitate global data distribution by operating across multiple cloud
regions and data centers.
4. Minimal Downtime:
- Serverless architecture and data replication across nodes ensure business continuity, with
alternative nodes providing access in case of a malfunction.
No, NoSQL cannot entirely replace relational databases. The choice between the two depends on the
specific needs of an organization. NoSQL is preferred for handling diverse data types in cloud and
web applications with a broad and rapidly growing user base. Its flexibility, multi-modality,
scalability, availability, and global distribution make it ideal for
Page
Introduction to NoSQL
certain use cases. However, both database types can coexist and complement each other, allowing
organizations to leverage the strengths of each based on their requirements.
There are four main types of NoSQL databases, each catering to specific use cases:
1. Document-oriented Databases:
2. Key-Value Stores:
3. Column-family Stores:
4. Graph Databases:
CAP Theorem
The CAP theorem, originally known as the CAP principle, illuminates the inherent trade-offs in
designing distributed systems with replication. It highlights the challenge of simultaneously
achieving consistency, availability, and partition tolerance in such systems.
1. Consistency (C):
- All nodes in the system see the
same data at the same time.
2. Availability (A):
- Every request receives a response,
without guarantee that it contains the
most recent version of the
information.
Page
Introduction to NoSQL
The CAP theorem asserts that distributed databases can prioritize at most two of these three
properties. Database systems fall into categories based on their priorities:
It's crucial to note that database systems may exhibit different behaviors based on configurations and
settings, influencing their consistency, availability, and partition tolerance. Even in the case of
specialized databases like Neo4j, which prioritizes consistency and partition tolerance over
availability, the CAP theorem remains relevant. In situations of network partition or failure, Neo4j
sacrifices availability to maintain consistency.
System architects must make trade-offs based on the application's requirements and the
characteristics of the underlying NoSQL database.
Page
Introduction to NoSQL
Indexing Multiple types, Secondary Indexing based Limited
including indexes on relationships indexing
compound and options
geo-spatial
indexes
Query Good Excellent Excellent for Very fast (in-
Performance graph queries memory
storage)
ACID Supports ACID Limited support Supports ACID Supports
Transactions transactions for transactions transactions transactions
through multi
commands
Community Large and active Active Active Active
Support community community community community
Use Cases Content Time-series Social networks, Caching,
management data, logging, fraud detection message
systems, e- IoT broker
commerce, real-
time analytics
-Scalability Requirements: Consider the need for horizontal scalability and how well the database
can distribute data across multiple nodes.
- Consistency and Availability Needs: Assess the trade-offs between consistency and availability
based on the application's requirements.
- Use Case Characteristics: Tailor the choice of database to the specific demands of the application,
such as read and write patterns, data complexity, and relationships.
In conclusion, understanding the nuances of NoSQL databases, their types, trade-offs, and specific
use cases is crucial for making informed decisions in selecting the right database solution for diverse
applications.
H. Resources/Equipment Required
1. Internet Connectivity:
- Reliable internet connectivity is essential for accessing online resources, databases, and relevant
documentation.
Page
Introduction to NoSQL
- Provide comprehensive documentation, tutorials, and learning materials to guide students through
the practical exercises.
a) Performance c) Persistence
b) Partition Tolerance d) Provisioning
4. Which NoSQL database is best suited for managing and querying interconnected
data, such as social networks?
a) MongoDB c) Neo4j
b) Cassandra d) Redis
5. What type of data model does a Key-Value Store NoSQL database follow?
a) Tabular c) Graph
b) Document-oriented d) Key-Value
a) MongoDB c) Neo4j
b) Cassandra d) Redis
7. What is the purpose of the CAP theorem in the context of distributed NoSQL databases?
Page
Introduction to NoSQL
availability, and partition d) To determine the optimal
tolerance database for all use cases
c) To enforce a specific data
model for NoSQL databases
8. Which factor is crucial in deciding the appropriate NoSQL database for an application?
Page
Introduction to NoSQL
J. References Links
1. https://www.spiceworks.com/tech/artificial-intelligence/articles/what-is-nosql/
2. https://www.geeksforgeeks.org/the-cap-theorem-in-dbms/
3. https://www.mongodb.com/nosql-explained
4. https://www.couchbase.com/resources/why-nosql/
5. https://www.guru99.com/nosql-tutorial.html
K. Assessment Rubrics
Needs
Satisfactory
Criteria Excellent (10) Good (7) Improvement Marks
(5)
(3)
Demonstrates a
profound Displays a Demonstrates
Shows a strong
Understanding understanding of basic limited
grasp of NoSQL
of NoSQL NoSQL databases, understanding comprehension
fundamentals.
Concepts accurately defining of NoSQL of NoSQL
key concepts and concepts. databases.
types.
Accurately identifies
Identifies and
Identification and explains various Identifies basic
explains Struggles to
of NoSQL types of NoSQL types of
multiple types identify NoSQL
Database databases, NoSQL
of NoSQL database types.
Types highlighting their databases.
databases.
characteristics.
Applies NoSQL
Applies Limited
concepts effectively Attempts to
Knowledge NoSQL application of
to provide relevant apply NoSQL
Application in concepts to NoSQL
examples of each concepts in
Examples provide concepts in
database type examples.
examples. examples.
discussed.
Average Marks
Page
Introduction to NoSQL
Practical 2. Introduction and Installation of MongoDB
A. Objective
Familiarize students with MongoDB through an introduction to its features and advantages, followed
by hands-on experience installing and connecting to the database.
2. Problem Analysis:
- Identify and analyze engineering challenges related to MongoDB installation and connectivity
using established methods.
3. Design/Development of Solutions:
- Design solutions for well-defined technical problems encountered during the installation and
connection processes in MongoDB.
- Utilize appropriate engineering tools to conduct experiments and tests during the MongoDB
installation, ensuring proficiency in modern database technologies.
6. Project Management:
- Demonstrate project management skills while handling the installation of MongoDB, either
individually or as part of a team, and effectively communicate progress.
7. Life-Long Learning:
- Exhibit the ability to adapt and engage in continuous learning, updating skills to match
technological changes in the field of database management, particularly MongoDB.
1. Technical Proficiency:
Page3
Introduction to NoSQL
2. Problem-Solving:
- Apply analytical skills to troubleshoot and resolve issues related to MongoDB installation and
connectivity.
- Effectively document the installation process and communicate findings, demonstrating clarity
and precision in conveying technical information.
4. Adaptability:
Apply MongoDB's features and basic CRUD operations to design and manipulate data structures
effectively, demonstrating proficiency in utilizing a document-oriented database.
Develop a positive attitude towards MongoDB installation, fostering confidence and competence.
G. Prerequisite Theory
Overview of MongoDB
MongoDB is a widely-used, open-source NoSQL database that embraces a document- oriented data
model. Unlike traditional relational databases, MongoDB stores data in flexible, JSON-like
documents, allowing for dynamic schemas and easy scalability. It offers high performance,
horizontal scalability, and is particularly well-suited for applications with rapidly evolving schemas
and large amounts of unstructured or semi-structured data.
Page4
Introduction to NoSQL
Data arrives from one or few locations. Data arrives from many locations.
It supports complex transactions. It supports simple transactions.
It has single point of failure. No single point of failure.
It handles data in less volume. It handles data in high volume.
Transactions written in one location. Transactions written in many locations.
support ACID properties compliance doesn’t support ACID properties
Its difficult to make changes in database once it is Enables easy and frequent changes to
defined database
schema is mandatory to store the data schema design is not required
Deployed in vertical fashion. Deployed in Horizontal fashion.
BSON is a binary encoded Javascript Object Notation (JSON)—a textual object notation widely used
to transmit and store data across web based applications. JSON is easier to understand as it is human-
readable, but compared to BSON, it supports fewer data types.
This format optimizes data storage and retrieval, ensuring efficiency in MongoDB's operations. Key
aspects of BSON include:
1. Binary Encoding:
BSON employs binary encoding to represent JSON-like documents in a compact and efficient
manner. This binary format facilitates faster data processing and reduces storage space compared to
traditional text-based JSON.
2. Data Types:
BSON supports a rich set of data types, including strings, integers, floating-point numbers, arrays,
and nested documents. This versatility allows MongoDB to handle diverse and complex data
structures, accommodating the dynamic nature of modern applications.
3. Efficient Storage:
The binary nature of BSON enhances data storage efficiency by eliminating the need for textual
parsing. This results in smaller document sizes, reducing I/O operations and improving overall
performance, especially in scenarios with large datasets.
BSON introduces additional data types not found in standard JSON, such as Date and Binary.
These extensions enhance MongoDB's ability to represent diverse information, including date and
time values and binary data like images or multimedia content.
Page5
Introduction to NoSQL
BSON's structure aligns with MongoDB's query language and indexing capabilities, enabling
efficient searching and retrieval of data. MongoDB can leverage BSON's binary format to accelerate
query execution and optimize the performance of read and write operations.
Understanding BSON is essential for MongoDB developers, as it forms the foundation for storing
and processing data within the database. Its binary representation contributes to MongoDB's speed,
flexibility, and scalability, making it a key component in the success of MongoDB as a NoSQL
database solution.
MongoDB is well-suited for CMS applications where content structures can vary widely. Its
flexible document-oriented model allows developers to adapt to changing content requirements
without compromising performance. MongoDB's ability to handle large volumes of unstructured data
makes it an ideal choice for storing and retrieving diverse content types efficiently.
2. Real-time Analytics:
In scenarios requiring real-time data analysis, MongoDB excels by providing fast read and write
operations. Its document-oriented data model allows for the storage of complex and varied data,
facilitating quick querying and analysis. MongoDB's support for horizontal scaling ensures
responsiveness even with high volumes of real-time data.
MongoDB is a preferred database for IoT projects where massive amounts of sensor data are
generated. Its ability to handle diverse data types and scale horizontally makes it suitable for storing
and retrieving sensor readings, device information, and other IoT-related data. MongoDB's flexibility
accommodates the dynamic nature of IoT environments.
4. E-commerce Platforms:
MongoDB excels in log and event tracking applications where fast insertion and retrieval of log
data are crucial. Its support for indexing and efficient querying enables the analysis of logs and
events in real-time. MongoDB's horizontal scalability ensures that log and event tracking systems
can handle increasing data volumes seamlessly.
Page6
Introduction to NoSQL
MongoDB serves as an effective backend database for mobile applications, providing a JSON-like
data model that aligns well with the data structures commonly used in mobile development. Its
automatic sharding and replication features ensure high availability and reliability for mobile app
backend services.
7. Geospatial Applications:
MongoDB's geospatial indexing capabilities make it a suitable choice for applications that involve
location-based data, such as mapping and geolocation services. It can efficiently store and retrieve
geospatial information, supporting queries based on proximity, distance, and other location-based
criteria.
Understanding these use cases showcases MongoDB's versatility and applicability across a wide
range of scenarios, making it a robust choice for developers and organizations seeking a flexible and
scalable NoSQL database solution.
MongoDB employs a flexible, document-oriented data model where data is stored in BSON
(Binary JSON) documents. This schema-less approach allows developers to work with dynamic and
evolving data structures, accommodating changes without the need for a predefined schema.
2. Dynamic Schema:
MongoDB's dynamic schema enables developers to add fields to documents on the fly. This
flexibility is particularly beneficial in scenarios where data structures evolve over time, facilitating
agile development and reducing the complexity associated with rigid, predefined schemas.
3. High Performance:
MongoDB is designed for high performance, offering fast read and write operations. Its indexing
and query optimization features, combined with the ability to horizontally scale by sharding, make it
suitable for handling large volumes of data and supporting applications with demanding performance
requirements.
4. Horizontal Scalability:
Page7
Introduction to NoSQL
MongoDB excels in scalable architectures through horizontal scaling. By distributing data across
multiple servers via sharding, MongoDB can handle increased data loads and provide seamless
scalability without sacrificing performance.
MongoDB supports various indexing techniques, enhancing query performance. Developers can
create indexes on fields to speed up data retrieval, and the query optimizer optimizes execution plans
for efficient searches, ensuring optimal performance even with extensive datasets.
6. Aggregation Framework:
MongoDB's powerful aggregation framework allows for complex data transformations and
analysis within the database. It supports pipeline-style aggregations, enabling developers to perform
tasks such as filtering, grouping, and projecting data directly within the database.
8. Geospatial Indexing:
MongoDB includes GridFS, a specification for storing and retrieving large files such as images,
videos, and audio files. This feature allows developers to seamlessly integrate large file storage
within the database, eliminating the need for a separate file storage system.
MongoDB benefits from a vibrant and active community, providing a rich ecosystem of tools,
drivers, and extensions. This community support enhances the development experience and ensures
access to a wealth of resources for troubleshooting and optimization.
Understanding and leveraging these features make MongoDB a robust choice for modern
applications, offering developers the flexibility, scalability, and performance required to address
diverse and evolving data challenges.
Page8
Introduction to NoSQL
Installing MongoDB
MongoDB is a cross-platform NoSQL database, compatible with various operating systems. Before
installation, ensure your system meets the following requirements:
- Disk Space: Plan for data storage, indexes, and system files.
- File System: Choose a supported file system like ext4 (Linux) or NTFS (Windows).
Page9
Introduction to NoSQL
2. Select the appropriate version for your operating system (Windows, Linux, or macOS).
3. Choose the desired edition (Community) and click the download button.
4. Follow the prompts to save the installer file to your local machine.
Page
Introduction to NoSQL
- Windows:
Page
Introduction to NoSQL
- Follow the installation wizard, accepting the license agreement.
Page
Introduction to NoSQL
- MongoDB will be installed as a Windows service.
- Linux:
- Optionally, create a symbolic link to the `bin` directory for easy access.
- macOS:
- Install MongoDB using Homebrew: `brew tap mongodb/brew && brew install
mongodb/brew/mongodb-community`.
- Alternatively, download the .tgz file and follow similar steps as Linux.
- Basic Configuration:
Page
Introduction to NoSQL
- Choose the installation directory.
- Network Configuration:
- Set the data directory where MongoDB will store its databases.
- Windows:
- MongoDB as a service starts automatically. Use `net start MongoDB` and `net stop
MongoDB` to manage the service.
- Use the following commands to check server status and view logs:
- `mongo admin --eval "db.runCommand({whatsmyuri: 1})"`: Check the server status and
connection URI.
Following these steps ensures a smooth MongoDB installation, configuration, and verification process,
allowing you to start working with MongoDB databases on your chosen platform.
Connecting to MongoDB
Page
Introduction to NoSQL
MongoDB Connection String:
A MongoDB connection string is a URI-like string that specifies how to connect to a MongoDB
instance. It includes information such as the host, port, authentication details, and other parameters.
The general format is `mongodb://username:password@host:port/database`.
- Python (PyMongo):
import com.mongodb.client.MongoClient;
import com.mongodb.client.MongoClients;
MongoClient client =
MongoClients.create("mongodb://username:password@host:port/database");
db.createUser({
user: "username",
pwd: "password",
roles: ["readWrite", "dbAdmin"]
});
- Authenticate in the MongoDB shell:
db.auth("username", "password");
Overview of MongoDB Compass (GUI tool for MongoDB):
Page
Introduction to NoSQL
- Features:
The MongoDB Shell, mongosh, is a JavaScript and Node.js REPL environment for interacting with
MongoDB deployments in Atlas, locally, or on another remote host. Use the MongoDB Shell to test
queries and interact with the data in your MongoDB database.
To install mongosh, you can use the MongoDB Community Edition, which includes the MongoDB
server and tools, including the mongosh shell. The installation steps vary depending on your
operating system.
For Windows:
1. Download the MongoDB Community Edition installer from the official MongoDB website:
MongoDB Community Download.
Page
Introduction to NoSQL
3. Once installed, you can open the mongosh shell from the command prompt or PowerShell.
After installation, you can launch mongosh by typing mongosh in your terminal or command
prompt.
Remember to start the MongoDB server (mongod) before using mongosh. The installation process
may differ slightly based on updates or changes, so refer to the MongoDB documentation for the
most current information: MongoDB Installation Guides.
Helpers
Show Databases
show dbs
db // prints the current database
Switch Database
use <database_name>
Show Collections
show collections
Run JavaScript File
load("myScript.js")
- Ensure the server allows remote connections (bind to all IP addresses or specific IP).
Page
Introduction to NoSQL
Securing the MongoDB Deployment:
- Encryption:
- Authentication:
- Firewall Rules:
- Audit Logging:
- Keep MongoDB and system software up to date with the latest security patches.
By following these guidelines, you can establish a secure MongoDB deployment, ensuring that data
is protected and access is restricted to authorized users and applications.
Additional Topics
- For restoration, employ `mongorestore` to rebuild the database from the dump.
Page
Introduction to NoSQL
3. Continuous Backup Services:
2. Multi-Cloud Availability:
- MongoDB Atlas supports multiple cloud providers, including AWS, Azure, and Google
Cloud.
- Enables flexibility in choosing the cloud infrastructure that best suits your needs.
4. Automated Scaling:
- Allows for automatic horizontal scaling by adjusting the number of nodes in a cluster based on
demand.
1. Serverless Backend:
- MongoDB Stitch is a serverless platform that eliminates the need for traditional server
management.
Page
Introduction to NoSQL
- Provides a unified platform for backend development and database management.
5. Real-Time Updates:
By adhering to these strategies, practices, and leveraging MongoDB's cloud services, developers can
ensure the robustness, scalability, and efficiency of MongoDB deployments, whether in traditional
environments or cloud-based solutions.
H. Resources/Equipment Required
I. Practical related Quiz
- A. Relational - C. Tabular
- B. Document-oriented - D. Hierarchical
2. In MongoDB, what does BSON stand for?
Page
Introduction to NoSQL
data Systems (CMS)
Page
Introduction to NoSQL
- C. Applications with static - D. Single-location data
schemas sources
10. What MongoDB feature allows for automatic horizontal scaling by adjusting the
number of nodes in a cluster based on demand?
Page
Introduction to NoSQL
J. References Links
1. https://www.geeksforgeeks.org/what-is-mongodb-working-and-features/
2. https://www.w3schools.com/mongodb/
3. https://www.mongodb.com/docs/manual/tutorial/install-mongodb-on-windows/
4. https://www.simplilearn.com/tutorials/mongodb-tutorial/install-mongodb-on-windows
5. https://www.geeksforgeeks.org/how-to-install-mongodb-on-windows/
K. Assessment Rubrics
Needs
Satisfactory
Criteria Excellent (10) Good (7) Improvement Marks
(5)
(3)
Successfully
Faces
installs MongoDB Installs Installs
MongoDB challenges
with accurate and MongoDB MongoDB with
Installation during
clear steps, correctly with some errors or
Process installation
showcasing deep clear steps. confusion.
process.
understanding.
Demonstrates an
Demonstrates
in-depth Understands Struggles to
Configuration basic
understanding of MongoDB comprehend
and Setup knowledge of
MongoDB configuration MongoDB
Understanding MongoDB
configuration and and setup. configuration.
configuration.
setup procedures.
Effectively utilizes
MongoDB features
Utilizes Makes limited
Utilization of during the Attempts to use
MongoDB use of
MongoDB installation process, MongoDB
features MongoDB
Features demonstrating features.
appropriately. features.
proficiency.
Demonstrates
excellent
Demonstrates Struggles to
troubleshooting Exhibits good
Troubleshooting basic troubleshoot
skills, quickly troubleshooting
Skills troubleshooting issues
resolving any skills.
skills. effectively.
issues that may
arise.
Average Marks
Page
Introduction to NoSQL
Practical 3. Basic CRUD Operations with MongoDB
A. Objective
The primary objective of this practical is to familiarize students with the fundamental CRUD (Create,
Read, Update, Delete) operations in MongoDB, a NoSQL database. Students will gain hands-on
experience in performing these operations using the MongoDB shell and understand the underlying
principles of data manipulation in a MongoDB environment.
Apply fundamental mathematical and engineering concepts in implementing CRUD operations with
MongoDB.
2. Problem Analysis:
Utilize codified standard methods to identify and analyze well-defined engineering problems related
to MongoDB data manipulation.
3. Design/Development of Solutions:
Design effective solutions for well-defined technical problems in MongoDB, emphasizing CRUD
operations and database structure.
Apply modern engineering tools to interact with MongoDB databases and conduct tests and
measurements, ensuring data integrity.
6. Project Management:
Use engineering management principles for effective communication and collaborative project
management during MongoDB activities.
7. Life-Long Learning:
Demonstrate a commitment to life-long learning by analyzing individual needs and updating skills in
response to technological changes in MongoDB and NoSQL databases.
Page
Introduction to NoSQL
- Demonstrate proficiency in creating, reading, updating, and deleting data in MongoDB,
showcasing a solid understanding of basic CRUD operations.
- Develop practical skills in using the MongoDB shell to navigate databases, execute CRUD
operations, and verify the success of data manipulations.
- Exhibit competency in identifying and resolving common errors during CRUD operations,
showcasing problem-solving skills in the MongoDB environment.
Apply MongoDB's features and basic CRUD operations to design and manipulate data structures
effectively, demonstrating proficiency in utilizing a document-oriented database.
Students will acquire practical proficiency in MongoDB's CRUD operations, mastering document
structure design, shell navigation, and error handling. This ensures a foundational understanding for
real-world data manipulation scenarios in MongoDB.
Students will cultivate a strong appreciation for proficient database management, showcasing
enhanced confidence in executing MongoDB CRUD operations. They will adopt a positive attitude
towards addressing real-world data challenges, expressing a sense of accomplishment and
motivation. Recognizing the importance of their MongoDB skills, students will be motivated to
apply these acquired competencies in future engineering applications, fostering a positive affective
response towards database manipulation and an eagerness to implement MongoDB concepts in their
professional endeavors.
G. Prerequisite Theory
CRUD operations define a user interface's conventions, enabling users to view, search, and modify
database components.
To modify MongoDB documents, one connects to a server, queries the relevant documents, adjusts
the desired properties, and sends the data back to update the database. CRUD follows data-oriented
standards aligned with HTTP action verbs.
Page
Introduction to NoSQL
Breaking down individual CRUD operations:
use your_database_name
Replace `your_database_name` with the desired name. MongoDB creates the database on-
demand when data is inserted.
MongoDB switches to it if it exists; otherwise, it creates the database when data is inserted.
db
This displays the current database name.
Dropping a Database
db.dropDatabase()
Ensure you are in the intended database before executing this command.
- Lazy Creation: MongoDB creates databases when data is inserted, optimizing resource usage.
- Switching Context: The `use` command facilitates seamless transitions between databases.
- Database Naming Rules: Case-sensitive names with allowed characters [a-z], [A-Z], [0-9], period
(.), underscore (_), and dollar sign ($).
Example:
use mydatabase
Dropping a database:
db.dropDatabase()
Page
Introduction to NoSQL
To create a collection in MongoDB, you can use the `createCollection` method:
db.createCollection("your_collection_name")
Replace `your_collection_name` with the desired name for your collection.
db.your_collection_name.drop()
Replace `your_collection_name` with the name of the collection you want to drop.
- Case-sensitive names with allowed characters [a-z], [A-Z], [0-9], period (.), underscore (_),
and dollar sign ($).
Example:
Creating a collection:
db.createCollection("products")
Dropping a collection:
db.products.drop()
Inserting Documents:
db.collection_name.insertMany([
{ key1: value1, key2: value2, ... },
{ key1: value1, key2: value2, ... },
...
])
Reading Documents:
db.collection_name.find()
You can add criteria to filter the results:
Page
Introduction to NoSQL
db.collection_name.updateOne({ filter_criteria }, { $set: { key: new_value } })
For multiple documents:
db.collection_name.deleteOne({ filter_criteria })
For multiple documents:
db.collection_name.deleteMany({ filter_criteria })
- Ensure documents match the desired structure for the collection.
Examples:
Inserting documents:
db.products.find()
Updating documents:
In MongoDB, the way data is organized is very flexible. Unlike in traditional databases, you don't
have to decide and declare how the data should look beforehand. Each piece of data (document) can
be unique, but usually, they follow a similar pattern. The challenge is finding a good balance
between what the application needs, how fast the database works, and how you want to get the data
back. When designing how data should look, think about how the data will be used in the application
and how the data naturally fits together.
Designing MongoDB data models involves crucial decisions on document structure and representing
data relationships. Two approaches are commonly used: references and embedded documents.
Page
Introduction to NoSQL
References: Relationships are established through links or references between documents, allowing
applications to access related data by resolving these references. This approach follows a normalized
data model.
Embedded Data: Relationships are captured by storing related data within a single document
structure. MongoDB supports embedding document structures within fields or arrays, enabling
retrieval and manipulation of related data in a single database operation. This approach follows
denormalized data models.
MongoDB provides two types of data models: — Embedded data model and Normalized data model.
Based on the requirement, you can use either of the models while preparing your document.
In this model, you can have (embed) all the related data in a single document, it is also known as de-
normalized data model.
For example, assume we are getting the details of employees in three different documents namely,
Personal_details, Contact and, Address, you can embed all the three documents in a single one as
shown below −
Page
Introduction to NoSQL
{
_id: ,
Emp_ID: "10025AE336"
Personal_details:{
First_Name: "Radhika", Last_Name: "Sharma", Date_Of_Birth: "1995-09-26"
},
Contact: {
e-mail: ", phone: "9848022338"
},
Address: {
city: "Hyderabad", Area: "Madapur", State: "Telangana"
}
}
Normalized Data Model
In this model, you can refer the sub documents in the original document, using references. For
example, you can re-write the above document in the normalized model as:
Employee:
{
_id: <ObjectId101>,
Emp_ID: "10025AE336"
}
Personal_details:
{
_id: <ObjectId102>,
empDocID: " ObjectId101",
First_Name: "Radhika",
Last_Name: "Sharma",
Date_Of_Birth: "1995-09-26"
}
Contact:
{
_id: <ObjectId103>,
empDocID: "
ObjectId101",
e-mail: "[email protected]",
phone: "9848022338"
Address: }
Page
Introduction to NoSQL
{
_id: <ObjectId104>,
empDocID: " ObjectId101", city: "Hyderabad",
Area: "Madapur", State: "Telangana"
Indexing in MongoDB
Key Concepts:
1. Index Types:
- MongoDB supports various index types, including single-field, compound, multikey, and text
indexes.
- Each index type caters to specific query patterns and enhances data access.
2. Index Creation:
- MongoDB automatically creates an index on the `_id` field for each collection.
- Additional indexes can be strategically created based on the application's query requirements.
Managing Indexes:
Page
Introduction to NoSQL
1. Listing Existing Indexes:
- Syntax: `db.collection_name.getIndexes()`.
2. Dropping an Index:
Best Practices:
1. Selective Indexing:
2. Regular Monitoring:
- Utilize tools like MongoDB Compass or the Database Profiler for comprehensive monitoring.
- Choose index types based on specific query patterns and application requirements. Practical
Implementation:
db.products.createIndex({ name: 1 })
Listing existing indexes in the "products" collection
db.products.getIndexes()
Dropping an index on the "name" field in the "products" collection
db.products.dropIndex({ name: 1 })
Key Strategies:
1. Covered Queries:
Page
Introduction to NoSQL
- Aim for covered queries where the index alone satisfies the query without loading
documents from the collection.
- Utilize the `explain` method to analyze query execution plans and identify areas for
optimization.
3. Index Hinting:
- Use the `hint` method to explicitly specify which index MongoDB should use for a query.
Best Practices:
1. Query Specificity:
- Craft queries that are specific and target only the necessary data.
- Ensure queries utilize indexes to avoid collection scans, which can be resource-intensive.
Practical Implementation:
1. Explaining a Query:
2. Index Hinting:
- Explicitly specify the index to be used for a query using the `hint` method.
Example:
Page
Introduction to NoSQL
db.products.find({ name: "Product A" }).hint({ name: 1 })
Aggregation Framework
Aggregation in MongoDB is a powerful framework that allows for the transformation and processing
of documents within a collection. It enables users to perform advanced data manipulations,
combining and transforming data in various ways to derive meaningful insights. Aggregation is
especially useful for complex analytics and reporting tasks.
Key Concepts:
1. Pipeline Stages:
- Aggregation operations are structured as a pipeline, where each stage performs a specific
transformation on the data.
2. Expression Operators:
- Aggregation expressions allow the manipulation of data within the pipeline stages.
- The `$group` stage is fundamental for grouping documents based on specified criteria.
- Aggregation functions like `$sum`, `$avg`, `$min`, and `$max` facilitate summarization within
groups.
Practical Implementation:
1. Basic Aggregation:
db.products.aggregate([
{ $group: { _id: null, avg_price: { $avg: "$price" } } }
])
2. Grouping by a Field:
db.products.aggregate([
{ $group: { _id: "$category", avg_price: { $avg: "$price" } } }
])
3. Text Search and Matching:
Page
Introduction to NoSQL
db.products.aggregate([
{ $match: { $text: { $search: "keyword" } } }
])
Best Practices:
1. Optimizing Pipeline:
- Familiarize yourself with the data structure to craft effective aggregation pipelines.
Example:
Aggregation pipeline to find the total sales for each product category
db.sales.aggregate([
{ $group: { _id: "$product.category", total_sales: { $sum: "$quantity" } } },
{ $sort: { total_sales: -1 } }
])
Replica Sets
A replica set in MongoDB is a distributed system that provides high availability and fault tolerance
by maintaining multiple copies of data across multiple servers. This ensures data redundancy and
allows for automatic failover in case of server failures. Replica sets are a fundamental component for
building resilient MongoDB deployments.
Key Concepts:
- A replica set consists of multiple servers, with one designated as the primary and the others
as secondaries.
- The primary handles all write operations, while secondaries replicate data from the
primary.
2. Automatic Failover:
- In the event of a primary node failure, replica sets automatically elect a new primary from the
available secondaries.
3. Data Redundancy:
Page
Introduction to NoSQL
- Each member of the replica set contains a full copy of the data, providing redundancy and data
availability.
- Initialize a replica set using the `rs.initiate()` command on the primary node.
2. Read Preference:
- Clients can specify their read preferences to direct read operations to the primary or
secondary nodes.
- MongoDB provides tools like `rs.status()` for monitoring the health of a replica set.
- Regularly check the status and perform maintenance tasks such as adding or removing nodes.
Practical Considerations:
2. Arbiter Nodes:
3. Data Safety:
- Regularly backup data and monitor the replica set to ensure data safety.
Example:
rs.initiate(
{
_id: "myReplicaSet",
Page
Introduction to NoSQL
members: [
{ _id: 0, host: "mongo1:27017" },
{ _id: 1, host: "mongo2:27017" },
{ _id: 2, host: "mongo3:27017" }
]
}
)
Sharding in MongoDB
Sharding is a horizontal scaling technique in MongoDB designed to distribute data across multiple
servers, or shards. This enables MongoDB to handle large datasets and high write and read traffic,
ensuring scalability and improved performance. Sharding is a key feature for organizations with
growing data demands.
Key Concepts:
1. Shard:
- A shard is an individual server or a replica set that stores a subset of the data.
2. Shard Key:
3. Balancer:
1. Setting Up Sharding:
- Consider query patterns, data distribution, and growth when selecting a shard key.
3. Scaling Shards:
Page
Introduction to NoSQL
- MongoDB dynamically redistributes data to maintain an even load.
Practical Considerations:
1. Query Isolation:
- Ensure that queries align with the chosen shard key to take advantage of sharding benefits.
- Regularly monitor the health and performance of shards using tools like `sh.status()` and
`mongostat`.
Example:
sh.enableSharding("myDatabase")
Sharding a collection using a specific shard key
sh.shardCollection("myDatabase.myCollection", { "shardKeyField": 1 })
H. Resources/Equipment Required
1. Computers/Laptops:
- Ensure students have access to devices with pre-installed MongoDB for hands-on practice.
2. Internet Connection:
- Provide a stable internet connection for MongoDB installation and accessing online
documentation.
3. Learning Materials:
- Distribute guides and tutorials to aid students during the practical session.
4. Sample Datasets:
- Supply datasets for students to apply CRUD operations, enhancing their practical
understanding.
Page
Introduction to NoSQL
5. Technical Support:
- Offer assistance for installation or operational issues, ensuring a smooth learning experience.
{
"_id": ObjectId("5f8a4c261c9d4400001a4af4"),
"name": "Laptop",
"price": 1200.99,
"category": "Electronics",
"stock": 50
}
- Create a database named `OnlineStore` and a collection named `Products`
with at least 4 fields.
- Insert at least 10 records into the `Products` collection with various product
details.
- Retrieve and display all the documents inserted into the `Products` collection.
- Retrieve and display the documents in the `Products` collection, sorted by price
in ascending order.
Page
Introduction to NoSQL
Page
Introduction to NoSQL
{
"_id": ObjectId("5f8a4c261c9d4400001a4af7"),
"title": "The Great Gatsby",
Page
Introduction to NoSQL
"author": "F. Scott Fitzgerald",
"genre": "Classic", "year": 1925
}
- Create a database named `Library` and a collection named `Books` with at
least 4 fields.
- Insert at least 10 records into the `Books` collection with details about various books.
- Retrieve and display all the documents inserted into the `Books` collection.
- Update the publication year of a book in the `Books` collection based on a given
condition.
Page
Introduction to NoSQL
Page
Introduction to NoSQL
3. Consider following collections and perform following queries on it. [4 hr]
- Collection: `users`
{
"_id": ObjectId("5f8a49e11c9d4400001a4af1"),
"name": "John Doe",
"age": 25,
"email": "[email protected]"
}
- Collection: `blog_posts`
{
"_id": ObjectId("5f8a4a491c9d4400001a4af2"),
"title": "MongoDB Blog",
"content": "Introduction to MongoDB...",
"author": "Jane Smith",
"comments": [
{
"content": "Great post!",
"author": "Alice",
"timestamp": ISODate("2022-01-01T12:00:00Z")
},
{
"content": "Thanks for sharing.",
"author": "Bob",
"timestamp": ISODate("2022-01-02T09:30:00Z")
}
]
}
- Collection: `recipes`
{
"_id": ObjectId("5f8a4b431c9d4400001a4af3"),
"name": "Spaghetti Bolognese",
"ingredients": ["spaghetti", "ground beef", "tomato sauce", "onions"],
"steps": ["Boil spaghetti", "Cook ground beef", "Mix with tomato sauce and onions"]
}
- Create a document for a "user" with fields for name, age, and email.
- Design a document structure for a "blog post" that includes nested comments with
fields for content, author, and timestamp.
- Create a document structure for a "recipe" with an array field for ingredients and
another for steps.
Page
Introduction to NoSQL
- Retrieve all documents from a "products" collection where the price is greater than
$50.
- Use the aggregation framework to calculate the average rating for a "movie"
collection based on user reviews.
- Create an index on the "timestamp" field of a "logs" collection and explain how it
improves query performance.
- Implement a text search query to find all documents in a "books" collection that
contain the word "MongoDB" in the title or description.
- Use the aggregation pipeline to group "sales" data by month and calculate the
total revenue for each month.
- Set up a replica set with three nodes and demonstrate a failover scenario.
Page
Introduction to NoSQL
Page
Introduction to NoSQL
Page
Introduction to NoSQL
J. References Links
1. https://www.mongodb.com/developer/products/mongodb/cheat-sheet/
2. https://gist.github.com/bradtraversy/f407d642bdc3b31681bc7e56d95485b6
3. https://www.mongodb.com/basics/crud
4. https://www.scaler.com/topics/crud-operations-in-mongodb/
5. https://www.mongodb.com/developer/expertise-levels/advanced/
6. https://data-flair.training/blogs/mongodb-features/
K. Assessment Rubrics
Needs
Criteria Excellent (10) Good (7) Satisfactory (5) Improvement Marks
(3)
Demonstrates a Lacks
Understanding Shows a good Displays a basic
thorough understanding
of CRUD understanding understanding of
understanding of of CRUD
Operations of CRUD CRUD
CRUD operations
Shows Demonstrates Struggles in
Has a basic
Data Modeling exceptional skills good grasping data
understanding of
Proficiency in data modeling proficiency in modeling
data modeling
in MongoDB data modeling concepts
Successfully Attempts to use Unable to
Indexing and Implements
implements indexing but with implement
Query indexing with
indexing for significant issues indexing and
Optimization minor issues
optimal queries optimization
Demonstrates Attempts to
Unable to
Knowledge of proficiency in Shows good implement
implement
Sharding in implementing and understanding sharding but
sharding in
MongoDB managing sharding of sharding with significant
MongoDB
issues
Average Marks
Page
Introduction to NoSQL
Practical 4. Introduction and Setup of Cassandra
A. Objective
2. Problem Analysis Skills: Identify and analyze engineering problems in Cassandra cluster setup,
employing standard methods for troubleshooting and resolution.
3. Design Solutions Capability: Design effective solutions for technical challenges in Apache
Cassandra configuration, ensuring optimal distributed database performance.
4. Engineering Tools Proficiency: Apply modern engineering tools and techniques to conduct tests
and measurements, validating the successful setup of the Cassandra cluster.
6. Project Management Skills: Utilize engineering management principles for collaborative and
effective communication, leading or contributing to the successful Cassandra cluster setup.
7. Life-long Learning Mindset: Analyze individual learning needs and adapt to technological
changes, fostering continuous development in the field of distributed database management with
Apache Cassandra.
2. Problem Solving Skills: Through troubleshooting common issues, students enhance their
problem-solving abilities in distributed database management with Apache Cassandra.
4. Tool Proficiency and Validation: Application of modern tools for tests and measurements
validates the successful setup of the Cassandra cluster, honing practical skills in tool utilization.
Page
Introduction to NoSQL
D. Relevant Course Outcomes (Cos)
Demonstrate Cassandra's data model and query language (CQL), showcasing the ability to create and
manage distributed data tables efficiently.
Students will showcase their competence by successfully configuring an Apache Cassandra cluster,
translating theoretical knowledge into hands-on skills for distributed database management.
Additionally, a detailed troubleshooting report will highlight their proficiency in identifying and
resolving common issues encountered during the Cassandra cluster setup, demonstrating practical
problem-solving abilities.
G. Prerequisite Theory
Cassandra is a highly scalable, distributed NoSQL database designed for handling large amounts of
data across multiple commodity servers without any single point of failure. Developed by Apache
Software Foundation, it offers high availability, fault tolerance, and seamless horizontal scaling,
making it a popular choice for applications with demanding data requirements.
History of Cassandra
Apache Cassandra is used by smaller organizations while Datastax enterprise is used by the larger
organization for storing huge amount of data. Apache Cassandra is managed by Apache.
Key Features:
Page
Introduction to NoSQL
2. No Single Point of Failure: With a decentralized design, Cassandra eliminates the risk of a single
point of failure, enhancing reliability and system resilience.
3. High Availability: Data is replicated across multiple nodes, allowing for uninterrupted access
even in the event of node failures or network issues.
4. Scalability: Cassandra's linear scalability allows for the addition of nodes to accommodate
growing data needs, making it suitable for applications experiencing rapid expansion.
5. Flexible Data Model: Cassandra supports a schema-agnostic model, allowing developers to store
and retrieve data in a flexible, dynamic manner, accommodating changing application requirements.
6. Query Language (CQL): Cassandra Query Language (CQL) is similar to SQL, making it easier
for developers familiar with relational databases to transition to Cassandra.
7. Tunable Consistency: Users can configure the level of data consistency based on their
application's requirements, striking a balance between performance and data integrity.
Use Cases:
1. Big Data Analytics: Cassandra excels in handling vast amounts of data, making it a preferred
choice for big data analytics applications.
2. IoT (Internet of Things): Its ability to scale horizontally makes Cassandra well-suited for
managing large volumes of data generated by IoT devices.
4. Time-Series Data: The database's efficient handling of time-series data makes it valuable for
applications like monitoring systems and logging.
Challenges:
1. Learning Curve: The decentralized nature of Cassandra may pose a learning curve for those
accustomed to traditional relational databases.
2. Consistency Trade-offs: Achieving high availability often involves trade-offs in terms of data
consistency, requiring careful configuration based on application needs.
Cassandra offers a robust solution for organizations seeking a highly scalable, fault-tolerant, and
flexible NoSQL database, particularly well-suited for applications with demanding data
requirements.
Cassandra's architecture is tailored for handling extensive big data workloads across a cluster of
interconnected nodes, eliminating a single point of failure. Operating on a peer-to- peer distributed
system, each independent node can seamlessly handle both read and write requests, irrespective of
the data's physical location in the cluster. In the event of a node failure, requests can be redirected to
other nodes, ensuring continuous service availability.
Page
Introduction to NoSQL
Data replication in Cassandra involves nodes acting as replicas, safeguarding data against failures. In
case of discrepancies, Cassandra prioritizes returning the most recent value to the client and initiates
background read repairs to update outdated values.
Key components include nodes (data storage units), data centers (collections of related nodes), and
clusters (containing one or more data centers). The commit log serves as a crash-recovery
mechanism, capturing every write operation. Data progresses to the mem- table, a memory-resident
structure, before being flushed to SSTables—disk files—upon reaching a threshold. Bloom filters,
quick algorithms for set membership testing, function as a special cache accessed after each query.
This architecture ensures robustness, fault tolerance, and scalability in managing distributed data
across a Cassandra cluster.
Users can access Cassandra through its nodes using Cassandra Query Language (CQL). CQL treats
the database (Keyspace) as a container of tables. Programmers use cqlsh: a prompt to work with
CQL or separate application language drivers.
Clients approach any of the nodes for their read-write operations. That node (coordinator) plays a
proxy between the client and the nodes holding the data.
Write Operations
Every write activity of nodes is captured by the commit logs written in the nodes. Later the data will
be captured and stored in the mem-table. Whenever the mem-table is full, data will be written into
the SStable data file. All writes are automatically partitioned and replicated
Page
Introduction to NoSQL
throughout the cluster. Cassandra periodically consolidates the SSTables, discarding unnecessary
data.
Read Operations
During read operations, Cassandra gets values from the mem-table and checks the bloom filter to
find the appropriate SSTable that holds the required data.
Data model in Cassandra is totally different from normally we see in RDBMS. Let's see how
Cassandra stores its data.
Cluster
Cassandra database is distributed over several machines that are operated together. The outermost
container is known as the Cluster which contains different nodes. Every node contains a replica, and
in case of a failure, the replica takes charge. Cassandra arranges the nodes in a cluster, in a ring
format, and assigns data to them.
Keyspace
Keyspace is the outermost container for data in Cassandra. Following are the basic attributes of
Keyspace in Cassandra:
Replication factor: It specifies the number of machine in the cluster that will receive copies of the
same data.
Replica placement Strategy: It is a strategy which species how to place replicas in the ring. There are
three types of strategies such as:
Column families: column families are placed under keyspace. A keyspace is a container for a list of
one or more column families while a column family is a container of a collection of rows. Each row
contains ordered columns. Column families represent the structure of your data. Each keyspace has
at least one and often many column families.
In Cassandra, a well data model is very important because a bad data model can degrade
performance, especially when you try to implement the RDBMS concepts on Cassandra.
> Cassandra doesn't support JOINS, GROUP BY, OR clause, aggregation etc. So you have to store
data in a way that it should be retrieved whenever you want.
Page
Introduction to NoSQL
> Cassandra is optimized for high write performances so you should maximize your writes for
better read performance and data availability. There is a tradeoff between data write and data read.
So, optimize you data read performance by maximizing the number of data writes.
> Maximize data duplication because Cassandra is a distributed database and data duplication
provides instant availability without a single point of failure.
Visit the official Apache Cassandra website and download the latest stable release.
https://cassandra.apache.org/_/download.html
Follow the installation instructions provided for your operating system (Windows, Linux, or
macOS).
- Ensure that Java Development Kit (JDK) is installed on your system, as Cassandra relies on Java.
- Set the `JAVA_HOME` environment variable to the installation path of your JDK.
- Add the `bin` directory of the JDK to your system's `PATH` variable.
- Verify your Java installation by running `java -version` in the command line.
Install the latest version of Java 8 the Oracle Java Standard Edition 8 (OpenJDK 8). To verify
that you have the correct version of java installed, type java -version.
Users interact with the Cassandra database by utilizing the cqlsh bash shell. You need to install
Python 2.7 for cqlsh to handle user requests properly.
Page
Introduction to NoSQL
For using cqlsh, the latest version of Python 2.7 (support deprecated). To verify that you have the
correct version of Python installed, type python --version
Download the Apache Cassandra installation file, extract it to the folder named 'apache- cassandra-
3.11.16', and relocate this folder to a directory of your preference, for example, 'C:\apache-cassandra-
3.11.16'.
In order to run Cassandra with fully featured functionalities, we need to unrestrict Powershell
execution policy. It will look like below on running Cassandra.
- In Windows search bar, type powershell, then select ‘Windows Powershell’ and run as
administrator. You will come to the location ‘C:\WINDOWS\system32’.
- Type command : Set-ExecutionPolicy Unrestricted
Page
Introduction to NoSQL
- Go to the Environment Variables section, select 'Path' under System variables, and click 'Edit.'
Page
Introduction to NoSQL
cassandra
Page
Introduction to NoSQL
While the initial command prompt is still running open a new command line prompt from the same
bin folder. Enter the following command to access the Cassandra cqlsh bash shell:
cqlsh
Page
Introduction to NoSQL
You now have access to the Cassandra shell and can proceed to issue basic database commands to
your Cassandra server.
Create a Keyspace
In the CQL shell, create a keyspace using the CREATE KEYSPACE statement. Replace
'your_keyspace' with the desired keyspace name and set the replication strategy and factor based on
your requirements.
Page
Introduction to NoSQL
CREATE KEYSPACE your_keyspace
WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};
Use the Keyspace
USE your_keyspace;
bin/nodetool stopdaemon
- Ensure that Cassandra is gracefully shut down to prevent data corruption.
By following these steps, you can successfully install, configure, and start working with Apache
Cassandra, setting the foundation for building robust and scalable distributed database systems.
H. Resources/Equipment Required
1. Computing Devices: Students will need individual computing devices such as laptops or desktops
with sufficient processing power and memory to support the installation and configuration of
Apache Cassandra.
2. Internet Connectivity: A stable internet connection is essential for downloading the necessary
software packages, updates, and documentation related to Apache Cassandra.
3. Apache Cassandra Software: Students must have access to the latest version of the Apache
Cassandra software. This includes the installation files and documentation available for download
from the official Apache Cassandra website.
4. Operating System: Ensure that students' devices are running a compatible operating system.
Apache Cassandra is compatible with various operating systems, including Linux, Windows, and
macOS.
5. Java Development Kit (JDK): Apache Cassandra requires Java to run. Students should have a
compatible version of the JDK installed on their devices before setting up the Cassandra cluster.
Page
Introduction to NoSQL
I. Practical related Questions
4. What is the key feature of Cassandra that ensures system resilience and reliability?
Page
Introduction to NoSQL
- a) Linear Scalability - b) High Availability
Page
Introduction to NoSQL
- c) Learning Curve - d) Tunable Consistency
10. What is the primary role of the commit log in Cassandra's architecture?
True/False Statements:
3. Cassandra's write operations involve capturing data in the commit log, followed by
storing it in SSTables.
J. References Links
1. https://cassandra.apache.org/_/cassandra-basics.html
2. https://www.freecodecamp.org/news/the-apache-cassandra-beginner-tutorial/
3. https://www.datastax.com/learn/cassandra-fundamentals
4. https://phoenixnap.com/kb/install-cassandra-on-windows
5. https://www.heatware.net/cassandra/cassandra-windows-10-setup-install/
Page
Introduction to NoSQL
K. Assessment Rubrics
Needs
Criteria Excellent (10) Good (7) Satisfactory (5) Improvement Marks
(3)
Demonstrates a Shows a good
Displays a basic Lacks
Understanding comprehensive grasp of
understanding of understanding
of Cassandra understanding of Cassandra
Cassandra of Cassandra
Cassandra concepts
Exhibits Demonstrates Struggles to
Has a basic
Proficiency in exceptional skills good grasp data
understanding of
Data Modeling in data modeling proficiency in modeling
data modeling
in Cassandra data modeling concepts
Installs and Attempts to install
Successfully Unable to
Installation configures and configure
installs and install and
and Cassandra with Cassandra but with
configures configure
Configuration minor issues significant issues
Cassandra Cassandra
Average Marks
Page
Introduction to NoSQL
Practical 5. Data Modeling and Simple Queries with Cassandra
A. Objective
Students will explore data modeling principles in Apache Cassandra, emphasizing the creation of
efficient schemas. The goal is to strengthen understanding of Cassandra's query language (CQL)
through hands-on experience with simple queries. Additionally, students will perform monitoring,
troubleshooting, and learn performance tuning and optimization techniques. Furthermore, they will
implement compaction strategies, ensuring a comprehensive skill set for leveraging Cassandra in
practical scenarios.
- Apply basic mathematics, science, and engineering principles to formulate efficient data models
and perform fundamental operations in Cassandra.
2. Problem Analysis:
- Identify and analyze well-defined engineering challenges related to data modeling, simple
queries, and basic operations using standardized methods in Cassandra.
3. Solution Design/Development:
- Design effective solutions for engineering problems associated with data modeling, simple
queries, and fundamental operations in Cassandra, contributing to the creation of optimal schemas.
- Utilize modern engineering tools and techniques for conducting standard tests and
measurements, ensuring the reliability of Cassandra data models and operations.
6. Project Management:
7. Lifelong Learning:
Page3
Introduction to NoSQL
- Exhibit the ability to analyze individual learning needs and engage in continuous updating within
the evolving technological landscape of data modeling, simple queries, and basic operations in the
field of engineering.
- Develop competency in crafting effective data models in Cassandra, showcasing the ability to
design schemas that align with engineering requirements.
- Acquire practical skills in executing simple queries using Cassandra Query Language (CQL),
demonstrating proficiency in retrieving and manipulating data within the Cassandra database.
- Gain hands-on experience in identifying and resolving issues related to data modeling and query
performance, while implementing optimization strategies for enhanced efficiency.
Demonstrate Cassandra's data model and query language (CQL), showcasing the ability to create and
manage distributed data tables efficiently.
Students will have cultivated expertise in crafting efficient data models within Apache Cassandra
and executing simple queries using Cassandra Query Language (CQL). The practical will empower
them with hands-on skills to design optimal schemas, retrieve and manipulate data effectively,
troubleshoot potential issues, and implement optimization techniques. This comprehensive practical
outcome ensures students are well-prepared to apply their knowledge in real-world scenarios
involving data modeling and basic querying in Cassandra.
- Students will develop increased confidence in navigating and applying data modeling concepts
in Cassandra, fostering a heightened sense of engagement and enthusiasm for working with
distributed databases.
- The practical aims to instill a problem-solving mindset in students, encouraging them to approach
challenges associated with data modeling and simple queries in Cassandra with
Page4
Introduction to NoSQL
resilience and innovative thinking. This outcome contributes to a positive shift in attitudes towards
overcoming real-world engineering complexities.
G. Prerequisite Theory
In Cassandra, the management of keyspaces involves creating, altering, and dropping. Here are
examples of how you can perform these actions using the Cassandra Query Language (CQL):
Create Keyspace:
To create a keyspace, use the `CREATE KEYSPACE` statement. Replace 'your_keyspace' with the
desired keyspace name and set the replication strategy and factor based on your requirements.
To alter an existing keyspace, use the `ALTER KEYSPACE` statement. This can include
modifying the replication strategy, replication factor, or other configuration options.
To drop (delete) a keyspace and all its data, use the `DROP KEYSPACE` statement. Be cautious,
as this operation is irreversible.
In Cassandra, you can manage tables using the Cassandra Query Language (CQL). Below are
examples of creating, altering, and dropping tables:
Create Table:
To create a table, use the `CREATE TABLE` statement. Replace 'your_table' with the desired table
name and specify the columns along with their data types.
Page5
Introduction to NoSQL
);
Alter Table:
To alter an existing table, you can use the `ALTER TABLE` statement. This allows you to add or drop
columns, change the data type of a column, or modify other table properties.
To drop (delete) a table, use the `DROP TABLE` statement. Be cautious, as this operation
permanently removes the table and its data.
In Cassandra, the `TRUNCATE` statement is used to remove all data from a table while keeping the
table structure intact. This operation is similar to deleting all rows from the table but more efficient,
as it does not involve the same overhead.
Truncate Table:
TRUNCATE your_table;
Replace 'your_table' with the name of the table you want to truncate. This statement will
remove all rows from the specified table, but the table structure, including column definitions and
indexes, will remain unchanged.
It's important to note that truncating a table is a non-reversible operation, and it permanently removes
all data from the table. Ensure you have a backup or are certain about this action before executing it,
especially in a production environment.
In Cassandra, you can create and drop indexes using the Cassandra Query Language (CQL). Below
are examples of creating and dropping an index on a table:
Create Index:
To create an index on a column, use the `CREATE INDEX` statement. This allows you to create an
index to improve the performance of queries based on that column.
Page6
Introduction to NoSQL
Drop Index:
To drop (delete) an index, use the `DROP INDEX` statement. This removes the specified index
from the table.
Ensure that you carefully consider the implications of adding and removing indexes, as it can impact
the performance and storage requirements of your Cassandra database.
In Cassandra, the `BATCH` statement is used to group multiple CQL statements (queries, updates, or
deletes) into a single atomic operation. This ensures that either all the statements in the batch are
executed successfully or none of them are. Batches are useful when you need to maintain consistency
across multiple write operations.
BEGIN BATCH
// CQL statements to be executed
APPLY BATCH;
Here is a more detailed example:
BEGIN BATCH
INSERT INTO your_table (id, name, age) VALUES (uuid(), 'John', 25);
UPDATE your_table SET email = '[email protected]' WHERE id = <some_id>;
DELETE FROM another_table WHERE id = <another_id>;
APPLY BATCH;
- `BEGIN BATCH`: Indicates the beginning of the batch.
CRUD Operations
In Cassandra, you can perform basic data manipulation operations such as insert, select, update, and
delete using the Cassandra Query Language (CQL). Here are examples of each operation:
Insert:
INSERT INTO your_table (id, name, age, email) VALUES (uuid(), 'John Doe', 25,
'[email protected]');
Replace 'your_table' with the name of your table and adjust the values accordingly.
Select:
Page7
Introduction to NoSQL
SELECT * FROM your_table WHERE id = <some_id>;
Replace 'your_table' with the table name and `<some_id>` with the specific identifier.
Update:
Delete:
Remember to replace placeholders such as 'your_table' and '<some_id>' with your actual table name
and identifier. Additionally, ensure that your data model and application requirements guide the
structure of these queries for optimal performance and scalability.
Cassandra Collections
Cassandra collections are used to handle tasks. You can store multiple elements in collection.
In Cassandra, `SET`, `LIST`, and `MAP` are collection types that allow you to store multiple values
within a single column. These collections can be useful when you need to handle scenarios where a
column contains multiple items. Here's a brief overview of each:
SET:
A `SET` is an unordered collection of unique elements. Each element in the set must be of the same
data type. Duplicate values are not allowed.
Example:
A `LIST` is an ordered collection of elements where duplicates are allowed. Elements in the list can
be of different data types.
Page8
Introduction to NoSQL
Example:
A `MAP` is a collection of key-value pairs where each key is associated with a specific value. Keys
and values can have different data types.
Example:
Monitoring a Cassandra cluster involves using tools and techniques to assess its health, performance,
and other relevant metrics.
`nodetool` is a command-line utility in Apache Cassandra that provides various operations and
management tasks for interacting with and monitoring Cassandra nodes. It is a powerful tool for
system administrators and developers to perform actions on a Cassandra cluster. Here are some
commonly used `nodetool` commands:
- Node Status:
nodetool status
Displays the status of each node in the cluster, including their state (UN - Up, DN - Down),
load, and tokens.
- Cluster Information:
nodetool info
Page9
Introduction to NoSQL
Provides information about the cluster, including the Cassandra version, data center, and Rack.
- Compaction Stats:
nodetool compactionstats
Displays information about ongoing compactions.
nodetool tpstats
Shows statistics for thread pools, helping identify performance bottlenecks.
- Latency Information:
nodetool cfstats
Displays column family statistics, including read and write latencies.
3. Data Management:
- Table Snapshot:
- Compact Tables:
- Flush Tables:
4. Ring Management:
- Token Ring:
nodetool ring
Displays the token ring information, showing the distribution of tokens across the cluster.
- Move Node:
- Decommission Node:
nodetool decommission
Page
Introduction to NoSQL
Decommissions a node from the Cassandra cluster.
- Repair Node:
nodetool repair
Initiates a repair operation to ensure data consistency.
- Cleanup Node:
nodetool cleanup
Performs cleanup by removing obsolete data on a node.
These are just a few examples of the numerous `nodetool` commands available. Running
`nodetool` without any arguments provides a list of available commands and their descriptions.
Always refer to the official documentation for the specific version of Cassandra you are using for the
most accurate and up-to-date information.
Performance tuning and optimization in Apache Cassandra involve configuring and adjusting various
settings to ensure the cluster operates efficiently and meets the desired performance goals.
1. Memory Settings:
- "-Xms4G"
- "-Xmx4G"
- "-Xmn800M"
Page
Introduction to NoSQL
# Method for Cassandra to access data on disk
# Options: mmap, standard, or mmap_index_only disk_access_mode: mmap
# Additional Java Virtual Machine (JVM) options and garbage collection tuning # Adjust based on you
# Monitor GC logs to optimize settings jvm_options:
- "-XX:+UseG1GC"
Page
Introduction to NoSQL
- "-XX:MaxGCPauseMillis=500"
Always refer to the official Cassandra documentation for version 3.11 for detailed information
on these settings.
Compaction Strategy
Compaction is the process of merging multiple SSTables (Sorted String Tables) into a smaller
number of SSTables, reducing storage space and improving read performance. Cassandra provides
different compaction strategies, each with its own advantages and use cases. Here are some common
compaction strategies in Cassandra:
1. SizeTieredCompactionStrategy (STCS):
- Description: Segments SSTables based on size and compacts smaller SSTables into larger
ones.
- Use Case: Suitable for write-intensive workloads with uniform data distribution.
compaction:
enabled: true
default_compaction_strategy: SizeTieredCompactionStrategy
2. LeveledCompactionStrategy (LCS):
- Description: Divides SSTables into levels, each with a fixed size. Compacts SSTables within
the same level, then promotes them to the next level.
- Use Case: Suitable for read-heavy workloads, provides more predictable and tunable
compaction.
compaction:
enabled: true
default_compaction_strategy: LeveledCompactionStrategy
3. TimeWindowCompactionStrategy (TWCS):
- Description: Groups SSTables based on time intervals, compacts data within each time window.
- Use Case: Suitable for time-series data where older data can be compacted separately from
newer data.
compaction:
enabled: true
default_compaction_strategy: TimeWindowCompactionStrategy
4. DateTieredCompactionStrategy (DTCS):
- Use Case: Suitable for time-series data with varying write rates.
Page
Introduction to NoSQL
compaction:
enabled: true
default_compaction_strategy: DateTieredCompactionStrategy
5. SizeTieredCompactionStrategy with STCSIngestTTL:
compaction:
enabled: true
default_compaction_strategy: SizeTieredCompactionStrategy
compaction_strategy_options:
STCSIngestTTL: true
Choose the compaction strategy based on your specific use case, workload characteristics,
and performance requirements. Always monitor and test different strategies in a controlled
environment to determine the most effective one for your Cassandra deployment.
H. Resources/Equipment Required
1. Computing Devices:
- Students should have access to individual computing devices, preferably laptops, with the
necessary hardware specifications to run Apache Cassandra and related tools smoothly.
- Installation of Apache Cassandra: Ensure students have the latest version of Apache Cassandra
installed on their devices, configured and ready for use in a controlled environment.
- Integrated Development Environment (IDE): Provide guidance on using an appropriate IDE that
supports Cassandra Query Language (CQL) for developing and executing simple queries.
- Comprehensive documentation and tutorials covering data modeling principles, basic operations,
and simple queries in Cassandra to facilitate learning and hands-on practice.
4. Internet Connectivity:
Page
Introduction to NoSQL
I. Practical related Questions
4. Alter the table "products" to add a new column "manufacturer" of type TEXT.
6. Insert a new product into the "products" table with the following values:
(uuid(), 'Laptop', 1200.0, 50, 'Dell').
9. Delete the product with product_id 'some_id' from the "products" table.
10. Use `nodetool status` to check the status of the Cassandra nodes in the cluster.
Page
Introduction to NoSQL
Page
Introduction to NoSQL
1. Design a data model for a music streaming service. Consider entities such as
users, songs, playlists, and play history. Define tables and relationships to efficiently
support queries for user-specific playlists, recently played songs, and popular songs.
2. Write a CQL query to find the top 5 most played songs in the last month.
Consider using appropriate aggregation functions and time-based filtering.
Page
Introduction to NoSQL
3. Create a secondary index on a non-primary key column of the songs table. Write a
query to retrieve all songs released in a specific year using this secondary index.
5. Design a table to store real-time analytics data for user interactions (likes, shares,
skips) with songs. Write a query to retrieve songs with the highest engagement in
the last 24 hours.
7. Simulate a scenario with a large data set of songs and users. Write a query to
retrieve songs that have not been played by a specific user, considering efficient
handling of large data.
Page
Introduction to NoSQL
Page
Introduction to NoSQL
True/False Statements:
2. Truncating a table removes the table structure along with its data.
4. The `nodetool status` command provides information about the cluster's data
consistency.
Page
Introduction to NoSQL
J. References Links
1. https://data-flair.training/blogs/cassandra-crud-operation/
2. https://www.scylladb.com/glossary/cassandra-data-model/
3. https://cassandra.apache.org/doc/stable/cassandra/data_modeling/index.html
4. https://intellipaat.com/blog/tutorial/cassandra-tutorial/tuning-cassandra-performance/
5. https://thelastpickle.com/blog/2018/08/08/compression_performance.html
K. Assessment Rubrics
Needs
Criteria Excellent (10) Good (7) Satisfactory (5) Improvement Marks
(3)
Demonstrates Shows
Has a basic Struggles with
Proficiency in mastery of basic proficiency in
understanding of basic
Basic operations and basic operations
basic operations Cassandra
Operations maintenance in and
and maintenance operations
Cassandra maintenance
Excels in
monitoring and Competently Demonstrates Unable to
Monitoring and effectively monitors and basic skills in effectively
Troubleshooting troubleshoots troubleshoots monitoring and monitor and
Cassandra Cassandra troubleshooting troubleshoot
issues
Exhibits an in-
Demonstrates a Shows a basic Lacks
Understanding depth
good grasp of understanding of understanding
of Cassandra understanding of
Cassandra Cassandra of Cassandra
Architecture Cassandra
architecture architecture architecture
architecture
Adequately
Effectively tunes Attempts to tune
tunes and Struggles with
Performance and optimizes and optimize
optimizes basic
Tuning and Cassandra for Cassandra but with
Cassandra performance
Optimization optimal significant issues
with minor tuning
performance
issues
Average Marks
Page
Introduction to NoSQL
Practical 6. Introduction to Neo4j Graph Databases
A. Objective
To introduce students to Neo4j Graph Databases by covering the basics, emphasizing the property
graph model and graph theory fundamentals. It explores use cases in areas like social networks and
fraud detection while guiding students through the installation process of Neo4j for hands-on
experience.
2. Problem Analysis: Identify and analyze specific engineering problems, employing standardized
methods for comprehensive problem understanding within the context of Neo4j Graph Databases.
4. Engineering Tools, Experimentation, and Testing: Apply contemporary engineering tools and
techniques to conduct standard tests and measurements, particularly in the realm of Neo4j Graph
Databases.
7. Life-Long Learning: Demonstrate the ability to analyze individual learning needs and actively
engage in continuous learning, adapting to technological changes specifically within the field of
Neo4j Graph Databases.
1. Graph Database Modeling: Develop proficiency in constructing and understanding graph database
models, with a focus on the property graph model used in Neo4j.
2. Installation and Setup: Demonstrate competence in installing and configuring Neo4j, ensuring a
clear understanding of system requirements and successful setup, including access to the Neo4j
Browser.
Page
Introduction to NoSQL
3. Problem-Solving with Graph Theory: Apply graph theory fundamentals to solve practical
engineering problems, showcasing the practical application of graph databases in scenarios like
social networks, fraud detection, and recommendation engines.
Identify the significance of graph databases, illustrating their practical applications in solving
complex relationship-oriented problems.
Students will have a comprehensive understanding of Neo4j Graph Databases, focusing on key
aspects such as the basics of graph databases, graph theory fundamentals, use cases for graph
databases, and the installation process of Neo4j. This practical outcome ensures that students are
well-versed in fundamental graph concepts, acquainted with real-world applications, and proficient
in setting up Neo4j for practical implementation.
1. Appreciation for Graph Database Concepts: Develop a heightened appreciation for the conceptual
foundations of graph databases, recognizing their significance in representing and querying complex
relationships.
3. Application Awareness: Gain awareness of practical applications of graph databases, fostering the
ability to identify scenarios where graph structures excel, such as in social networks, fraud detection,
and recommendation engines.
G. Prerequisite Theory
Graph databases are a type of NoSQL database that uses graph structures with nodes, edges, and
properties to represent and store data. They are particularly well-suited for scenarios where
relationships between entities are important and need to be efficiently queried.
Page
Introduction to NoSQL
- Property: Key-value pairs associated with nodes and edges, providing additional information.
- Graph databases often use the Cypher query language for data manipulation and retrieval.
Cypher is designed to be expressive and intuitive for querying graph data.
- Graph databases excel at traversing relationships between nodes. You can easily follow edges to
navigate through the graph and discover connections between nodes.
- Graph databases are designed to efficiently handle queries involving relationships, making them
well-suited for applications that involve complex and interconnected data.
- Microsoft Azure Cosmos DB: Supports graph data models along with other NoSQL models.
Use Cases:
- Graph databases are suitable for scenarios where relationships are as important as the data itself,
such as social networks, fraud detection, recommendation engines, network analysis, and knowledge
graphs.
1. Social Networks:
Page
Introduction to NoSQL
- Modeling users as nodes and relationships (friendship, following) as edges allows for efficient
representation and traversal of social networks, enabling features like friend recommendations.
2. Recommendation Engines:
- Nodes representing users and items (products, content) with edges capturing interactions help
build personalized recommendation systems by analyzing the graph structure.
3. Fraud Detection:
4. Knowledge Graphs:
- Modeling information as nodes and relationships facilitates the creation of knowledge graphs,
aiding in organizing and navigating interconnected data for insights and discovery.
- Graph databases are employed to represent and analyze relationships between biological entities
(genes, proteins) in genomics research, drug discovery, and systems biology.
- Representing IT components and their dependencies as nodes and edges helps visualize and
manage complex IT infrastructures, optimizing operations and troubleshooting.
1. Install Java:
- If OpenJDK 11 or Oracle Java 11 is not already installed, obtain and install it on your system.
- Set the `JAVA_HOME` environment variable to the installation path of your JDK.
2. Download Neo4j:
Page
Introduction to NoSQL
- Visit the Neo4j Download Center and download the latest release.
- https://neo4j.com/deployment-center/
- Verify the integrity by checking the SHA hash. Click on SHA-256 below your downloaded file
on the Download Center and compare it with the calculated SHA-256 hash using platform- specific
commands.
- Choose a permanent location for the extracted files, for example, D:\neo4j\ (referred to as
NEO4J_HOME).
3. Run Neo4j:
Page
Introduction to NoSQL
Start neo4j with command neo4j start
- Connect using the username 'neo4j' with the default password 'neo4j'.
H. Resources/Equipment Required
Page
Introduction to NoSQL
1. Computers or Laptops: Students need access to individual computers or laptops capable of
running Neo4j software.
2. Internet Connection: A stable internet connection is required for software downloads and
accessing online resources.
3. Neo4j Software: Ensure the latest version of Neo4j is available for download and installation on
student machines.
4. Documentation and Guides: Provide instructional materials covering graph database basics,
graph theory, use cases, and Neo4j installation steps.
5. Technical Support: Have support available to assist students with any software or installation
issues during the practical session.
2. Which query language is commonly used for data manipulation and retrieval in graph
databases?
a) A document c) A table
b) An entity in the graph d) A row
a) Columns c) Nodes
b) Tables d) Documents
Page
Introduction to NoSQL
a) MongoDB c) Amazon Neptune
a) N4J_HOME c) JAVA_HOME
b) NEO4J_DIR d) NEO4J_HOME
11. To access Neo4j Browser, you should visit:
a) http://localhost:8080 c) http://127.0.0.1:8000
b) http://localhost:7474 d) http://neo4j-browser.com
Page1
Introduction to NoSQL
Page2
Introduction to NoSQL
True/False:
1. Graph databases are particularly well-suited for scenarios where relationships between entities
are not important.
2. The Cypher query language is specifically designed for querying relational databases.
J. References Links
1. https://aws.amazon.com/nosql/graph/
2. https://neo4j.com/developer/graph-database/
3. https://rubygarage.org/blog/neo4j-database-guide-with-use-cases
4. https://neo4j.com/docs/operations-manual/current/installation/windows/
5. https://www.oracle.com/in/autonomous-database/what-is-graph-database/
Page3
Introduction to NoSQL
K. Assessment Rubrics
Needs
Satisfactory
Criteria Excellent (10) Good (7) Improvement Marks
(5)
(3)
Has a
Demonstrates a Shows a strong
Understanding satisfactory Struggles with
profound understanding of
of Basics of understanding basic concepts
understanding of the basics of
Graph of the basics of of graph
the basics of graph databases
Databases graph databases databases
graph databases
Demonstrates a Lacks
Comprehension Exhibits a deep Shows a basic
good understanding
of Graph comprehension comprehension
comprehension of graph theory
Theory of graph theory of graph theory
of graph theory fundamentals
Fundamentals fundamentals fundamentals
fundamentals
Effectively Attempts to
Application of Adequately Struggles to
applies graph apply graph
Graph applies graph apply graph
databases in databases but
Databases in databases in some databases in
various use with significant
Use Cases use cases use cases
cases issues
Unable to install
Neo4j or
Successful Installs Neo4j Installs Neo4j Installs Neo4j
encounters
Installation of successfully with with minor with major
critical
Neo4j no issues issues issues
installation
issues
Average Marks
Page4
Introduction to NoSQL
Practical 7. Basic Graph Queries and Implementations with Neo4j
A. Objective
Students will acquire expertise in the Cypher Query Language, enabling them to execute
fundamental graph operations, explore graph algorithms and their applications. The practical will
additionally cover Neo4j optimization techniques and real-world graph database scenarios,
enhancing students' skills in graph querying and implementations.
1. Basic and Discipline specific knowledge: Apply foundational knowledge of mathematics, science,
and engineering principles, along with specialized engineering expertise, to address engineering
challenges encountered in the practical implementation of basic graph queries and solutions with
Neo4j.
2. Problem analysis: Demonstrate the ability to identify and analyze specific engineering problems
related to graph databases, utilizing established methods to formulate solutions and address
challenges in graph query implementations.
4. Engineering Tools, Experimentation and Testing: Apply contemporary engineering tools and
methodologies to conduct standardized tests and measurements, specifically in the context of basic
graph queries and implementations using Neo4j.
5. Engineering practices for society, sustainability, and environment: Apply engineering practices
that align with societal needs, sustainability principles, and environmental considerations while
implementing basic graph queries in the Neo4j graph database.
7. Life-long learning: Demonstrate the ability to analyze individual learning needs and engage in
ongoing updates and learning activities, particularly in response to technological advancements in
the field of graph databases and Neo4j implementations.
1. Cypher Query Proficiency: Execute fundamental graph queries using the Cypher Query Language
in Neo4j, showcasing adeptness in retrieving and manipulating graph data.
Page5
Introduction to NoSQL
2. Graph Operations Mastery: Apply essential graph operations, including node and relationship
creation, updating, and deletion, to address practical scenarios within the Neo4j environment.
4. Real-world Scenario Resolution: Apply acquired skills to solve real-world graph database
scenarios, engaging in hands-on experiences related to query optimization, troubleshooting, and
effective decision-making in graph-related problem-solving.
Identify the significance of graph databases, illustrating their practical applications in solving
complex relationship-oriented problems.
Students will achieve a comprehensive understanding of basic graph queries and implementations
using Neo4j. The practical outcome includes hands-on proficiency in the Cypher Query Language,
mastery of essential graph operations, application of graph algorithms, utilization of Neo4j
optimization techniques, and successful resolution of real- world graph database scenarios.
1. Increased Appreciation for Graph Database Concepts: Develop a heightened appreciation for the
relevance and significance of graph database concepts, recognizing their role in solving real-world
engineering challenges.
3. Confidence in Cypher Query Language: Build confidence and proficiency in using the Cypher
Query Language, instilling a sense of accomplishment and mastery in performing basic graph
queries and implementations.
G. Prerequisite Theory
Page6
Introduction to NoSQL
Cypher is Neo4j’s graph query language that lets you retrieve data from the graph. It is like SQL for
graphs, and was inspired by SQL so it lets you focus on what data you want out of the graph (not
how to go get it). It is the easiest graph language to learn by far because of its similarity to other
languages and intuitiveness.
Cypher is unique because it provides a visual way of matching patterns and relationships. Cypher
was inspired by an ASCII-art type of syntax where (nodes)- [:ARE_CONNECTED_TO]-
>(otherNodes) using rounded brackets for circular (nodes), and - [:ARROWS]-> for relationships.
When you write a query, you draw a graph pattern through your data.
Neo4j users use Cypher to construct expressive and efficient queries to do any kind of create, read,
update, or delete (CRUD) on their graph, and Cypher is the primary interface for Neo4j.
Once you start neo4j, you can use the :play cypher command inside of Neo4j Browser to get started.
Neo4j’s developer pages cover the basics of the language, which you can explore by topic area
below, starting with basic material, and building up towards more complex material.
Page7
Introduction to NoSQL
Cypher provides first class support for a number of data types. These fall into several
categories which will be described in detail in the following subsections:
Property types: Integer, Float, String, Boolean, Point, Date, Time, LocalTime, DateTime,
LocalDateTime, and Duration.
In Cypher, comments are added by starting a line with // and writing text after the slashes. Using two
forward slashes designates the entire line as a comment, explaining syntax or query functionality.
In Cypher, representing nodes involves enclosing them in parentheses, mirroring the visual
representation of circles used for nodes in the graph model. Nodes, which signify data entities, are
identified by finding nouns or objects in the data model. For instance, in the example (Sally), (John),
(Graphs), and (Neo4j) are nodes.
1. Node Variables:
- Nodes in Cypher can be assigned variables, such as (p) for person or (t) for thing. These variables
function similarly to programming language variables, allowing you to reference the nodes by the
assigned name later in the query.
2. Node Labels:
- Node labels in Cypher, similar to tags in the property graph data model, help group similar nodes
together. Labels, like Person, Technology, and Company, act as identifiers, aiding in specifying
certain types of entities to look for or create. Using node labels in queries helps Cypher optimize
execution and distinguish between different entities.
()//anonymous node (no label or variable) can refer to any node in the database (p:Person)//using variab
Page8
Introduction to NoSQL
(:Technology)//no variable, label Technology
(work:Company)//using variable work and label Company
In Cypher, relationships are denoted by arrows (--> or <--) between nodes, resembling the
visual representation of connecting lines. Relationship types and properties can be specified in square
brackets within the arrow. Directed relationships use arrows, while undirected relationships use
double dashes (--), allowing flexible traversal in either direction without specifying the physical
orientation in queries.
- Relationship types in Cypher categorize connections between nodes, providing meaning to the
relationships similar to how labels group nodes. Good naming conventions using verbs or actions are
recommended for clarity and readability in Cypher queries.
2. Relationship Variables:
- Like nodes, relationships in Cypher can be assigned variables such as [r] or [rel]. These variables,
whether short or expressive like [likes] or [knows], allow referencing the relationship later in a
query. Anonymous relationships can be specified with two dashes (--, -
->, <--) if they are not needed for reference.
In Cypher, node and relationship properties are represented using curly braces within the parentheses
for nodes and brackets for relationships. For example, a node property is expressed as `(p:Person
{name: 'Sally'})`, and a relationship property is denoted as `- [rel:IS_FRIENDS_WITH {since:
2018}]->`.
Page9
Introduction to NoSQL
In Cypher, patterns are composed of nodes and relationships, expressing the fundamental structure of
graph data. Patterns can range from simple to intricate, and in Cypher, they are articulated by
combining node and relationship syntax, such as `(p:Person {name: "Sally"})- [rel:LIKES]-
>(g:Technology {type: "Graphs"})`.
- In Cypher, the `CREATE` clause is used to add data by specifying patterns representing graph
structures, labels, and properties. For example, `CREATE (:Movie {title: 'The Matrix', released:
1997})` creates a movie node with specified properties.
- To return created data, the `RETURN` clause is added, referencing variables assigned to pattern
elements. For instance, `CREATE (p:Person {name: 'Keanu Reeves'}) RETURN p` creates a person
node and returns it in the result.
- The `MATCH` clause is used for finding patterns in the graph. It enables specifying patterns
similar to `CREATE` but focuses on identifying existing data. For example, `MATCH (m:Movie)
RETURN m` finds all movie nodes.
- The `MERGE` clause combines elements of `MATCH` and `CREATE`, ensuring uniqueness by
checking for existing data before creating. It's useful for creating or matching nodes and
relationships. For instance, `MERGE (m:Movie {title: 'Cloud Atlas'}) ON CREATE SET m.released
= 2012 RETURN m` merges or creates a movie node and returns it.
- Return values can be aliased for better readability using the `AS` keyword. For example,
`RETURN tom.name AS name, tom.born AS 'Year Born'` provides cleaner and more informative
result labels.
Filtering results
Explore result refinement in Cypher by using the WHERE clause to filter and retrieve specific
subsets of data based on boolean expressions, predicates, and comparisons, including logical
operators like AND, OR, XOR, and NOT.
Page
Introduction to NoSQL
MATCH (m:Movie)
WHERE m.title = 'The Matrix' RETURN m
1. Inserting Data:
- Patterns can be created blindly, but using `MATCH` before `CREATE` ensures
uniqueness.
- Remove properties using `REMOVE` or set them to `null` with `SET` for nodes.
Page
Introduction to NoSQL
5. Avoiding Duplicate Data with MERGE:
- `MERGE` checks for the entire pattern's existence and creates it if not found.
- Utilize `ON CREATE SET` and `ON MATCH SET` to specify actions during node or
relationship creation or matching.
- This helps initialize properties when creating and update properties when matching.
Graph algorithms provide one of the most potent approaches to analyzing connected data because
their mathematical calculations are specifically built to operate on relationships. They describe steps
to be taken to process a graph to discover its general qualities or specific quantities.
Path Finding - these algorithms help find the shortest path or evaluate the availability and
quality of routes
Centrality - these algorithms determine the importance of distinct nodes in a network
Community Detection - these algorithms evaluate how a group is clustered or
partitioned, as well as its tendency to strengthen or break apart
Similarity - these algorithms help calculate the similarity of nodes
Topological link prediction - these algorithms determine the closeness of pairs of nodes
Node Embeddings - these algorithms compute vector representations of nodes in a graph.
Node Classification - this algorithm uses machine learning to predict the
classification of nodes.
Link prediction - these algorithms use machine learning to predict new links between pairs
of nodes
OS Memory Sizing:
Page
Introduction to NoSQL
- Reserve around 1GB for non-Neo4j server activities.
- Avoid exceeding available RAM to prevent OS swapping, which impacts performance. Page
Cache Sizing:
- Estimate page cache size by summing the sizes of relevant database files and adding a growth
factor.
- Configure the page cache size in `neo4j.conf` (default is 50% of available RAM). Heap
Sizing:
- Configure a sufficiently large heap space for concurrent operations (8G to 16G is often
adequate).
*(Refer to the Neo4j Operations Manual for detailed discussions on heap memory configuration,
distribution, and garbage collection tuning.)*
Logical Logs:
- Logical transaction logs are crucial for recovery after an unclean shutdown and incremental
backups.
- Log files are rotated after reaching a specified size (e.g., 25 MB).
- The default open file limit of 1024 may be insufficient, especially with multiple indexes or high
connection volumes.
- Increase the limit to a practical value (e.g., 40000) based on usage patterns.
- Adjust system-wide open file limit following platform-specific instructions (ulimit command for
current session).
Suppose we need to learn preferences of our customers to create a promotional offer for a specific
product category, such as notebooks. First, Neo4j allows us to quickly obtain a list of
Page
Introduction to NoSQL
notebooks that customers have viewed or added to their wish lists. We can use this code to select all
such notebooks:
MATCH (:Customer)-[:ADDED_TO_WISH_LIST|:VIEWED]->(notebook:Product)-
[:IS_IN]->(:Category {title: 'Notebooks'})
RETURN notebook;
Now that we have a list of notebooks, we can easily include them in a promotional offer. Let’s
make a few modifications to the code above:
WITH offer
MATCH (:Customer)-[:ADDED_TO_WISH_LIST|:VIEWED]->(notebook:Product)-
[:IS_IN]->(:Category {title: 'Notebooks'})
MERGE(offer)-[:USED_TO_PROMOTE]->(notebook);
We can track the changes in the graph with the following query:
MATCH (offer:PromotionalOffer)-[:USED_TO_PROMOTE]->(product:Product)
Linking a promotional offer with specific customers makes no sense, as the structure of graphs
allows you to access any node easily. We can collect emails for a newsletter by analyzing the
products in our promotional offer.
When creating a promotional offer, it’s important to know what products customers have viewed or
added to their wish lists. We can find out with this query:
Page
Introduction to NoSQL
MATCH (offer:PromotionalOffer {type: 'discount_offer'})-[:USED_TO_PROMOTE]-
>(product:Product)<-[:ADDED_TO_WISH_LIST|:VIEWED]-(customer:Customer)
This example is simple, and we could have implemented the same functionality in a relational
database. But our goal is to show the intuitiveness of Cypher and to demonstrate how simple it is to
write queries in Neo4j.
Now let’s imagine that we need to develop a more efficient promotional campaign. To increase
conversion rates, we should offer alternative products to our customers. For example, if a customer
shows interest in a certain product but doesn’t buy it, we can create a promotional offer that contains
alternative products.
To show how this works, let’s create a promotional offer for a specific customer:
MATCH (free_product:Product)
MATCH (product:Product)
WHERE ((alex)-->(product))
MATCH (free_product)-[:IS_IN]->()<-[:IS_IN]-(product)
RETURN free_product;
Page
Introduction to NoSQL
This query searches for products that don’t have either ADDED_TO_WISH_LIST, VIEWED, or
BOUGHT relationships with a client named Alex McGyver. Next, we perform an opposite query that
finds all products that Alex McGyver has viewed, added to his wish list, or bought. Also, it’s crucial
to narrow down recommendations, so we should make sure that these two queries select products in
the same categories. Finally, we specify that only products that cost 20 percent more or less than a
specific item should be recommended to the customer.
Xiaomi Mi Mix 2 (price: $420.87). Price range for recommendations: from $336.70 to $505.04. Sony
Xperia XA1 Dual G3112 (price: $229.50). Price range for recommendations: from
$183.60 to $275.40.
Note that both product and free_product variables contain items that belong to the same category,
which means that the [:IS_IN]->()<-[:IS_IN] constraint has worked.
As you can see, none of the products except for the Huawei P8 Lite fits in the price range for
recommendations, so only the P8 Lite will be shown on the recommendations list after the query is
executed.
Now we can create our promotional offer. It’s going to be different from the previous one
(personal_replacement_offer instead of discount_offer), and this time we’re going to store a
customer’s email as a property of the USED_TO_PROMOTE relationship as the products contained
in the free_product variable aren’t connected to specific customers. Here’s the full code for the
promotional offer:
MATCH (free_product)-[:IS_IN]->()<-[:IS_IN]-(product)
Page
Introduction to NoSQL
WHERE ((product.price - product.price * 0.20) >= free_product.price <= (product.price +
product.price * 0.20))
CREATE(offer:PromotionalOffer{type:'personal_replacement_offer',content:'Personal
replacement offer for ‘ + alex.name})
WITH offer, free_product, alex
MERGE(offer)-[rel:USED_TO_PROMOTE {email: alex.email}]->(free_product) RETURN offer, free_pro
Imagine we want to recommend products to Alex McGyver according to his interests. Neo4j allows
us to easily track the products Alex is interested in and find other customers who also have expressed
interest in these products. Afterward, we can check out these customers’ preferences and suggest
new products to Alex.
Page
Introduction to NoSQL
First, let’s take a look at all customers and the products they’ve viewed, added to their wish lists, and
bought:
MATCH (customer:Customer)-->(product:Product)
As you can see, Alex has two touch points with other customers: the Sony Xperia XA1 Dual G3112
(purchased by Allison York) and the Nikon D7500 Kit 18–105mm VR (viewed by Joe Baxton).
Therefore, in this particular case, our product recommendation system should offer to Alex those
products that Allison and Joe are interested in (but not the products Alex is also interested in). We
can implement this simple recommendation system with the help of the following query:
Page
Introduction to NoSQL
MATCH (customer)-->(customer_product:Product)
We can further improve this recommendation system by adding new conditions, but the
takeaway is that Neo4j helps you build such systems quickly and easily.
H. Resources/Equipment Required
2. Computing Device: Students need personal computers or laptops meeting the system
requirements for Neo4j installation and execution.
4. Cypher Query Language Guide: Students should have access to a guide or documentation
for the Cypher Query Language to assist in formulating and executing queries.
Page
Introduction to NoSQL
I. Practical related Questions
3. Which Cypher clause is used to filter query results based on specified conditions?
- A) CREATE - C) MATCH
- B) FILTER - D) WHERE
4. What is the purpose of the RETURN clause in
Cypher?
- C) Return specified data
- A) Create new nodes.
from the query.
- B) Filter query results.
- D) Delete nodes.
- A) SET - C) MODIFY
- B) ADD - D) UPDATE
6. What does the MERGE clause do in Cypher?
Page
Introduction to NoSQL
Page
Introduction to NoSQL
True/False Questions:
1. In Cypher, the WHERE clause is used to specify conditions for filtering query results.
(True/False)
2. The CREATE clause in Cypher is used to find patterns in the graph. (True/False)
3. Cypher's DELETE clause is used exclusively for removing nodes from the graph.
(True/False)
1. to add a new node representing a movie with the title "Inception" and release
year 2010. Write a Cypher query to achieve this.
2. Retrieve the names of all actors who have acted in movies released after the
year 2000. Write a Cypher query for this.
3. Change the release year of the movie "The Matrix" to 1999. Write a Cypher query
to update this information.
5. Retrieve the names of directors who directed movies with a rating higher than 8.
Write a Cypher query with the WHERE clause.
6. Create a new node representing an actor named "Tom Hanks" and assign the
label "Actor" to the node. Write a Cypher query for this.
7. Find all pairs of actors who have acted together in the same movie. Write a
Cypher query to retrieve this information.
8. Retrieve the names of users who have rated a movie with a rating greater than
4. Write a Cypher query with the MATCH and RETURN clauses.
10 Retrieve the titles of movies released between 2010 and 2020. Write a Cypher query
with conditional operators.
Page2
Introduction to NoSQL
Page3
Introduction to NoSQL
Page4
Introduction to NoSQL
Write Cypher queries for following
1. Retrieve the names of actors who have appeared in more than five movies, and
for each actor, display the count of movies they have acted in. Write a Cypher query
with advanced filtering.
2. Find the shortest path between two users who are connected through a series of
friendship relationships. Display the names of users along this path. Write a
Cypher query for pathfinding.
3. For movies released before the year 2000, increment their ratings by 1. Write
a Cypher query that conditionally updates movie ratings.
4. Find all pairs of actors who have acted together in more than one movie. Display
the names of these actors and the count of movies they have acted in together. Write
a Cypher query for complex pattern matching.
5. Retrieve the top three directors who have directed the most movies. Display their
names and the count of movies directed. Write a Cypher query involving
aggregation and sorting.
6. Retrieve the names of users who have rated movies with an average rating above
4. Display both the user names and the average rating. Write a Cypher query involving
multiple nodes.
7. Retrieve the names of users who have rated the same movies as a user
named "John." Write a Cypher query with nested queries.
8. For each movie, create a relationship to a new genre node if the movie's rating
is above 8. Write a Cypher query for dynamic relationship creation.
9. Find paths between two users such that the movies they have rated have a
unique genre. Display the paths and the unique genres. Write a Cypher query for
finding uncommon paths.
Page5
Introduction to NoSQL
Page6
Introduction to NoSQL
Page7
Introduction to NoSQL
J. References Links
1. https://neo4j.com/developer/cypher/
2. https://www.skillsoft.com/course/cypher-query-language-basic-reads-writes-with-cypher-
dfa6d42c-81a1-4151-8c2d-bf1f50d71378
3. https://memgraph.com/blog/graph-algorithms-applications
4. https://neo4j.com/developer/guide-performance-tuning/
K. Assessment Rubrics
Needs
Criteria Excellent (10) Good (7) Satisfactory (5) Marks
Improvement (3)
Shows a high
Mastery of Demonstrates Demonstrates Struggles with
level of
Cypher exceptional basic proficiency Cypher queries
proficiency in
Query proficiency in in Cypher queries and lacks
Cypher
Language Cypher queries proficiency
queries
Executes basic Performs Struggles to
Carries out basic
Execution of graph operations basic graph execute basic
graph operations
Basic Graph with precision operations graph operations
but with some
Operations and accuracy with or makes
errors
competence significant errors
Struggles to use
Uses Neo4j Attempts to use
Utilization of Effectively Neo4j
optimization Neo4j
Neo4j utilizes Neo4j optimization
techniques optimization
Optimization optimization techniques or lacks
with some techniques but
Techniques techniques understanding
success with limitations
Page8
Introduction to NoSQL
Practical 8. Redis Basics: Introduction and Key-Value Operations
A. Objective
To provide a comprehensive understanding of Redis, covering its overview, key data structures, real-
world applications, essential commands, and advanced features, integrated with diverse technologies.
This approach ensures students acquire hands-on proficiency in employing Redis for optimal data
operations and exploring its broader utility in contemporary tech ecosystems.
2. Problem Analysis:
- Utilize codified standard methods to identify and analyze well-defined engineering problems
within Redis, fostering a systematic approach to problem-solving.
- Design effective solutions for well-defined technical issues in Redis, contributing to the
development of systems components or processes aligned with specified requirements.
- Apply contemporary engineering tools and techniques to conduct standard tests and
measurements, enhancing proficiency in utilizing engineering tools within the context of Redis.
6. Project Management:
7. Life-long Learning:
Page9
Introduction to NoSQL
- Exhibit the ability to assess individual learning needs and engage in continuous learning,
adapting to technological changes in the field of engineering, particularly in the context of Redis
Basics.
- Demonstrate practical skills in manipulating various Redis data structures for effective
information storage and retrieval.
- Apply competencies in utilizing advanced Redis features, including transactions, Pub/Sub, and
geospatial indexes, for diverse engineering applications.
Utilize Redis data structures and functionalities to implement efficient caching strategies,
showcasing the role of Redis in enhancing data retrieval performance.
Students will gain hands-on proficiency in executing key-value operations, manipulating diverse data
structures, and solving engineering problems using Redis. The practical outcome emphasizes
competency in executing essential Redis commands, showcasing the integration of Redis with other
technologies, and applying this knowledge to real-world scenarios. This comprehensive approach
ensures students acquire practical skills essential for effective data management and engineering
applications in various contexts.
1. Enhanced Confidence:
- Students are expected to develop increased confidence in working with Redis, gaining assurance
in executing key-value operations and utilizing various data structures for practical problem-solving.
Page
Introduction to NoSQL
- The practical is designed to foster an appreciation for the real-world applications of Redis,
allowing students to understand how key-value operations and data structures align with solving
tangible engineering challenges.
- Through hands-on exercises and application of Redis in engineering scenarios, students are
expected to enhance their problem-solving skills, cultivating a practical mindset in addressing
challenges using Redis functionalities.
G. Prerequisite Theory
Overview of Redis
Redis is a NoSQL database which follows the principle of key-value store. The key-value store
provides ability to store some data called a value, inside a key. You can recieve this data later only if
you know the exact key used to store it.
Redis, an open-source (BSD licensed) in-memory data structure store, serves as a versatile solution
functioning as a database, cache, and message broker. Classified as a NoSQL database, Redis offers
the flexibility to store substantial volumes of data without the constraints typically associated with
relational databases.
Distinguishing itself through support for an array of data structures, Redis accommodates strings,
hashes, lists, sets, sorted sets, bitmaps, hyperloglogs, and geospatial indexes featuring radius queries.
This diverse set of capabilities positions Redis as a powerful tool for managing varied data types and
enables its application across a wide spectrum of use cases.
Redis Architecture:
Redis follows a client-server architecture and is known for its simplicity and efficiency. The key
components in Redis architecture are the Redis client and the Redis server.
Page
Introduction to NoSQL
1. Redis Client:
- The Redis client is any application or program that interacts with the Redis server to perform
read and write operations.
- Clients can be written in various programming languages, and Redis provides official client
libraries for several languages, including Python, Java, Ruby, and more.
- Clients communicate with the Redis server using the Redis protocol, a lightweight binary
protocol optimized for performance.
2. Redis Server:
- The Redis server is the core component responsible for storing and managing data in- memory.
- It operates as a daemon process, running independently on a server, and listens for client
connections on a specified port (default is 6379).
- The server manages various data structures, processes commands, and responds to client
requests.
- Redis can be configured to run in different modes, such as as a standalone instance, as a master
in a replication setup, or as a node in a clustered environment.
- Redis stores all data in RAM, providing fast read and write operations. This makes it ideal for use
cases where low-latency access to data is crucial.
- Redis is single-threaded, meaning it processes one command at a time. While this may seem like
a limitation, it simplifies the design and allows for better predictability and consistency.
- While Redis is an in-memory database, it provides options for persistence. Data can be
periodically saved to disk or written to an append-only file, ensuring data durability.
Page
Introduction to NoSQL
- Redis supports master-slave replication, allowing data to be replicated across multiple nodes.
This enhances data availability and provides fault tolerance.
- Redis processes commands atomically, ensuring that operations either succeed completely or
fail, maintaining data consistency.
Redis supports a variety of data structures, each designed to serve specific use cases. Here's an
overview of the main data structures supported by Redis:
1. Strings:
- Strings are the simplest and most basic data type in Redis. They can contain any kind of data,
such as text, binary data, or serialized objects.
- Redis provides various operations on strings, including set, get, append, increment,
decrement, and more.
2. Hashes:
- Hashes are maps between string field names and string values, so they are useful for
representing objects with multiple attributes.
- Hashes in Redis are particularly suitable for storing and retrieving objects with a small
number of fields.
3. Lists:
- Lists are collections of ordered elements, where each element can be any data type.
- Redis provides powerful operations on lists, such as push, pop, index-based access, range
retrieval, and blocking pop operations.
4. Sets:
- Set operations include adding, removing, and checking for the existence of members.
Redis also supports set operations like union, intersection, and difference between sets.
5. Sorted Sets:
- Sorted sets are similar to sets, but each element in a sorted set is associated with a score.
- Elements in a sorted set are kept sorted based on their scores, allowing for efficient range queries
and ranking of elements.
Redis features efficient Bitmaps for binary flag representation, HyperLogLogs for estimating unique
element cardinality with minimal memory usage, and Geospatial Indexes for location-
Page
Introduction to NoSQL
based storage and retrieval. These data structures, coupled with Redis's in-memory storage and
atomic operations, make it a powerful tool for applications like caching, real-time analytics, and
messaging systems. Understanding the distinct characteristics of each data structure is vital for
maximizing Redis's effectiveness in various scenarios.
1. Caching:
- Redis is widely used for caching frequently accessed data, reducing the load on backend
databases and improving overall application performance.
- Example: Large-scale e-commerce websites often use Redis to cache product details, reducing
the load on the primary database and providing users with faster access to frequently viewed items.
2. Real-Time Analytics:
- Redis facilitates real-time analytics by providing low-latency access to data, making it suitable
for monitoring and analyzing dynamic information.
- Example: Online gaming platforms utilize Redis for real-time analytics to monitor player
activities, track in-game events, and provide instantaneous feedback on player performance.
3. Message Queues:
- Redis's publish/subscribe mechanism makes it a popular choice for building scalable message
queues and real-time communication systems.
- Example: Popular messaging platforms leverage Redis as a message queue to ensure seamless
and real-time communication between users, allowing for instant message delivery.
4. Session Storage:
- In web applications, Redis serves as an efficient session store, managing user session data with
quick read and write operations.
- Example: Social media platforms use Redis as a session store to manage user sessions, ensuring a
smooth and responsive experience as users navigate through various features and pages.
- Redis's sorted sets are utilized for implementing leaderboards and efficiently managing scores
associated with various entities.
- Example: Gaming applications implement Redis sorted sets to create dynamic leaderboards that
showcase top scores in real-time, promoting competition and engagement among players.
6. Geospatial Applications:
Page
Introduction to NoSQL
- Redis supports geospatial data types, making it valuable for location-based services, such as
storing and querying items based on geographic coordinates.
- Example: Ride-sharing apps use Redis for geospatial indexing to efficiently locate and match
drivers with passengers based on their real-time geographic coordinates, optimizing service delivery.
These examples illustrate how Redis is applied in practical scenarios across different industries,
emphasizing its widespread adoption for improving performance, scalability, and real-time
capabilities.
Note: Redis is not officially supported on Windows. To install Redis on Windows, you'll first need
to enable WSL2 (Windows Subsystem for Linux). WSL2 lets you run Linux binaries natively on
Windows. For this method to work, you'll need to be running Windows 10 version 2004 and higher
or Windows 11.
To install Redis on Ubuntu, go to the terminal and type the following commands −
Start Redis
$redis-server
Check If Redis is Working
$redis-cli
This will open a redis prompt.
redis 127.0.0.1:6379>
In the above prompt, 127.0.0.1 is your machine's IP address and 6379 is the port on which
Redis server is running. Now type the following PING command.
redis 127.0.0.1:6379>
ping PONG
This shows that Redis is successfully installed on your machine.
Redis - Configuration
In Redis, there is a configuration file (redis.conf) available at the root directory of Redis.
Although you can get and set all Redis configurations by Redis CONFIG command.
Syntax
Page
Introduction to NoSQL
Example
To update configuration, you can edit redis.conf file directly or you can update
configurations via CONFIG set command.
To run commands on Redis server, you need a Redis client. Redis client is available in Redis package,
which we have installed earlier.
Redis provides a rich set of commands and operations to interact with its various data
structures. Here are some basic Redis commands and operations:
1. Key-Value Operations:
Indeed, these key-value operations are fundamental to working with Redis. Let's delve into these
commands:
- This command sets the value associated with a given key. If the key already exists, it updates
the existing value; otherwise, it creates a new key-value pair.
- Used to retrieve the value associated with a specific key. It is a fundamental read operation in
Redis.
GET mykey
This would return `"Hello, Redis!"` if the previous SET command was executed.
3. DEL key:
- Deletes a key and its associated value from the Redis database.
Page
Introduction to NoSQL
DEL mykey
After executing this command, the key "mykey" and its value will be removed from the
database.
These key-value operations form the basis for many Redis use cases, allowing efficient storage and
retrieval of data. They are commonly used for caching, session management, and various other
scenarios where quick and direct access to data is essential.
2. String Operations:
- This command appends the specified value to the existing value of a key. If the key does not
exist, a new key is created with the provided value.
After executing these commands, the value of the key "mystring" would be "Hello, Redis!".
2. STRLEN key:
- Retrieves the length of the string value stored at the specified key.
STRLEN mystring
This would return the length of the string in the key "mystring".
- INCR increments the integer value stored at the specified key by 1, while DECR decrements
it by 1. If the key does not exist, a new key is created with the value set to 1.
SET mycounter 10
INCR mycounter
After these commands, the value of "mycounter" would be 11.
DECR mycounter
Subsequently, the value of "mycounter" would be 10 again.
These string operations are useful for manipulating and analyzing text-based data as well as
managing counters or numerical values in Redis.
3. Hash Operations:
Page
Introduction to NoSQL
- Sets the value of a field within a hash. If the hash does not exist, a new hash is created with the
specified key.
3. HGETALL key:
HGETALL user:1001
This command returns a list of field-value pairs for the hash associated with the key
"user:1001".
These hash operations are particularly useful for representing and managing structured data
within Redis. Hashes can be employed to store and retrieve information related to entities,
making them a valuable choice for scenarios where data needs to be organized into fields and
subfields.
4. List Operations:
Page
Introduction to NoSQL
- Retrieves a range of elements from a list. The range is specified by the start and stop indices.
LRANGE mylist 0 -1
This command retrieves all elements from the list associated with the key "mylist."
These list operations in Redis are beneficial for scenarios where data needs to be stored in an ordered
sequence. Lists are commonly used for implementing queues, managing job queues, and storing logs
where the order of events is crucial. The ability to insert and retrieve elements from both ends of the
list makes Redis lists versatile for various use cases.
5. Set Operations:
- Adds one or more members to a set. If the set does not exist, a new set is created.
2. SMEMBERS key:
SMEMBERS myset
This command returns all members of the set associated with the key "myset."
Redis sets are useful for scenarios where you need to represent a collection of unique elements. Set
operations like union, intersection, and difference provide powerful tools for analyzing and
manipulating sets in various applications.
Page
Introduction to NoSQL
- Adds members with associated scores to a sorted set. If a member already exists, its score is
updated.
Sorted sets in Redis are particularly useful when you need to maintain a ranking or leaderboard
based on scores. These operations allow you to efficiently retrieve and update elements based on
their scores, making sorted sets suitable for scenarios such as gaming leaderboards or any application
involving ordered data.
- Sets or clears the bit at a specified offset in the string value stored at a key. The value can be 0 or
1.
SETBIT mybitmap 5 1
This command sets the bit at offset 5 in the bitmap associated with the key "mybitmap"
to 1.
- Retrieves the bit value at a specified offset in the string value stored at a key.
GETBIT mybitmap 5
This command returns the bit value at offset 5 in the bitmap associated with the key
"mybitmap."
Bitwise operations in Redis are often used for scenarios where you need to efficiently represent and
manipulate sets of binary flags or indicators. Bitmaps can be employed for tasks such as tracking
user preferences, monitoring system states, or implementing compact data structures that rely on
binary representation.
8. Pub/Sub (Publish/Subscribe):
Page
Introduction to NoSQL
- Publishes a message to a specified channel. Any clients subscribed to that channel will receive
the message.
2. SUBSCRIBE channel:
- Subscribes the client to the specified channel. The client will receive messages published to that
channel.
SUBSCRIBE news_channel
This command subscribes the client to the "news_channel," allowing it to receive
messages published to that channel.
Redis Pub/Sub is commonly used for building real-time messaging systems, chat applications, and
event notification systems. Publishers broadcast messages to specific channels, and subscribers
receive messages from channels they are interested in, facilitating efficient communication between
different parts of an application.
9. Transactions:
1. MULTI:
- Marks the start of a transaction block. Subsequent commands are queued up for execution as part
of the transaction.
MULTI
SET key1 "value1"
SET key2 "value2"
This command initiates a transaction block, and the subsequent SET commands will be
executed atomically.
2. EXEC:
- Executes all previously queued commands in a transaction. If the transaction is successful, the
changes are committed; otherwise, they are rolled back.
EXEC
This command executes the queued commands within the transaction block. If
successful, the changes are committed.
3. DISCARD:
- Discards all commands in the current transaction, effectively canceling the transaction.
DISCARD
Page
Introduction to NoSQL
This command discards all commands queued in the current transaction block, undoing any
changes made so far.
These transaction commands in Redis provide a way to group multiple commands into a single
atomic operation. This ensures that either all commands in the transaction are executed, or none of
them are, maintaining data consistency. Transactions are particularly useful in scenarios where it's
crucial to perform a series of operations atomically.
- Sets the time to live (TTL) of a key, indicating how long the key should be retained before it is
automatically deleted.
EXPIRE mykey 60
This command sets the key "mykey" to expire in 60 seconds. After this time elapses, the
key will be automatically removed from the database.
2. TTL key:
TTL mykey
This command returns the remaining time to live of the key "mykey." If the key is
persistent (does not have a set expiration), TTL returns -1. If the key does not exist or has expired,
TTL returns -2.
These commands are valuable for managing the lifecycle of keys in Redis. Setting an expiration time
is useful for scenarios such as caching, where you want to automatically refresh data after a certain
period. The TTL command provides a convenient way to check the remaining time before a key
expires.
These are just a few examples, and Redis provides a comprehensive set of commands to perform a
wide range of operations on different data structures. Understanding these basic commands is
essential for effectively working with Redis in various applications.
Redis offers several advanced features that contribute to its versatility and wide range of use cases.
Here are some notable advanced features of Redis:
1. Persistence:
- Redis supports different mechanisms for persistence, allowing data to be saved to disk. This
ensures that data is not lost when Redis is restarted. Options include RDB snapshots and AOF
(Append-Only File) logs.
2. Replication:
Page
Introduction to NoSQL
- Redis supports master-slave replication, allowing data to be asynchronously replicated from one
Redis server (master) to one or more Redis servers (slaves). This provides data redundancy, high
availability, and scalability.
3. Partitioning:
- Redis allows horizontal scaling through partitioning, where data is distributed across multiple
Redis instances. This helps handle large datasets and improves performance by leveraging the
capabilities of multiple servers.
4. Lua Scripting:
- Redis supports Lua scripting, allowing users to write custom scripts that can be executed on the
server. This feature enables complex operations and transactions to be executed atomically on the
server side.
5. Transactions:
- Redis supports transactions using the MULTI, EXEC, and DISCARD commands. Multiple
commands can be grouped together in a transaction, ensuring they are executed atomically. This
helps maintain data consistency.
6. Keyspace Notifications:
- Redis provides the ability to subscribe to notifications for specific keyspace events. Clients can be
notified when certain events, such as key expirations or modifications, occur in the Redis dataset.
7. Bitmap Operations:
- Redis supports advanced bitmap operations, allowing efficient manipulation of sets of bits. This
is useful for scenarios such as tracking user behavior, handling flags, and implementing efficient data
structures.
8. HyperLogLogs:
- HyperLogLogs provide approximate cardinality estimation for sets of unique elements with
minimal memory usage. This feature is useful for counting distinct items in large datasets with
reduced memory requirements.
9. Geospatial Indexing:
- Redis supports geospatial data types and provides commands for storing, querying, and
manipulating data based on geographic location. This is valuable for location-based services and
applications.
- Redis Cluster is a distributed implementation of Redis that provides high availability and
horizontal scaling. It divides the dataset into multiple partitions across nodes, ensuring data is
distributed and replicated for fault tolerance.
Page
Introduction to NoSQL
11. Security Features:
- Redis supports authentication through passwords and provides access control mechanisms to
restrict client access based on IP addresses. This helps enhance the security of Redis deployments.
These advanced features contribute to Redis's appeal in various use cases, including real- time
analytics, caching, messaging systems, and more. Understanding and leveraging these features
allows developers and system administrators to optimize Redis for specific requirements and achieve
better performance and reliability.
HTTP API cache refers to the caching of HTTP responses from an API (Application Programming
Interface). It involves storing the response of an API request in a cache, which can be subsequently
used to serve future requests without the need to re-fetch the data from the API server.
One of the common use cases for API caching is storing user profiles. In an authentication system,
usually only the user ID or email (something unique) is stored in the authentication token. However,
the frontend often requires the user's name and profile picture on every page. Retrieving the user
profile repeatedly can become redundant, so api caching with Redis can greatly enhance scalability.
function GetUserProfile(user_id) {
var cache_response = redis.Do(`GET user:{user_id}`) if(cache_response) {
return json.Unmarshal(cache_response)
}
Page
Introduction to NoSQL
}
HTTP REST APIs typically return JSON responses. If we want to utilize Redis to cache the
responses, we can store JSON string as the cached value. For subsequent requests, we can simply
unmarshalled the string and return the cached JSON object.
Session-based authentication is a widely used approach for user authentication in web applications. It
involves generating a session token after login, which is then used to keep track of the authenticated
user.
Similar to the issue with user profiles, the session token needs to be checked every time a user
performs an action that requires authentication. Consequently, querying the database can easily
become a bottleneck in high-traffic use cases. Therefore, using Redis for session storage is
considered one of the best solutions.
Page
Introduction to NoSQL
function Logout(session_token){
redis.Do(`DEL user:session:{session_token}`)
}
function VerifyToken(session_token) {
var email = redis.Do(`GET user:session:{session_token}`) if(!
email) {
return null, errors(`token is expired or not exists`)
}
var user_detail = getUserDetail(email)
return user_detail, null
}
Page
Introduction to NoSQL
Example-3: Log In Throttling
Authentication has many interesting use cases, one of which is preventing users from hacking other
users through brute force attacks. One of the easiest ways to achieve this is by implementing
throttling. Throttling refers to the practice of intentionally limiting the rate or speed of a process or
service. You might have seen messages such:
“There have been too many login failures. Try again in x seconds”
This is an implementation of throttling for login attempts to prevent brute force attacks. In a simple
manner, throttling involves storing the number of login attempts made by a user. However, storing
this data in databases like MySQL or PostgreSQL may seem overly engineering. Additionally,
databases like MySQL and PostgreSQL do not have built-in time- based expiration, unlike Redis.
Using Redis to implement throttling is a straightforward solution. Please refer to the system design
below:
Page
Introduction to NoSQL
redis.Do(`INCR login_attempt:{guest_session}`)
redis.Do(`EXPIRE login_attempt:{guest_session} 300`) return null, errors("user not found / i
}
redis.Do(`DEL login_attempt:{guest_session}`) return auth_token, null
}
Example-4: Dynamic Pricing in Hotel Industry (Redis Distributed Locking)
When working with horizontal scaling or even microservices, it becomes necessary to implement
distributed locking at the application level to address issues such as race conditions or thundering
herd problems. Redis can be leveraged for distributed locking, offering several advantages including
high scalability, fault tolerance, and strong consistency guarantees.
Consider a scenario where you are developing a hotel booking platform that fetches dynamic prices
from a third-party API. In a situation where multiple users are requesting hotel information for the
same date, you may want to avoid overwhelming the third-party API. Here, you can utilize Redis
distributed locking to resolve this issue effectively.
if(!acquire_lock){
sleep(250) // waiting 250 ms
Page
Introduction to NoSQL
continue
}
return price_detail
}
Example-5: OTP (One Time Password) using Redis
"OTP" stands for "One-Time Password." It is a security measure used to authenticate users and
verify their identity during online transactions or account logins. A one-time password is a unique
and temporary code that is typically valid for a short period of time, usually for a single login
session or transaction. Since Redis has time-based expiration using the EXPIRE command,
it is well-suited for building an OTP (One-Time Password) system.
Generate OTP
Verify OTP
Page
Introduction to NoSQL
Redis can be leveraged to build an OTP system. Below is pseudocode that demonstrates how
Redis can be used to develop an OTP system:
function GenerateOTP(phone_number) {
var otp_code = generate_otp_code(6)
var is_success = redis.Do(`SET otp:{phone_number} otp_code NX EX 600`) if(!
is_success){
var ttl = redis.Do(`TTL otp:{phone_number}`)
return errors("otp has been sent, try again in {ttl} minutes")
}
sendOtpViaSMS(phone_number, otp_code)
return null
}
redis.Do(`DEL otp:{phone_number}`)
return null
}
Example-6: Waiting List in Help Desk System (Redis for Job Queue)
A help desk is a system designed to manage and streamline customer support and issue resolution
processes within an organization. When the number of support request much bigger than the number
of customer support it will generate a problem.
Imagine you only have 3 customer support officer and at one time there are 100 request to have chat
support, how to solve it?
It’s a common where help desk system has waiting list mechanism. In technical, waiting list is
similar to JOB QUEUE in data structure. Redis has a powerfull command to solve job queue
problems. In a help desk system, there are at least two functions:
Page
Introduction to NoSQL
function CreateRequest(payload){
var request_id = createSupportRequest(payload)
redis.Do(`LPUSH helpdesk {request_id}`)
}
function GetQueuePos(request_id){
var my_queue_number = redis.Do(`LPOS helpdesk {request_id} RANK -1`)
// Notes: "RANK -1" means you reverse the sort
return my_queue_number + 1
}
function HandleSupportRequest(cs_id){
while (true) {
var request_id = redis.Do(`RPOPLPUSH helpdesk helpdesk:{cs_id}`)
if(request_id == null) {
sleep(60)
continue
}
var request_detail = getSupportRequest(request_id)
connect(cs_id, request_detail.user_id)
redis.Do(`LREM helpdesk:{customer_support_id} 1 {request_id}`
}
}
Page
Introduction to NoSQL
H. Resources/Equipment Required
1. Technical Setup:
- Ensure that students have access to individual computers or virtual machines with Redis
installed, creating a conducive environment for hands-on practice.
- Redis is not officially supported on Windows. To install Redis on Windows, you'll first need to
enable WSL2 (Windows Subsystem for Linux). WSL2 lets you run Linux binaries natively on
Windows. For this method to work, you'll need to be running Windows 10 version 2004 and higher
or Windows 11.
2. Educational Resources:
3. Training Environment:
- Set up a simulated environment with Redis configured to mimic real-world scenarios, allowing
students to apply key-value operations and commands in a controlled setting.
4. Support Structure:
1. Which data structure in Redis is suitable for storing and retrieving objects
with multiple attributes?
a. Strings c. Lists
b. Hashes d. Sets
2. What is the primary role of the Redis client in the client-server architecture?
a. HSET c. SADD
b. LPUSH d. ZADD
Page
Introduction to NoSQL
Page
Introduction to NoSQL
4. What does the `PUBLISH` command do in
Redis?
c. Sets the time to live for a
a. Adds elements to a list key
b. Publishes a message to a d. Retrieves all members of a
channel set
7. Which data structure is suitable for ordered data retrieval based on scores in Redis?
True/False Statements:
3. The `RPUSH` command is used to insert values at the beginning of a list in Redis.
Page
Introduction to NoSQL
Page
Introduction to NoSQL
Write redis queries for following
2. Write a query to retrieve the length of the string stored in the key "message".
3. Write a query to add the values "apple," "orange," and "banana" to a set with the
key "fruits".
4. Write a query to retrieve all fields and values from a hash with the key "user:1001".
1. Using only Redis commands, demonstrate how you can create a new key
named "counter" and set its value to 100, but only if the key doesn't already exist. If
the key exists, increment its value by 10.
Page
Introduction to NoSQL
2. Write a Redis query to append the string ", Redis is powerful!" to the existing value
of the key "description" only if the length of the current value is less than 50 characters.
3. Suppose you have a hash named "user:101" representing a user profile. Add a
new field "age" with a value of 25, but only if the field doesn't exist. If the field exists,
increment its value by 1.
4. Create a list named "logs" and insert the values "log_entry_1," "log_entry_2," and
"log_entry_3" at the beginning of the list. After that, trim the list to keep only the first
two elements.
5. Implement a scenario where you have two sets, "setA" and "setB." Write a query
to add the elements present in both sets to a new set named "common_elements."
6. Assume you have a sorted set named "scores" with members "playerA," "playerB,"
and "playerC" having scores 100, 150, and 80, respectively. Write a query to update
the score of "playerB" to 120.
7. Create a bitmap named "user_flags" with a length of 8 bits. Set the bits at positions
2 and 5 to 1, and then retrieve the value of the bitmap.
Page
Introduction to NoSQL
Page
Introduction to NoSQL
Consider a scenario where you are developing a real-time analytics dashboard for an
e-commerce platform using Redis. Design a use case example that best fits Redis's
strengths and capabilities for this scenario. Include the specific Redis commands
and data structures you would use to implement this example.
Page
Introduction to NoSQL
J. References Links
1. https://redis.io/docs/install/install-redis/install-redis-on-windows/
2. https://www.koderhq.com/tutorial/redis/basic-commands/#google_vignette
3. https://redis.com/glossary/redis-data-structures/
4. https://redis.io/docs/interact/search-and-query/advanced-concepts/
5. https://redis.com/blog/5-industry-use-cases-for-redis-developers/
6. https://www.thescalable.net/p/redis-use-cases-examples-in-the-real
K. Assessment Rubrics
Needs
Satisfactory
Criteria Excellent (10) Good (7) Improvement Marks
(5)
(3)
Demonstrates a Demonstrates a Struggles to
Shows a strong
Understanding profound basic grasp the
understanding of
of Redis understanding of understanding of fundamental
Redis overview
Overview Redis overview Redis overview concepts of
Redis overview
Struggles to
Demonstrates a
Displays an Shows a solid comprehend
Knowledge of basic
excellent grasp understanding Redis data
Redis Data understanding of
of Redis data of Redis data structures or
Structures Redis data
structures structures makes significant
structures
errors
Carries out basic Struggles to
Proficiency in Executes basic
Performs basic commands and execute basic
Basic Redis
commands and operations but commands and
Commands commands and
operations with with some errors operations or
and operations with
competence makes significant
Operations precision
errors
Struggles to
Demonstrates a Demonstrates a
Understanding Shows a strong grasp the
profound basic
of Advanced understanding of fundamental
understanding of understanding of
Redis advanced Redis concepts of
advanced Redis advanced Redis
Features features advanced Redis
features features
features
Average Marks
Page