0% found this document useful (0 votes)
19 views15 pages

Fundamental of Database Group Work

A graph database (GDB) uses graph structures, consisting of nodes, edges, and properties, to store data, making it efficient for handling complex relationships. It excels in scenarios involving many-to-many relationships, real-time query performance, and flexible schema design, making it suitable for applications like social networks and fraud detection. In contrast, MongoDB is a document-oriented NoSQL database that allows for flexible schema design and efficient data management, while PostgreSQL is an object-relational database management system that combines traditional relational features with advanced capabilities like custom data types and JSON support.

Uploaded by

Nolawi Getye
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views15 pages

Fundamental of Database Group Work

A graph database (GDB) uses graph structures, consisting of nodes, edges, and properties, to store data, making it efficient for handling complex relationships. It excels in scenarios involving many-to-many relationships, real-time query performance, and flexible schema design, making it suitable for applications like social networks and fraud detection. In contrast, MongoDB is a document-oriented NoSQL database that allows for flexible schema design and efficient data management, while PostgreSQL is an object-relational database management system that combines traditional relational features with advanced capabilities like custom data types and JSON support.

Uploaded by

Nolawi Getye
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

What is a Graph Database?

A graph database (GDB) is a database that uses graph structures for storing data. It
uses nodes, edges, and properties instead of tables or documents to represent and store
data. Ex: Neo4j, Amazon Neptune, ArangoDB etc.
The edges represent relationships between the nodes. This helps in retrieving data
more easily and, in many cases, with one operation. Graph databases are commonly
referred to as NoSQL.

Graph databases are similar to 1970s network model databases in that both represent
general graphs, but network-model databases operate at a lower level of abstraction
and lack easy traversal over a chain of edges.
The underlying storage mechanism of graph databases can vary. Relationships are
first-class citizens in a graph database and can be labelled, directed, and given
properties.

Graph databases portray the data as it is viewed conceptually. This is accomplished by


transferring the data into nodes and its relationships into edges.
A graph database is a database that is based on graph theory. It consists of a set of
objects, which can be a node or an edge.

 Nodes represent entities or instances such as people, businesses, accounts, or any


other item to be tracked. They are roughly the equivalent of a record, relation, or
row in a relational database, or a document in a document-store database.
 Edges, also termed graphs or relationships, are the lines that connect nodes to
other nodes; representing the relationship between them. Meaningful patterns
emerge when examining the connections and interconnections of nodes,
properties and edges. The edges can either be directed or undirected. In an
undirected graph, an edge connecting two nodes has a single meaning. In a
directed graph, the edges connecting two different nodes have different meanings,
depending on their direction. Edges are the key concept in graph databases,
representing an abstraction that is not directly implemented in a relational
model or a document-store model.
 Properties are information associated to nodes.

When do we need Graph Database?

1. It solves Many-To-Many relationship problems. If we have friends of friends


and stuff like that, these are many to many relationships.Used when the query in
the relational database is very complex.

2. When relationships between data elements are more important For example-
there is a profile and the profile has some specific information in it but the major
selling point is the relationship between these different profiles that is how you
get connected within a network.
In the same way, if there is data element such as user data element inside a graph
database there could be multiple user data elements but the relationship is what is
going to be the factor for all these data elements which are stored inside the graph
database.
3. Low latency with large scale data When you add lots of relationships in the
relational database, the data sets are going to be huge and when you query it, the
complexity is going to be more complex and it is going to be more than a usual time.
However, in graph database, it is specifically designed for this particular purpose and
one can query relationship with ease.

Why Do Graph Databases Matter?


Graph databases matter because they provide a natural and efficient way to model,
store, and query complex relationships between data entities. In today's data-driven
world, many applications—from social media platforms to fraud detection systems—
require not just storing data, but understanding how that data is connected.

Key Reasons They Matter:


 Relationship-Centric Storage: Unlike relational databases, which use JOINs to
infer relationships, graph databases store relationships as first-class citizens,
making them faster and more intuitive for traversing networks.

 Real-Time Query Performance:Graph databases excel at querying highly


interconnected data in real-time, which is essential for recommendations, path
finding (e.g., in maps), and influence analysis.

 Flexible Schema: They allow a schema-less structure, meaning you can evolve
your data model without costly migrations, making them ideal for dynamic
applications.

 Applications in Critical Domains: Used in fraud detection, recommendation


engines, network/IT operations, knowledge graphs, and social network analysis—
graph databases power features that require deep link analysis.

Limitations of Graph Databases:


 Graph Databases may not be offering better choice over the NoSQL variations.
 If application needs to scale horizontally this may introduces poor performance.
 Not very efficient when it needs to update all nodes with a given parameter.

Scenario: University Course Enrollment System


Imagine a university system where you need to model relationships between students,
courses, and professors. In a traditional relational database, you'd need multiple
JOINs to connect these entities. But in a graph database, you can model this naturally
using nodes and relationships.

Entities and Relationships:


Nodes:
Student: name, student_id
Course: title, course_code
Professor: name, department

Relationships:
(:Student)-[:ENROLLED_IN]->(:Course)
(:Professor)-[:TEACHES]->(:Course)

Using Neo4j (Cypher Query Language)

Create Nodes:
CREATE (alice:Student {name: 'Alice', student_id: 'S001'})
CREATE (bob:Student {name: 'Bob', student_id: 'S002'})
CREATE (math:Course {title: 'Mathematics', course_code: 'MATH101'})
CREATE (prof:Professor {name: 'Dr. Smith', department: 'Mathematics'})

Create Relationships:
CREATE (alice)-[:ENROLLED_IN]->(math)
CREATE (bob)-[:ENROLLED_IN]->(math)
CREATE (prof)-[:TEACHES]->(math)

Sample Queries
1. Find all students enrolled in Mathematics:
MATCH (s:Student)-[:ENROLLED_IN]->(c:Course {title: 'Mathematics'})
RETURN s.name

2. Find which professor teaches Mathematics:


MATCH (p:Professor)-[:TEACHES]->(c:Course {title: 'Mathematics'})
RETURN p.name

3. List all courses a student is enrolled in:


MATCH (s:Student {name: 'Alice'})-[:ENROLLED_IN]->(c:Course)
RETURN c.title

4. Find all students taught by a specific professor:


MATCH(p:Professor{name:'Dr.Smith'})-[:TEACHES]->(c:Course)<-
[:ENROLLED_IN]-(s:Student)
RETURN s.name

What is MongoDB?
MongoDB is a document-oriented NoSQL database designed for storing and
managing large volumes of unstructured or semi-structured data. Unlike traditional
relational databases, MongoDB uses a flexible schema design, allowing developers to
store data in a way that aligns seamlessly with modern application requirements.
Key Features

Document-Oriented: MongoDB stores data in JSON-like documents, which are more


flexible and easier to work with than traditional rows and columns. Each document
can have a different structure, making it ideal for hierarchical data storage

Scalability: MongoDB supports horizontal scaling through sharding, which distributes


data across multiple servers. This ensures high availability and performance even with
large datasets

Indexing: Every field in a MongoDB document can be indexed, making data retrieval
faster and more efficient

Replication: MongoDB provides high availability through replica sets, which


maintain multiple copies of data across different servers. This ensures data
redundancy and reliability

Aggregation: MongoDB offers powerful aggregation capabilities, allowing for


complex data processing and analysis. It supports aggregation pipelines, map-reduce
functions, and single-purpose aggregation methods

Architecture
 Programming language accessibility
MongoDB has official drivers for major programming languages and development
environments. There are also a large number of unofficial or community-supported
drivers for other programming languages and frameworks.

 Serverless access
 Management and graphical front-ends

 Record insertion in MongoDB with Robomongo 0.8.5


The primary interface to the database has been the mongo shell. Since MongoDB 3.2,
MongoDB Compass is introduced as the native GUI. There are products and third-
party projects that offer user interfaces for administration and data viewing.

Scenario: E-Commerce Orders System


{
"_id": ObjectId("..."),
"order_id": "ORD123",
"customer_name": "Robel",
"items": [
{ "product": "Laptop", "qty": 1, "price": 1000 },
{ "product": "Mouse", "qty": 2, "price": 25 }
],
"total": 1050,
"status": "Shipped",
"order_date": ISODate("2025-05-09T00:00:00Z")
}
Queries for This Scenario

1. Insert an order

db.orders.insertOne({
order_id: "ORD123",
customer_name: "Robel",
items: [
{ product: "Laptop", qty: 1, price: 1000 },
{ product: "Mouse", qty: 2, price: 25 }
],
total: 1050,
status: "Shipped",
order_date: new Date()
})

2. Find all orders by a customer


db.orders.find({ customer_name: "Robel" })

3. Find orders with total greater than 1000


db.orders.find({ total: { $gt: 1000 } })

4. Update order status

db.orders.updateOne(
{ order_id: "ORD123" },
{ $set: { status: "Delivered" } }
)
4. Find orders with product "Laptop"
db.orders.find({ "items.product": "Laptop" })

Operator Description
$gt Greater than
$lt Less than
$set Updates the value of a field
$in Matches any value in an array
$push Adds an item to an array
$and / $or Combines conditions

What is PostgreSS database?


PostgreS database, commonly known as Postgres, is a powerful open-source
object-relational database management system (ORDBMS). It stores data in
structured tables made up of rows and columns and allows relationships
between tables using keys a core concept in relational databases.
Unlike traditional relational databases, PostgreSQL also includes object-
oriented features, offering greater flexibility and complexity in data modeling.

Key Object-Oriented Features:


 Custom data types: Create your own data structures tailored to your application.
 Table inheritance: Let tables inherit columns and constraints from other tables to
simplify hierarchical data.
 Functions and procedures: Write reusable logic using multiple languages (e.g.,
PL/pgSQL, Python).
JSON support: Natively handle semi-structured data with full JSON
capabilities.
Full-text search: Efficiently search text fields within large datasets.

What is Postgress database Used For?


Postgress database is widely used across domains, including AI and data science, due
to its reliability and versatility.

Common Uses for Data Scientists:

Large-scale data storage for machine learning, analytics, and data pipelines.Advanced
querying with joins, subqueries, and window functions for powerful data exploration.

Data transformation using complex SQL operations for data cleaning and preparation.
In-database analytics using functions, triggers, and procedures to compute and
automate logic directly in the database.

Examples
E-commerce Inventory Management: A retail company uses Postgress database to
track products, stock levels, and sales. The database helps in real-time inventory
updates and generating sales reports.

Healthcare Records: A hospital system employs Postgress database to store patient


records, appointments, and medical histories, ensuring secure and reliable data access
for doctors.

Financial Transactions: A bank uses Postgress database to manage customer


accounts, transactions, and fraud detection, leveraging its robust transaction support.

Sample Queries
Creating a Table:

CREATE TABLE customers (

customer_id SERIAL PRIMARY KEY,

name VARCHAR(100),
email VARCHAR(100),

join_date DATE );

Inserting a data:

INSERT INTO customers (name, email, join_date) VALUES

('Abebe Bekele', '[email protected]', '2024-01-15'),

('Almaz Tadesse', '[email protected]', '2024-02-10');

Retrieving Data with Conditions:

SELECT name, email FROM customers WHERE join_date > '2024-01-01';

Updating Data:

UPDATE customers SET email = '[email protected]' WHERE


customer_id = 1;

Deleting Data:

DELETE FROM customers WHERE customer_id = 2;

Inner Join
The INNER JOIN clause in SQL is used to combine records from two or more tables.
The result contains only the rows that have matching values in both tables based on a
specific condition. This makes INNER JOIN a valuable tool when we need to work
with related data across multiple tables in a database.

The key feature of an INNER JOIN is that it filters out rows from the result where
there is no matching data in both tables. Essentially, it returns a “subset” of the data
where the condition is satisfied.

Key Points About SQL INNER JOIN

1. Combines Data from Multiple Tables: INNER JOIN allows us to combine


data from multiple tables based on common columns, making it possible to
work with related data stored in different tables.
2. Excludes Non-Matching Records: INNER JOIN only returns records where
there is a match in both tables based on the join condition. If there is no match,
the record will be excluded from the result set.
3. Simplifies Complex Queries: INNER JOIN simplifies complex queries by
allowing you to work with multiple tables at once. It reduces the need for
multiple subqueries and makes database management more efficient.
4. Widely Used in Relational Databases: INNER JOIN is widely used for tasks
such as managing customer orders, product inventories, and many other
relational datasets. It is essential for performing operations on normalized
data.
Example

Product and Supplier Inventory:

Scenario: A retail business wants to check the availability of products by linking


product details with supplier information, focusing only on products with active
suppliers.

Use Case: Inventory managers reorder stock for products with confirmed supplier
records.
Example: Join products and suppliers tables on supplier_id to get product and supplier
details.

Sample Queries

Sample Tables:
employees (employee_id, name, department_id)
departments (department_id, department_name)

Inserting Sample Data:

INSERT INTO employees (name, department_id) VALUES ('Alice', 1),


('Bob', 2), ('Charlie', 3); INSERT INTO departments
(department_id, department_name) VALUES (1, 'HR'), (2, 'IT'), (4,
'Finance');

Basic Inner Join Query:

SELECT e.name, d.department_name FROM employees e INNER JOIN


departments d ON e.department_id = d.department_id;
Result:
Alice - HR
Bob - IT
(Charlie is excluded as department 3 has no match)

Inner Join with Conditions:

SELECT e.name, d.department_name FROM employees e INNER JOIN


departments d ON e.department_id = d.department_id WHERE
.department_name = 'IT';
Result:
Bob – IT

Multiple Tables Inner Join:


SELECT e.name, d.department_name, l.location FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id
INNER JOIN locations l ON d.location_id = l.location_id;
(Assumes a locations table with location_id and location.)

FULL OUTER JOIN


The SQL FULL OUTER JOIN statement joins two tables based on a common
column. It selects records with matching values in these columns and the remaining
rows from both tables.
The FULL OUTER JOIN in MySQL returns all rows of both tables involved in the
JOIN operation in the result set. If there is no match for a particular row based on the
specified condition, the result will include NULL values for columns from the table
that do not have a match. This makes FULL OUTER JOIN useful when including
unmatched rows from both tables.
Key features of FULL OUTER JOIN
1. Combines left and right joins:
We can achieve the same result using a combination of LEFT JOIN, RIGHT
JOIN, and the UNION operator. The LEFT JOIN returns all records from the left
table and matched records from the right table (filling NULLs where no
match exists), and the RIGHT JOIN does the opposite, and we can use the
UNION of the LEFT JOIN and RIGHT JOIN to get the same returns as the
FULL OUTER JOIN.

2. Includes unmatched rows:


Rows with no match in the other table are included with NULL values filling
in the missing columns.

3. Maintains Data integrity:


It helps preserve all data from both tables. Therefore, no data is lost.
4. Useful for Data Reconciliation:
FULL OUTER JOINs are often used to identify differences between
datasets— showing what exists in one table but not the other.

5. Results can be further filtered for more insight:


After the FULL OUTER JOIN, we can filter the results to find only
unmatched rows or analyze mismatched data. Example consider two tables, one
listing employees and the other departments. We can apply this to reveal
employees not assigned to any department and department without
employees.
Syntax:
SELECT column_list
FROM table1
FULL OUTER JOIN table2
ON table1.common_column = table2.common_column;
The column_list refers to the columns that we want to retrieve in the result table. We
separate each column with a comma and if we want to retrieve all columns from both
table, we use ‘*’ after SELECT. Table1 and table2 are the tables that we want to join.
And as mentioned earlier, the FULL OUTER JOIN works based on a common
column. The ON statement gives the sql hint on what columns from the two table to
match.
Example:
Scenario: A business company stores customer information separately from order
history. If the company wants to identify customers without purchases or purchases
without registered customers. Our task is to join all records in the tables and filter
records with NULL value in one of their rows.

Sample Tables:
CREATE TABLE customers(
customer_id INT,
first_name VARCHAR(50),
last_name VARCHAR(50)
);
CREATE TABLE orders(
order_id INT,
amount INT,
customer INT
);
Customers table:
Customer_id First_nam Last_name
e
1 Natnael Melaku
2 Biruk Dagim
3 Minase Tesfaye
4 Yosef Mulatu

Orders table:
ordre_id Amoun Customer
t
1 50 10
2 100 4
3 86 6
4 49 2

Query:
SELECT customer_id, first_name, amount
FROM customers
FULL OUTER JOIN orders
ON customers.customer_id = orders.customer;
Output:

Customer_id First_nam amount


e
1 Natnael NULL
2 Biruk 49
3 Minase NULL
4 Yosef 100
NULL NULL 50
NULL NULL 86

Customers 2 and 4 have matching values so the amount value is given accordingly.
Customers with ID 1 and 3 have no match in the customer column of orders as a
result the amount is set to NULL. Similarly, customer 10 and 6 are not registered on
the customers table. Therefore, the customer_id and first_name are set to NULL.
From this table, we can filter and identify customers without purchases or purchases
without registered customers.
Left Join
Definition: A LEFT JOIN returns all rows from the left table and the matched rows
from the right table. If there is no match, the result will contain NULL values from the
right table. It is mostly used when you want to get everything from the main table,
whether or not there's a related record in the joined table.

Key Features of LEFT JOIN:

 Retrieves all data from the left table.


 Returns NULL for unmatched rows from the right table.
 Maintains full visibility of the main dataset (left table).
 Very useful in reporting where missing values must be shown.

When Do We Need LEFT JOIN?

 To find students who haven’t submitted assignments.


 To generate a customer list with or without purchases.
 In attendance systems, to list all students whether they attended or not.
 When checking for missing or incomplete data in linked tables.

Example:

Let’s say we have two tables:

I. Students: Contains the list of all students in a university in Addis Ababa.


II. Course_Enrollments: Contains only the students who have enrolled in any
course.

We want to list all students, whether they enrolled in a course or not.

student_id name
1 Saba
2 Henok
3 Melat
student_id course
1 Math
2 Physics

SQL Query:
SELECT Students.name, Course_Enrollments.course

FROM Students

LEFT JOIN Course_Enrollments

ON Students.student_id = Course_Enrollments.student_id;

Output:

name course
Saba Math
Henok Physics
Melat NULL

Explanation:

 Saba and Henok are enrolled in Math and Physics, respectively.


 Melat is not enrolled in any course, but she still appears in the result because we
used LEFT JOIN.

Right Join
Definition: A RIGHT JOIN returns all rows from the right table and the matched
rows from the left table. If there is no match, it will return NULL values from the left
table. It’s useful when we want to get everything from the secondary table, even if it
doesn’t have corresponding data in the main table.

Key Features of RIGHT JOIN:

 Retrieves all data from the right table.


 Shows NULL where the left table has no match.
 Useful when the right table is the priority (e.g., course list, transactions).
 Helps detect orphan records—records that exist without matching main data.

When Do We Need RIGHT JOIN?

 To find orders with no known customers.


 To list course enrollments even if students are missing in the database.
 In inventory systems, to view items sold that are not listed in the product catalog.
 When auditing for data integrity issues.

Example:

Now let’s say we want to get a list of all courses that students enrolled in, including
those that don’t have a matching student (e.g., missing or unregistered).

Course_Enrollments Table:
student_id course
1 Math
2 Physics

4 English

Students Table:

student_id name
1 Saba
2 Henok
3 Melat

Conclusion

Understanding Left Join and Right Join is essential when working with relational
databases because not all data is always perfectly matched between tables. These joins
help us uncover missing or unmatched information, which is especially important in
real-world systems like student records, inventory tracking, financial reporting, or
customer databases.

 Left Join is useful when we want to see all records from the main (left) table,
even if there are no matching entries in the related table.
 Right Join is the opposite—it helps us view everything from the secondary (right)
table, whether or not it has related data in the main table.
REFERENCES
1. What is Graph Database – Introduction | GeeksforGeeks
2. Graph database - Wikipedia
3. Davis Kerby. "Why MongoDB is the way to go". DZone. Archived from the
original on June 12, 2018. Retrieved July 6, 2017.
4. MongoDB - Wikipedia

You might also like