0% found this document useful (0 votes)
19 views15 pages

COSC 411 Summary-V2.0

The Entity-Relationship Model (ERM) is a conceptual framework for database design that visually represents entities, their attributes, and relationships. It includes various types of attributes, connectivity, cardinality, and relationship degrees, as well as differences between centralized and distributed databases. Additionally, it discusses database management systems (DBMS), data fragmentation, replication, and the challenges faced in distributed database management.

Uploaded by

hamzahaladu002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views15 pages

COSC 411 Summary-V2.0

The Entity-Relationship Model (ERM) is a conceptual framework for database design that visually represents entities, their attributes, and relationships. It includes various types of attributes, connectivity, cardinality, and relationship degrees, as well as differences between centralized and distributed databases. Additionally, it discusses database management systems (DBMS), data fragmentation, replication, and the challenges faced in distributed database management.

Uploaded by

hamzahaladu002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

THE ENTITY RELATIONSHIP MODEL (ERM)

An Entity-Relationship Model (ER Model) in database design is a conceptual framework


used to describe the structure of a database. It visually represents data objects (entities),
the relationships between them, and the attributes that describe these entities and
relationships. Here are the key components:

Entities: represent objects or things in the real world that have distinct existence. For
example, in a school database, entities might include Student, Teacher, Course, etc.

 Entities are usually depicted as rectangles in ER diagrams.

Attributes: describe properties or characteristics of an entity. For instance, a student


entity might have attributes like StudentID, Name, DateOfBirth, and Address.

 Attributes are typically shown as ovals connected to their respective entities.


ATTRIBUTE TYPES

 Simple/Atomic Attributes: Simple or atomic attributes are indivisible; they


represent the most basic units of data and cannot be broken down into smaller
components. Example: Age: A single value representing the number of years a
person has lived.
 Composite Attributes: Composite attributes consist of multiple components, each
of which can be treated as an independent attribute. They represent a single
attribute that can be broken down into meaningful sub-parts. Example: FullName:
Can be divided into FirstName, MiddleName, and LastName
 Single-Valued Attributes: Single-valued attributes hold only one value for a
given entity instance. Each entity instance will have a single piece of data for this
attribute. Example: DateOfBirth: A person has one specific date of birth.
 Multi-Valued Attributes: Multi-valued attributes can hold multiple values for a
single entity instance. Each entity instance can have a set of values for this
attribute. Example: Skills (an employee may have multiple skills)
 Null Attributes: Null attributes are attributes that can have a null value, meaning
the attribute may have no applicable value or the value is unknown or not yet
assigned. A null value is different from an empty string or zero; it represents the
absence of a value. Example: MiddleName: Not all people have a middle name, so
this attribute can be null.
 Derived / Calculated: Derived or calculated attributes are those whose values are
not stored directly in the database but are computed from other attributes when
needed. These attributes depend on the values of other attributes. Example: Age:
Can be calculated from the DateOfBirth by subtracting the birth date from the
current date.

Relationships: Represent associations between entities. For example, a student enrolls in


a Course, or a teacher teaches a Course. They are depicted as diamonds, with lines
connecting them to the related entities. • Relationship classification is difficult to
establish if only one side of the relationship is known

Connectivity describes the types of relationships between entities in a database. It is


about how many entities can be associated with another entity in a relationship.
Connectivity is usually classified into the following types:

 One-to-One (1:1): One entity in set A is related to at most one entity in set B, and
vice versa. Example: Each person has a unique passport number, and each
passport number is assigned to only one person.

 One-to-Many (1:M): One entity in set A can be related to multiple entities in set
B, but an entity in set B is related to at most one entity in set A. Example: A
single department in a company can have many employees, but each employee
belongs to only one department.
 Many-to-Many (M:M): An entity in set A can be related to multiple entities in set
B, and an entity in set B can be related to multiple entities in set A. Example:
Students and courses – a student can enroll in many courses, and a course can have
many students.
 Many-to-One (M:1): In a many-to-one relationship, many instances of one entity
can be associated with a single instance of another entity. Example: Students and
University: Many students attend one university. Here, the relationship is many
students (set A) to one university (set B).

Cardinality refers to the number of instances of one entity that can or must be associated
with each instance of another entity. Cardinality specifies the minimum and maximum
number of relationships in which an entity can participate. It is expressed as a pair of
values: (minimum, maximum).

 Minimum Cardinality: The least number of entity instances that must participate
in a relationship. Example: In a mandatory one-to-many relationship between
departments and employees, the minimum cardinality from department to
employees might be one, meaning each department must have at least one
employee.
 Maximum Cardinality: The greatest number of entity instances that can
participate in a relationship. Example: In a one-to-many relationship between
departments and employees, the maximum cardinality from department to
employees might be many, meaning each department can have many employees.
KEY DIFFERENCES
Connectivity: Focuses on the type of relationship between entities (one-to-one, one-to-
many, many-to-many). While
Cardinality: Focuses on the number of entity instances that can participate in a
relationship (minimum and maximum).
In summary, connectivity tells you about the possible types of relationships between
entities, while cardinality provides detailed constraints on how many instances of each
entity can or must participate in these relationships.
DEGREES OF RELATIONSHIPS

1. Unary (or Recursive) Relationship: A relationship between instances of the


same entity type. Example: An employee supervising other employees.

o Entity: Employee
o Relationship: Supervises
o Description: An employee can supervise other employees, and each
employee can have a supervisor.

2. Binary Relationship: A relationship between instances of two different entity


types. This is the most common type of relationship in ER models.
3. Ternary Relationship: A relationship among three different entity types.
Example: A project assignment where employees work on projects using specific
equipment.
a. Entities: Employee, Project, Equipment
b. Relationship: Assignment
c. Description: An employee works on a project using a piece of equipment.
This relationship cannot be accurately represented by binary relationships
alone because the involvement of all three entities is necessary to describe
the association.
OBJECT-ORIENTED MODEL

The Object-Oriented Model in database management systems (DBMS) integrates


concepts from object-oriented programming (OOP) with database systems. This approach
allows for a more natural and intuitive representation of real-world entities and their
interactions within a database.

ISA RELATIONSHIP

also known as the "IS-A" relationship, is used to model inheritance in database design.
This relationship is particularly useful in object-oriented design where a subclass inherits
properties and behaviors from a superclass.

DISJOINT CONSTRAINT
The disjoint constraint ensures that an entity instance can belong to only one subclass in
the inheritance hierarchy. It specifies that an entity can be a member of at most one sub
class. And there can no be overlap between the subclass.
OVERLAP CONSTRAINT
The overlap constraint allows an entity instance to belong to multiple subclasses
simultaneously. It does not enforce exclusivity among subclasses.

Specialization: is the process of taking an entity and creating several specialized classes.
it is either disjoint or overlapping.
Generalization: is the process of taking several related entities and creating a general
class. It is total or partial.

COMPLETENESS CONSTRAINT
This can be total or partial. For total every entity in the superclass must belong to a
subclass while for partial, entities in the superclass do not need to be part of any subclass.

DATABASE
A database is an organized collection of structured information, or data, typically stored
electronically in a computer system. It allows for efficient retrieval, manipulation, and
management of data using database management systems (DBMS).
TYPES OF DATABASES
1. Relational Databases (RDBMS): Tables with rows and columns. Examples:
MySQL, PostgreSQL, Oracle, SQL Server
2. NoSQL Databases: Types: Document, Key-Value, Column-family, Graph.
Examples: MongoDB, Cassandra, Redis, CouchDB
3. Object-Oriented Databases: Structure: Data stored as objects. Examples:
db4o, ObjectDB
4. Hierarchical Databases: Structure: Tree-like structure with parent-child
relationships. Examples: IBM Information Management System (IMS)
5. Network Databases: Structure: Graph-like structure allowing multiple parent
records. Examples: Integrated Data Store (IDS), CA-IDMS

DATABASE MANAGEMENT SYSTEM (DBMS)


A Database Management System (DBMS) is software that provides an interface for users
and applications to interact with a database, enabling efficient data storage, retrieval, and
management. It ensures data integrity, security, and concurrent access to the database.
APPLICATION OF DBMS
1. Customer Relationship Management (CRM): Managing customer interactions,
sales, and service records.
2. Healthcare: Storing patient records, medical histories, and treatment plans.
3. E-commerce: Handling product catalogs, customer information, and order processing.
4. Banking Systems: Managing customer accounts, transactions, and financial records

DATABASE SCHEMA
A database schema is the blueprint or logical structure of a database that defines how
data is organized and how relationships among the data are associated. It includes the
definitions of tables, columns, data types, indexes, and the relationships between tables

TYPES OF DB SCHEMA
1. Physical Schema: Defines the physical storage structure of the database.
2. Logical Schema: Describes the logical structure of the database.
3. External schema is an essential part of the database architecture that provides
customized, secure, and efficient views of the data for different users, enhancing both
usability and security.

DISTRIBUTED DATABASE (DDB)


is a collection of multiple, interconnected databases spread across different physical
locations, but managed as a single system. It allows data to be stored and processed on
various networked computers, enhancing performance, reliability, and scalability.

DISTRIBUTED DATABASE MANAGEMENT SYSTEM (DDBMS)


is software that manages a distributed database, ensuring that data is distributed across
multiple locations is stored, retrieved, and updated efficiently and securely
FEATURES OF DDBMS
1. data fragmentation
2. data distribution
3. replication and synchronization
4. concurrency control
5. making CRUD (create, read, update and delete)

ADVANTAGES OF DDBMS
1. Improved Performance: Distributes data across multiple locations, allowing
for parallel processing and faster query response times.
2. Enhanced Reliability and Availability: Provides fault tolerance and disaster
recovery by replicating data across different sites, ensuring continuous access even
if some sites fail.
3. Scalability: Easily scales to handle increased data and user load by adding more
nodes or sites to the system without significant reconfiguration.
4. Data Localization: Reduces data access latency by placing data closer to where
it is most frequently used, improving overall system efficiency.
5. Resource Sharing: Allows multiple locations to share data and computing
resources, leading to better resource utilization and cost efficiency.

CENTERILIZED AND DISTRIBUTED DATABASE


DISTRIBUTED DATABASES AND CLIENT SERVER ARCHITECTURE
A distributed database (DDB) is a collection of multiple logically related database
distributed over a computer network.
A distributed database management system (DDBMS) as a software system that
manages a distributed database while making the distribution transparent to the user.

ADVANTAGES OF DISTRIBUTED DATABASES


1. Data Management Transparency: Users don’t need to know where data is physically
stored across different locations. They interact with the database as if all data is in one
place, simplifying access and management.
Transparency Types:
 Operational Transparency: Users don’t need to know the details of how the
network operates.
 Location Transparency: Users can access and issue commands from any location
without affecting the system’s operation.
 Naming Transparency: Users can access objects like files or records by name
from any location in the system.
 Replication Transparency: Data is copied and stored at multiple sites to reduce
access time and enhance performance.
 Fragmentation Transparency: Data can be split into fragments (rows or
columns) and stored in different locations while appearing as a single entity to the
user.
2. Increased Reliability and Availability: A distributed database system has multiple
nodes, so if one fails, others continue to operate, ensuring continued service.
3. Improved Performance: A distributed database system fragments data to keep it
closer to where it's needed. This reduces the time required for data access and
modification, enhancing performance.
4. Easier Expansion (Scalability): New nodes can be added to the system without
disrupting the existing setup, allowing for easy scalability and expansion of the database
system.

DATA FRAGMENTATION
Data Fragmentation refers to the process of dividing a large database into smaller,
manageable pieces or fragments. This technique is used to improve the performance and
efficiency of database systems.
Types:
 Horizontal Fragmentation: Divides data by tuples/rows. Each fragment contains
a subset of rows from the original table, based on some criteria (e.g., regional
data).
 Vertical Fragmentation: Divides data by columns. Each fragment includes a
subset of columns, useful for different applications needing different data
attributes.
 Hybrid Fragmentation: Combines both horizontal and vertical fragmentation to
further refine data distribution.
CORRECTNESS OF FRAGMENTATION
ensures that data is divided accurately and can be correctly reconstructed.
 Completeness: All data from the original database must be present in the
fragments. No data is lost during the fragmentation process.
 Reconstruction: The original database must be accurately rebuilt from its
fragments. You should be able to piece all the fragments back together to get the
complete, original database.
 Disjointness: Each fragment should be unique and not overlap with others. No
data should be duplicated across fragments; each piece of data appears in only one
fragment.
Reconstruction for horizontal fragmentation is Union operation and join for vertical.
These rules ensure that the fragmented database works correctly and that all data can be
accurately and efficiently managed.
Derived horizontal fragmentation: It is the partitioning of a primary relation to other
secondary relations which are related with foreign keys.

FRAGMENTATION SCHEMA
refers to the plan for how a database is divided into fragments. It’s a structured outline of
how the database is split into different pieces (fragments). These pieces can be divided
horizontally (by rows) or vertically (by columns), or a combination of both.

ALLOCATION SCHEMA
refers to how the fragments of a database are distributed and stored across different
locations or nodes.

DATA REPLICATION
Data replication involves creating copies of the database and storing them at multiple
locations or sites.
 Full Replication: The entire database is duplicated and stored at each site
 Partial Replication: Only specific parts or subsets of the database are replicated
to selected sites.

TYPES OF DISTRIBUTED DATABASE SYSTEMS


1. Homogeneous Database is a type of database system where all the sites or nodes
use the same database management software and follow the same database
structure and protocols.
2. Heterogeneous Database systems involve multiple sites or nodes that use
different database management systems (DBMS) or structures. There are two main
types:
 Federated Database System: Each site operates its own distinct DBMS.
 Multi database System: Sites use various DBMSs, and there is no unified
global schema.
ISSUES IN FEDERATED DATABASE MANAGEMENT SYSTEMS (FDBMS).
The main challenges are:
 Differences in Data Models: Various sites may use different data models, which
can complicate data integration and management.
 Differences in Constraints: Each site might have its own data access and
processing constraints, leading to inconsistencies and challenges in enforcing
uniform data handling policies.
 Differences in Query Languages: Sites may use different versions or types of
query languages making it difficult to standardize and execute queries across the
federated system.

CONCURRENCY CONTROL AND RECOVERY


Key Issues or problems encountered in Distributed Databases:
 Dealing with Multiple Copies of Data Items
 Failure of Individual Sites
 Communication Link Failure
 Distributed Commit
 Distributed Deadlock

CONCURRENCY CONTROL
Distributed Concurrency Control: Managing how multiple transactions access and
modify the same data without conflicting.
Primary Site Technique: Designate one site as the primary coordinator for managing
transactions. This site oversees the process and ensures consistency across the
distributed database.

TRANSACTION MANAGEMENT
 Concurrency Control and Commit: One site (a central coordinator) manages how
transactions are handled to ensure that multiple transactions don't conflict with
each other.
 Two-Phase Locking: This is a method to manage how transactions access data.

ADVANTAGES TRANSACTION MANAGEMENT


 Simple Implementation and Management
 Efficient Data Access
DISADVANTAGES TRANSACTION MANAGEMENT
 Overloading of the Primary Site
 Single Point of Failure
 Backup Site

PRIMARY COPY TECHNIQUE:


In this approach, instead of a site, a data item partition is designated as primary copy. To
lock a data item just the primary copy of the data item is locked.
 Advantages: Since primary copies are distributed at various sites, a single site is
not overloaded with locking and unlocking requests.
 Disadvantages: Identification of a primary copy is complex. A distributed
directory must be maintained, possibly at all sites.

CLIENT-SERVER DATABASE ARCHITECTURE


consists of clients running client software, a set of servers which provide all database
functionalities and a reliable communication infrastructure.
Functions:
1. Server Responsibilities: Manages local data and provides database services at
each site.
2. Client Responsibilities: Handles most distribution functions and user interactions.
3. Communication Management: Facilitates reliable communication between
clients and servers.
Processing SQL Queries:
1. Query Parsing: Client breaks down a user query into sub-queries.
2. Execution: Each sub-query is sent to the appropriate server for processing.
3. Result Collection: Servers process sub-queries and return results to the client.

You might also like