Midterm Exam FDS PDF
Midterm Exam FDS PDF
Midterm Exam FDS PDF
What is a Database?
Related data can be called as an all data or group of data or a collection of data that has some kind of
similar properties or attributes.
Data is raw, unorganized facts that need to be processed. Data can be something simple and seemingly
random and useless until it is organized.
The DBMS is a general-purpose, application-independent software system that facilitates the processes
of:
1. Defining - A database involves specifying data types, structures, and constraints for the data to
be stored in the database. The database definition or descriptive information is also stored by
the DBMS in a form of database catalog or dictionary.
2. Constructing - The database is the process of storing the data itself on some storage medium
that is controlled by the DBMS.
3. Manipulating - A database includes such functions as querying the database to retrieve specific
data, updating the database to reflect changes in the miniworld, deleting data, and generating
reports from the data.
AMM
4. Sharing - A database allows multiple users and application programs to access the database
simultaneously. An application program accesses the database by sending queries or requests
for data to the DBMS.
• STUDENTs
• COURSEs
• SECTIONs (of COURSEs)
• DEPARTMENTs (academic)
• INSTRUCTORs
BASIC DEFINITION
• Data - Known facts that can be recorded and have an implicit meaning.
• Database Management System (DBMS) - A software package/ system to facilitate the creation
and maintenance of a computerized database.
• Database System: The DBMS software together with the data itself. Sometimes, the
applications are also included.
AMM
Characteristics of the Database Approach
1. Self-Describing Nature of a Database System
• A database system is referred to as self-describing because it not only contains the database
itself, but also metadata which defines and describes the data and relationships between
tables in the database.
1. System Analysts and Application Programmers (Software Engineers) - responsible for the
design, structure, and properties of the database and responsible for writing application
programs that interact with the database.
2. End User - they use the data for queries, reports, and some of them update the database
content.
They are categorized as:
• Casual End Users - They occasionally access the database but may need different
information each time. These users have great knowledge of query language. They do
not write programs, but they can interact with the system by writing queries.
• Naïve End Users - They are the users who does not have any knowledge about
database. Their task is to just use the developed application and get the desired
results.
• Sophisticated End Users - These are engineers, scientists, business analysts and others
who thoroughly familiarize themselves with the facilities of DBMS to implement their
own application to meet their complex requirements.
• Standalone Users - They are the users who interacts with the database directly via on-
line terminal or indirectly through Menu or graphics-based interfaces.
AMM
DATABASE USERS - WORKERS BEHIND THE SCENE
1. DBMS system designers and implementers - designs and implement the DBMS modules and
interfaces as a software package.
2. Tool developers - design and implement the tools – the software packages that facilitate
database modelling and design.
3. Operators and maintenance personnel - They are responsible for the actual running and
maintenance of the hardware and software environment for the database system. Also called
the system administration personnel.
AMM
E. Extending Database Capabilities for New Applications
AMM
Chapter 2: Database System Concepts and Architecture
Data Models
• A set of concepts to describe the structure of a database, and certain constraints that the
database should obey.
• An abstract model that organizes elements of data and standardizes how they relate to one
another and to the properties of real-world entities.
• Refer to two distinct but closely related concepts.
• Conceptual (high-level, semantic) data models: Provide concepts that are close to the way
many users perceive data. (Also called entity-based or object-based data models.)
• Physical (low-level, internal) data models: Provide concepts that describe details of how data is
stored in the computer.
• Implementation (representational, external) data models: Provide concepts that fall between
the above two, balancing user views with some computer storage details.
• Conceptual Data Model: This Data Model defines WHAT the system contains. This model is
typically created by Business stakeholders and Data Architects. The purpose is to organize,
scope and define business concepts and rules.
AMM
• Logical Data Model Implementation: Defines HOW the system should be implemented
regardless of the DBMS. This model is typically created by Data Architects and Business Analysts.
The purpose is to developed technical map of rules and data structures.
• Physical Data Model: This Data Model describes HOW the system will be implemented using a
specific DBMS system. This model is typically created by DBA and developers. The purpose is
actual implementation of the database.
• The Logical Data Model is used to define the structure of data elements and to set
relationships between them. The logical data model adds further information to the conceptual
data model elements. The advantage of using a Logical data model is to provide a foundation to
form the base for the Physical model. However, the modeling structure remains generic.
AMM
Schemas
Schema Diagram
AMM
• A database instance, on the other hand, is a snapshot of a database as it existed at a particular
time. Thus, database instances can change over time, whereas a database schema is usually
static since it’s difficult to change the structure of a database once it is operational.
• Instance: The data stored in database at a particular moment of time is called instance of
database.
1. Logical Schema
• It describes the database designed at logical level.
• Can be defined as the design of the database at its logical level. In this level, the
programmers, as well as the database administrator (DBA), work. At this level, data can be
described as certain types of data records that can be stored in the form of data
structures. However, the internal details (such as an implementation of data structure)
will be remaining hidden at this level.
2. Physical Schema
• It describes the database designed at physical level.
• Can be defined as the design of a database at its physical level. In this level, it is expressed
how data is stored in blocks of storage.
AMM
View schema can be defined as the design of the database at the view level, which generally
describes end-user interaction with database systems.
LOGICAL SCHEMA
Three-Schema Architecture
• Internal schema at the internal level to describe physical storage structures and access paths.
Typically uses a physical data model.
• Conceptual schema at the conceptual level to describe the structure and constraints for the
whole database for a community of users. Uses a conceptual or an implementation data model.
• External schemas at the external level to describe the various user views. Usually uses the same
data model as the conceptual level.
AMM
Three-Schema Architecture
1. Mappings among schema levels are needed to transform requests and data. Programs refer to an
external schema and are mapped by the DBMS to the internal schema for execution.
• Process of transforming request and results between three level it's called mapping.
• This architecture makes the database abstract. It is used to hide the details of how data is
physically stored in a computer system, which makes it easier to use for a user.
• This architecture allows each user to access the same database with a different customized
view of data.
• This architecture enables a database admin to change the storage structure of the database
without affecting the user currently on the system.
Data Independence
• Logical Data Independence: The capacity to change the conceptual schema without having to
change the external schemas and their application programs.
• Physical Data Independence: The capacity to change the internal schema without having to
change the conceptual schema.
AMM
DBMS Languages
• Data Definition Language (DDL): Used by the DBA and database designers to specify the
conceptual schema of a database.
➢ storage definition language (SDL) is used to define internal schemas.
➢ view definition language (VDL) is used to define external schemas.
• Data Manipulation Language (DML): Used to specify database retrievals and updates.
➢ DML commands (data sublanguage) can be embedded in a general-purpose
programming language.
➢ Programming language (host language), such as COBOL, C, or an Assembly Language.
➢ Query language stand-alone DML commands can be applied directly.
2 Types of DML
• High Level or Non-procedural Languages: e.g., SQL, are set-oriented and specify what data to
retrieve than how to retrieve. Also called declarative languages.
• Low Level or Procedural Languages: record-at-a-time; they specify how to retrieve data and
include constructs such as looping.
Procedural Languages: c, pascal, fortran, basic, cobol, ada, lisp, assembly language, etc
Non-Procedural Languages: Java, c++, python, JavaScript, Ruby, Scala, Ocaml, PHP, C#
DBMS Interfaces
1. Stand-alone query language interfaces.
Example: Entering SQL queries at the DBMS interactive SQL interface (e.g., SQL*Plus in ORACLE)
• Programmer interfaces for embedding DML in programming languages:
➢ Pre-compiler Approach
➢ Procedure (Subroutine) Call Approach
2. Menu based interfaces: These interfaces presents the user with list of option called menus.
Pull down menus are the very important technique.
3. Graphical user interfaces: lt displays a schema to the user in diagrammatic form. Most GUIs use
pointing device such as a mouse.
4. Natural language interfaces: These Interfaces accept request written in English or some other
language.
5. Forms base interfaces: These interface displays a form to each user. User can fill out all the
entries to insert new data.
6. Interfaces for parametric user: parametric user such as bank tellers, often have a small set of
operations that they must perform repeatedly. For this a small set of abbreviated commands is
included, with the goal of minimizing the number of keystrokes required for each request.
7. Interface for DBA: most database systems contain privileged commands that can be used only
by the dba staff.
AMM
Database System Utilities
1. Loading – A loading utility is used to load existing text/sequential files into the database.
Source format and desired target file are specified to the utility, and the utility reformats the
data to load into a table.
2. Backup – A backup utility creates a backup copy of the database, usually by dumping database
onto tape. Can be used to restore the database in case of failure. Incremental backup can be
used which records only the changes since the last backup.
3. File Reorganization – This utility reorganize database files into different file organizations to
improve performance.
4. Performance Monitoring – monitors database usage and provides statistics to the DBA. DBA
uses the statistics for decision-making. The DBA uses the statistics in making decisions such as
whether or not to reorganize files or whether to add or drop indexes to improve performance.
Other Tools
AMM
CHAPTER 3: DATA MODELING USING ENTITY-RELATIONSHIP (ER) MODEL
TYPE OF ATTRIBUTES
• Composite Attributes: Can be divided into smaller subparts, which represent more basic
attributes with independent meaning.
• Address: An address can be divided into sub-parts such as street name, city, state, and
zip code.
• Name: A name can be divided into sub-parts such as first name, middle name, and last
name.
• Simple/Atomic Attributes: Attributes that are not divisible. Are attributes that cannot be
divided into smaller sub-parts.
• Age: Age is a single value that cannot be further divided into sub-parts.
• Gender: Gender is a single value that is either male or female and cannot be divided.
• Single: also known as unary attributes. Has a single value. Are attributes that can have only one
value or occurrence for a single entity.
• Multi-valued: Can contain multiple values. Are attributes that can have multiple values or
occurrences for a single entity.
• Stored Attributes: Attribute which are physically stored in the database or a data storage
system.
• Derived Attributes: Attributes from which the value is calculated from other attributes.
• Age: Age can be calculated by subtracting a person's date of birth from the current date.
• Total cost: Total cost can be calculated by multiplying the price per unit by the quantity
purchased.
• BMI: BMI (body mass index) can be calculated by dividing a person's weight in kilograms
by the square of their height in meters.
NULL Values: A particular attribute may not have an applicable value for an attribute. Represent
missing or unknown data in a database or data storage system.
• A person's middle name may be NULL if the person does not have a middle name.
• A customer's phone number may be NULL if the customer did not provide a phone
number.
• A product's weight may be NULL if the weight information is not available.
Complex Attributes: Formed by nesting composite attributes and multi-valued attributes in arbitrary
way. Attributes that can be further divided into smaller sub-attributes.
• Address: An address can be divided into sub-attributes such as street name, city, state,
and zip code.
• Product: A product can be divided into sub-attributes such as name, description, price,
and manufacturer.
Entity Type: Defines a collection (or set) of entities that have the same attributes. Each entity type in
the database is described by its name and attributes. Used to represent a group of similar entities in a
database or data storage system.
Entity Set: a collection of entities of the same type that share common attributes and relationships.
Example:
CAR
Registration(RegistrationNumber, State), VehicleID, Make, Model, Year, (Color)
Key Attributes
Key Attributes: An attribute that are distinct/unique for each individual entity in the entity set.
Attributes that uniquely identify an entity in a database or data storage system.
Weak Entity Types –an entity that cannot be uniquely identified by its attribute. It is indicated by a
double-lined rectangle (entity) connected by a double-lined diamond (relationship).
• Order Item: An order item is a weak entity type that depends on the Order entity type for
context and a unique identifier.
• Prescription Line: A prescription line is a weak entity type that depends on the Prescription
entity type for context and a unique identifier.
RELATIONSHIP
• Relationship represents an association between two or more entities.
Example: EMPLOYEE works on PROJECT
• Role Names signifies the role that a participating entity from the entity type plays in each
relationship instance.
Example: In a database for a company, a many-to-many relationship can exist between the
entities of employee and project. Each employee can be associated with multiple projects, and
each project can involve multiple employees. In this relationship, the role names can be "works
on" for the employee entity and "is worked on by" for the project entity. These role names
describe the relationship between the entities and provide clarity and understanding to the
users of the ERD.
• Recursive Relationship same entity type participate more than once in a relationship type in
different roles .
Example: In a database for an organization, a recursive relationship can exist between the
employee entity and the manager entity. Each employee can have a manager, who is also an
employee within the same organization. In this relationship, the role names can be "reports to"
for the employee entity and "manages" for the manager entity. This recursive relationship
allows for a hierarchy to be established within the organization, where each employee can be
linked to their respective manager, who is also an employee within the same organization.
An identifying relationship is a relationship between two entities in which an instance of a child entity
is identified through its association with a parent entity, which means the child entity is dependent on
the parent entity for its identity and cannot exist without it.
ER-DIAGRAM NOTATION
✓ Use case diagram used to determine the main process and sub process of the project.
✓ Data flow diagram is to determine the data (in/out) of your project.
Chapter 4: THE RELATIONAL DATA MODEL AND RELATIONAL DATABASE CONSTRAINTS
Relational model
1. Table: In the relational model, data is stored in tables, which are composed of rows and columns.
Tables are also known as relations.
2. Rows: Each row in a table represents a single record or instance of an entity and is also known as a
tuple.
3. Columns: Columns represent the attributes or properties of the entity being represented. Each
column contains data of a specific type, such as text, number, or date.
4. Primary key: A primary key is a column or set of columns that uniquely identifies each row in a
table. It is used to enforce data integrity and ensure that each row can be uniquely identified.
5. Foreign key: A foreign key is a column or set of columns in one table that refers to the primary key
of another table. It is used to create relationships between tables.
6. Relationship: A relationship is a connection between two or more tables, typically created using
foreign keys. There are different types of relationships, including one-to-one, one-to-many, and
many-to-many.
Domain vs Relation
A domain refers to the set of allowable values for a particular attribute or column in a table. A domain
has a logical definition because it represents a set of values that are valid for a specific attribute or
column.
A relation, on the other hand, is a mathematical concept that is used to represent the structure of data
in a database. In the relational model, a relation is a set of tuples (or rows) that have the same attributes
(or columns).
Example – A relation STUDENT
Definition Summary
Table Relation
Row Tuple
Characteristics of Relations
Rows and columns: A relation is composed of rows and columns. Each row represents a single record or
instance of an entity, while each column represents an attribute or property of the entity.
Unique rows: Each row in a relation must be unique, meaning that no two rows can have exactly the
same values for all columns. This is enforced by the use of a primary key, which ensures that each row
can be uniquely identified.
Attribute domains: Each column in a relation has a defined domain, which specifies the allowable
values and data type for the attribute. This helps ensure data consistency and integrity.
No duplicate columns: Each column in a relation must be unique, meaning that no two columns can
have the same name and domain.
Order independence: The order of the rows and columns in a relation is not significant. This means
that the same data can be represented in different ways without affecting its meaning.
Operations: Relational databases support a set of operations that can be applied to relations, including
selecting, projecting, joining, and aggregating data. These operations can be used to query and
manipulate data in a relational database.
Relational integrity constraints are rules that are applied to data in a relational database to ensure
that the data is consistent, accurate, and complete. They are used to maintain the quality and integrity
of data in a database, and to prevent data from being entered or modified in ways that violate the rules
of the database schema.
➢ Entity Integrity Constraints: These constraints are used to ensure that each row in a table is
unique and can be identified using a primary key. For example, a table of employees might
have a primary key of employee ID, which would ensure that each employee is uniquely
identified.
➢ Referential Integrity Constraints: These constraints are used to ensure that relationships
between tables are maintained. For example, if a table of orders has a foreign key that
references a table of customers, the referential integrity constraint would ensure that every
order is associated with a valid customer.
➢ Domain Integrity Constraints: These constraints are used to ensure that data entered into a
column of a table meets certain criteria or falls within a specific domain. For example, a column
of dates might have a domain integrity constraint that only allows dates in the format "YYYY-
MM-DD".
➢ Check Constraints: These constraints are used to specify a condition that must be true for data
to be entered into a table. For example, a table of products might have a check constraint that
ensures that the price of a product is greater than zero.
➢ Assertion Constraints: These constraints are used to ensure that more complex conditions are
met. For example, an assertion constraint could ensure that the sum of all orders for a given
customer does not exceed their credit limit.
Check Constraints
Assertion Constraints
Key Constraints
Key constraint is a type of integrity constraint that ensures that data in a table is unique and can be
identified using a specific set of attributes or columns.
There are two types of key constraints: primary keys and foreign keys.
➢ Primary key: is a column or set of columns in a table that uniquely identifies each row in the
table. It must be unique and not null. Each table in a database must have one and only one
primary key. A primary key can be a single column or a combination of columns.
➢ Foreign key: is a column or set of columns in a table that refers to the primary key of another
table. It establishes a relationship between two tables in a database. A foreign key can have the
same values as the primary key it refers to or it can be null if the relationship is optional.
A relational database schema is the blueprint or plan that defines the structure of a database, including
the tables, columns, keys, and relationships between tables. It is a visual representation of the database
structure that helps developers and database administrators to understand the database and its
components.
The most common types of relational database schema
➢ Star schema: A database schema that organizes data into a central fact table, surrounded by
dimension tables. The fact table contains numeric measures or facts that are used to analyze
business data, while the dimension tables contain descriptive data that provide context for the
facts.
➢ Snowflake schema: similar to a star schema, but it contains normalized dimension tables. This
means that each dimension table is split into multiple related tables, reducing data redundancy,
and improving data consistency.
➢ Third normal form schema: A normalized database schema that eliminates data redundancy by
separating data into multiple related tables.
➢ Denormalized schema: A denormalized schema is a database schema that combines data from
multiple tables into a single table.
➢ Hybrid schema: A hybrid schema combines elements of two or more schema types, such as a
star schema and a snowflake schema.
Populated Database State
A populated database state refers to a state of a database where it already contains data. In other
words, a populated database is a database that has already been populated with data, usually through
data entry or import from external sources.
➢ Create: Creating is the process of adding new database objects such as tables, views, indexes,
and constraints.
➢ Alter: Altering is the process of modifying existing database objects such as tables, views,
indexes, and constraints.