Information Management
Information Management
(DBMS)
MOUDLE 1 – DATABASE ENVIRONMENT AND
DEVELOPMENT PROCESS Database Management System
Definitions A software system that is used to create,
maintain, and provide controlled access to user
Database: organized collection of logically
databases
related data
Data: stored representations of meaningful
objects and event
o Structured: numbers, text, dates
o Unstructured: images, video,
documents
Information: data processed to increase
knowledge in the person using the data
Metadata: data that describes the properties Advantages of the Database Approach
and context of user data
1. Program-data independence
Disadvantages of File Processing 2. Planned data redundancy
Program-Data Dependence 3. Improved data consistency
o All programs maintain metadata for 4. Improved data sharing
each file they use 5. Increased application development productivity
Duplication of Data 6. Enforcement of standards
o Different systems/programs have 7. Improved data quality
separate copies of the same data 8. Improved data accessibility and responsiveness
9. Reduced program maintenance
Limited Data Sharing
10. Improved decision support
o No centralized control of data
Lengthy Development Times Cost and Risks of the Database Approach
o Programmers must design their own file
formats 1. New, specialized personnel
2. Installation and management cost and
Excessive Program Maintenance
o 80% of Information Systems budget complexity
3. Conversion costs
4. Need for explicit backup and recovery
5. Organizational conflict
Data models
o Graphical system capturing nature and
relationship of data
o Enterprise Data Model
High-level entities and
relationships for the
organization
SOLUTION: The DATABASE Approach o Project Data Model
More detailed view, matching
Central repository of shared data
data structure in database or
Data is managed by a controlling agent data warehouse
Stored in a standardized, convenient form
Entities o Define database during development of
o Noun from describing a person, place, initial prototype
object, event, or concept o Repeat implementation and
o Composed of attributes maintenance activities with new
Relationships prototype versions
o Link between entities
Systems Development Life Cycle
o Usually one-to-many (1:M) or many-to-
many (M:N)
Relationship Databases
o Database technology involving tables
(relations) representing entities and
primary/foreign keys representing
relationships
CASE Tools
o Computer-aided Software Engineering
1. Planning
Repository
o Purpose – Preliminary understanding
o Centralized storehouse of metadata
o Deliverable – Request for study
Database Management System (DBMS)
o Database Activity – Enterprise
o Software for managing the database
modeling and early conceptual data
Database
modeling
o Storehouse of the data
2. Analysis
Application Programs
o Purpose – Thorough requirements
o Software using the data
analysis and structuring
User Interface o Deliverable – Functional system
o Text and graphical displays to users specifications
Data/Database Administrators o Database Activity – Thorough and
o Personnel responsible for maintaining integrated conceptual data modeling
the database 3. Logical Design
System Developers o Purpose – information requirements
o Personnel responsible for designing elicitation and structure
databases and software o Deliverable – Detailed design
End Users specifications
o People who use the applications and o Database Activity – Logical database
databases design (transactions, forms, displays,
Two Approaches to Database and IS Development views, data integrity, and security)
4. Physical Design
SDLC (System Development Life Cycle) o Purpose – Develop technology and
o Detailed, well-planned development organizational specifications
process o Deliverable – Program/data structures,
o Time-consuming, but comprehensive technology purchases, organization
o Long development cycle redesigns
Prototyping o Database Activity – Physical database
o Rapid Application Development (RAD) design (define database to DBMS,
o Cursory attempt at conceptual data physical data organization, database
modeling processing programs)
5. Implementation
o Purpose – Programming, testing,
training, installation, documenting
o Deliverable – Operational programs,
documentation, training materials
o Database Activity – Database
implementation, including coded
programs, documentations, installation
and conversion
6. Maintenance
o Purpose - Monitor, repair, enhance
o Deliverable – Periodic adults
o Database Activity – Database
maintenance, performance analysis and
tuning, error corrections
Managing Projects
Project
o A planned undertaking of related
activities to reach an objective that has
a beginning and an end
o Initiated and planned in the planning
stage of SDLC
o Executed during analysis, design, and
implementation
Database Schema o Closed at the end of implementation
External Schema Managing Projects: People Involved
o User views
o Subsets of conceptual schema Business analysts
o Can be determined from business- Systems analysts
function/data entity matrices Database analysts and data modelers
o DBA determines schema for different Data/Database administrators
users Project managers
o Different people have different views of Users
the database, these are the external Programmers
schema Database architects
Conceptual Schema Other technical experts
o E-R models
Evolution of Database
Internal Schema
o Logical structures Driven by four main objectives:
o Physical structures o Need for program-data independence->
o The internal schema is the underlying reduced maintenance
design and implementation o Desire to manage more complex data
types and structures
o Ease of data access for less technical
personnel
o Need for more powerful decision Multitiered client/server database architecture
support platforms
Business Rules
Naming Attributes
Defining Attributes
o
Multi-valued and Derived Attributes
Simple and Composite Identifier Attributes
o Multivalued – may take on more than one
value for a given entity (or relationship) Modeling Relationships
instance
• Relationship Types vs. Relationship Instances
o Derived – values can be calculated from
o The relationship type is modeled as
related attribute values (not physically
lines between entity types…the
stored in the database)
instance is between specific entity
instances
• Relationships can have attributes
o These describe features pertaining to
the association between the entities in
the relationship
• Two entities can have more than one type of
relationship between them (multiple
relationships)
• Associative Entity
o combination of relationship and entity
Degree of Relationships
Cardinality Constraints
• Mandatory cardinalities
• Binary Relationship
• Ternary Relationship
• One optional, one mandatory
• Optional cardinalities Associative Entities
• Enhanced ER model
o extends original ER model with new
modeling constructs
• Subtype
o A subgrouping of the entities in an
entity type that has attributes distinct
from those in other subgroupings
• Supertype
o A generic entity type that has a
relationship with one or more subtypes
Multivalued attributes can be represented as • Attribute Inheritance
relationships o Subtype entities inherit values of all
attributes of the supertype
o An instance of a subtype is also an
instance of the supertype
Example of Generalization
Example of specialization
Basic notation for supertype/subtype
• Subtype Discriminator
o An attribute of the supertype whose
values determine the target subtype(s)
Example of Total Specialization Rule o Disjoint
a simple attribute with
alternative values to indicate
the possible subtypes
o Overlapping
a composite attribute whose
subparts pertain to different
subtypes.
Each subpart contains a
Boolean value to indicate
whether or not the instance
belongs to the associated
Example of Partial Specialization Rule subtype
• Disjointness Constraints
o Whether an instance of a supertype
may simultaneously be a member of
two (or more) subtypes
o Disjoint Rule
An instance of the supertype
can be only ONE of the
subtypes
Example of Disjoint (Subtype Discriminator)
Entity Clusters
A foreign key is an attribute (possibly
composite) in a relation that serves as the
primary key of another relation.
For example, consider the relations EMPLOYEE1
and DEPARTMENT
Integrity Constraints correspond with the “parent”
side row to be deleted
1. Domain Constraints
Set-to-Null – set the foreign key
o Allowable values for an attribute (See
in the dependent side to null if
Table 4-1). A domain definition usually
deleting from the parent side →
consists of the following components:
not allowed for weak entities
domain name, meaning, data type, size
o Referential integrity constraints are
(or length), and allowable values or
implemented with foreign key to
allowable range (if applicable).
primary key references.
For example, consider the relation EMP COURSE The attribute on the left side of the arrow in a
(EmpID, CourseTitle, DateCompleted) shown in functional dependency is called a determinant
Figure 4-7. We represent the functional SSN, VIN, and ISBN are determinants in the
dependency in this relation as follows: preceding three examples. In the EMP COURSE
relation (Figure 4-7), the combination of EmpID
and CourseTitle is a determinant.
The comma between EmpID and CourseTitle
stands for the logical AND operator, because
DateCompleted is functionally dependent on
Candidate Keys
EmpID and CourseTitle in combination.
The functional dependency in this statement Candidate key is an attribute, or combination of
implies that the date when a course is attributes, that uniquely identifies a row in a
completed is determined by the identity of the relation.
employee and the title of the course A candidate key must satisfy the following
Typical examples of functional dependencies properties, which are a subset of the six
are the following: properties of a relation previously listed:
o SSN → Name, Address, Birthdate o Unique identification
A person’s name, address, and For every row, the value of the
birth date are functionally key must uniquely identify that
dependent on that person’s row. This property implies that
Social Security number (in other each nonkey attribute is
words, there can be only one functionally dependent on that
Name, one Address, and one key.
Birthdate for each SSN). o Nonredundancy
o VIN → Make, Model, Color No attribute in the key can be
The make, model, and the deleted without destroying the
original color of a vehicle are property of unique
functionally dependent on the identification.
vehicle identification number
(as above, there can be only
one value of Make, Model, and
Color associated with each
VIN).
o ISBN → Title, FirstAuthorName,
Publisher
The title of a book, the name of
the first author, and the
publisher are functionally
dependent on the book’s
international standard book
number (ISBN)
First Normal Form
No multivalued attributes
Every attribute value is atomic
Fig. 4-25 is not in 1st Normal Form (multivalued
attributes) ➔ it is not a relation.
Fig. 4-26 is in 1st Normal form.
All relations are in 1st Normal Form.
Insertion
o If new product is ordered for order
1007 of existing customer, customer
data must be re-entered, causing
duplication
Deletion
o If we delete the Dining Table from
Order 1006, we lose information
concerning this item’s finish and price
Update
o Changing the price of product ID 4
requires update in multiple records
Why do these anomalies exist?
o Because there are multiple themes
(entity types) in one relation. This
results in duplication and an
unnecessary dependency between the
entities.
Third Normal Form