DB Module Final
DB Module Final
FACULTY OF TECHNOLOGY
Prepared by:
Nibretu Kebede
July 2018
DTU
Fundamentals of database system module 2010
UNIT ONE
Introduction to Database Systems
Unit description
This unit deals with Database system and File System, Characteristics of the Database Approach, application
area of database. To address these contents Brainstorming, peer & group discussion and gap lecture will be
used more. Question & answer, group works are among the methods to be used.
Objectives: At the end of this unit, students will be able to:
Define database system &File SystemL1(K)
Compare database system &File SystemL5(K)
Differentiate the Characteristics of the Database ApproachL3 (A )
Contents:
Definition of database system &File System
Characteristics of Database Approach
Actors on the Scene
Application of database
Method of Teaching: brain storming, gap lecture, group discussion
Brian storming: what is database?
Database can be defined as:
A shared collection of logically related data, designed to meet the information needs of multiple users in
an organization
It usually refers to data organized and stored on a computer that can be searched and retrieved by a
computer program. This computer program is called Database management system (DBMS)
A collection of information organized and presented to serve a specific purpose. (A telephone book is a
common database.) A computerized database is an updated, organized file of machine readable
information that is rapidly searched and retrieved by computer.
An organized collection of information in computerized format.
A collection of related information about a subject organized in a useful manner
That provides a base or foundation for procedures such as retrieving information, drawing conclusions,
and making decisions.
A Computerized representation of any organizations flow of information and storage of data.
many more things about the database. Modern relational DBMS require that the administrative users of
the database
Concurrency Control Services: access and update on the database by different users simultaneously
should be implemented correctly.
Recovery Services Recovery services mean that in case a database gets an inconsistent State to get
corrupted due to any invalid action of someone, the DBMS should be able to recover itself to a consistent
state, ensuring that the data loss during the recovery process of the database remains minimum.
Authorization Services (Security): The database is intended to be used by a number of users, who will
perform a number of actions on the database and data stored in the database, The DBMS is used to al low
or restrict different database users to interact with the database. It is the responsibility of the database to
check whether a user intending to get access to database is authorized to do so or not. If the user is an
authorized one than what actions can he/she per form on the data?
Integrity Services: rules about data and the change that took place on the data, correctness and
consistency of stored data, and quality of data based on business constraints.
Services to promote data independency between the data and the application.
Utility services: sets of utility service facilities like Importing data, Statistical analysis support etc.
User Interfaces: The data in a database may be accessed by numerous people all with different levels of
expertise. It is important that the system provides an adequate variety of user interfaces so that it may be
used as efficiently and effectively by all those who access it. The DBMS must allow the same data to be
viewed in different ways.
Flexibility: Because programs and data are independent, programs do not have to be modified when types
of unrelated data are added to or deleted from the database, or when physical storage changes. Only the
data are change from the storage area.
Fast response to information requests: Because data are integrated into a single database, complex
requests can be handled much more rapidly, then if the data were located in separate, non-integrated files.
In many businesses, faster response means better customer service. Possible fast data retrieve from the
data store.
Less storage area: Theoretically, all occurrences of data items need be stored only once, thereby
eliminating the storage of redundant data. System developers and database designers often use data
normalization to minimize data redundancy.
Data duplication is reduced: As data is integrated, present on different locations so chances of data
duplication are much reduced and date is updated form.
Data is easy to understand: As data is managed according to the needs of the user and it is in very easy
format so that you have no difficulty in using the data through database management system
Components of Database Management system:
Data: is the unprocessed fact.
DBMS: is a collection of software (tool) that is used to manage the database and its user.
Hardware: It consists of secondary storage disks on which the database resides.
People: this component is composed of the people in organization that are responsible or play a role
designing, implementing, managing, administrating and using the resource in the database.
Procedure: this is rules and regulation on how to use and design the database.
Questions
1. Which one of the following true about database?
A. is a collection of related record in the folders and subfolders
B. Organized collection of information in computerized format.
C. Disorganized collection of information in computerized format.
D. Organized collection of information in bookshelf format.
2. What are the approaches of database?
3. What is data?
UNIT TWO
Database System Concepts and Architecture
Unit description:
In this unit Data Models, Schema and Instances, DBMS Architecture and Data Independence, Database
Language and Interface, the Database System Environment, and Classification of DBMS are contents to be
covered. To deliver these contents brain storming and interactive lecture, Peer teaching, group discussion,
presentation and class work methods are used.
Objectives: At the end of this unit students will be able to:
describe Data Models, Schema and Instances Database Language and Interface
Describe The Database System Environment
Described DBMS
Contents:
Data Models, Schema and Instances
DBMS Architecture and Data Independence
Database Language and Interface
The Database System Environment
Classification of DBMS
Method of Teaching: brain storming, gap lecture, group discussion
Brian storming: what is the term data model, schema & instance?
Gap lecture:
Database Model, Database schema and Database instance:
Database model describes an abstract way how data is represented in an information system or a database
management system
As you can see, the second definition comes after the first one, since it covers the physical implementation of
the data – we could say that a database model is the physical model of a conceptual data model.
There are four main types of model
The hierarchical database model.
The network database model.
The relational database model.
The object-oriented database model.
Assignment
Write the difference between the above four types of database models.
Database schema
Database schema: is the overall description of the database, include explanation of the database constraints
that should hold on the database.
The Three level of schema according there abstraction: entity describes
External schema: at the external level to describe the various user views. Usually uses the same data
model as the conceptual level.
Conceptual schema: at the conceptual level to describe the structure and constraints for the whole
database for a community of users. Uses a conceptual or an implementation data model.
Internal schema: at the internal level to describe physical storage structures and access paths. Typically
uses a physical data model
Database Instances
Instance: is the collection of data in the database at a particular point of time (snap- shot).
Also called State or Snap Shot or Extension of the database
Refers to the actual data in the database at a specific point in time. `
State of database is changed any time we add, delete or update an item.
Since Instance is actual data of database at some point in time, changes rapidly.
Group discussion:
Discus the main difference between the database schemas?
DBMS Architecture and Data Independence
Three important characteristics of the database approaches are
(1) Insulation of programs and data (program-data and program-operation independence)
(2) Support of multiple user views and
(3) Use of a catalog to store the database description (schema).
In this section we specify architecture for database systems, called the three-schema architecture which was
proposed to help achieve and visualize these characteristics.
The Three-Schema Architecture
The goal of the three-schema architecture is to separate the user applications and the physical database. In this
architecture, schemas can be defined at the following three levels:
1. The internal level has an internal schema, which describes the physical storage structure of the database.
The internal schema uses a physical data model and describes the complete details of data storage and access
paths for the database.
2. The conceptual level has a conceptual schema, which describes the structure of the whole database for a
community of users. The conceptual schema hides the details of physical storage structures and concentrates
on describing entities, data types, relationships, user operations, and constraints. A high-level data model or
an implementation data model can be used at this level.
3. The external or view level includes a number of external schemas or user views. Each external schema
describes the part of the database that a particular user group is interested in and hides the rest of the database
from that user group. A high-level data model or an implementation data model can be used at this level.
Data Independence
The three-schema architecture can be used to explain the concept of data independence, which can be defined
as the capacity to change the schema at one level of a database system without having to change the schema
at the next higher level. We can define two types of data independence:
1. Logical data independence is the capacity to change the conceptual schema without having to change
external schemas or application programs. We may change the conceptual schema to expand the database (by
adding a record type or data item), or to reduce the database (by removing a record type or data item). In the
latter case, external schemas that refer only to the remaining data should not be affected.
2. Physical data independence is the capacity to change the internal schema without having to change the
conceptual (or external) schemas. Changes to the internal schema may be needed because some physical files
had to be reorganized—for example, by creating additional access structures—to improve the performance of
retrieval or update. If the same data as before remains in the database, we should not have to change the
conceptual schema.
Database Languages
In this section, it is explained how 'a data gets into a database system' and 'how the information gets to the
users'. More correctly formulated the following questions will be answered:
A. How does an application interact with a database management system?
B. How does a user look at a database system?
C. How can a user query a database system and view the results in his/her application?
Reading Assignment
Read and prepare short notes about Database System Environment and Classification of Database
Management Systems.
Questions
1. A collection of data in the data base at the particular time?
a. Database model c. Database schema
b. Database instance d. Database architecture
2. What is database schema?
3. Responsible person who identify the appropriate structure of the database.
4. List types of DBMS Interfaces and discuss each of them.
UNIT THREE
Database Modelling
Unit description
In this unit E/R Model ,Design principles, Network and hierarchical model,Data Modeling using Entity
Relationship, Database Design Using High level Data Models,Entity types and Sets, Attributes and Keys,
Database Abstraction, Relationships will be discussed. To deliver these contents brain Storming and gap
lecture, group discussion, question and answer methods will be used. And the way of assessment will takes
place in the form of questioning and answer, group work, individual assignment, lab assignment, test.
Objectives: At the end of this, unit students will be able to:
Define database Modelling
define Entity, Attributes, Keys, Relationships(components of ERD)
differentiate the types of entities
differentiate E/R Diagram naming conventions, and Design issues
Construct ERD
Contents:
Introduction to Database Modelling
E/R Model
define Entity, Attributes, Keys, Relationships(components of ERD)
differentiate the types of entities
differentiate E/R Diagram naming conventions, and Design issues
Construct ERD
III. Method of Teaching : brain storming , gap lecture, group discussion
Introduction:
Brian storming: what is database Modelling?
Mini lecture:
Data Model: a set of concepts to describe the structure of a database, and certain constraints that the database
should obey.
It is a description of the way that data is stored in a database. Data model helps to understand the relationship
between entities and to create the most effective structure to hold data.
It is a collection of tools or concepts for describing
Data
Data relationships
Data semantics
Data constraints
The main purpose of Data Model is to represent the data in an understandable way.
Categories of data models:
1. Hierarchical Model
The simplest data model
Record type is referred to as node or segment
The top node is the root node
Nodes are arranged in a hierarchical structure as sort of upside-down tree
A parent node can have more than one child node
A child node can only have one parent node
The relationship between parent and child is one-to-many
Relation is established by creating physical link between stored records (each is stored with a predefined
access path to other records)
To add new record type or relationship, the database must be redefined and then stored in a new form.
2. Network Model
Allows record types to have more than one parent unlike hierarchical model
A network data models sees records as set members
Each set has an owner and one or more members
Allow no many to many relationship between entities
Like hierarchical model network model is a collection of physically linked records.
Allow member records to have more than one owner
Reading assignment
Database Design Using High level Data Models.
Components of Entity-Relational (ER) Model
1. Entities
2. Attributes
3. Relationships
4. Relational constraints
1. Entities
The basic object that the ER model represents is an entity, which is a "thing" in the real world with an
independent existence..
Entity Types and Entity sets
An entity type defines a collection (or set) of entities that have the same attributes. A few individual entities
of each type are also illustrated, along with the values of their attributes. The collection of all entities of a
particular entity type in the database at any point in time is called an entity set; the entity set is usually referred
to using the same name as the entity type
2. Attribute
Each entity has attributes—the particular properties that describe it. For example, an employee entity may
be described by the employee’s name, age, address, salary, and job.
Types of Attributes
Several types of attributes occur in the ER model: simple versus composite; single-valued versus multi-
valued; and stored versus derived. We first define these attribute types and illustrate their use via examples.
We then introduce the concept of a null value for an attribute.
Composite versus Simple (Atomic) Attributes
Composite attributes can be divided into smaller subparts, which represent more basic attributes with
independent meanings. For example, the Address attribute of the employee entity can be sub-divided into
City, Region, and Zip. Attributes that are not divisible are called simple or atomic attributes. The value of a
composite attribute is the concatenation of the values of its constituent simple attributes.
Composite attributes are useful to model situations in which a user sometimes refers to the composite attribute
as a unit but at other times refers specifically to its components. If the composite attribute is referenced only
as a whole, there is no need to subdivide it into component attributes. For example, if there is no need to refer
to the individual components of an address (Zip, Street, and so on), then the whole address is designated as a
simple attribute.
Single-valued Versus Multi-valued Attributes
Most attributes have a single value for a particular entity; such attributes are called single-valued. For
example, Age is a single-valued attribute of person. In some cases an attribute can have a set of values for the
same entity—for example, Colors attribute for a car, or a College Degrees attribute for a person. Cars with
one color have a single value, whereas two-tone cars have two values for Colors. Similarly, one person may
not have a college degree, another person may have one, and a third person may have two or more degrees;
so different persons can have different numbers of values for the College Degrees attribute. Such attributes
are called multi valued.
Stored Versus Derived Attributes
In some cases two (or more) attribute values are related—for example, the Age and Birth Date attributes of a
person. For a particular person entity, the value of Age can be determined from the current (today’s) date and
the value of that person’s Birth Date. The Age attribute is hence called a derived attribute and is said to be
derivable from the Birth Date attribute, which is called a stored attribute. Some attribute values can be
derived from related entities; for example, an attribute Number Of Employees of a department entity can be
derived by counting the number of employees related to (working for) that department.
Null Values
In some cases a particular entity may not have an applicable value for an attribute. For example, a College
Degrees attribute applies only to persons with college degrees. For such situations, a special value called null
is created.
KEYS: A key is an attribute or set of attributes in a relation that uniquely identifies each tuple
in the relation.
Types of keys
Super keys
Candidate Keys
Primary key
Composite primary key.
Alternate key
Foreign key
Reading assignment
Read and prepare short note about each types of keys listed above.
3. Relationships
The relationship between entities which exist must be taken into account when processing information. In
any business processing one object may be associated with another object due to some event. Such kind of
association is what we call a RELATIONSHIP between entity objects
One external event or process may affect several related entities.
Related entities require setting of LINKS from one part of the database to another.
A relationship should be named by a word or phrase which explains its function
Role names are different from the names of entities forming the relationship: one entity may take on many
roles, the same role maybe played by different entities
For each RELATIONSHIP, one can talk about the Number of Entities and the Number of Tuples
participating in the association. These two concepts are called DEGREE and CARDINALITY of a
relationship respectively.
Degree of Relationship
An important point about a relationship is how many entities participate in it. The number of entities
participating in a relationship is called the DEGREE of the relationship.
Among the Degrees of relationship, the following are the basic:
UNARY/RECURSIVE RELATIONSHIP: Tuples /records of a Single entity are related with each other.
BINARY RELATIONSHIPS: Tuples/records of two entities are associated in a relationship
TERNARY RELATIONSHIP: Tuples/records of three different entities are associated
And a generalized one: N-NARY RELATIONSHIP: Tuples from arbitrary number of entity sets are
participating in a relationship.
Cardinality of Relationship
Another important concept about relationship is the number of instances/ tuples that can be associated with a
single instance from one entity in a single relationship. The number of instances participating or associated
with a single instance from an entity in a relationship is called the CARDINALITY of the relationship. The
major cardinalities of a relationship are:
ONE-TO-ONE: one tuple is associated with only one other tuple.
ONE-TO-MANY, one tuple can be associated with many other tuples, but not the reverse.
MANY-TO-MANY: one tuple is associated with many other tuples and from the other side, with a
different role name one tuple will be associated with many tuples.
4. Relational Constraints (Integrity rules)
Relational Integrity
Domain Integrity: No value of the attribute should be beyond the allowable limits
Entity Integrity: In a base relation, no attribute of a Primary Key can assume a value of NULL.
Referential Integrity: If a Foreign Key exists in a relation, either the Foreign Key value must match a
Candidate Key value in its home relation or the Foreign Key value must be NULL.
Enterprise Integrity: Additional rules specified by the users or database administrators of a database are
incorporated.
Weak entity type
An entity that does not have a key attributes of their own are called weak entity types.
It is an entity that cannot exist without the entity with which it has a relationship
It is indicated by a double rectangle
In contrast, regular entity types that do have a key attribute are sometimes called strong entity types.
Assessments:
1. Discuss the role of a high-level data model in the database design process.
2. Define the following terms: entity, attribute, attribute value, relationship instance, composite attribute,
multi-valued attribute, derived attribute, complex attribute, key attribute, value set (domain).
3. What is an entity type? What is an entity set? Explain the differences among an entity, an entity type, and
an entity set.
4. E/R Diagram naming conventions, and Design issues
UNIT FOUR
Record Storage and Primary File Organization
Unit description
In these unit Operations on Files, Files of Unordered Records (Heap Files), Files of Ordered Records (Sorted
Files), Hashing Techniques, Index Structure for Files, Single level Ordered Index, multi-level Ordered Index
on B tree and B+ trees are contents to be covers. To deliver these contents brain storming and presentation,
group discussion, demonstration methods will be used. And the way of assessment will takes place in the form
of questioning and answer, group work, individual assignment, lab assignment, test
Objectives: At the end of this unit, students will be able to:
Define file , record& file operation
compare single level ordered index and multilevel ordered index
differentiate Files of Unordered Records (Heap Files)& Files of Ordered Records (Sorted Files)
.differentiate Hashing Techniques
Understand Index Structure for Files
Contents:
Operations on Files
Files of Unordered Records (Heap Files)
Files of Ordered Records (Sorted Files)
Hashing Techniques
Index Structure for Files
Single level ordered index and multilevel ordered index
Dynamic Multilevel indexes using B-Trees and B+ Trees
Indexes on Multiple Indexes
Method of Teaching: brain storming, gap lecture, group discussion group presentation.
Brian storming: what is file?
File Organization
A file is organized logically as a sequence of records. These records are mapped onto disk blocks. Files are
provided as a basic construct in operating systems.
Organization of Records in Files
An instance of a relation is a set of records. Given a set of records, the next question is how to organize them
in a file. Several of the possible ways of organizing records in files are:
Heap files organization. Any record can be placed anywhere in the file where there is space for the
record. There is no ordering of records. Typically, there is a single file for each relation
Sequential file organization. Records are stored in sequential order, according to the value of a “search
key” of each record.
Hashing file organization. A hash function is computed on some attribute of each record. The result of
the hash function specifies in which block of the file the record should be placed. Generally, a separate
file is used to store the records of each relation.
Clustering file organization, records of several different relations are stored in the same file; further,
related records of the different relations are stored on the same block, so that one I/O operation fetches
related records from all the relations. For example, records of the two relations can be considered to be
related if they would match in a join of the two relations
There are two basic kinds of indices:
Ordered indices. Based on a sorted ordering of the values.
Hash indices. Based on a uniform distribution of values across a range of buckets. The bucket to which a
value is assigned is determined by a function, called a hash function.
We shall consider several techniques for both ordered indexing and hashing. No one technique is the best.
Rather, each technique is best suited to particular database applications. Each technique must be evaluated on
the basis of these factors:
Access types: The types of access that are supported efficiently. Access types can include finding records
with a specified attribute value and finding records whose attribute values fall in a specified range.
Access time: The time it takes to find a particular data item, or set of items, using the technique in question.
Insertion time: The time it takes to insert a new data item. This value includes the time it takes to find
the correct place to insert the new data item, as well as the time it takes to update the index structure.
Deletion time: The time it takes to delete a data item. This value includes the time it takes to find the item
to be deleted, as well as the time it takes to update the index structure.
Space overhead: The additional space occupied by an index structure. Provided that the amount of
additional space is moderate, it is usually useful to sacrifice the space to achieve improved performance.
Ordered Indices
To gain fast random access to records in a file, we can use an index structure. Each index structure is associated
with a particular search key. Just like the index of a book or a library catalog, an ordered index stores the
values of the search keys in sorted order, and associates with each search key the records that contain it.
The records in the indexed file may themselves be stored in some sorted order, just as books in a library are
stored according to some attribute.
File operations:
1. Open: Prepares the file for reading or writing. Allocates appropriate buffers (typically at least two) to hold
file blocks from disk, and retrieves the file header. Sets the file pointer to the beginning of the file.
2. Reset: Sets the file pointer of an open file to the beginning of the file.
3. Find (or Locate): Searches for the first record that satisfies a search condition. Transfers the block
containing that record into a main memory buffer (if it is not already there). The file pointer points to the
record in the buffer and it becomes the current record. Sometimes, different verbs are used to indicate
whether the located record is to be retrieved or updated.
4. Read (or Get): Copies the current record from the buffer to a program variable in the user program. This
command may also advance the current record pointer to the next record in the file, which may necessitate
reading the next file block from disk.
5. Find Next: Searches for the next record in the file that satisfies the search condition. Transfers the block
containing that record into a main memory buffer (if it is not already there). The record is located in the
buffer and becomes the current record.
6. Delete: Deletes the current record and (eventually) updates the file on disk to reflect the deletion.
7. Modify: Modifies some field values for the current record and (eventually) updates the file on disk to
reflect the modification.
8. Insert: Inserts a new record in the file by locating the block where the record is to be inserted, transferring
that block into a main memory buffer (if it is not already there), writing the record into the buffer, and
(eventually) writing the buffer to disk to reflect the insertion.
9. Close: Completes the file access by releasing the buffers and performing any other needed cleanup
operations.
Multilevel Indices
Indices with two or more levels are called multilevel indices. Searching for records with a multilevel
index requires significantly fewer I/O operations than does searching for records by binary search. Each
level of index could correspond to a unit of physical storage. Thus, we may have indices at the track,
cylinder, and disk levels.
Presentation:
Read more on this chapter and present in the class
Questions
1. How Heap file organization place the record?
2. What are the possible ways of organizing records in files?
UNIT FIVE
Relational algebra operation and Relational calculus
Unit description
In this unit relational algebra operations, Relational calculus are contents to be covers, to deliver these contents
brain storming and gap lecture, group discussion, and demonstration methods will be used. And the way of
assessment will takes place in the form of questioning and answer, group assignment,
Objectives: At the end of this unit students will be able to:
Explain relational algebra operations
list types of relational Algebra
understand Relational calculus
Contents:
relational algebra operations
types of relational algebra
Relational calculus
Method of Teaching: brain storming, gap lecture, group discussion, and Presentation
Brian storming: what is relation?
The basic set of operations for the relational model is known as the relational algebra. These operations enable
a user to specify basic retrieval requests.
The result of the retrieval is a new relation, which may have been formed from one or more relations. The
algebra operations thus produce new relations, which can be further manipulated using operations of the
same algebra.
A sequence of relational algebra operations forms a relational algebra expression; whose result will also be
a relation that represents the result of a database query (or retrieval request).
Relational algebra is a theoretical language with operations that work on one or more relations to define
another relation without changing the original relation.
The output from one operation can become the input to another operation (nesting is possible)
There are different basic operations that could be applied on relations on a database based on the
requirement.
Selection Join
Intersection
Full outer join
Assignment
Semi join
Operators - Write
INSERT - provides a list of attribute values for a new tuple in a relation. This operator is the same as SQL.
DELETE - provides a condition on the attributes of a relation to determine which tuple(s) to remove from
the relation. This operator is the same as SQL.
MODIFY - changes the values of one or more attributes in one or more tuples of a relation, as identified
by a condition operating on the attributes of the relation. This is equivalent to SQL UPDATE.
Operators - Retrieval
There are two groups of operations:
Mathematical set theory based relations: UNION, INTERSECTION, DIFFERENCE, and CARTESIAN
PRODUCT.
Special database operations: SELECT, PROJECT, and JOIN.
The SELECT Operation: The SELECT operation is used to select a subset of the tuples from a relation
that satisfy a selection condition. One can consider the SELECT operation to be a filter that keeps only
those tuples that satisfy a qualifying condition.
Set Operations
Consider two relations R and S.
UNION of R and S: The union of two relations is a relation that includes all the tuples that are either in
R or in S or in both R and S. Duplicate tuples is eliminated.
INTERSECTION of R and S: The intersection of R and S is a relation that includes all tuples that are
both in R and S.
DIFFERENCE of R and S: The difference of R and S is the relation that contains all the tuples that are
in R but that are not in S.
SET Operations - requirements
For set operations to function correctly the relations R and S must be union compatible. Two relations are
union compatible if
Relational Calculus
Reading Assignment
Read about Relational Calculus
UNIT SIX
Database Design
Unit description
In this unit Introduction to database design, Functional Dependency and Normalization will be discussed. To
deliver these contents brain storming and presentation, group discussion, demonstration methods will be used.
And the way of assessment will takes place in the form of questioning and answer, group work, individual
assignment, lab assignment, test
Objectives: At the end of this unit students will be able to:
Describe database design
Describe Functional Dependency
Understand about Normalization
Contents:
Introduction to database design
Functional Dependency
Normalization
Forms of Normalization
Database design is the process of coming up with different kinds of specification for the data to be stored in
the database. The database design part is one of the middle phases we have in information systems
development where the system uses a database approach. Design is the part on which we would be engaged
to describe how the data should be perceived at different levels and finally how it is going to be stored in a
computer system.
Database Development Life Cycle
As it is one component in most information system development tasks, there are several steps in designing a
database system. Here more emphasis is given to the design phases of the system development life cycle.
Information System with Database application consists of several tasks which include:
Planning of Information systems Design
Requirements Analysis,
Design (Conceptual, Logical and Physical Design)
Implementation
Testing and deployment
Operation and Support
These major steps in database design are discussed as follows;
1. Planning: that is identifying information gap in an organization and propose a database solution to solve
the problem.
2. Analysis: that concentrates more on fact finding about the problem or the opportunity. Feasibility analysis,
requirement determination and structuring, and selection of best design method are also performed at this
phase.
3. Design: in database designing more emphasis is given to this phase. The phase is further divided into three
sub-phases.
Conceptual Design:
Logical Design:
Physical Design:
4. Implementation: the testing and deployment of the designed database for use.
5. Operation and Support: administering and maintaining the operation of the database system and
providing support to users.
In developing a good design, one should answer such questions as:
What are the relevant Entities for the Organization
What are the important features of each Entity
What are the important Relationships
What are the important queries from the user
What are the other requirements of the Organization and the Users
Levels of Database Design
Database development system has several phases, from those different phases, the prime interest of a database
development system will be the Design part which is again sub divided into other three sub-phases. These
sub-phases are:-
Conceptual design
Physical design
Strong entity
Weak entity
Attributes
Derived Attribute
Key FK
Database Normalization
Database Normalization is a series of steps followed to obtain a database design that allows for consistent
storage and efficient access of data in a relational database. These steps reduce data redundancy and the risk
of data becoming inconsistent
Normalization is the process of identifying logical association between data item and designing a database
that will represent such abscission but without suffering the update anomalies which are:
Insertion Anomalies
Deletion Anomalies
Modification Anomalies
Normalization may reduce system performance since data will be cross referenced from many tables. Thus
demoralization is sometimes used to improve performance, at the cost of reduced consistency guarantees.
Normalization normally is considered as good if it is lossless decomposition.
All the normalization rules will eventually remove the update anomalies that may exist during data
manipulation after the implementation. The update anomalies are;
The type of problems that could occur in insufficiently normalized table is called update anomalies which
includes:
1. Insertion anomalies
An "insertion anomaly" is a failure to place information about a new database entry into all the places in the
database where information about that new entry needs to be stored.
In a properly normalized database, information about a new entry needs to be inserted into only one place in
the database; in an inadequately normalized database, information about a new entry may need to be inserted
into more than one place and, human fallibility being what it is, some of the needed additional insertions may
be missed.
2. Deletion anomalies
A "deletion anomaly" is a failure to remove information about an existing database entry when it is time to
remove that entry. In a properly normalized database, information about an old, to-be-gotten-rid-of entry
needs to be deleted from only one place in the database; in an inadequately normalized database, information
about that old entry may need to be deleted from more than one place, and, human fallibility being what it is,
some of the needed additional deletions may be missed.
3. Modification anomalies
A modification of a database involves changing some value of the attribute of a table. In a properly normalized
database table, whatever information is modified by the user, the change will be effected and used accordingly.
Functional Dependency (FD)
Before moving to the definition and application of normalization, it is important to have an understanding of
"functional dependency."
Three Type of Functional Dependency
1. Partial Dependency
If an attribute which is not a member of the primary key is dependent on some part of the primary key (if we
have composite primary key) then that attribute is partially functionally dependent on the primary key.
2. Full Dependency
If an attribute which is not a member of the primary key is not dependent on some part of the primary key but
the whole key (if we have composite primary key) then that attribute is fully functionally dependent on the
primary key.
3. Transitive Dependency
In mathematics and logic, a transitive relationship is a relationship of the following form: "If A implies B, and
if also B implies C, then A implies C."
Steps (Forms) of Normalization:
We have various levels or steps in normalization called Normal Forms. The level of complexity, strength of
the rule and decomposition increases as we move from one lower level Normal Form to the higher.
UNIT SEVEN
STRUCTURAL QUERY LANGUAGES (SQL)
WHAT IS SQL?
SQL stands for Structured Query Language. It was developed in the 1970s at IBM as a way to provide
computer users with a standardized method for selecting data from various database formats. The intent was
to build a language that was not based on any existing programming language, but could be used within any
programming language as a way to update and query information in databases.
SQL statements are just that--statements. Each statement can perform operations on one or more database
objects (tables, columns, indexes, and so on). Most SQL statements return results in the form of a set of data
records, commonly referred to as a view. SQL is not a particularly friendly language. Many programs that use
SQL statements hide these statements behind point-and-click dialogs, query-by-example grids, and other user-
friendly interfaces. Make no mistake, however, that if the data you are accessing is stored in a relational
database, you are using SQL statements, whether you know it or not.
SQL is a powerful manipulation language used by Visual Basic and the Microsoft Access Jet database engine
as the primary method for accessing the data in your databases.
Use of SQL
SQL can execute queries against a database
SQL can retrieve data from a database
SQL can insert records in a database
SQL can update records in a database
SQL can delete records from a database
SQL can create new databases
CHAR(size) Holds a fixed length string (can contain letters, numbers, and special characters). The
fixed size is specified in parenthesis. Can store up to 255 characters
VARCHAR(size) Holds a variable length string (can contain letters, numbers, and special characters).
The maximum size is specified in parenthesis. Can store up to 255 characters. Note: If
you put a greater value than 255 it will be converted to a TEXT type
BLOB For BLOBs (Binary Large OBjects). Holds up to 65,535 bytes of data
LONGBLOB For BLOBs (Binary Large OBjects). Holds up to 16,777,215 bytes of data
SET For BLOBs (Binary Large OBjects). Holds up to 4,294,967,295 bytes of data
Number types:
Data type Description
FLOAT(size,d) A small number with a floating decimal point. The maximum number of digits may be
specified in the size parameter. The maximum number of digits to the right of the decimal
point is specified in the d parameter
DOUBLE(size,d) A large number with a floating decimal point. The maximum number of digits may be
specified in the size parameter. The maximum number of digits to the right of the decimal
point is specified in the d parameter
DECIMAL(size,d) A DOUBLE stored as a string, allowing for a fixed decimal point. The maximum number of
digits may be specified in the size parameter. The maximum number of digits to the right of
the decimal point is specified in the d parameter
Date types:
Data type Description
TIMESTAMP() *A timestamp. TIMESTAMP values are stored as the number of seconds since the
Unix epoch ('1970-01-01 00:00:00' UTC). Format: YYYY-MM-DD HH:MM:SS
Note: The supported range is from '1970-01-01 00:00:01' UTC to '2038-01-09
03:14:07' UTC
DROP TABLE
ETC…
Data Manipulation Language
Data Manipulation Language language, database (DML, or Data Management Language) A language for the
manipulation of data in a database.
DML Statement:
INSERT INTO
UPDATE
SELECT
DELETE
ETC…
Data Control Language (DCL)
A Data Control Language (DCL) is a computer language and a subset of SQL, used to control access to data
in a database
DCL Statements:
GRANT
REVOKE
SQL Statements:
Most of the actions you need to perform on a database are done with SQL statements.
SQL statement used to perform all functions in databases stem.
1. The CREATE DATABASE Statement
The CREATE DATABASE statement is used to create a database.
SQL CREATE DATABASE Syntax
CREATE DATABASE database name
Each table should have a primary key, and each table can have only one primary key.
SQL DEFAULT Constraint
The DEFAULT constraint is used to insert a default value into a column.
The default value will be added to all new records, if no other value is specified.
The FOREIGN KEY constraint is used to prevent actions that would destroy link between tables.
The FOREIGN KEY constraint also prevents that invalid data is inserted into the foreign key column, because
it has to be one of the values contained in the table it points to.
SQL CHECK Constraint
The CHECK constraint is used to limit the value range that can be placed in a column.
If you define a CHECK constraint on a single column it allows only certain values for this column.
If you define a CHECK constraint on a table it can limit the values in certain columns based on values in other
columns in the row.
SQLDROP TABLE, and DROP DATABASE
3. The DROP TABLE Statement
The DROP TABLE statement is used to delete a table.
5.2 To delete a column in a table, use the following syntax (notice that some database systems don't allow
deleting a column):
5.3 To change the data type of a column in a table, use the following syntax:
The second form specifies both the column names and the values to be inserted:
INSERT INTO table_name (column1, column2, column3,...)
VALUES (value1, value2, value3,...)
UPDATE table_name
SET column1=value, column2=value2,...
WHERE some_column=some_value
Note: Notice the WHERE clause in the UPDATE syntax. The WHERE clause specifies which record or
records that should be updated. If you omit the WHERE clause, all records will be updated!
SELECT column_name(s)
FROM table_name
WHERE column_name LIKE pattern
Note: Notice the WHERE clause in the DELETE syntax. The WHERE clause specifies which record or
records that should be deleted. If you omit the WHERE clause, all records will be deleted!
Delete All Rows
It is possible to delete all rows in a table without deleting the table. This means that the table structure,
attributes, and indexes will be intact:
DELETE FROM table_name
Group discussion:
Discus on SQL statements by giving different examples.
Questions
1. What are the two relational algebra operators?
2. Which one of the following is the allowed operator in where close to say not equal
A. <>
B. !=
C. ==
D. !==
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
If you want to add a column named "DateOfBirth" in theabove table called "Persons" table.
Here is the SQL statement you are going to used:
Possible answer
1. Mathematical set theory and relational database it self
2. A