0% found this document useful (0 votes)
46 views29 pages

CSC 203.1 Note

This document provides an overview of database management, defining key concepts such as databases, DBMS, and data integrity. It discusses the advantages and disadvantages of using a database system, the functions of a DBMS, and the three-level architecture of databases. Additionally, it introduces the Entity-Relationship model as a method for database design, highlighting its main features and basic concepts.

Uploaded by

moseseko2006
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views29 pages

CSC 203.1 Note

This document provides an overview of database management, defining key concepts such as databases, DBMS, and data integrity. It discusses the advantages and disadvantages of using a database system, the functions of a DBMS, and the three-level architecture of databases. Additionally, it introduces the Entity-Relationship model as a method for database design, highlighting its main features and basic concepts.

Uploaded by

moseseko2006
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


INTRODUCTION TO DATABASE MANAGEMENT
Database:
A database is organized collection of related data of an organization stored in formatted
way which is shared by multiple users.
The main feature of data in a database are:
1. It must be well organized
2. It is related
3. It is accessible in a logical order without any difficulty
4. It is stored only once
For example, consider the roll no, name, address of a student stored in a student file. It is
collection of related data with an implicit meaning. Data in the database may be
persistent, integrated and shared.
Persistent:
If data is removed from database due to some explicit request from user to remove.
Integrated:
A database can be a collection of data from different files and when any redundancy
among those files is removed from the database, such database is said to contain
integrated data.
Sharing Data:
The data stored in the database can be shared by multiple users simultaneously without
affecting the correctness of data.

Why Database:
In order to overcome the limitation of a file system, a new approach was required. Hence
a database approach emerged. A database is a persistent collection of logically related
data. The initial attempts were to provide a centralized collection of data. A database has
a self-describing nature. It contains not only the data sharing and integration of data of
an organization in a single database.
A small database can be handled manually but for a large database and having multiple
users it is difficult to maintain it. In that case a computerized database is useful.

1
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


The advantages of database system over traditional, paper based methods of record
keeping are: Compactness: No need for large amount of paper files
⚫ Speed: The machine can retrieve and modify the data in a faster way then human
being
⚫ Less drudgery: Much of the maintenance of files by hand is eliminated
⚫ Accuracy: Accurate, up-to-date information is fetched as per requirement of the
user at any time.
Database Management System (DBMS):
A database management system refers to a set of programs (software) used for the
creation, maintenance and manipulation of a database.
Function of DBMS:
1. Defining database schema: it must give facility for defining the database structure
also specifies access rights to authorized users.
2. Manipulation of the database: The dbms must have functions like insertion of
record into database, data update, deletion of data, retrieval of data
3. Sharing of database: The DBMS must share data items for multiple users by
maintaining consistency of data.
4. Protection of database: It must protect the database against unauthorized users.
5. Database recovery: If for any reason the system fails DBMS must facilitate data
base recovery.

Advantages of DBMS:
Reduction of redundancies:
Centralized control of data by the DBA avoids unnecessary duplication of data and
effectively reduces the total amount of data storage required avoiding duplication in the
elimination of the inconsistencies that tend to be present in redundant data files.
Sharing of Data:
A database allows the sharing of data under its control by any number of application
programs or users.

2
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


Data Integrity:
Data integrity means that the data contained in the database is both accurate and
consistent. Therefore, data values being entered for storage could be checked to ensure
that they fall within a specified range and are of the correct format.

Data Security:
The DBA who has the ultimate responsibility for the data in the dbms can ensure that
proper access procedures are followed including proper authentication to access to the
DataBase System and additional check before permitting access to sensitive data.
Conflict Resolution:
DBA resolve the conflict on requirements of various user and applications. The DBA
chooses the best file structure and access method to get optional performance for the
application.
Data Independence:
Data independence is usually considered from two points of views; physically data
independence and logical data independence.
Physical Data Independence allows changes in the physical storage devices or
organization of the files to be made without requiring changes in the conceptual view or
any of the external views and hence in the application programs using the data base.
Logical Data Independence indicates that the conceptual schema can be changed without
affecting the existing external schema or any application program.

Disadvantage of DBMS:
1. DBMS software and hardware (networking installation) cost is high
2. The processing overhead by the dbms for implementation of security, integrity and
sharing of the data.
3. Centralized database control
4. Setup of the database system requires more knowledge, money, skills, and time.
5. The complexity of the database may result in poor performance.

3
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


DATABASE BASICS
Data Item:
The data item is also called as field in data processing and is the smallest unit of data that
has meaning to its users.

Eg: “e101”, ”sumit”

Entities and attributes:


An entity is a thing or object in the real world that is distinguishable from all other objects
Eg: Bank, employee, student

Attributes are properties are properties of an entity.


Eg: Empcode, ename, rolno, name

Logical data and physical data:


Logical data are the data for the table created by user in primary memory.
Physical data refers to the data stored in the secondary memory.
Schema and sub-schema:
A schema is a logical data base description and is drawn as a chart of the types of data
that are used. It gives the names of the entities and attributes and specify the relationships
between them.
A database schema includes such information as:
➢ Characteristics of data items such as entities and attributes.
➢ Logical structures and relationships among these data items.
➢ Format for storage representation.
➢ Integrity parameters such as physical authorization and back up policies.

A subschema is derived schema derived from existing schema as per the user requirement.
There may be more than one subschema create for a single conceptual schema.

4
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


Three Level Architecture of DBMS:

Internal level
External level View View View

Mapping supplied by DBMS


Conceptual
level
Conceptual view

Mapping supplied by DBMS/OS


Internal level

A database management system that provides three level of data is said to follow three-
level architecture.
⚫ External level
⚫ Conceptual level
⚫ Internal level
External Level:
The external level is at the highest level of database abstraction. At this level, there will be
many views define for different user’s requirement. A view will describe only a subset of
the database. Any number of user views may exist for a given global schema (conceptual
schema).
For example, each student has different view of the time table. the view of a student of
BTech (CSE) is different from the view of the student of Btech (ECE). Thus this level of
abstraction is concerned with different categories of users.
Each external view is described by means of a schema called sub schema.

5
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


Conceptual Level:
At this level of database abstraction all the database entities and the relationships among
them are included. One conceptual view represents the entire database. This conceptual
view is defined by the conceptual schema.
The conceptual schema hides the details of physical storage structures and concentrate on
describing entities, data types, relationships, user operations and constraints.
It describes all the records and relationships included in the conceptual view. There is
only one conceptual schema per database. It includes feature that specify the checks to
relation data consistency and integrity.

Internal level:
It is the lowest level of abstraction closest to the physical storage method used. It indicates
how the data will be stored and describes the data structures and access methods to be
used by the database. The internal view is expressed by internal schema.
The following aspects are considered at this level:
1. Storage allocation e.g: B-tree, hashing
2. Access paths eg. specification of primary and secondary keys, indexes etc
3. Miscellaneous eg. Data compression and encryption techniques, optimization of
the internal structures.

Database Users:
Naive Users:
Users who need not be aware of the presence of the database system or any other system
supporting their usage are considered naïve users. A user of an automatic teller machine
falls on this category.
Online Users:
These are users who may communicate with the database directly via an online terminal
or indirectly via a user interface and application program. These users are aware of the
database system and also know the data manipulation language system.

6
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


Application Programmers:
Professional programmers who are responsible for developing application programs or
user interfaces utilized by the naïve and online user falls into this category.
Database Administration:
A person who has central control over the system is called database administrator.
The functions of DBA are;
1. Creation and modification of conceptual Schema definition
2. Implementation of storage structure and access method.
3. Schema and physical organization modifications.
4. Granting of authorization for data access.
5. Integrity constraints specification.
6. Execute immediate recovery procedure in case of failures
7. Ensure physical security to database

Database language:
1) Data definition language (DDL):
DDL is used to define database objects. The conceptual schema is specified by a set
of definitions expressed by this language. It also gives some details about how to
implement this schema in the physical devices used to store the data. This
definition includes all the entity sets and their associated attributes and their
relationships. The result of DDL statements will be a set of tables that are stored in
special file called data dictionary.
2) Data Manipulation Language (DML):
A DML is a language that enables users to access or manipulate data stored in the
database. Data manipulation involves retrieval of data from the database, insertion
of new data into the database and deletion of data or modification of existing data.
There are basically two types of DML:
⚫ Procedural: Which requires a user to specify what data is needed and how to
get it.
⚫ Non-Procedural: which requires a user to specify what data is needed without
specifying how to get it.
7
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT

3) Data Control Language (DCL):


This language enables user to grant authorization and canceling authorization of
database objects.

ELEMENTS OF DBMS

DML Pre-Compiler:
It converts DML statements embedded in an application program to normal procedure
calls in the host language. The pre-complier must interact with the query processor in
order to generate the appropriate code.
DDL Compiler:
The DDL compiler converts the data definition statements into a set of tables. These tables
contain information concerning the database and are in a form that can be used by other
components of the dbms.
File Manager:
File manager manages the allocation of space on disk storage and the data structure used
to represent information stored on disk.
Database Manager:
A database manager is a program module which provides the interface between the low
level data stored in the database and the application programs and queries submitted to
the system.
The responsibilities of database manager are:
1. Interaction with File Manager: The database manager is responsible for
the actual storing, retrieving and updating of data in the database.
2. Integrity Enforcement: The data values stored in the database must satisfy
certain constraints (eg: the age of a person can't be less then zero). These
constraints are specified by DBA. Data manager checks the constraints and if it
satisfies then it stores the data in the database.
3. Security Enforcement: Data manager checks the security measures for database
from unauthorized users.

8
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


4. Backup and Recovery: Database manager detects the failures occur due to
different causes (like disk failure, power failure, deadlock, software error) and
restores the database to original state of the database.
5. Concurrency Control: When several users access the same database file
simultaneously, there may be possibilities of data inconsistency. It is responsible
of database manager to control the problems occur for concurrent transactions.
Query Processor: The query processor uses the data dictionary to find the details of data
file and using this information it creates query plan/access plan to execute the query.
Data Dictionary: Data dictionary is the table which contains the information about database
objects. It contains information like

1. external, conceptual and internal database description


2. description of entities, attributes as well as meaning of data elements
3. synonyms, authorization and security codes
4. database authorization
The data stored in the data dictionary is called meta data.

9
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


DBMS STRUCTURE

Naïve user Application On line user DBA


programers

Application System calls Ddl compiler


programs

Application prog Dml precomplier Query processor Ddl compiler


obj code

Database manager

File manager

DBMS

Data file

Data dictionary

ER-MODEL

Data Model:
The data model describes the structure of a database. It is a collection of conceptual tools
for describing data, data relationships and consistency constraints and various types of
data models such as
1. Object based logical model
2. Record based logical model
3. Physical model

Types of data model:


1. Object based logical model
a. ER-model
b. Functional model
c. Object oriented model
d. Semantic model
2. Record based logical model
10
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


a. Hierarchical database model
b. Network model
c. Relational model
3. Physical model

Entity Relationship Model (ER Model)


The entity-relationship data model perceives the real world as consisting of basic objects,
called entities and relationships among these objects. It was developed to facilitate
database design by allowing specification of an enterprise schema which represents the
overall logical structure of a data base.

Main Features of ER-MODEL:


• Entity relationship model is a high-level conceptual model
• It allows us to describe the data involved in a real-world enterprise in terms of
objects and their relationships.
• It is widely used to develop an initial design of a database
• It provides a set of useful concepts that make it convenient for a developer to move
from a basic set of information to a detailed and description of information that can
be easily implemented in a database system
• It describes data as a collection of entities, relationships and attributes.

Basic Concepts:
The E-R data model employs three basic notions: entity sets, relationship sets and
attributes.
Entity Sets: An entity is a “thing” or “object” in the real world that is distinguishable from
all other objects. For example, each person in an enterprise is an entity. An entity has a set
property and the values for some set of properties may uniquely identify an entity. BOOK
is entity and its properties (called as attributes) bookcode, booktitle, price etc.

An entity set is a set of entities of the same type that share the same properties, or
attributes. The set of all persons who are customers at a given bank.

11
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT

Attributes:
An entity is represented by a set of attributes. Attributes are descriptive properties
possessed by each member of an entity set (entity).

Customer is an entity and its attributes are customerid, custmername, custaddress etc.

Relationship:
A relationship is an association among entities. They are usually expressed as verbs such
as assign, associate, track etc. Relationship provides useful information that could not be
easily discerned with just the entity types.

customer borrow loan

In the above diagram, borrow denotes the relationship between the two elements customer
and loan.

Cardinality Ratio:
Cardinality ratios express the number of entities to which another entity can be associated
via a relationship set. It describes the number of relationship instances in which an entity
can participate. Types of cardinality ratio:

1. One to One:
An entity A is associated with at most one entity B, and an entity B is associated with at
most one entity A.
Eg: relationship between college and principal

12
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


1 1
college has principal

2. One to Many:
An entity A is associated with any number of entities in B. An entity in B is associated
with at the most one entity A.
Eg: Relationship between department and faculty
1 M
Faculty Contains Department

3. Many to One:
An entity A is associated with at most one entity in B. An entity in B is associated with
any number in A.
M 1
Employee 1 Works Department
in

4. Many to Many:
Entities in A and B are associated with any number of entities from each other.

M M
M M
customer account
owns

13
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT

More about Entities and Relationship:

Participation Constraints:
The participation constraints specify the number of instances of an entity that can
participate in a relationship set.

a) Total: When all the entities from an entity set participate in a relationship type,
is called total participation. For example, the participation of the entity set student on the
relationship set must

‘opts’ is said to be total because every student enrolled must opt for a course.

b) Partial: When it is not necessary for all the entities from an entity set to participate
in a relationship type, it is called partial participation. For example, the participation of
the entity set student in ‘represents’ is partial, since not every student in a class is a class
representative.
Weak Entity:
Entity types that do not contain any key attribute, and hence cannot be identified
independently are called weak entity types. A weak entity can be identified by uniquely
only by considering some of its attributes in conjunction with the primary key attribute
of another entity, which is called the identifying owner entity.

Generally, a partial key is attached to a weak entity type that is used for unique
identification of weak entities related to a particular owner type. The following
restrictions must hold:
The owner entity set and the weak entity set must participate in one to many relationship
set.
This relationship set is called the identifying relationship set of the weak
entity set. The weak entity set must have total participation in the identifying
relationship.
14
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT

Example:
Consider the entity type Dependent related to Employee entity, which is used to keep
track of the dependents of each employee. The attributes of Dependents are: name,
birthdate, sex and relationship. Each employee entity set is said to its own the dependent
entities that are related to it. However, not that the ‘Dependent’ entity does not exist of its
own, it is dependent on the Employee entity.

ER-DIAGRAM:

The overall logical structure of a database using ER-model graphically with the help of
an ERdiagram.

Symbols used in ER- diagram:

composite attribute
entity

Weak entity

attribute Relationship

Multi valued attribute


Identifying
Derived attribute Relationship
Key attribute

1 m
1 1

One-to -one One-to -many


m 1
m n

many-to -one many-to -many

15
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT

Total participation Partial participation

A University registrar's office maintains data about the following entities:


(a) Course, including number,title,credits,syllabus and prerequisites
(b) course offering,including course number,year,semester,section number,instructor
timings, and class room
(c) Students including student-id,name and program

16
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


(d) Instructors, including identification number,name,department and title further,
the enrollment of students in courses and grades awarded to students in each
course they are enrolled for must be appropriate modeled.
Construct an E-R diagram for the registrar's office. Document all assumptions that you
may make about the mapping constraints.

17
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT

Consider a university database for the scheduling of class rooms for final exams. This
database could be modeled as the single entity set exam, with attributes course-

18
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


name,section-number,room-number and time, Alternatively, one or more additional
entity sets would be defined, along with relationship sets to replae some of the attributes
of the exam entity set, as
• course with attributes name,department and c-number
• section with attributes s-number and enrollment and dependent as a weak entity
set on course
• room with attributes r-number,capacity and building

19
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT

20
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


LECTURE-7: Advanced ER-Diagram:

Abstraction is the simplification mechanism used to hide superfluous details of a set of


objects. It allows one to concentrate on the properties that are of interest to the application.
There are two main abstraction mechanism used to model information:

Generalization and specialization:


Generalization is the abstracting process of viewing set of objects as a single general class
by concentrating on the general characteristics of the constituent sets while suppressing
or ignoring their differences. It is the union of a number of lower-level entity types for
the purpose of producing a higher-level entity type. For instance, student is a
generalization of graduate or undergraduate, full-time or part-time students. Similarly,
employee is generalization of the classes of objects cook, waiter, and cashier.
Generalization is an IS_A relationship; therefore, manager IS_AN employee, cook IS_AN
employee, waiter IS_AN employee, and so forth.

Specialization is the abstracting process of introducing new characteristics to an existing


class of objects to create one or more new classes of objects. This involves taking a higher-
level, and using additional characteristics, generating lower-level entities. The lower-level
entities also inherits the, characteristics of the higher-level entity. In applying the
characteristics size to car we can create a full-size, mid-size, compact or subcompact car.
Specialization may be seen as the reverse process of generalization addition specific
properties are introduced at a lower level in a hierarchy of objects.

21
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


empno name
dob

employee
Generalization Specialization

Is Is
degree degree

Full time Part-time


employee employee

Is Is Is Is

faculty staff teaching casual

degree Intrest Intrest Classificatio hourra

EMPLOYEE(empno,name,dob) Faculty(empno,degree,intrest)
FULL_TIME_EMPLOYEE(empno,salary) Staff(empno,hour-rate)
PART_TIME_EMPLOYEE(empno,type) Teaching (empno,stipend)

22
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT

Aggregation:
Aggregation is the process of compiling information on an object, there by abstracting a
higher level object. The entity person is derived by aggregating the characteristics of
name, address, ssn. Another form of the aggregation is abstracting a relationship objects
and viewing the relationship as an object.

23
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT

Job

Branch
Employe
Works
e on

Manag
es

Manager

24
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT

25
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT

ER- Diagram For College Database

26
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


RELATIONAL MODEL

RELATIONAL MODEL
Relational model is simple model in which database is represented as a collection of
“relations” where each relation is represented by two-dimensional table.

The relational model was founded by E. F. Codd of the IBM in 1972. The basic concept in
the relational model is that of a relation.

Properties:
o It is column homogeneous. In other words, in any given column of a table, all items
are of the same kind.
o Each item is a simple number or a character string. That is a table must be in first
normal form.
o All rows of a table are distinct. The ordering of rows with in a table is immaterial.
o The columns of a table are assigned distinct. names and the ordering of these
columns is immaterial.

Domain, attributes tuples and relational:

Tuple:
Each row in a table represents a record and is called a tuple .A table containing ‘n’
attributes in a record is called is called n-tuple.

Attributes:

27
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


The name of each column in a table is used to interpret its meaning and is called an
attribute.Each table is called a relation. In the above table, account_number, branch name,
balance are the attributes.

Domain:
A domain is a set of values that can be given to an attribute. So, every attribute in a table
has a specific domain. Values to these attributes cannot be assigned outside their
domains.

Keys:

Super key:
A super key is an attribute or a set of attributes used to identify the records uniquely in a
relation. For example, customer-id, (cname, customer-id), (cname,telno)

Candidate key:
Super keys of a relation can contain extra attributes. Candidate keys are minimal super
keys. i.e, such a key contains no extraneous attribute. An attribute is called extraneous if
even after removing it from the key, makes the remaining attributes still has the properties
of a key (atribute represents entire table).

In a relation R, a candidate key for R is a subset of the set of attributes of R, which have
the following properties:
Uniqueness: No two distinct tuples in R have the same values for
the candidate key
Irreducible: No proper subset of the candidate key has the
uniqueness property that is the
candidate key.

A candidate key’s values must exist. It can’t be null.


The values of a candidate key must be stable. Its value can not change outside the

28
COMPUTER SCIENCE UNIT

CSC 203.1- DATABASE MANAGEMENT


control of the
system. Eg:
(cname,telno)
Primary key:
The primary key is the candidate key that is chosen by the database designer as the
principal means of identifying entities with in an entity set. The remaining candidate
keys if any are called alternate key.

29

You might also like