Uni I

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 27

UNIT-I

INTRODUCTION
DEFINITION:
Database-Management system (DBMS) consists of a collection of interrelated data
and set of programs to access those data.
 The collection of data referred to as the database.
 The primary goal of a DBMS is to provide an environment that is both
convenient and efficient to use in retrieving and storing database information.

1.1. DATABASE SYSTEM APPLICATIONS:

Databases are widely used. Some applications are:


 Banking: For customer information, accounts, loans, and banking transactions.
 Airlines: For reservations and schedule information. Airlines were among the
first to use databases in a geographically distributed manner.
 Universities: For student information, course registrations, and grades.
 Credit card transactions: For purchases on credit cards and generation of
monthly statements.
 Telecommunication: For keeping records of calls made, generating monthly
bills, maintaining balances on prepaid calling cards, and storing information
about the communication networks.
 Finance: for storing information about holdings, sales and purchases of financial
instruments such as stocks and bonds; also for storing real-time market data to
enable on-line trading by customers and automated trading by the firm.
 Sales: For customer, product, and purchase information.
 On-line retailers: For sales data noted above plus on-line order tracking,
generation of recommendation lists, and maintenance of on-line product
evaluations.
 Manufacturing: For management of the supply chain and for tracking
production of items in factories, inventories of items in warehouses and stores,
and orders for items.
 Human resources: For information about employees, salaries, payroll taxes,
benefits, and for generation of paychecks.

1.2. PURPOSE OF DATABASE SYSTEM


 To keep the information on a computer is to store it in permanent system files.
 To allow users to manipulate the information, the system has a number of
application programs that manipulate the files, including
o A program to debit or credit an account.
o A program to add a new account.
o A program to find the balance of an account.
o A program to generate monthly statements.
 New applications programs are added to the system as needed arises.
 Suppose the bank decides to offer checking accounts, the bank creates new
permanent files that contain information about all the checking accounts
maintaining in the bank and need to write new application programs.
 The typical file processing system is supported by a conventional operating
system. The system stores permanent record in various, files and it needs
different application program to extract records from and add records to the
appropriate files.

THE MAJOR DISADVANTAGES OF THE FILE PROCESSING SYSTEM ARE

Data redundancy and inconsistency:


 Different programmers create the file and application programs.
 The programs may be written in different formats, languages. So the same
information may be duplicated in several files. This kind of redundancy leads
to higher storage and access cost.
Data isolation:
The data are scattered in various files, writing a new application program to
retrieve the appropriate data is difficult.
Integrity problem:
 The data values stored in the database must satisfy types of consistency
constraints.
 Developers add certain constraints in the code. In case of addition of new
constraints, it becomes difficult to change the program.
Ex: The balance of an account should never fall below rupees 250.
Atomicity problems:
 An action or event should happen entirely or it should not done.
 In case of failure in computer system the data should be restored to the
consistent state that existed prior to the failure.
Data anomalies:
 Any change in any field value must be correctly made in any places to
maintain data integrity.
 Data anomalies may exists in case of
o Modification anomalies.
o Insertion anomalies.
o Deletion anomalies.
Security problems:
 Not all the data’s should be accessible by every person.
Ex: payroll personnel need to see only employee database and should not access the
information about accounts detail.

1.3. VIEW OF DATA


A DBMS is a collection of interrelated files and a set of programs that allow users to
access and modify these files.
Data Abstraction:
It must retrieve data efficiently.
 Many database systems users are not computer trained, developers, hide the
complexity from users to several levels of abstraction, to simplify user’s
interactions with the system.
The three levels of data abstraction
Physical level:
 The lowest level of abstraction describes how the data or actually stored.
 At the physical level, complex low level data structures are described in
detail.

Logical level:
 The next higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data.
 The entire database is thus described in terms of a small number of relatively
simple structures.
 The logical level of abstraction is used by database administrator, who must
decide what information is to be kept in the database.
View level:
 The highest level of abstraction describes only part of the entire database.
 The users need to access only a part of the database. So that their interaction
with the system is simplified, the view level of abstraction is defined.
 The system may provide many views for the same database.
EX: A Pascal like language, to declare a record as follows.

Type customer= record


Customer_name :string;
Social_security:string;
Customer_street:string;
Customer_city:string;
End;

This code defines a new record called customer with four fields. Each field has a name
and a type associated with it.
 At the physical level, customer, account or employee record can be described
as a block of consecutive storage locations.
 At the logical level, each such record is described by a type definition.
 At the view level, computer users see a set of application programs the hide
details of the data types.
Instance and Schemas:
 The collection of information stored in the database of a particular moment is
called an instance of the database.
 The overall design of the database is called the database schema.
 To declare such variables in a Pascal like language.
Var customer 1: customer:
 Variable customer 1 now corresponds to an area of storage containing a customer
type record.
 Database systems have several schemas, partitioned into levels of abstraction.
 Physical Schema: At the lowest level.
 Logical Schema: At the intermediate level.
 Sub Schema: At the highest level.
 Database system support one physical schema, one logical schema and several
sub schemas.
Data Independence:
 The ability of modify a schema definition in one level with out affecting a schema
definition in the next higher level is called data independence.
 There are 2 levels of data independence.
Physical data independence:
 It is ability to modify the physical schema with out causing application programs
to be rewritten.
 The modification at the physical level is necessary to improve performance.
Logical data independence:
 It is ability to modify the logical schema with out causing application programs
to be rewritten.
 Modifications at the logical level are necessary when ever the logical structure of
the database is altered.
 Logical data independence is more difficult to achieve than is physical data
independence. Since application programs are heavily dependent and logical
structure of the data.

1.4. DATABASE LANGUAGES

A database system provides 2 different types of languages


To specify database schema(DDL)
To express database queries and updates(DML)
1.4.1 DATA-DEFINITION LANGUAGES (DDL)

 A Database schema is specified by a set of definitions expressed by a special


language called a data-definition language (DDL)
 The result of compilation of DDL Statements is a set of tables that is stored in a
special file is called Data Dictionary (DD) or Data Directory
 A Data Dictionary is a file that contains metadata that is data about data.
 The storage structure and access methods used by the database system are
specified by a set of definitions in a special type of DDL called a Data storage
and Definition Language.
 Database systems concentrate on integrity constraints that casn be tested with
minimal overhead:

 Domain Constraints
 Referential Integrity
 Assertion
 Authorization

1.4.2 DATA-MANIPULATION LANGUAGE

The data manipulation means


o The retrieval of information stored in the database
o The insertion of new information into the database
The deletion of information from the databaseThe modification if information stored in
the databaseThe goal is to provide efficient human interaction with the systemA DML is
a language that enables users to access or manipulate data as organized by the
appropriate model.There are 2 types.
Non-Procedural DMLs OR Declarative DMLs:
It requires a user to specify “what data are needed without specifying
how to get those data”.
Procedural DMLs:
It requires a user to specify “what data are needed and how to get those
data”.
 A query is a statement requesting the retrieval of information
 The portion of a DML that involves information retrieval is called a “query
language” or DML.

1.5. TRANSACTION MANAGEMENT

Several operations on the database form a single logical unit of work.


ATOMICITY:
 One account (say A) is debited and another account (say B) is credited
 Either both the credit and debit occur or neither occur. The funds transfer must
happen it is entirely or not at all.
CONSISTENCY:
 The value of the sum A+B must be preserved this correctness requirement is
called “consistency”
DURABILITY:
After the successful execution of a funds transfer, the new values of account A and B
must persist. This persistency requirement is called “durability”
TRANSACTION:
 A transaction is a collection of operations that performs a single logical functions
in a database application
 Each transaction is a unit of both atomicity and consistency. The transactions do
not violate any database consistency constraints.
 The Atomicity and Durability properties is the responsibilities of the database
system itself-the transaction management component.
 Several transaction update the database concurrently the consistency of data may
no longer be preserved even though each individual transaction is correct.
It is responsibility of the concurrency-control manager to control the interaction among
the concurrency transaction.

1.6. DATABASE ARCHITECTURE

Structure System:
A database system is partitioned into modules that deal with each of the
responsibilities of the overall system
The functional components of a database system can be broadly divided into
query processor components and storage manager components.
The query processor components include:
DML Compiler:
DML compiler which translates DML statements in a query language into low-
level instructions that the query evaluation engine understands.
Embedded DML Precompiler:
DML Precompiler which converts DML Statements embedded in an application
program to normal procedure calls in the host language.
DDL Interpreter:
DDL interpreter which interprets DDL Statements and records them in asset of
tables containing metadata
Query evaluation engine:
In which executes low-level instructions generated by the DML compiler
The storage manager components include:
Authorization and integrity manager:
In which tests for the satisfaction of integrity constraints and check the authority of
users to access data.
Transaction manager
In which ensures that the database remains in a consistent (correct) state despite
system failures, and that concurrent transaction executions proceed without conflicting
File manager:
In which manages the allocation of space on disk storage and the data structures used
to represent information stored on disk.

Buffer manager:
In which is responsible for fetching data from disk storage into main memory,
and deciding what data to cache in memory
Several data structures are required as physical system implementation
Data files
In which store the database itself
Data dictionary
In which stores metadata about the structure of the database the data dictionary
is used heavily
Indices
In which provides fast access to data items that hold particular values.
Statistical data
In which store statistical information about the data in the database.
Database applications are usually partitioned into two or three parts

Two-tier architecture & Three-tier architecture

user user
client
application Application
client

network network

Application
Database system serve

Database system
1.7. DATABASE USERS AND ADMINISTRATORS

o The primary goal of a database system is to provide an environment for


retrieving information from and storing new information into the database
o There are 4 different types of database-system users
Naïve users:
o They are unsophisticated users who interact with the system by invoking
one of the permanent application programs that have been written
previously
(Ex)a bank teller (ATM)
Application Programmers:
o The application programmers are computer professionals who interact
with the system through DML calls which are embedded in a program
written in a host language(Ex cobol,PL/I,C)
o These programs are commonly referred to as application programs.
o (Ex) banking system includes programs that generate payroll checks, the
debit account, the credit account or that fund between accounts.
o A special preprocessor called the DML Precompiled converts the DML
calls in the host language.
o There are special types of programming languages that combine control
structures of Pascal-like languages with control structures for the
manipulation of a database object (relations). These languages sometimes
called fourth-generation languages(4GL)
Sophisticated users:
The users interact with the system without writing programs.instead they
form their requests in a database query language.Each such query is
submitted to a query processor.
users: Specialized
o They are sophisticated users who write specialized database applications
that do not fit into the traditional data-processing framework.
o These applications are computer-aided design Systems(CAD)knowledge
based and expert systems, systems that store data with complex data types.
Database Administrator
The person who has such central control over the system is called the database
administrator (DBA)
o The functions of the DBA includes:
1. schema definition:
o The DBA creates the original database schema by writing a set of
definitions that is translated by the DDL Compiler to a set of tables
that is stored permanently in the data dictionary.
2. Storage structure and access method definition.
o The DBA creates appropriate storage structures and access methods
by writing a set of definitions, which is translated by the data-storage
and data-definition language compiler.
3. schema and physical organization modification
o programmers accomplish the relatively rare modifications either to
the database schema
4. granting of authorization for data access
o The granting of different types of authorization allows the database
administrator to regulate which parts of the database various users
can access.
5. integrity –constraint specification:
o The data values stored in the database must satisfy certain
consistency constraints.
o The constraint must be specified explicitly by the database
administrator

1.8 STRUCTURE OF RELATIONAL DATABASES


1. A relational database consists of a collection of tables, each of which is assigned
a unique name.
2. Each table has a structure similar represented E-R database by tables.
3. A row in a table is represents a relationship among a set of values since a table
is a collection of such relationships there is a close correspondence between the
concept of tables and mathematical concept of relation from which the
relational data model takes its name.

Basic structure:
 The account table has 3 column headers branch-name, account-number and
balance.
 The terminology of the relational model has headers as attributes each attribute
is a set of permitted values called the domain of that attribute.
 (Ex) attribute branch-number the domain is the set of all branch-names.
 D1 denote this set D2 denote the set of all account-number and D3 the set of all
balance.
 The row of account must consists of a 3 tuple(V1,V2,V3)
 V1-is a branch-name
 V2-is an account-number
 V3-is a balance
 The account is a subset of D1*D2*D3
 A table of n attributes must be subset of
 D1*D2*D3*D4……………..Dn-1*Dn
 The mathematical terms relation and tuple in place of the terms tables and
rows.
 In account relation there are several tuples the tuple variable t refer to the first
tuple of the relation
 t[branch-name]-to denote the value of t on the branch-name attribute
 t[branch-name] = “Brighton”
 t[balance]=500
 t[1]-to denote the value of tuples on the first attribute(branch-name)
 t[2]-denote account-number ……..so on
Database schema:
o database schema, or the logical design of the database and a database instance,
which is a snapshot of the data in the database at a given instant in time
o to give a name to a relation schema the lowercase names for relations and
names beginning with and uppercase letter for relation schemas
o In this notation, the Account-schema to denote the relation schema for relation
account.
o Account-schema=(branch-name, account-number, balance)
o The account is a relation on Account-Schema by Account(Account-schema)
o A relation schema comprises a list of attributes and their corresponding
domains
o Consider the branch relation
o Branch-schema = (branch-name, branch-city, asset)
o Note that the attribute branch-name appears in both branch-schema and
Account-schema
Keys:
There are 3 types of keys
 Super keys
 Primary keys
 Candidate Keys
Candidate keys
o In branch-schema,{branch-name} and {branch-name,branch-city} are both super
keys {branch-name,branch-city} is not a candidate keys and {branch-name}
itself is a super key {branch-name} is a candidate keys our purpose serve as a
primary key
o The attribute branch-city is not a super key.since 2 branches in the same city
may have different names
o Let R be a relation schema the subset K of R is a super keys for R restricting
consideration to relations r® in which no 2 distinct tuples have the same values
on all attributes in K that is if t1 and t2 are in r and t1 != t2 then t1[k]!=t2[k].
Strong entity set:
The primary key of the entity set becomes the primary key of the relation

Weak entity set:


The table and thus the relation, corresponding to a weak entity set includes:
o The attributes of the weak entity set
o The primary key of the strong entity set on which the weak
entity set depends
Relationship set:
The union of the primary keys of the related entity sets becomes a superkey.
Relation
If the relationship is many-to-many this superkey is also the primary key.
Combined tables:
The binary many-to-one relationship set from A to B can be represented by a table
consisting of the attributes of A and attributes of the relationship set
Multivalued attributes:
A Multivalued attribute M is represented by a table consisting of the primary key
of the entity set or relationship set of which M is an attribute plus a column C holding
an individual value of M.
Query Languages:
o A Query language is a language in which a user requests information from the
database
o Query languages can be categorized as being either procedural or non
procedural
o In a procedural language, the user instructs the system to perform a sequence
of operations on the database to compute the desired result.
o In a non procedural language, the user describes the information desired
without giving a specific procedure for obtaining that information.
o A complete data-manipulation language includes not only a query language,
but also a language for database modification.

1.9 RELATIONAL DATABASE DESIGN

When designing a database, you have to make decisions regarding how best to
take some system in the real world and model it in a database. This consists of deciding
which tables to create, what columns they will contain, as well as the relationships
between the tables. While it would be nice if this process was totally intuitive and
obvious, or even better automated, this is simply not the case. A well-designed database
takes time and effort to conceive, build and refine.
The benefits of a database that has been designed according to the relational model are
numerous. Some of them are:
 Data entry, updates and deletions will be efficient.
 Data retrieval, summarization and reporting will also be efficient.
 Since the database follows a well-formulated model, it behaves predictably.
 Since much of the information is stored in the database rather than in the
application, the database is somewhat self-documenting.
 Changes to the database schema are easy to make.
The goal of this article is to explain the basic principles behind relational database
design and demonstrate how to apply these principles when designing a database using
Microsoft Access. This article is by no means comprehensive and certainly not
definitive. Many books have been written on database design theory; in fact, many
careers have been devoted to its study. Instead, this article is meant as an informal
introduction to database design theory for the database developer.

1.10 ER - MODEL
The E-R (entity-relationship) data model views the real world as a set of basic
objects (entities) and relationships among these objects.
It is intended primarily for the DB design process by allowing the specification of
an enterprise scheme. This represents the overall logical structure of the DB.

1.10.1 OVERVIEW OF THE DATABASE DESIGN PROCESS

 When medium-sized or large databases are designed for use as part of an


information system of a large organization, database design becomes complex.
This is because many users are expected to use the database, so the system must
satisfy the requirements of all these users. Careful design and testing phases are
imperative to ensure that all these requirements are satisfactorily met.

1.10.2 THE DATABASE DESIGN PROCESS

The six main phases of the database design process:


1. Requirements connection and analysis - involves the collection of
information concerning the intended use of the database
2. Conceptual database design - The goal of this phase is to produce a
conceptual schema for the database that is independent of a specific
DBMS. In addition to specifying the conceptual schema, we should specify
as many of the know database applications or transactions as possible.
These applications should also be specified using a notation that is
independent of any specific DBMS.
3. Choice of a DBMS .The following costs must be considered when selecting
a DBMS:
a. Software acquisition cost.
b. Maintenance cost. The cost of receiving standard maintenance
service from the vendor and for keeping the DBMS version up to
date.
c. Hardware acquisition cost. New hardware such as additional
memory, terminal, disk units, or specialized DBMS storage may be
needed.
d. Database creation and conversion cost: The cost of using the DBMS
software either to create the database system form scratch or to
convert an existing system to the new DBMS software.
e. Personnel cost. New positions of database administrator (DBA) and
staff are being created in most companies using DBMS's.
f. Training cost.
g. Operating cost. The cost of continued operation of the database
system is typically not worked into an evaluation of alternative
because it is incurred regardless of the DBMS selected.
4. Data model mapping (also called logical database design) - During this
phase we map (or transform) the conceptual schema from the high - level
data model used in phase 2 into the data model of the DBMS chosen in
phase 3. In addition, the design of external schemas (views) for specific
applications is often done during this phase.
5. Physical database design- During this phase we design the specifications
for the stored database in terms of physical storage structures, record
placement, and access paths. This corresponds to the design of the internal
schema in the terminology of the three-level DBMS architecture.
6. Database implementation

1.10.3 THE ENTITY RELATIONSHIP MODEL

An entity is a “thing” or “object “in the real world that is distinguishable from all
other objects.
o (Ex) each person in an enterprise is an entity social-security number 1001
uniquely identifies one particular person in the enterprise.
o An entity set is a set of entities of the same type that share the same properties
or attributes
o Entity sets do not need to be disjoint. An entity is represented by a set of
attributes.
o Attributes are descriptive properties possessed by each member of an entity
set.
o A database include a collection of entity sets each of which contains any
number of entities of the same type
o An attribute as used in the E-R Model can be characterized by the following
attribute types.
Entity sets:
o A super key is a set of one or more attributes that taken collectively allows us
to identify uniquely an entity in the entity set.

o The social-security attribute of the entity set customer is sufficient to


distinguish one customer entity from another
o Social security number is a super key similarly the combination of customer
name and social security number is a super key for the entity set customer
o Super keys for which no proper subset is a super key such minimal super keys
are called “candidate keys”.
o The social security and “customer-name, customer-street} are candidate keys.
A key (primary key, candidate key and super key) is a property of the entity set,
rather an of the individual entities
Relationship sets:
o A relationship is an association among several entities
o For (ex) to define a relationship that associates customer hayes with loan L-15
this relationship specifies that Hayes is a customer with loan number L-15.
o a relationship set is a set of relationship of the same type it is a mathematical
relation on n>/2 entity sets. { (e1,e2,e3,e4…………en)| e1E1 e2 E2….en 
En}
o where (e1,e2,e3……..en) is a relationship
o the association between entity sets is referred to as participation that is the
entity sets E1,E2,E3………En participate in relationship set R
o A relationship instance in an E-R Schema represents that an association exists
between the named entities in the real-world enterprise that is being modeled
o The same entity set participates in a relationship set more than once in
different roles. In this type of relationship set which is called a recursive
relationship set
A relationship may also have descriptive attributes
Simple and composite attributes:
o The attributes have been simple. They are not divided into subparts
composite attributes can be divided into subparts.
o (Ex) customer-name could be structured as a composite attribute consisting of
first-name ,middle-name, last-name.
Single-valued and multivalued attributes
 The attributes have specified in our examples all have a single value for a
particular entity.
 Instance the loan-number attribute for a only one loan entity refers to only
one loan number such attributes are said to be “single valued attribute”
 Any particular employee may have zero, one or more dependents, the
different employee entities within the entity set will have different numbers
of values for the dependent-name attribute. This type of attribute is said to be
multivalued attribute
Null attributes:
o A null value is used when an entity does not have a value for an attribute
o A particular employee has no dependents the dependent-name value for that
employee will be null
o Null can also designate that an attribute value is unknown.
o Social-security value for a particular customer is null the value is missing
Derived attributes:
o The value for this type of attribute can be derived from the values of other
related attributes or entities.
o To derive the value for this attribute by counting the number of loans entities
associated with that customer.
o The value for employment-length can be derived from the value for start-date
and the current –date.

Constraints:
An E-R enterprise schema may define certain constraints to' which the contents
of a database must conform there are two of the most important types of constraints,
1. Mapping Cardinalities
Mapping cardinalities, or cardinality ratios, express the number of entities to
which another entity can be associated via a relationship set.
For a binary relationship set R between entity sets A and B, the mapping
cardinality must be one of the following:
One to one:
An entity in A is associated with at most one entity in B, and an entity in B is
associated with at most one entity in A.
One to many:
An entity in A is associated with any number (zero or more) of
entities in B. An entity in B, however, can be associated with at most one
entity in A
Many to one:
An entity in A is associated with at most one entity in B. An entity in B, however,
can be associated with any number (zero or more) of entities.
Many to many:
An entity in A is associated with any number (zero or more) of entities in B, and
an entity in B is associated with any number (zero or more) of entities in A.
2. Participation Constraints
The participation of an entity set E in a relationship set R is said to be total if every
entity in E participates in at least one relationship in R. If only some entities in E
participate in relationships in R, the participation of entity set E in relationship R is
said to be partial.
Keys:
A key allows us to identify a set of attributes that suffice to distinguish entities
from each other. Keys also help uniquely identify relationships, and thus distinguish
relationships from each other.

Entity Sets:
A super key is a set of one or more attributes that, taken collectively, allow us to
identify uniquely an entity in the entity set
For example, the customer-id attribute of the entity set customer is sufficient to
distinguish one customer entity from another. Thus, customer-id is a super key.
Similarly, the combination of customer-name and customer-id is a super key for the
entity set customer. The customer-name attribute of customer is not a super key,
because several people might have the same name. If K is a super key, then so is any
superset of K.
We are often interested in super keys for which no proper subset is a super key.
Such minimal super keys are called candidate keys.
We shall use the term primary key to denote a candidate key that is chosen by the
database designer as the principal means of identifying entities within an entity set. A
key (primary, candidate, and super) is a property of the entity set, rather than of the
individual entities.
Relationship Sets:
Let R be a relationship set involving entity sets e|, E,,..., E.. Let primary-key ( e|)
denote the set of attributes that forms the primary key for entity set E,. Assume for
now that the attribute names of all primary keys are unique, and each entity set
participates only once in the relationship. The composition of the primary key for a
relationship set depends on the set of attributes associated with the relationship set R.
If the relationship set R has no attributes associated with it, then the set
of attributes

Primary-key (E1) U primary-key (E2) U…..U primary-key (En)


describes an individual relationship in set R.
• If the relationship set R has attributes a1, a2 ……..am associated with it, then the set of
attributes
primary-key(E1) U primary-key(E2) U…… U primary-key(En)U {a1, a2,..., am}
describes an individual relationship in set R.
In both of the above cases, the set of attributes
Primary-key (E1) U primary-key (E2) U…..U primary-key (En)
Forms a super key for the relationship set.

1.10.4 ENTITY-RELATIONSHIP DIAGRAM

The overall logical structure of a database can be expressed graphically by an E-R


Diagram.
o The Diagram consists of the following major components
o Rectangles, which represent entity sets
o Dashed ellipses which denote derived attributes
Double lines-which indicate total participation of an entity in a relationship set.
The relationship set may be
o Many –to-many
o One-to-many
o Many-to-many
o One-to-one
To draw either a directed line (-> ) or an undirected(-) between the relationship set
and the entity set.
If a relationship set has also some attributes associated with it then the link
attributes to that relationship set.
Weak Entity Sets
o An entity set may not have sufficient attributes to form a primary key such an
entity set is termed a “weak entity set”
o An entity set that has a primary key is termed a strong entity set
o This entity set does not have a primary key it is a weak entity set
o Weak entity set to be meaningful it must be part of a one-to-many relationship
set.
o This relationship set should have no descriptive attributes since any required
attributes can be associated with the weak entity set
o The concept of strong and weak entity sets are related to the existence
dependencies
o A weak entity set does not have a primary key
o The entity set that depend on one particular strong entity set
o The discriminator of a weak entity set is a set of attributes that allows this
distinction to be made
o The discriminator of a weak entity is also called the “partial key” of the entity set.
o The primary key of a weak entity set is formed by the primary key of the strong
entity set on which the weak entity set is existence dependent plus the weak
entity’s discriminator.
o A weak entity set may participate as owner in an identifying relationship with
another weak entity set.
E-R Features and its uses
The extended E-R Features of
 Specialization
 Generalization
 Specialization
o An entity set may include subgroupings of entities that are distinct in some
way from other entities in the set
o The E-R Model provides a means for representing these distinctive entity
groupings
o An account if further classified as following
o Savings-account
o Checking- account
o Each of these account types is described by a set of attributes that includes all
the attributes of entity set account plus additional attributes.
o For (ex) savings-account entities are described further by the attribute
interest-rate, where as checking-account entities are described further by the
attribute overdraft-amt.the process of designating sub groupings within an
entity set is specialization
o The specialization of an account allows us to distinguish among account
based on the type of account.
o An entity set may be specialized by more than one distinguishing feature
o Specialization is based on the type of account ownership would result in the
entity sets commercial-account and personal-account
o When more than one specialization is formed on an entity set, a particular
entity may belongs to both of the specializations.
o To apply specialization repeatedly to refine the design schema
o The bank may offer following 3 types of checking-account
o A standard checking
o An account with a $3.00 monthly service charge and 25 free
checks each months.
o A gold checking:
o An account that requires a $1000.00 minimum balance pays
2% interest
o A senior checking:
o An account for customers aged 65 years or older that has no
monthly service charge.
o The specialization of checking –account by account type yields
 Standard with attribute num-checks
 Gold with attributes minimum-balance and interest-
payment
 Senior-with attribute D-O-B
o An E-R diagram, specialization is depicted by a triangle component labeled
ISA
o The label ISA stands for “is a “ and represents .(Ex) the savings account “is
an” account
o The ISA relationship may also be referred to as a superclass and subclass
relationship
Generalization:
o The refinement from an initial entity set into successive levels of entity sub
grouping represents a top-down design process in which distinctions are made
explicit.
o The design process may also proceed in a bottom-up manner

o The database designer may have first identified a checking-account entity set
with the attributes account-number, balance and overdraft-amount
and savings-account entity set with the attributes acc-number, balance and
interest-rate.
o There are similarities between the checking-account entity set and the
savings-account entity set in the sense that they have several attributes in common
o This commonality can be expressed by generalization, which is a containment
relationship that exists between a higher-level entity set and one or more lower-
level entity set
o (Ex) account is the higher-level entity set and savings-account and checking-
account are lower-level entity sets.
o Higher and lower-level entity sets also may be designated by the terms pf
super class and subclass
o The account entity set is the super class of the savings-account and checking-
account are subclasses
o Generalization is a simple inversion of specialization

You might also like